Quantifying the scale effect in geospatial big data using semi-variograms

Autoři: Lei Chen aff001;  Yong Gao aff001;  Di Zhu aff001;  Yihong Yuan aff002;  Yu Liu aff001
Působiště autorů: Institute of Remote Sensing and Geographic Information System, School of Earth and Space Sciences, Peking University, Beijing, China aff001;  Department of Geography, Texas State University, San Marcos, Texas, United States of America aff002
Vyšlo v časopise: PLoS ONE 14(11)
Kategorie: Research Article
doi: 10.1371/journal.pone.0225139


The scale effect is an important research topic in the field of geography. When aggregating individual-level data into areal units, encountering the scale problem is inevitable. This problem is more substantial when mining collective patterns from big geo-data due to the characteristics of extensive spatial data. Although multi-scale models were constructed to mitigate this issue, most studies still arbitrarily choose a single scale to extract spatial patterns. In this research, we introduce the nugget-sill ratio (NSR) derived from semi-variograms as an indicator to extract the optimal scale. We conducted two simulated experiments to demonstrate the feasibility of this method. Our results showed that the optimal scale is negatively correlated with spatial point density, but positively correlated with the degree of dispersion in a point pattern. We also applied the proposed method to a case study using Weibo check-in data from Beijing, Shanghai, Chengdu, and Wuhan. Our study provides a new perspective to measure the spatial heterogeneity of big geo-data and selects an optimal spatial scale for big data analytics.

Klíčová slova:

Cell cycle and cell division – Data processing – Human mobility – Polynomials – Simulation and modeling – Social media – Urban areas – Remote sensing imagery


1. Atkinson PM, Tate NJ. Spatial scale problems and geostatistical solutions: a review. The Professional Geographer. 2000;52(4):607–623. doi: 10.1111/0033-0124.00250

2. Davis F, Quanttrochi D, Ridd M, Lam N, Walsh SJ, Michaelsen JC, et al. Environmental analysis using integrated GIS and remotely sensed data- Some research needs and priorities. Photogrammetric Engineering and Remote Sensing. 1991;57(6):689–697.

3. Lloyd CD. Exploring spatial scale in geography. John Wiley & Sons; 2014.

4. Atkinson P, Zhang J, Goodchild MF. Scale in spatial information and analysis. CRC Press; 2014.

5. Openshaw S. The modifiable areal unit problem. Concepts and techniques in modern geography. 1984.

6. Goodchild MF, Proctor J. Scale in a digital geographic world. Geographical and environmental modelling. 1997;1:5–24.

7. Batty M. Big data, smart cities and city planning:. Dialogues in Human Geography. 2013;3(3):274–279. doi: 10.1177/2043820613513390 29472982

8. Liu Y, Liu X, Gao S, Gong L, Kang C, Zhi Y, et al. Social Sensing: A New Approach to Understanding Our Socioeconomic Environments. Annals of the Association of American Geographers. 2015;105(3):512–530. doi: 10.1080/00045608.2015.1018773

9. Kwan MP. Algorithmic Geographies: Big Data, Algorithmic Uncertainty, and the Production of Geographic Knowledge. Annals of the Association of American Geographers. 2016;106(2):274–282.

10. Ruddell D, Wentz EA. Multi-tasking: Scale in geography. Geography Compass. 2009;3(2):681–697. doi: 10.1111/j.1749-8198.2008.00206.x

11. Jelinski DE, Wu J. The modifiable areal unit problem and implications for landscape ecology. Landscape ecology. 1996;11(3):129–140. doi: 10.1007/BF02447512

12. Zhang S, Zhu D, Yao X, Cheng X, He H, Liu Y. The Scale Effect on Spatial Interaction Patterns: An Empirical Study Using Taxi OD data of Beijing and Shanghai. IEEE Access. 2018;6:51994–52003. doi: 10.1109/ACCESS.2018.2869378

13. Stone KH. A geographer’s strength: The multiple-scale approach. Journal of Geography. 1972;71(6):354–362. doi: 10.1080/00221347208981686

14. Manley DJ. The modifiable areal unit phenomenon: An investigation into the scale effect using UK census data. University of St Andrews; 2006.

15. Zhou Y, Long Y. SinoGrids: a practice for open urban data in China. American Cartographer. 2016;43(5):379–392. doi: 10.1080/15230406.2015.1129914

16. Pei T, Sobolevsky S, Ratti C, Shaw SL, Li T, Zhou C. A new insight into land use classification based on aggregated mobile phone data. International Journal of Geographical Information Science. 2014;28(9):1988–2007. doi: 10.1080/13658816.2014.913794

17. Liu X, Kang C, Gong L, Liu Y. Incorporating spatial interaction patterns in classifying and understanding urban land use. International Journal of Geographical Information Science. 2016;30(2):334–350. doi: 10.1080/13658816.2015.1086923

18. Liu Y, Wang F, Xiao Y, Gao S. Urban land uses and traffic ‘source-sink areas’: Evidence from GPS-enabled taxi data in Shanghai. Landscape and Urban Planning. 2012;106(1):73–87. doi: 10.1016/j.landurbplan.2012.02.012

19. Atkinson PM, Aplin P. Spatial variation in land cover and choice of spatial resolution for remote sensing. International Journal of Remote Sensing. 2004;25(18):3687–3702. doi: 10.1080/01431160310001654383

20. Woodcock CE, Strahler AH. The factor of scale in remote sensing. Remote Sensing of Environment. 1987;21(3):311–332. doi: 10.1016/0034-4257(87)90015-0

21. Woodcock CE, Strahler AH, Jupp DLB. The use of variograms in remote sensing: II. Real digital images. Remote Sensing of Environment. 1988;25(3):323–348. doi: 10.1016/0034-4257(88)90109-5

22. Curran PJ. The semivariogram in remote sensing: An introduction. Remote Sensing of Environment. 1988;24(3):493–507. doi: 10.1016/0034-4257(88)90021-1

23. Atkinson PM. Selecting the spatial resolution of airborne MSS imagery for small-scale agricultural mapping. International Journal of Remote Sensing. 1997;18(9):1903–1917. doi: 10.1080/014311697217945

24. Rahman AF, Gamon JA, Sims DA, Schmidts M. Optimum pixel size for hyperspectral studies of ecosystem function in southern California chaparral and grassland. Remote Sensing of Environment. 2003;84(2):192–207. doi: 10.1016/S0034-4257(02)00107-4

25. Garrigues S, Allard D, Baret F, Weiss M. Quantifying spatial heterogeneity at the landscape scale using variogram models. Remote Sensing of Environment. 2006;103(1):81–96. doi: 10.1016/j.rse.2006.03.013

26. Garrigues S, Allard D, Baret F, Morisette J. Multivariate quantification of landscape spatial heterogeneity using variogram models. Remote Sensing of Environment. 2008;112(1):216–230. doi: 10.1016/j.rse.2007.04.017

27. Lausch A, Pause M, Doktor D, Preidl S, Schulz K. Monitoring and assessing of landscape heterogeneity at different scales. Environmental monitoring and assessment. 2013;185(11):9419–9434. doi: 10.1007/s10661-013-3262-8 23719741

28. Scheider S, Huisjes MD. Distinguishing extensive and intensive properties for meaningful geocomputation and mapping. International Journal of Geographical Information Science. 2019;33(1):28–54. doi: 10.1080/13658816.2018.1514120

29. Matheron G. Principles of geostatistics. Economic Geology. 1963;58(8):1246–1266. doi: 10.2113/gsecongeo.58.8.1246

30. Chilès JP, Delfiner P. Geostatistics: Modeling Spatial Uncertainty. 1999.

31. Muñoz JD, Kravchenko A. Deriving the optimal scale for relating topographic attributes and cover crop plant biomass. Geomorphology. 2012;179(1):197–207.

32. Liu D, Wang Z, Zhang B, Song K, Li X, Li J, et al. Spatial distribution of soil organic carbon and analysis of related factors in croplands of the black soil region, Northeast China. Agriculture, Ecosystems & Environment. 2006;113(1-4):73–81. doi: 10.1016/j.agee.2005.09.006

33. Ahmadi SH, Sedghamiz A. Geostatistical Analysis of Spatial and Temporal Variations of Groundwater Level. Environmental Monitoring and Assessment. 2007;129(1-3):277–294. doi: 10.1007/s10661-006-9361-z 17180432

34. Journel AG, Huijbregts CJ. Mining geostatistics. vol. 600. Academic press London; 1978.

35. McBratney A, Webster R. Choosing functions for semi-variograms of soil properties and fitting them to sampling estimates. Journal of soil Science. 1986;37(4):617–639. doi: 10.1111/j.1365-2389.1986.tb00392.x

36. Huang D, Liu Z, Zhao X, Zhao P. Emerging polycentric megacity in China: An examination of employment subcenters and their influence on population distribution in Beijing. Cities. 2017;69:36–45. doi: 10.1016/j.cities.2017.05.013

37. Zou Y, Mason R, Zhong R. Modeling the polycentric evolution of post-Olympic Beijing: An empirical analysis of land prices and development intensity. Urban Geography. 2015;36(5):735–756. doi: 10.1080/02723638.2015.1027121

38. Zheng S, Kahn ME. Does government investment in local public goods spur gentrification? Evidence from Beijing. Real Estate Economics. 2013;41(1):1–28. doi: 10.1111/j.1540-6229.2012.00339.x

39. Li M, Kwan MP, Wang F, Wang J. Using points-of-interest data to estimate commuting patterns in central Shanghai, China. Journal of Transport Geography. 2018;72:201–210. doi: 10.1016/j.jtrangeo.2018.09.004

40. Luo X, Dong L, Dou Y, Zhang N, Ren J, Li Y, et al. Analysis on spatial-temporal features of taxis’ emissions from big data informed travel patterns: a case of Shanghai, China. Journal of cleaner production. 2017;142:926–935. doi: 10.1016/j.jclepro.2016.05.161

41. Liu X, Wang M. How polycentric is urban China and why? A case study of 318 cities. Landscape and urban planning. 2016;151:10–20. doi: 10.1016/j.landurbplan.2016.03.007

42. Wang JF, Zhang TL, Fu BJ. A measure of spatial stratified heterogeneity. Ecological Indicators. 2016;67:250–256. doi: 10.1016/j.ecolind.2016.02.052

43. Wang JF, Li XH, Christakos G, Liao YL, Zhang T, Xue G, et al. Geographical detectors-based health risk assessment and its application in the neural tube defects study of the Heshun Region, China. International Journal of Geographical Information Science. 2010;24(1):107–127. doi: 10.1080/13658810802443457

44. Kodinariya TM, Makwana PR. Review on determining number of Cluster in K-Means Clustering. International Journal. 2013;1(6):90–95.

45. Heupel M, Semmens JM, Hobday A. Automated acoustic tracking of aquatic animals: scales, design and deployment of listening station arrays. Marine and Freshwater Research. 2006;57(1):1–13. doi: 10.1071/MF05091

Článek vyšel v časopise


2019 Číslo 11