Spatial Big Data: A Perspective


Shashi Shekhar : Biography , Homepage , Picture


Computer Science Department, University of Minnesota.





Increasingly, the size, variety, update rate and complexity of location-based datasets exceed the capacity of common spatial computing platforms to manage, process, and analyze the data with reasonable effort. Such data is known as Spatial Big Data (SBD). Examples include cell-phone trajectories, location-based service requests, social media check-ins, sensor-measurements, temporally-detailed road-maps, etc.

SBD has transformative potential. For example, a 2011 McKinsey Global Institute report estimates savings of about $600 billion annually by 2020 in terms of fuel and time saved by helping vehicles avoid congestion and idling. Geo-social media is leveraged for timely detection of tornadoes and outbreaks. Sciences are investigating SBDs for spatio-temporal hypothesis generation as well as for complex questions, where progress was hampered by data paucity.

SBD challenges, opportunities and debates arise at the level of platforms, analytics and scientific methods. Platforms (e.g., Hadoop, SQL/OGIS) are challenged by iterative and interdependent spatial algorithms as well as increasing variety (e.g., Lagrangian frame of reference). Opportunities include both adaptation to current platforms (e.g., non-iterative algorithms, Mahot) and explorations of alternative platforms. Analytics methods are challenged by spatial auto-correlation, geographic heterogeneity and need to reduce user burden by estimating neighborhood relationships. A current debate juxtaposes rise of both simpler models (e.g., data as the model) and more complex models (e.g., ensembles). Scientific method debates include data quality measures (e.g., bias vs. timeliness), impact of corporate ownership of SBD on transparency and reproducibility, and effect of data-intensive science (fourth paradigm) of classical methods (e.g., theory-based, hypothesis testing).

KEYWORDS: Spatial, Spatio-temporal, Big Data, Data Analytics.

NOTE 1: Some of the ideas discussed in this talk appeared in the following publications:

  1. Sushil K. Prasad et al., Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate and Social Science Communities: A Research Roadmap. IEEE BigData Congress, 2017, pages 232-250.
  2. Agriculture Big Data (AgBD) Challenges and Opportunities From Farm To Table: A Midwest Big Data Hub Community Whitepaper, , NSF Midwest Big Data Hub, December, 2017.
  3. Transdisciplinary Foundations of Geospatial Data Science ( html , pdf ) ISPRS International Journal of Geo-Informatics, 6(12), 2017. doi:10.3390/ijgi6120395. (with Y. Xie, E. Eftelioglu, R. Ali, X. Tang, Y. Li, and R. Doshi)
  4. Spatiotemporal Data Mining: A Computational Perspective , ISPRS International Journal on Geo-Informtion, 4(4):2306-2338, 2015 (DOI: 10.3390/ijgi4042306). (w/ Z. Jiang, R. Ali, E. Efteliglu, X. Tang, V. Gunturi, and X. Zhou).
  5. Identifying patterns in spatial information: a survey of methods ( pdf ), S. Shekhar, M. R. Evans, J. M. Kang and P. Mohan, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , 193-214, 1(3), May/June 2011. (DOI: 10.1002/widm.25).
  6. M. Evans, D. Oliver, K. Yang, X. Zhou, R. Ali, and S. Shekhar, Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities , CyberGIS for Geospatial Discovery and Innovation (Ed. S. Wang and M. Goodchild), Springer, 2019, isbn 978-94-024-1529-2.
  7. Spatial Big Data: Platforms, Analytics and Science , under review for GeoJournal Special Issue on Big Data, (planned).
  8. Spatial Big Data: Case Studies on Volume, Velocity, and Variety , in Big Data: Techniques and Technologies in Geoinformatics (Ed. H. Karimi), isbn 978-1-46-658651-2, CRC Press, 2014.
  9. Spatiotemporal data mining in the era of big spatial data: Algorithms and applications Proceedings ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, pages 1-10, November 2012 (   DOI: 10.1145/2447481.2447482 ). (with R. Vatsavai, A. Ganguly, V. Chandola, A. Stefanidis, S. Klasky).
  10. Spatial Big-Data Challenges Intersecting Mobility and Cloud Computing , 11th International ACM SIGMOD Workshop on Data Engineering for Wireless and Mobile Access, 2012. A summary appeared in NSF Workshop on Social Networks and Mobility in the Cloud , 2012. (wht Michael R. Evans, Viswanath Gunturi, KwangSoo Yang).

NOTE 2: This talk has been presented at following forums: