Spatial Big Data: A Perspective


Shashi Shekhar : Biography , Homepage , Picture


Computer Science Department, University of Minnesota.





Increasingly, the size, variety, update rate and complexity of location-based datasets exceed the capacity of common spatial computing platforms to manage, process, and analyze the data with reasonable effort. Such data is known as Spatial Big Data (SBD). Examples include cell-phone trajectories, location-based service requests, social media check-ins, sensor-measurements, temporally-detailed road-maps, etc.

SBD has transformative potential. For example, a 2011 McKinsey Global Institute report estimates savings of about $600 billion annually by 2020 in terms of fuel and time saved by helping vehicles avoid congestion and idling. Geo-social media is leveraged for timely detection of tornadoes and outbreaks. Sciences are investigating SBDs for spatio-temporal hypothesis generation as well as for complex questions, where progress was hampered by data paucity.

SBD challenges, opportunities and debates arise at the level of platforms, analytics and scientific methods. Platforms (e.g., Hadoop, SQL/OGIS) are challenged by iterative and interdependent spatial algorithms as well as increasing variety (e.g., Lagrangian frame of reference). Opportunities include both adaptation to current platforms (e.g., non-iterative algorithms, Mahot) and explorations of alternative platforms. Analytics methods are challenged by spatial auto-correlation, geographic heterogeneity and need to reduce user burden by estimating neighborhood relationships. A current debate juxtaposes rise of both simpler models (e.g., data as the model) and more complex models (e.g., ensembles). Scientific method debates include data quality measures (e.g., bias vs. timeliness), impact of corporate ownership of SBD on transparency and reproducibility, and effect of data-intensive science (fourth paradigm) of classical methods (e.g., theory-based, hypothesis testing).

KEYWORDS: Spatial, Spatio-temporal, Big Data, Data Analytics.

NOTE 1: Some of the ideas discussed in this talk appeared in the following publications:

  1. Enabling Spatial Big Data via CyberGIS: Challenges and Opportunities , under review as a book chapter in CyberGIS: Fostering a New Wave of Geospatial Innovation and Discovery (Ed. S. Wang and M. Goodchild), Springer , 2014 (expected).
  2. Spatial Big Data: Platforms, Analytics and Science , under review for GeoJournal Special Issue on Big Data, (planned).
  3. Spatial Big Data: Case Studies on Volume, Velocity, and Variety , in Big Data: Techniques and Technologies in Geoinformatics (Ed. H. Karimi), isbn 978-1-46-658651-2, CRC Press, 2014.
  4. Spatial Big-Data Challenges Intersecting Mobility and Cloud Computing , 11th International ACM SIGMOD Workshop on Data Engineering for Wireless and Mobile Access, 2012. (A summary appeared in NSF Workshop on Social Networks and Mobility in the Cloud , 2012.)

NOTE 2: This talk has been presented at following forums: