TITLE:

What is special about Geo-AI and Spatial data science ?

PRESENTER:

Shashi Shekhar : Biography , Homepage , Picture

AFFILIATION:

Computer Science Department, University of Minnesota.

URL:

http://www.cs.umn.edu/~shekhar

VIDEOS:

SLIDES:

ABSTRACT:

Rise of spatial big data (e.g., trajectories, remote-sensing) is fueling growth of Geo-AI (e.g., geo-imagery analysis automation) for making previously unimaginable maps, answering trail-blazing geo-content based queries, and understanding spatiotemporal patterns of our lives, etc. Applications span from apps for navigation, ride-sharing, and delivery to monitoring global crops, climate change, diseases, and smart cities to understanding cellular or urban patterns of life.

However, one-size-fit-all machine learning performs poorly (e.g., salt-n-pepper noise, inaccuracy) due to spatial autocorrelation and variability, which violate the common i.i.d. assumption (i.e. data samples are generated independently and from identical distribution). Furthermore, high cost of spurious patterns requires guardrails such as noise tolerance, and modeling of spatial concepts (e.g., polygons) and implicit relationships (e.g., distance, inside). In addition, methods discretizing continuous space face the modifiable areal unit problem (e.g., gerrrymandering).

Thus, the talk suggests spatial data science approaches and describes methods for spatial classification and prediction (e.g., spatial auto-regression, spatial decision trees, spatial variability aware neural networks) along with techniques for discovering patterns such as noise-robust hotspots (e.g., SaTScan, linear, arbitrary shapes), interactions (e.g., co-locations, tele-connections ), spatial outliers, and their spatio-temporal counterparts (e.g., cascade , mixed-drove co-occurrence ). It concludes by calling for inclusion of spatial perspectives in data science courses and curricula.

KEYWORDS: Spatial, Spatio-temporal, Auto-correlation, Data Mining, Machine Learning, Statistics.

ACKNOWLEDGMENTS: This work was supported in part by the National Science Foundation, the U.S. Department of Defense, the National Aeronautics and Space Administration the Federal Highway Authority, and the University of Minnesota (e.g., Center for Transportation Studies).

SURVEYS, TUTORIALS

  1. Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A Survey, ACM Computing Surveys, 55(2):1-38, March 2023,  doi.org/10.1145/3487893.
  2. U.S. Bureau of Industry and Security, Addition of Software Specially Designed To Automate the Analysis of Geospatial Imagery to the Export Control Classification Number 0Y521 Series, Federal Register, 459-462 (4 pages), 85 FR 459, Document Number 2019-27649, Jan. 6th, 2020.
  3. A UCGIS Call to Action: Bringing the Geospatial Perspective to Data Science Degrees and Curricula , 2019.
  4. S. Shekhar and P. Vold, Spatial Computing, Essential Knowledge Series, MIT Press, 2020.
  5. AM-97 - An Introduction to Spatial Data Mining , The Geographic Information Science & Technology Body of Knowledge (4th Quarter 2020 Edition), John P. Wilson (Ed.). DOI:10.22224/gistbok/2020.4.5. (Also University of MInnesota Computer Science technical report 18-013, 2018).
  6. Shashi Shekhar. What is special about spatial data science and Geo-AI? In 33rd International Conference on Scientific and Statistical Database Management (SSDBM 2021), ACM, page 271, 2021. DOI:https://doi.org/10.1145/3468791.3472263.
  7. L. Chauhan and S. Shekhar. GeoAI-Accelerating a Virtuous Cycle between AI and Geo, Proc. 13th ACM Intl. Conference on Contemporary Computing (IC3-2021), pp. 355-370. DOI:https://doi.org/10.1145/3474124.3474179
  8. Data Science for Earth: An Earth Day Report, ACM SIGKDD Explor. Newsl. 22(1):4-7, June 2020. DOI:https://doi.org/10.1145/3400051.3400055 (E. Eftelioglu, S. Shekhar, J. Hudson, L. Joppa, C. Baru, and V. Janeja,) .
  9. Transdisciplinary Foundations of Geospatial Data Science ( html , pdf ) ISPRS International Journal of Geo-Informatics, 6(12), 2017. doi:10.3390/ijgi6120395. (with Y. Xie, E. Eftelioglu, R. Ali, X. Tang, Y. Li, and R. Doshi)
  10. Spatiotemporal Data Mining: A Computational Perspective , ISPRS International Journal on Geo-Informtion, 4(4):2306-2338, 2015 (DOI: 10.3390/ijgi4042306). (w/ Z. Jiang, R. Ali, E. Efteliglu, X. Tang, V. Gunturi, and X. Zhou).
  11. Identifying patterns in spatial information: a survey of methods ( pdf ), S. Shekhar, M. R. Evans, J. M. Kang and P. Mohan, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , 193-214, 1(3), May/June 2011. (DOI: 10.1002/widm.25).
  12. Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Transactions on Knowledge and Data Mining, 29(10):2318-2331, June 2017. ( DOI: 10.1109/TKDE.2017.2720168 ). (w/ A. Karpatne et al.).
  13. Parallel Processing over Spatial-Temporal Datasets from Geo, Bio, Climate and Social Science Communities: A Research Roadmap. IEEE Big Data Congress 2017: 232-250 (with S. Prasad et al.)..

PAPERS ON SPECIFIC PATTERN FAMILIES

  1. Towards Spatial Variability Aware Deep Neural Networks (SVANN): A General Approach , ACM Transactions on Intelligent Systems and Technology 12(6):1-21, December 2021. https://doi.org/10.1145/3466688 (Note A Summary of Results appeared in the 1st ACM SIGKDD Workshop on Deep Learning for Spatiotemporal Data, Applications, and Systems (Deepspatial 2020), 2020, Best Paper) (J. Gupta, Y. Xie and S. Shekhar).
  2. Significant DBSCAN towards Statistically Robust Clustering , Proc. 16th Intl. Symposium on Spatial and Temporal Databases (SSTD 19), 31-40, ACM (Best Paper). DOI:https://doi.org/10.1145/3340964.3340968 (Y. Xie and S. Shekhar).
  3. Discovering colocation patterns from spatial data sets: a general approach, IEEE Trans. on Know. and Data Eng., 16(12), 2004 (w/ Y. Huang et al.)
  4. A join-less approach for mining spatial colocation patterns, IEEE Trans. on Know. and Data Eng.,18 (10), 2006. (w/ J. Yoo).
  5. Cascading Spatio-Temporal Pattern Discovery , IEEE Trans. Knowl. Data Eng. 24(11): 1977-1992, 2012 (w/ P. Mohan et al.).
  6. Detecting graph-based spatial outliers: algorithms and applications Proc.: ACM Intl. Conf. on Knowledge Discovery & Data Mining, 2001 (with Q. Lu et al.)
  7. A unified approach to detecting spatial outliers, Springer GeoInformatica, 7 (2), 2003. (w/ C. Lu, et al.)
  8. Discovering Flow Anomalies: A SWEET Approach , IEEE Intl. Conf. on Data Mining, 2008 (w/ J. Kang).
  9. Discovering personally meaningful places: An interactive clustering approach, ACM Trans. on Info. Systems (TOIS) 25 (3), 2007. (with C. Zhou et al.)
  10. A K-Main Routes Approach to Spatial Network Activity Summarization , IEEE Trans on Know. & Data Eng., 26(6), 2014. (with D. Oliver et al.)
  11. Significant Linear Hotspot Discovery< IEEE Trans. Big Data 3(2): 140-153, 2017, (w/ X.Tang et al.)
  12. Ring-Shaped Hotspot Detection, IEEE Trans. Know. and Data Eng., 28(12): 3367-3381, 2016, (w/ E. Eftelioglu et al.)
  13. Spatial contextual classification and prediction models for mining geospatial data , IEEE Transactions on Multimedia, 4 (2), 2002. (with P. Schrater et al.)
  14. Focal-Test-Based Spatial Decision Tree Learning, IEEE Trans. Knowl. Data Eng. 27(6): 1547-1559, 2015 (summary in Proc. IEEE Intl. Conf. on Data Mining, 2013) (w/ Z. Jiang et al.).
  15. Spatiotemporal change footprint pattern discovery: an inter-disciplinary survey., Wiley Interdisc. Rew.: Data Mining and Know. Discovery 4(1), 2014. (with X. Zhou et al.)