## Jeff Calder

Associate Professor of Mathematics538 Vincent Hall

School of Mathematics

University of Minnesota

Phone: 612-626-1324

Email: jcalder at umn dot edu

# Research

The overall goal of my research is to use the theory of partial differential equations (PDEs) and the calculus of variations to study foundational problems in machine learning and data science, and develop new, more efficient, algorithms founded on strong theoretical principles. Much of my work involves proving large sample size continuum limits for discrete problems. In this context, the continuum is the big data limit of discrete machine learning, and tools from continuum analysis can be used to understand why algorithms work well, and when they will fail. It is particularly exciting when this kind of analysis leads to new algorithms with performance guarantees.

My research is currently supported by NSF DMS Career Award 1944925 (2020—2025), NSF DMS Award 1816917 (2018—2021), an Alfred P. Sloan Foundation Research Fellowship (2020—2022), and a McKnight Presidential Fellowship (2021—2024). I also grateful for previous support by the NSF under DMS grants 1500829 (2015—2017) and 1713691 (2017—2020), and a University of Minnesota Grant in Aid award (2017—2019).

The links below lead to brief descriptions of some research projects that I have worked recently.

- Graph-Based Learning
- Continuum Limits for Data Peeling
- Prediction with Expert Advice
- Mathematics and Anthropology

Much of my work is in collaboration with current and past PhD students and postdoctoral researchers (listed below), as well as many other collaborators.

#### Reproducible Research

I am committed to reproducibility of research. Code to reproduce results from our research papers can be found on my publications page under the corresponding paper. Most of the code is posted to my personal GitHub account, including our new GraphLearning (GitHub, PyPI) Python package, or to the AMAAZE website (under the software tab). As part of the AMAAZE consortium, we are developing plugins for MeshLab to make our work easily accessible to the community. Please reach out if you need help running any of our code.

#### Current graduate students

- Brendan Cook, PhD
- Drisana Mosaphir, PhD
- Riley O'Neill, co-advised with Peter Olver, PhD
- Kodjo Houssou, PhD
- Rosalind Hong, MSc

#### Current Postdocs

- Mahmood Ettehad, IMA/Target postdoc 2020—2022

#### Former PhD students

- Amber Yuan, graduated 2021. First job: Machine Learning Engineer at Spotify.
- Mauricio Flores Rios (co-advised wtih Gilad Lerman), graduated 2018 First Job: Lead data scientist at Target Corporation.

#### Former Postdocs

- Nadejda Drenska, MCFAM postdoc 2018—2021. Currently a Rufus Isaacs Postdoctoral Fellow in the Department of Applied Mathematics and Statistics at Johns Hopkins University.
- Shannon Negaard-Paper, IMA/Cargill postdoc 2019—2021. Currently a data scientist at NextEra Energy.

# Graph-Based Learning

A major focus of my current research is graph-based learning. A graph structure encodes interdependencies among constituents, such as social media users, images or video sequences, or physical or biological agents, and provides a convenient representation for high dimensional data. Common tasks in graph-based learning include clustering (i.e., grouping similar elements), semi-supervised learning (e.g., label propagation), or dimension reduction via embeddings, among many others. My research is guided by viewing discrete graph-based learning problems as discretizations of continuous PDEs or variational problems, and I use ideas and analysis from the continuum to draw new insights and develop new algorithms.

I have been interested recently in graph-based semi-supervised learning in the regime where very few labeled examples are available. In this setting, the widely used Laplacian regularization is ill-posed, and tends to put most data points into the same class. I have worked on developing new algorithms that are provably well-posed for low label rate problems. In particular, I have studied the p-Laplace regularizers, and proved they are well posed for sufficiently large p (including the infinity-Laplacian), and have worked on developing efficient algorithms for solving the p-Laplace equation on a graph. I also developed the Properly Weighted Laplacian, which involves re-weighting the graph near labeled nodes in a way that is provably well posed with very few labels. The analysis of the Properly Weighted Laplacian involves an interesting study of singularly weighted Sobolev spaces.

More recently, we developed a new framework, called Poisson learning, for semi-supervised learning at low label rates. The idea is to replace the assignment of label values by the placement of point sources and sinks and solving the corresponding graph Poisson equation. Our paper on Poisson learning was recently accepted to ICML 2020. As part of the paper, I have released a Python package GraphLearning (GitHub, PyPI) that is devoted to efficient implementations of modern graph-based learning algorithms for semi-supervised learning and clustering.

I have also worked on spectral convergence and regularity for graph Laplacian eigenvectors. The spectrum of the graph Laplacian is used for many tasks in data science, such as clustering (e.g., spectral clustering) and dimension reduction through spectral embeddings (e.g., diffusion maps). In the manifold setting, where the graph is assumed to be a random geometric graph sampled from a low dimensional embedded manifold, it is known that the spectrum of the graph Laplacian converges, in the large sample size and sparse graph limit, to the spectrum of a particular weighted Laplace-Beltrami operator on the data manifold. In recent paper we improved the state-of-the-art spectral convergence rates to show that they match the linear pointwise consistency convergence rates. In a follow up work we went a step further and established a type of macroscopic Lipschitz estimate for solutions of graph Poisson equations, and used this to establish uniform (in fact, Holder) convergence of graph Laplacian spectra.

I am also interested in problems on directed graphs, and have established a continuum limit for the PageRank vector, which is used to rank the importance of nodes in a graph. I plan to continue working on problems in graph based learning, including semi-supervised learning on directed graphs, clustering, and spectral convergence and regularity theory for graph Laplacians.

## Related Publications

- M. Flores, J. Calder, and G. Lerman.
**Analysis and algorithms for Lp-based semi-supervised learning on graphs.***To appear in Applied and Computational Harmonic Analysis*, 2022. [ arXiv ] [ Code ] - L. Bungert, J. Calder, and T. Roith.
**Uniform Convergence Rates for Lipschitz Learning on Graphs.***arXiv preprint*, 2021. [ arXiv ] [ Code ] - J. Calder, S. Park, and D. Slepčev.
**Boundary Estimation from Point Clouds: Algorithms, Guarantees and Applications.***arXiv preprint*, 2021. [ arXiv ] [ Code ] - J. Calder, N. García Trillos, and M. Lewicka.
**Lipschitz regularity of graph Laplacians on random data clouds.***To appear in SIAM Journal on Mathematical Analysis*, 2021. [ arXiv ] - J. Calder, D. Slepčev, and M. Thorpe.
**Rates of convergence for Laplacian semi-supervised learning with low labeling rates.***arXiv preprint*, 2020. [ arXiv ] - A. Yuan, J. Calder, and B. Osting.
**A continuum limit for the PageRank algorithm.***European Journal of Applied Mathematics*, 2021. [ arXiv ] [ Journal ] [ Code ] - J. Calder, B. Cook, M. Thorpe, and D. Slepčev.
**Poisson Learning: Graph based semi-supervised learning at very low label rates.***Proceedings of the 37th International Conference on Machine Learning, PMLR*, 119:1306--1316, 2020. [ arXiv ] [ Journal ] [ Code ] - J. Calder and N. García Trillos.
**Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs.***arXiv preprint*, 2019. [ arXiv ] [ Code ] - J. Calder and D. Slepčev.
**Properly-weighted graph Laplacian for semi-supervised learning.***Applied Mathematics and Optimization: Special Issue on Optimization in Data Science*, 82:1111--1159, 2019. [ arXiv ] [ Journal ] - J. Calder.
**Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data.***SIAM Journal on Mathematics of Data Science*, 1(4):780--812, 2019. [ arXiv ] [ Journal ] [ Code ] - J. Calder.
**The game theoretic p-Laplacian and semi-supervised learning with few labels.***Nonlinearity*, 32(1):301--330, 2018. [ arXiv ] [ Journal ]

# Continuum Limits for Data Peeling

Many problems in science and engineering involve the sorting, or ordering, of large amounts of data, which can be challenging in the context of big data problems. There is a wide class of such sorting algorithms, called data peeling, that work by repeatedly peeling away extremal points. Different notions of extremality lead to different notions of data peeling. Data peeling

Nondominated sorting is a data peeling algorithm that sorts points in Euclidean space into layers by repeatedly removing the coordinatewise minimal points. The algorithm is fundamental in multi-objective optimization, where it is a key component of the so-called genetic and evolutionary algorithms for continuous multi-objective optimization, which is ubiquitous in science and engineering. We have also applied nondominated sorting to the problems of multi-criteria anomaly detection, and multi-query image retrieval. In my doctoral dissertation, I proved that the layers obtained by nondominated sorting of a random point cloud converge in the large sample size limit (with probability one) to the level sets of a function that satisfies a Hamilton-Jacobi. Recently, with my student Brendan Cook, we established a quantitative convergence rate for this continuum limit. I utilized this continuum limit to develop fast approximate sorting algorithms based on estimating the point cloud density and solving the Hamilton-Jacobi equation numerically. We applied this fast approximate algorithm to the problem of anomaly detection in a streaming setting.

Convex hull peeling is another data peeling algorithm that is used for defining data depth. Convex hull peeling works by repeatedly removing the vertices of the convex hull of a point cloud. The convex hull peeling depth is the number of steps required to remove a given point, and the convex hull median is the centroid of the final convex layer, before all points are peeled away. With Charles K. Smart, we proved that convex hull peeling has a continuum limit that corresponds to affine invariant curvature motion. In particular, the convex layers approximate motion by Gauss curvature to the power 1/(d+1), weighted by the point density. Using this result we were able to determine asymptotic formulas for the number of points on each convex layer for simple densities, like a Gaussian, giving a distributional theory that was lacking for convex hull peeling.

## Related Publications

- B. Cook and J. Calder.
**Rates of convergence for the continuum limit of nondominated sorting.***To appear in SIAM Journal on Mathematical Analysis*, 2021. [ arXiv ] - J. Calder and C. K. Smart.
**The limit shape of convex hull peeling.***Duke Mathematical Journal*, 169(11):2079--2124, 2020. [ arXiv ] [ Journal ] - B. Abbasi, J. Calder, and A. M. Oberman.
**Anomaly detection and classification for streaming data using partial differential equations.***SIAM Journal on Applied Mathematics*, 78(2):921-941, 2018. [ arXiv ] [ Journal ] - W. Thawinrak and J. Calder.
**High-order filtered schemes for the Hamilton-Jacobi continuum limit of nondominated sorting.***Journal of Mathematics Research*, 10(1):90--109, 2018. [ arXiv ] [ Journal ] - J. Calder.
**Numerical schemes and rates of convergence for the Hamilton-Jacobi equation continuum limit of nondominated sorting.***Numerische Mathematik*, 137(4):819--856, 2017. [ arXiv ] [ Journal ] - J. Calder.
**A direct verification argument for the Hamilton-Jacobi equation continuum limit of nondominated sorting.***Nonlinear Analysis Series A: Theory, Methods, & Applications*, 141:88--108, 2016. [ arXiv ] [ Journal ] [ PDF ] - K.-J. Hsiao, K. Xu, J. Calder, and A. O. Hero.
**Multi-criteria similarity-based anomaly detection using Pareto Depth Analysis.***IEEE Transactions on Neural Networks and Learning Systems*, 27(6):1307--1321, 2016. [ arXiv ] [ Journal ] - K.-J. Hsiao, J. Calder, and A. O. Hero.
**Pareto-depth for multiple-query image retrieval.***IEEE Transactions on Image Processing*, 24(2):583--594, 2015. [ arXiv ] [ Journal ] [ PDF ] - J. Calder.
**Directed last passage percolation with discontinuous weights.***Journal of Statistical Physics*, 158(45):903--949, 2015. [ arXiv ] [ Journal ] [ PDF ] - J. Calder, S. Esedoḡlu, and A. O. Hero.
**A PDE-based approach to nondominated sorting.***SIAM Journal on Numerical Analysis*, 53(1):82--104, 2015. [ arXiv ] [ Journal ] [ PDF ] - J. Calder, S. Esedoḡlu, and A. O. Hero.
**A continuum limit for non-dominated sorting.***Information Theory and Applications Workshop*, 2014. [ Journal ] [ PDF ] - J. Calder, S. Esedoḡlu, and A. O. Hero.
**A Hamilton-Jacobi equation for the continuum limit of non-dominated sorting.***SIAM Journal on Mathematical Analysis*, 46(1):603--638, 2014. [ arXiv ] [ Journal ] [ PDF ] - K.-J. Hsiao, K. Xu, J. Calder, and A. O. Hero.
**Multi-criteria anomaly detection using Pareto Depth Analysis.***Advances in Neural Information Processing Systems 25*:854--862, 2012. [ arXiv ] [ Journal ]

# Prediction with Expert Advice

Prediction with expert advice refers to problems in online machine learning where a player synthesizes advice from many experts to make predictions in real-time. Together with Nadejda Drenska, we studied prediction with expert advice against an adversarial environment that controls the gains of the experts and is making their choices to ensure the player achieves the largest loss (or regret) possible. This is essentially a worst-case analysis, and is used as the foundation for robust prediction systems. Applications of prediction with expert advice include algorithm boosting, stock price prediction and portfolio optimization, self-driving car software, and many other problems.

Much of the work in this field is heuristic, and based on taking hand-tuned weighted averages of expert predictions. In contrast, we are interested in provably optimal strategies for the player and environment in the asymptotic limit of a large number of plays of the game. In particular, we studied the problem of prediction of a binary sequence with history dependent experts. The history dependence introduces an additional discrete variable that lives on a directed graph called the de Bruijn graph. We identified an asymptotically optimal strategy for the player, which involves solving a continuum limit PDE and a Poisson equation over the de Bruijn graph.

## Related Publications

- J. Calder and N. Drenska.
**Asymptotically optimal strategies for online prediction with history-dependent experts.***Journal of Fourier Analysis and Applications Special Collection on Harmonic Analysis on Combinatorial Graphs*, 27(20), 2021. [ arXiv ] [ Journal ] - N. Drenska and J. Calder.
**Online prediction with history-dependent experts: The general case.***To appear in Communications on Pure and Applied Mathematics*, 2020. [ arXiv ]

# Mathematics and Anthropology

I am also interested in practical applications of machine learning to problems in science and engineering, and have begun a collaboration with Peter Olver and PhD candidate Katrina Yezzi-Woodley from the Department of Anthropology. A major open question for anthropologists is whether the geometry of broken bone fragments contains sufficient information to indicate the actor responsible for fragmentation, or whether the fragmentation is largely a product of intrinsic architecture and structure of the bone itself. Part of our collaborative work is focused on addressing this question by using machine learning to classify broken bone fragments based on agent of breakage, which may be stone tools, animal bites, or natural causes, for example. Another major problem is to automatically reassemble broken bone fragments that are found in archaeological sites, since manual refitting is extremely time-consuming. The goal of our collaboration is to apply modern geometric and data analytical tools to address these problems. The expected anthropological impacts include a better understanding of early human origins and dispersal, prehistory, culture, predator avoidance, and social organization.

Together with Peter Olver and Katrina Yezzi-Woodley, we founded the Anthropological and Mathematical Analysis of Archaeological and Zooarchaeological Evidence (AMAAZE) consortium. AMAAZE is an international consortium of anthropologists, mathematicians, and computer scientists who are working together to advance analytical methods and to use advanced mathematical methods to address important questions within archaeology and zooarchaeology. Our research project is NSF-funded (DMS grant 1816917) and has provided opportunites for many research experiences for undergraduates (REU).

## Related Publications

- K. Yezzi-Woodley, J. Calder, P. J. Olver, A. Melton, P. Cody, T. Huffstutler, A. Terwilliger, G. Tostevin, M. Tappen, and R. Coil.
**The Virtual Goniometer: A new method for measuring angles on 3D models of fragmentary bone and lithics.***Archaeological and Anthropological Sciences*, 13(106), 2021. [ arXiv ] [ Journal ] [ Code ] - R. O'Neill, P. Angulo-Umana, J. Calder, B. Hessburg, P. J. Olver, C. Shakiban, and K. Yezzi-Woodley.
**Computation of circular area and spherical volume invariants via boundary integrals.***SIAM Journal on Imaging Sciences*, 13(1):53--77, 2020. [ arXiv ] [ Journal ] [ Code ]