High-dimensional data sets often exhibit low-dimensional structures. Traditional methods for recovering these structures fail for many important applications in imaging. The workshop will present emerging approaches for geometric data modeling, while targeting important imaging applications. The topics to be discussed include two alternatives to PCA (one robustly recovers low-rank approximation and the other accounts for higher-order cumulants), fast modeling of data with several affine subspaces (via a multiscale geometric procedure), and modeling data by mixtures of manifolds based on semi-supervised information. The applications include optical flow estimation from video, face recognition, motion segmentation, and identification of hand-written digits.
The minisymposium is funded by ONR and will be held as a part of the 2010 SIAM Imaging Meeting.
Gilad Lerman, University of Minnesota, Twin Cities
The official program for the minisymposium (MS14) is online.
2:00-2:25 Robust Recovery of Low-Rank Structure from Imagery Data
John Wright, University of Illinois at Urbana-Champaign
Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This problem can be formally stated as one of recovering a low rank matrix from large but sparse corruption. In this talk, we discuss recent progress on this problem. Recent theoretical results show that under fairly broad circumstances it can be exactly and efficiently solved by convex programming. We discuss fast and scalable solutions to this convex program, via proximal gradient algorithms as well as duality. We then show how these new tools impact real imaging applications, including automatic face recognition and optic flow estimation from video. These applications show the power of the tools, but also demand non-trivial extensions to nonlinear and manifold structure (e.g., due to transformations of domain).
2:30-2:55 Image Processing with Higher-order Statistics on Grassmannians
Lek-Heng Lim, University of California, Berkeley
Abstract: We identify a subspace, i.e. a point on a Grassmannian, that best describes a collection of images by simultaneously accounting for variation in cumulants of all orders. For comparison, PCA-based techniques (e.g. Eigenfaces and Fisherfaces) take into account only the second order cumulant, i.e. covariance. While ICA-based techniques do use higher-order cumulants, they invariably make strong assumptions on the higher-order cumulants (e.g. E[XYZ]=0). We try to model such irreducible higher-order dependence rather than assuming it is zero. A limited-memory quasi-Newton method on Grassmannian is used to perform the requisite optimization.
3:00-3:25 Hybrid Linear Modeling by Multiscale Geometric Analysis
Arthur Szlam, Courant Institute, New York University
Abstract: The hybrid linear modeling problem is to identify a set of d-dimensional affine sets in a D-dimensional Euclidean space. It arises, for example, in object tracking and structure from motion. The hybrid linear model can be considered as the second simplest (behind linear) manifold model of data. In this paper we will present a very simple geometric method for hybrid linear modeling based on selecting a set of local best fit flats that minimize a global l1 error measure. The size of the local neighborhoods is determined automatically using Jones' l2 beta numbers; it is proven under certain geometric conditions that good local neighborhoods exist and are found by our method. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the algorithm on synthetic and real hybrid linear data.
3:30-3:55 Multi-manifold Semi-supervised Learning
Aarti Singh, Carnegie Mellon University
Abstract: Semi-supervised learning on a single manifold has been the subject of intense study. I will discuss the setting of multiple manifolds, in which it is assumed that the target function is smooth with respect to the geodesic distance on each manifold, and the manifolds can intersect or partly overlap. I will discuss a semi-supervised learning algorithm that separates different manifolds into decision sets, and performs supervised learning within each set. Using a finite sample analysis, it is possible to quantify the potential gain of using unlabeled data in this multi-manifold setting. I will also present a practical version of the algorithm that involves a novel application of Hellinger distance and size-constrained spectral clustering.
For further information on this event, please email Gilad Lerman at lerman(at)umn.edu.
The Minnisymposium is generously supported by ONR