NUMERICAL LINEAR ALGEBRA IN DATA EXPLORATION

CSci 8363 -- Spring 2005

TTh 4-5:15pm, EE/CS 3-111

Daniel Boley

Call Number 64994

Click Here Detailed Plans for Semester

Linear Algebra has contributed many methods for handling very large quantities of numerical data. Here we examine many of these linear algebra methods and how they have been applied to the exploration and analysis of very large data collections. Part of the class will be an overview of many of the linear algebra methods for large sparse matrix problems such as eigenvalue and singular value problems. But the bulk of the class will be devoted to how these linear algebra methods have been used in information retrieval, data mining, unsupervised clustering, and the like. Examples of methods we will examine are Latent Semantic Indexing, Linear Least Squares Fit, Principal Direction Divisive Partitioning, Hubs and Authorities Analysis, Support Vector Machines, and recent ideas on non-negative matrix decompositions. A collection of basic research papers, some of a tutorial nature, will be used for the class. Examples will be taken from vision recognition systems, biological gene analysis, document retrieval.

Prerequisites and Work Plan

Students should be familiar with basic linear algebra concepts and methods such as Gaussian elimination for systems of linear equations. Exposure to concepts such as matrix eigenvalues and matrix least squares problems will be very useful, but we will review the principal numerical methods used to solve them. We will also review the concepts and methods from information retrieval, pattern recognition, and machine learning as the topics are encountered in class. Students will be expected to read a research paper and/or do a project, with some results presented in class.

For Further Information

Contact Daniel Boley, 6-209 EE/CS Bldg, boley@cs.umn.edu, 625-3887.http://www-users.cs.umn.edu/~boley