Spring 2019 CSCI 5561
Computer Vision

Mon/Wed 9:45am-11:00am @ Keller Hall 3-111

Information
Syllabus

Instructor: Hyun Soo Park (hspark at umn.edu)
Office hour: Mon/Wed 4:00pm-5:00pm (Shepherd Laboratory 261)

TA: Jingfan Guo (guo00109 at umn.edu)
Office hour: Tue/Thr 4:00pm-5:00pm (Shepherd Laboratory 234)

Textbook: Not required but the following books will be frequently referred:
+ "Computer Vision: Algorithms and Applications", Richard Szeliski
+ "Computer Vision: A Modern Approach", David A. Forsyth and Jean Ponce

The course materials are inspired by the slides of Jianbo Shi (UPenn), Kris Kitani (CMU), James Hays (GATECH), Kristen Grauman (UT Austin), Steve Seitz (UW), and Robert Collins (PSU).

Slide

Lecture Reading

Introduction

Image filtering Ch 3 (Szeliski)
Ch 4 (Forsyth)

Convolution Ch 3 (Szeliski)
Ch 4 (Forsyth)
wiki

Image gradient Ch 3 (Szeliski)

Edge Ch 4 (Szeliski)
Ch 5 (Forsyth)
Canny, "A computational approach to edge detection" TPAMI (1986)

HOG Ch 4 (Szeliski)
Ch 5 (Forsyth)
Dalal and Triggs, "Histograms of oriented gradients for human detection" CVPR (2005)
Felzenszwalb et al., "Object detection with discriminatively trained part based models", TPAMI (2010)

Image pyramid Ch 4, 5 (Forsyth)
wiki
P.A. Bromiley, "Products and Convolutions of Gaussian Probability Density Functions"
Bert and Adelson, The Laplacian Pyramid as a Compact Image Code", Trans. on Comm. (1983)
E.H. Andelson et al., "Pyramid methods in image processing" (1984)

Scale space Ch 4, 5 (Forsyth)
wiki
Bretzner and Lindeberg, "Feature Tracking with Automatic Selection of Spatial Scales", CVIU (1998)

SIFT Ch 5 (Forsyth)
Lowe, "Distinctive Image Features from Scale-Invariant Keypoints". IJCV (2004)
VLfeat SIFT

Feature matching Ch 4 (Szeliski)

Image warping Ch 4, 6, 9 (Szeliski)

RANSAC Ch 6 (Szeliski)
Ch 15 (Forsyth)
Fischler and Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography" Comm. ACM. (1981)

Optical flow Ch 8 (Szeliski)
Lucas and Kanade, "An iterative image registration technique with an application to stereo vision", 1981

Good feature to track / Harris corner Ch 4, 8 (Szeliski)
Shi and Tomasi, "Good Features to Track," CVPR, 1994
Harris and Stephens, "A Combined Corner and Edge Detector". Alvey Vision Conference, 1988)

Image Alignment (Lucas-Kanade) Ch 8 (Szeliski)
Baker and Matthews, "Lucas-Kanade 20 Years On: A Unifying Framework: Part 1", 2002

Inverse Compositional Image Alignment Baker and Matthews, "Lucas-Kanade 20 Years On: A Unifying Framework: Part 1", 2002

Meanshift Tracking (nonparametric tracking) Ch 5.3 (Szeliski)
Comaniciu et al., "Kernel-Based Object Tracking", TPAMI, 2003
Cheng, "Mean Shift, Mode Seeking, and Clustering", TPAMI, 1995
Fukunaga et al., "The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition", IEEE Transactions on Information Theory, 1975

Dense Optical Flow Ch 8 (Szeliski)
Horn and Schunck, "Determining Optical Flow", Artificial Intelligence, 1981

Eigenfaces Ch 14 (Szeliski)
Turk and Pentland, "Face recognition using eigenfaces" CVPR 1991

Fisher's Linear Discrimant Ch 14 (Szeliski)
Belhumeur et al., "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection" PAMI 1997

Face Detection Ch 14 (Szeliski)
Viola and Jones, "Robust Real-time Object Detection", IJCV 2001
Freund and Schapire, "A Short Introduction to Boosting", 1999

Bag-of-Words Image Classification Lazebnik et al., "Beyond bags of features: spatial pyramid matching for recognizing natural scene categories", CVPR 2006
Fei Fei, Fergus, Torralba, "Recognizing and Learning Object Categories", Short course (link)

Neural Network Forsyth Ch 22

Convolutional Neural Network LeCunn, "Gradient-based learning applied to document recognition", 1998
Krizhevsky et al., "ImageNet Classification with Deep Convolutional Neural Networks", NIPS 2012
Zeiler and Fergus, "Visualizing and Understanding Convolutional Networks", ECCV 2014
Mahendran and Vidaldi, "Understanding Deep Image Representations by Inverting Them", CVPR 2014

Training Convolutional Neural Network Stanford Visual Recognition Lecture Note by Andrej Karpathy

CNN Object Detection Stanford Visual Recognition Lecture Note by Andrej Karpathy
Girshick et al., "Rich feature hierarchies for accurate object detection and semantic segmentation"
Girshick, "Fast R-CNN"
Ren et al., "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks"
Redmon et al, "You Only Look Once: Unified, Real-Time Object Detection"

Camera Model Szeliski Ch 2
Forsyth Ch 1

Camera Projection Szeliski Ch 2
Forsyth Ch 1

Projective Line Szeliski Ch 2

Epipolar Geometry Szeliski Ch 7
Forsyth Ch 10
Longuet-Higgins "A computer algorithm for reconstructing a scene from two projections". Nature 1981

Triangulation Szeliski Ch 7

Stereo Reconstruction Szeliski Ch 7
Forsyth Ch 10

Final Exam Review

Homework

HW #1: Histogram of Oriented Gradients (last update Feb 2)

HW #2: Registration (Inverse Compositional Image Tracking)

HW #3: Scene Recognition (BoW)

HW #4: Digit Recognition (CNN)

HW #5: Stereo Reconstruction

Scholastic misconduct
Scholastic misconduct is broadly defined as "any act that violates the right of another student in academic work or that involves misrepresentation of your own work. Scholastic dishonesty includes, (but is not necessarily limited to): cheating on assignments or examinations; plagiarizing, which means misrepresenting as your own work any part of work done by another; submitting the same paper, or substantially similar papers, to meet the requirements of more than one course without the approval and consent of all instructors concerned; depriving another student of necessary course materials; or interfering with another student's work."

Spring 2019 CSCI 5561 Computer Vision

Mon/Wed 9:45am-11:00am @ Keller Hall 3-111

Information

Slide

Homework

Scholastic misconduct

Spring 2019 CSCI 5561
Computer Vision