Lecture | Reading |
Introduction |
|
Image filtering |
Ch 3 (Szeliski) Ch 4 (Forsyth) |
Convolution |
Ch 3 (Szeliski) Ch 4 (Forsyth) wiki |
Image gradient |
Ch 3 (Szeliski) |
Edge |
Ch 4 (Szeliski) Ch 5 (Forsyth)
Canny, "A computational approach to edge detection" TPAMI (1986) |
HOG |
Ch 4 (Szeliski) Ch 5 (Forsyth)
Dalal and Triggs, "Histograms of oriented gradients for human detection" CVPR (2005)
Felzenszwalb et al., "Object detection with discriminatively trained
part based models", TPAMI (2010)
|
Image pyramid |
Ch 4, 5 (Forsyth) wiki
P.A. Bromiley, "Products and Convolutions of Gaussian Probability Density
Functions"
Bert and Adelson, The Laplacian Pyramid as a Compact Image Code", Trans. on Comm. (1983)
E.H. Andelson et al., "Pyramid methods in image processing" (1984)
|
Scale space |
Ch 4, 5 (Forsyth) wiki
Bretzner and Lindeberg, "Feature Tracking with Automatic Selection of Spatial Scales", CVIU (1998)
|
SIFT |
Ch 5 (Forsyth)
Lowe, "Distinctive Image Features from Scale-Invariant Keypoints". IJCV (2004)
VLfeat SIFT
|
Feature matching |
Ch 4 (Szeliski)
|
Image warping |
Ch 4, 6, 9 (Szeliski)
|
RANSAC |
Ch 6 (Szeliski)
Ch 15 (Forsyth)
Fischler and Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography" Comm. ACM. (1981)
|
Optical flow |
Ch 8 (Szeliski)
Lucas and Kanade, "An iterative image registration technique with an application to stereo vision", 1981
|
Good feature to track / Harris corner |
Ch 4, 8 (Szeliski)
Shi and Tomasi, "Good Features to Track," CVPR, 1994
Harris and Stephens, "A Combined Corner and Edge Detector". Alvey Vision Conference, 1988) |
Image Alignment (Lucas-Kanade) |
Ch 8 (Szeliski)
Baker and Matthews, "Lucas-Kanade 20 Years On: A Unifying Framework: Part 1", 2002
|
Inverse Compositional Image Alignment |
Baker and Matthews, "Lucas-Kanade 20 Years On: A Unifying Framework: Part 1", 2002
|
Meanshift Tracking (nonparametric tracking) |
Ch 5.3 (Szeliski)
Comaniciu et al., "Kernel-Based Object Tracking", TPAMI, 2003
Cheng, "Mean Shift, Mode Seeking, and Clustering", TPAMI, 1995
Fukunaga et al., "The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition", IEEE Transactions on Information Theory, 1975
|
Dense Optical Flow |
Ch 8 (Szeliski)
Horn and Schunck, "Determining Optical Flow", Artificial Intelligence, 1981
|
Eigenfaces |
Ch 14 (Szeliski)
Turk and Pentland, "Face recognition using eigenfaces" CVPR 1991
|
Fisher's Linear Discrimant |
Ch 14 (Szeliski)
Belhumeur et al., "Eigenfaces vs. Fisherfaces: Recognition
Using Class Specific Linear Projection" PAMI 1997
|
Face Detection |
Ch 14 (Szeliski)
Viola and Jones, "Robust Real-time Object Detection", IJCV 2001
Freund and Schapire, "A Short Introduction to Boosting", 1999
|
Bag-of-Words Image Classification |
Lazebnik et al., "Beyond bags of features: spatial pyramid matching for
recognizing natural scene categories", CVPR 2006
Fei Fei, Fergus, Torralba, "Recognizing and Learning Object Categories", Short course (link)
|
Neural Network |
Forsyth Ch 22
|
Convolutional Neural Network |
LeCunn, "Gradient-based learning applied to document recognition", 1998 Krizhevsky et al., "ImageNet Classification with Deep Convolutional
Neural Networks", NIPS 2012
Zeiler and Fergus, "Visualizing and Understanding
Convolutional Networks", ECCV 2014
Mahendran and Vidaldi, "Understanding Deep Image Representations by Inverting Them", CVPR 2014
|
Training Convolutional Neural Network |
Stanford Visual Recognition Lecture Note by Andrej Karpathy
|
CNN Object Detection |
Stanford Visual Recognition Lecture Note by Andrej Karpathy
Girshick et al., "Rich feature hierarchies for accurate object detection and semantic segmentation"
Girshick, "Fast R-CNN"
Ren et al., "Faster R-CNN: Towards Real-Time Object Detection
with Region Proposal Networks"
Redmon et al, "You Only Look Once:
Unified, Real-Time Object Detection"
|
Structured Prediction (Human Pose) |
Felzenszwalb et al., "Object Detection with Discriminatively Trained Part Based Models"
Yang and Ramanan, "Articulated Pose Estimation with Flexible Mixtures-of-parts"
Ramakrishna et al., "Pose Machine: Articulated Pose Estimation via Inference Machines"
Toshev and Szegedy, "DeepPose: Human Pose Estimation via Deep Neural Networks"
Wei et al., "Convolutional Pose Machines"
Cao et al., "Realtime Multi-person 2D Pose Estimation using Part Affinity Fields"
|
Structured Prediction (Semantic Segmentation) |
Dumoulin and Visin, "A guide to convolution arithmetic for deep learning"
Long et al., "Fully Convolutional Networks for Semantic Segmentation"
Noh et al., "Learning Deconvolutional Network for Semantic Segmentation"
Chen et al., "Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs"
Zheng et al., "Conditional Random Fields as Recurrent Neural Networks"
|
Camera Model |
Szeliski Ch 2
Forsyth Ch 1
|
Camera Projection |
Szeliski Ch 2
Forsyth Ch 1
|
Projective Line |
Szeliski Ch 2
|
Epipolar Geometry |
Szeliski Ch 7
Forsyth Ch 10
Longuet-Higgins "A computer algorithm for reconstructing a scene from two projections". Nature 1981
|
Triangulation and Stereo |
Szeliski Ch 7
Forsyth Ch 10
|