Additional Information
Book Details
Abstract
Appropriate for upper-division undergraduate- and graduate-level courses in computer vision found in departments of Computer Science, Computer Engineering and Electrical Engineering.
This textbook provides the most complete treatment of modern computer vision methods by two of the leading authorities in the field. This accessible presentation gives both a general view of the entire computer vision enterprise and also offers sufficient detail for students to be able to build useful applications. Students will learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods.
Table of Contents
Section Title | Page | Action | Price |
---|---|---|---|
Cover | Cover | ||
Title Page | Title | ||
Contents\r | 5 | ||
I IMAGE FORMATION\r | 31 | ||
1 Geometric Camera Models | 33 | ||
1.1 IMAGE FORMATION | 34 | ||
1.1.1 Pinhole Perspective | 34 | ||
1.1.2 Weak Perspective | 36 | ||
1.1.3 Cameras with Lenses | 38 | ||
1.1.4 The Human Eye | 42 | ||
1.2 INTRINSIC AND EXTRINSIC PARAMETERS | 44 | ||
1.2.1 Rigid Transformations and Homogeneous Coordinates | 44 | ||
1.2.2 Intrinsic Parameters | 46 | ||
1.2.3 Extrinsic Parameters | 48 | ||
1.2.4 Perspective Projection Matrices | 49 | ||
1.2.5 Weak-Perspective Projection Matrices | 50 | ||
1.3 GEOMETRIC CAMERA CALIBRATION | 52 | ||
1.3.1 ALinear Approach to Camera Calibration | 53 | ||
1.3.2 ANonlinear Approach to Camera Calibration | 57 | ||
1.4 NOTES | 59 | ||
2 \rLight and Shading | 62 | ||
2.1 MODELLING PIXEL BRIGHTNESS | 62 | ||
2.1.1 Reflection at Surfaces | 63 | ||
2.1.2 Sources and Their Effects | 64 | ||
2.1.3 The Lambertian+Specular Model | 66 | ||
2.1.4 Area Sources | 66 | ||
2.2 INFERENCE FROM SHADING | 67 | ||
2.2.1 Radiometric Calibration and High Dynamic Range Images | 68 | ||
2.2.2 The Shape of Specularities | 70 | ||
2.2.3 Inferring Lightness and Illumination | 73 | ||
2.2.4 Photometric Stereo: Shape from Multiple Shaded Images | 76 | ||
2.3 MODELLING INTERREFLECTION | 82 | ||
2.3.1 The Illumination at a Patch Due to an Area Source | 82 | ||
2.3.2 Radiosity and Exitance | 84 | ||
2.3.3 An Interreflection Model | 85 | ||
2.3.4 Qualitative Properties of Interreflections | 86 | ||
2.4 SHAPE FROM ONE SHADED IMAGE | 89 | ||
2.5 NOTES | 91 | ||
3 \rColor | 98 | ||
3.1 HUMAN COLOR PERCEPTION | 98 | ||
3.1.1 Color Matching | 98 | ||
3.1.2 Color Receptors | 101 | ||
3.2 THE PHYSICS OF COLOR | 103 | ||
3.2.1 The Color of Light Sources | 103 | ||
3.2.2 The Color of Surfaces | 106 | ||
3.3 REPRESENTING COLOR | 107 | ||
3.3.1 Linear Color Spaces | 107 | ||
3.3.2 Non-linear Color Spaces | 113 | ||
3.4 A \rMODEL OF IMAGE COLOR | 116 | ||
3.4.1 The Diffuse Term | 118 | ||
3.4.2 The Specular Term | 120 | ||
3.5 INFERENCE FROM COLOR | 120 | ||
3.5.1 Finding Specularities Using Color | 120 | ||
3.5.2 Shadow Removal Using Color | 122 | ||
3.5.3 Color Constancy: Surface Color from Image Color | 125 | ||
3.6 NOTES | 129 | ||
II EARLY VISION: JUST ONE IMAGE\r | 135 | ||
4 \rLinear Filters | 137 | ||
4.1 LINEAR FILTERS AND CONVOLUTION | 137 | ||
4.1.1 Convolution | 137 | ||
4.2 SHIFT INVARIANT LINEAR SYSTEMS | 142 | ||
4.2.1 Discrete Convolution | 143 | ||
4.2.2 Continuous Convolution | 145 | ||
4.2.3 Edge Effects in Discrete Convolutions | 148 | ||
4.3 SPATIAL FREQUENCY AND FOURIER TRANSFORMS | 148 | ||
4.3.1 Fourier Transforms | 149 | ||
4.4 SAMPLING AND ALIASING | 151 | ||
4.4.1 Sampling | 152 | ||
4.4.2 Aliasing | 155 | ||
4.4.3 Smoothing and Resampling | 156 | ||
4.5 FILTERS AS TEMPLATES | 161 | ||
4.5.1 Convolution as a Dot Product | 161 | ||
4.5.2 Changing Basis | 162 | ||
4.6 TECHNIQUE: NORMALIZED CORRELATION AND FINDING PATTERNS | 162 | ||
4.6.1 Controlling the Television by Finding Hands by Normalized Correlation\r | 163 | ||
4.7 TECHNIQUE: SCALE AND IMAGE PYRAMIDS | 164 | ||
4.7.1 The Gaussian Pyramid | 165 | ||
4.7.2 Applications of Scaled Representations | 166 | ||
4.8 NOTES | 167 | ||
5 \rLocal Image Features | 171 | ||
5.1 COMPUTING THE IMAGE GRADIENT | 171 | ||
5.1.1 Derivative of Gaussian Filters | 172 | ||
5.2 REPRESENTING THE IMAGE GRADIENT | 174 | ||
5.2.1 Gradient-Based Edge Detectors | 175 | ||
5.2.2 Orientations | 177 | ||
5.3 FINDING CORNERS AND BUILDING NEIGHBORHOODS | 178 | ||
5.3.1 Finding Corners | 179 | ||
5.3.2 Using Scale and Orientation to Build a Neighborhood | 181 | ||
5.4 DESCRIBING NEIGHBORHOODS WITH SIFT AND HOG FEATURES | 185 | ||
5.4.1 SIFT Features | 187 | ||
5.4.2 HOG Features | 189 | ||
5.5 COMPUTING LOCAL FEATURES IN PRACTICE | 190 | ||
5.6 NOTES | 190 | ||
6 \rTexture | 194 | ||
6.1 LOCAL TEXTURE REPRESENTATIONS USING FILTERS | 196 | ||
6.1.1 Spots and Bars | 197 | ||
6.1.2 From Filter Outputs to Texture Representation | 198 | ||
6.1.3 Local Texture Representations in Practice | 200 | ||
6.2 POOLED TEXTURE REPRESENTATIONS BY DISCOVERING TEXTONS | 201 | ||
6.2.1 Vector Quantization and Textons | 202 | ||
6.2.2 K-means Clustering for Vector Quantization | 202 | ||
6.3 SYNTHESIZING TEXTURES AND FILLING HOLES IN IMAGES | 206 | ||
6.3.1 Synthesis by Sampling Local Models | 206 | ||
6.3.2 Filling in Holes in Images | 209 | ||
6.4 IMAGE DENOISING | 212 | ||
6.4.1 Non-local Means | 213 | ||
6.4.2 Block Matching 3D (BM3D) | 213 | ||
6.4.3 Learned Sparse Coding | 214 | ||
6.4.4 Results | 216 | ||
6.5 SHAPE FROM TEXTURE | 217 | ||
6.5.1 Shape from Texture for Planes | 217 | ||
6.5.2 Shape from Texture for Curved Surfaces | 220 | ||
6.6 NOTES | 221 | ||
III \rEARLY VISION: MULTIPLEIMAGES | 225 | ||
7 \rStereopsis | 227 | ||
7.1 BINOCULAR CAMERA GEOMETRY AND THE EPIPOLAR CONSTRAINT | 228 | ||
7.1.1 Epipolar Geometry | 228 | ||
7.1.2 The Essential Matrix | 230 | ||
7.1.3 The Fundamental Matrix | 231 | ||
7.2 BINOCULAR RECONSTRUCTION | 231 | ||
7.2.1 Image Rectification | 232 | ||
7.3 HUMAN STEREOPSIS | 233 | ||
7.4 LOCAL METHODS FOR BINOCULAR FUSION | 235 | ||
7.4.1 Correlation | 235 | ||
7.4.2 Multi-Scale Edge Matching | 237 | ||
7.5 GLOBAL METHODS FOR BINOCULAR FUSION | 240 | ||
7.5.1 Ordering Constraints and Dynamic Programming | 240 | ||
7.5.2 Smoothness Constraints and Combinatorial Optimization over Graphs | 241 | ||
7.6 USING MORE CAMERAS | 244 | ||
7.7 APPLICATION: ROBOT NAVIGATION | 245 | ||
7.8 NOTES | 246 | ||
8 \rStructure from Motion | 251 | ||
8.1 INTERNALLY CALIBRATED PERSPECTIVE CAMERAS | 251 | ||
8.1.1 Natural Ambiguity of the Problem | 253 | ||
8.1.2 Euclidean Structure and Motion from Two Images | 254 | ||
8.1.3 Euclidean Structure and Motion from Multiple Images | 258 | ||
8.2 UNCALIBRATED WEAK-PERSPECTIVE CAMERAS | 260 | ||
8.2.1 Natural Ambiguity of the Problem | 261 | ||
8.2.2 Affine Structure and Motion from Two Images | 263 | ||
8.2.3 Affine Structure and Motion from Multiple Images | 267 | ||
8.2.4 From Affine to Euclidean Shape | 268 | ||
8.3 UNCALIBRATED PERSPECTIVE CAMERAS | 270 | ||
8.3.1 Natural Ambiguity of the Problem | 271 | ||
8.3.2 Projective Structure and Motion from Two Images | 272 | ||
8.3.3 Projective Structure and Motion from Multiple Images | 274 | ||
8.3.4 From Projective to Euclidean Shape | 276 | ||
8.4 NOTES | 278 | ||
IV \rMID-LEVEL VISION | 283 | ||
9 \rSegmentation by Clustering | 285 | ||
9.1 HUMAN VISION: GROUPING AND GESTALT | 286 | ||
9.2 IMPORTANT APPLICATIONS | 291 | ||
9.2.1 Background Subtraction | 291 | ||
9.2.2 Shot Boundary Detection | 294 | ||
9.2.3 Interactive Segmentation | 295 | ||
9.2.4 Forming Image Regions | 296 | ||
9.3 IMAGE SEGMENTATION BY CLUSTERING PIXELS | 298 | ||
9.3.1 Basic Clustering Methods | 299 | ||
9.3.2 The Watershed Algorithm | 301 | ||
9.3.3 Segmentation Using K-means | 302 | ||
9.3.4 Mean Shift: Finding Local Modes in Data | 303 | ||
9.3.5 Clustering and Segmentation with Mean Shift | 305 | ||
9.4 SEGMENTATION, CLUSTERING, AND GRAPHS | 307 | ||
9.4.1 Terminology and Facts for Graphs | 307 | ||
9.4.2 Agglomerative Clustering with a Graph | 309 | ||
9.4.3 Divisive Clustering with a Graph | 311 | ||
9.4.4 Normalized Cuts | 314 | ||
9.5 IMAGE SEGMENTATION IN PRACTICE | 315 | ||
9.5.1 Evaluating Segmenters | 316 | ||
9.6 NOTES | 317 | ||
10 \rGrouping and Model Fitting | 320 | ||
10.1 THE HOUGH TRANSFORM | 320 | ||
10.1.1 Fitting Lines with the Hough Transform | 320 | ||
10.1.2 Using the Hough Transform | 322 | ||
10.2 FITTING LINES AND PLANES | 323 | ||
10.2.1 Fitting a Single Line | 324 | ||
10.2.2 Fitting Planes | 325 | ||
10.2.3 Fitting Multiple Lines | 326 | ||
10.3 FITTING CURVED STRUCTURES | 327 | ||
10.4 Robustness | 329 | ||
10.4.1 M-Estimators | 330 | ||
10.4.2 RANSAC: Searching for Good Points | 332 | ||
10.5 FITTING USING PROBABILISTIC MODELS | 336 | ||
10.5.1 Missing Data Problems | 337 | ||
10.5.2 Mixture Models and Hidden Variables | 339 | ||
10.5.3 The EM Algorithm for Mixture Models | 340 | ||
10.5.4 Difficulties with the EM Algorithm | 342 | ||
10.6 MOTION SEGMENTATION BY PARAMETER ESTIMATION | 343 | ||
10.6.1 Optical Flow and Motion | 345 | ||
10.6.2 Flow Models | 346 | ||
10.6.3 Motion Segmentation with Layers | 347 | ||
10.7 MODEL SELECTION: WHICH MODEL IS THE BEST FIT? | 349 | ||
10.7.1 Model Selection Using Cross-Validation | 352 | ||
10.8 NOTES | 352 | ||
11 \rTracking | 356 | ||
11.1 SIMPLE TRACKING STRATEGIES | 357 | ||
11.1.1 Tracking by Detection | 357 | ||
11.1.2 Tracking Translations by Matching | 360 | ||
11.1.3 Using Affine Transformations to Confirm a Match | 362 | ||
11.2 TRACKING USING MATCHING | 364 | ||
11.2.1 Matching Summary Representations | 365 | ||
11.2.2 Tracking Using Flow | 367 | ||
11.3 TRACKING LINEAR DYNAMICAL MODELS WITH KALMAN FILTERS | 369 | ||
11.3.1 Linear Measurements and Linear Dynamics | 370 | ||
11.3.2 The Kalman Filter | 374 | ||
11.3.3 Forward-backward Smoothing | 375 | ||
11.4 DATA ASSOCIATION | 379 | ||
11.4.1 Linking Kalman Filters with Detection Methods | 379 | ||
11.4.2 Key Methods of Data Association | 380 | ||
11.5 PARTICLE FILTERING | 380 | ||
11.5.1 Sampled Representations of Probability Distributions | 381 | ||
11.5.2 The Simplest Particle Filter | 385 | ||
11.5.3 The Tracking Algorithm | 386 | ||
11.5.4 A \rWorkable Particle Filter | 388 | ||
11.5.5 \rPractical Issues in Building Particle Filters | 390 | ||
11.6 NOTES | 392 | ||
V\r HIGH-LEVEL VISION | 395 | ||
12 \rRegistration | 397 | ||
12.1 REGISTERING RIGID OBJECTS | 398 | ||
12.1.1 Iterated Closest Points | 398 | ||
12.1.2 Searching for Transformations via Correspondences | 399 | ||
12.1.3 Application: Building Image Mosaics | 400 | ||
12.2 MODEL-BASED VISION: REGISTERING RIGID OBJECTS WITH PROJECTION | 405 | ||
12.2.1 Verification: Comparing Transformed and Rendered Source to Target\r | 407 | ||
12.3 REGISTERING DEFORMABLE OBJECTS | 408 | ||
12.3.1 Deforming Texture with Active Appearance Models | 408 | ||
12.3.2 Active Appearance Models in Practice | 411 | ||
12.3.3 Application: Registration in Medical Imaging Systems | 413 | ||
12.4 NOTES | 418 | ||
13 \rSmooth Surfaces and Their Outlines | 421 | ||
13.1 ELEMENTS OF DIFFERENTIAL GEOMETRY | 423 | ||
13.1.1 Curves | 423 | ||
13.1.2 Surfaces | 427 | ||
13.2 CONTOUR GEOMETRY | 432 | ||
13.2.1 The Occluding Contour and the Image Contour | 432 | ||
13.2.2 The Cusps and Inflections of the Image Contour | 433 | ||
13.2.3 Koenderink’s Theorem | 434 | ||
13.3 VISUAL EVENTS: MORE DIFFERENTIAL GEOMETRY | 437 | ||
13.3.1 The Geometry of the Gauss Map | 437 | ||
13.3.2 Asymptotic Curves | 439 | ||
13.3.3 The Asymptotic Spherical Map | 440 | ||
13.3.4 Local Visual Events | 442 | ||
13.3.5 The Bitangent Ray Manifold | 443 | ||
13.3.6 Multilocal Visual Events | 444 | ||
13.3.7 The Aspect Graph | 446 | ||
13.4 NOTES | 447 | ||
14 \rRange Data | 452 | ||
14.1 ACTIVE RANGE SENSORS | 452 | ||
14.2 RANGE DATA SEGMENTATION | 454 | ||
14.2.1 Elements of Analytical Differential Geometry | 454 | ||
14.2.2 Finding Step and Roof Edges in Range Images | 456 | ||
14.2.3 Segmenting Range Images into Planar Regions | 461 | ||
14.3 RANGE IMAGE REGISTRATION AND MODEL ACQUISITION | 462 | ||
14.3.1 Quaternions | 463 | ||
14.3.2 Registering Range Images | 464 | ||
14.3.3 Fusing Multiple Range Images | 466 | ||
14.4 OBJECT RECOGNITION | 468 | ||
14.4.1 Matching Using Interpretation Trees \r | 468 | ||
14.4.2 Matching Free-Form Surfaces Using Spin Images | 471 | ||
14.5 KINECT | 476 | ||
14.5.1 Features | 477 | ||
14.5.2 Technique: Decision Trees and Random Forests | 478 | ||
14.5.3 Labeling Pixels | 480 | ||
14.5.4 Computing Joint Positions | 483 | ||
14.6 NOTES | 483 | ||
15 \rLearning to Classify | 487 | ||
15.1 CLASSIFICATION, ERROR, AND LOSS | 487 | ||
15.1.1 Using Loss to Determine Decisions | 487 | ||
15.1.2 Training Error, Test Error, and Overfitting | 489 | ||
15.1.3 Regularization | 490 | ||
15.1.4 Error Rate and Cross-Validation | 493 | ||
15.1.5 Receiver Operating Curves | 495 | ||
15.2 MAJOR CLASSIFICATION STRATEGIES | 497 | ||
15.2.1 Example: Mahalanobis Distance\r | 497 | ||
15.2.2 Example: Class-Conditional Histograms and Naive Bayes | 498 | ||
15.2.3 Example: Classification Using Nearest Neighbors \r | 499 | ||
15.2.4 Example: The Linear Support Vector Machine | 500 | ||
15.2.5 Example: Kernel Machines | 503 | ||
15.2.6 Example: Boosting and Adaboost | 505 | ||
15.3 PRACTICAL METHODS FOR BUILDING CLASSIFIERS | 505 | ||
15.3.1 Manipulating Training Data to Improve Performance | 507 | ||
15.3.2 Building Multi-Class Classifiers Out of Binary Classifiers | 509 | ||
15.3.3 Solving for SVMS and Kernel Machines | 510 | ||
15.4 NOTES | 511 | ||
16 classifying Images\r | 512 | ||
16.1 BUILDING GOOD IMAGE FEATURES | 512 | ||
16.1.1 Example Applications | 512 | ||
16.1.2 Encoding Layout with GIST Features | 515 | ||
16.1.3 Summarizing Images with Visual Words | 517 | ||
16.1.4 The Spatial Pyramid Kernel | 519 | ||
16.1.5 Dimension Reduction with Principal Components | 523 | ||
16.1.6 Dimension Reduction with Canonical Variates | 524 | ||
16.1.7 Example Application: Identifying Explicit Images | 528 | ||
16.1.8 Example Application: Classifying Materials | 532 | ||
16.1.9 Example Application: Classifying Scenes | 532 | ||
16.2 CLASSIFYING IMAGES OF SINGLE OBJECTS | 534 | ||
16.2.1 Image Classification Strategies | 535 | ||
16.2.2 Evaluating Image Classification Systems | 535 | ||
16.2.3 Fixed Sets of Classes | 538 | ||
16.2.4 Large Numbers of Classes | 539 | ||
16.2.5 Flowers, Leaves, and Birds: Some Specialized Problems | 541 | ||
16.3 IMAGE CLASSIFICATION IN PRACTICE | 542 | ||
16.3.1 Codes for Image Features | 543 | ||
16.3.2 Image Classification Datasets | 543 | ||
16.3.3 Dataset Bias | 545 | ||
16.3.4 Crowdsourcing Dataset Collection | 545 | ||
16.4 NOTES | 547 | ||
17 \rDetecting Objects in Images | 549 | ||
17.1 THE SLIDING WINDOW METHOD | 549 | ||
17.1.1 Face Detection | 550 | ||
17.1.2 Detecting Humans | 555 | ||
17.1.3 Detecting Boundaries | 557 | ||
17.2 DETECTING DEFORMABLE OBJECTS | 560 | ||
17.3 THE STATE OF THE ART OF OBJECT DETECTION | 565 | ||
17.3.1 Datasets and Resources | 568 | ||
17.4 NOTES | 569 | ||
18 \rTopics in Object Recognition | 570 | ||
18.1 WHAT SHOULD OBJECT RECOGNITION DO? | 570 | ||
18.1.1 What Should an Object Recognition System Do? | 570 | ||
18.1.2 Current Strategies for Object Recognition | 572 | ||
18.1.3 What Is Categorization? | 572 | ||
18.1.4 Selection: What Should Be Described? | 574 | ||
18.2 FEATURE QUESTIONS | 574 | ||
18.2.1 Improving Current Image Features | 574 | ||
18.2.2 Other Kinds of Image Feature | 576 | ||
18.3 Geometric Questions | 577 | ||
18.4 SEMANTIC QUESTIONS | 579 | ||
18.4.1 Attributes and the Unfamiliar | 580 | ||
18.4.2 Parts, Poselets and Consistency | 581 | ||
18.4.3 Chunks of Meaning\r | 584 | ||
VI APPLICATIONS AND TOPICS\r | 587 | ||
19 \rImage-Based Modeling andRendering | 589 | ||
19.1 VISUAL HULLS | 589 | ||
19.1.1 Main Elements of the Visual Hull Model | 591 | ||
19.1.2 Tracing Intersection Curves | 593 | ||
19.1.3 Clipping Intersection Curves | 596 | ||
19.1.4 Triangulating Cone Strips | 597 | ||
19.1.5 Results | 598 | ||
19.1.6 Going Further: Carved Visual Hulls | 602 | ||
19.2 PATCH-BASED MULTI-VIEW STEREOPSIS | 603 | ||
19.2.1 Main Elements of the PMVS Model | 605 | ||
19.2.2 Initial Feature Matching | 608 | ||
19.2.3 Expansion | 609 | ||
19.2.4 Filtering | 610 | ||
19.2.5 Results | 611 | ||
19.3 THE LIGHT FIELD | 614 | ||
19.4 NOTES | 617 | ||
20 \rLooking at People | 620 | ||
20.1 HMM’S, DYNAMIC PROGRAMMING, AND TREE-STRUCTURED MODELS | 620 | ||
20.1.1 Hidden Markov Models | 620 | ||
20.1.2 Inference for an HMM | 622 | ||
20.1.3 Fitting an HMM with EM | 627 | ||
20.1.4 Tree-Structured Energy Models | 630 | ||
20.2 PARSING PEOPLE IN IMAGES | 632 | ||
20.2.1 Parsing with Pictorial Structure Models | 632 | ||
20.2.2 Estimating the Appearance of Clothing | 634 | ||
20.3 TRACKING PEOPLE | 636 | ||
20.3.1 Why Human Tracking Is Hard | 636 | ||
20.3.2 Kinematic Tracking by Appearance | 638 | ||
20.3.3 Kinematic Human Tracking Using Templates | 639 | ||
20.4 3D FROM 2D: LIFTING | 641 | ||
20.4.1 Reconstruction in an Orthographic View | 641 | ||
20.4.2 Exploiting Appearance for Unambiguous Reconstructions | 643 | ||
20.4.3 Exploiting Motion for Unambiguous Reconstructions \r | 645 | ||
20.5 ACTIVITY RECOGNITION | 647 | ||
20.5.1 Background: Human Motion Data | 647 | ||
20.5.2 Body Configuration and Activity Recognition | 651 | ||
20.5.3 Recognizing Human Activities with Appearance Features | 652 | ||
20.5.4 Recognizing Human Activities with Compositional Models | 654 | ||
20.6 RESOURCES | 654 | ||
20.7 NOTES | 656 | ||
21 \rImage Search and Retrieval | 657 | ||
21.1 THE APPLICATION CONTEXT | 657 | ||
21.1.1 Applications | 658 | ||
21.1.2 User Needs | 659 | ||
21.1.3 Types of Image Query | 660 | ||
21.1.4 What Users Do with Image Collections | 661 | ||
21.2 BASIC TECHNOLOGIES FROM INFORMATION RETRIEVAL | 662 | ||
21.2.1 Word Counts | 662 | ||
21.2.2 Smoothing Word Counts | 663 | ||
21.2.3 Approximate Nearest Neighbors and Hashing | 664 | ||
21.2.4 Ranking Documents | 668 | ||
21.3 IMAGES AS DOCUMENTS | 669 | ||
21.3.1 Matching Without Quantization | 670 | ||
21.3.2 Ranking Image Search Results | 671 | ||
21.3.4 Laying Out Images for Browsing | 674 | ||
21.4 PREDICTING ANNOTATIONS FOR PICTURES | 675 | ||
21.4.1 Annotations from Nearby Words | 676 | ||
21.4.2 Annotations from the Whole Image | 676 | ||
21.4.3 Predicting Correlated Words with Classifiers | 678 | ||
21.4.4 Names and Faces | 679 | ||
21.4.5 Generating Tags with Segments | 681 | ||
21.5 THE STATE OF THE ART OF WORD PREDICTION | 684 | ||
21.5.1 Resources | 685 | ||
21.5.2 Comparing Methods | 685 | ||
21.5.3 Open Problems | 686 | ||
21.6 NOTES | 689 | ||
VII BACKGROUND MATERIAL\r | 691 | ||
22 \rOptimization Techniques | 693 | ||
22.1 LINEAR LEAST-SQUARES METHODS | 693 | ||
22.1.1 Normal Equations and the Pseudoinverse | 694 | ||
22.1.2 Homogeneous Systems and Eigenvalue Problems | 695 | ||
22.1.3 Generalized Eigenvalues Problems | 696 | ||
22.1.4 An Example: Fitting a Line to Points in a Plane | 696 | ||
22.1.5 Singular Value Decomposition | 697 | ||
22.2 NONLINEAR LEAST-SQUARES METHODS | 699 | ||
22.2.1 Newton’s Method: Square Systems of Nonlinear Equations. | 700 | ||
22.2.2 Newton’s Method for Overconstrained Systems\r | 700 | ||
22.2.3 The Gauss–Newton and Levenberg–Marquardt Algorithms | 701 | ||
22.3 SPARSE CODING AND DICTIONARY LEARNING | 702 | ||
22.3.1 Sparse Coding \r | 702 | ||
22.3.2 Dictionary Learning | 703 | ||
22.3.3 Supervised Dictionary Learning | 705 | ||
22.4 MIN-CUT/MAX-FLOW PROBLEMS AND COMBINATORIAL OPTIMIZATION | 705 | ||
22.4.1 Min-Cut Problems | 706 | ||
22.4.2 Quadratic Pseudo-Boolean Functions | 707 | ||
22.4.3 Generalization to Integer Variables | 709 | ||
22.5 NOTES | 712 | ||
Bibliography | 714 | ||
Index | 767 | ||
List of Algorithms | 790 |