Skip to main content Accessibility help
×
Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-01-27T12:27:06.447Z Has data issue: false hasContentIssue false

11 - Representing and Matching Scenes

from Part III - Image Understanding

Published online by Cambridge University Press:  25 October 2017

Wesley E. Snyder
Affiliation:
North Carolina State University
Hairong Qi
Affiliation:
University of Tennessee
Get access

Summary

One of these things is not like the other.

– Sesame Street

Introduction

In this chapter rather than matching regions that we did in Chapter 10, we consider issues associated with matching scenes.

Matching at this level establishes an interpretation. That is, it puts two representations into correspondence:

  • • (Section 11.2) In this section, both representations may be of the same form. For example, correlation matches an observed image with a template, an approach called template matching. Eigenimages are also a representation for images that use the concepts of principal components to match images.

  • • (Section 11.3) When matching scenes, we don't really want to match every single pixel, but only do matching at points that are “interesting.” This requires a definition for interest points.

  • • (Sections 11.4, 11.5, and 11.6) Once the interest points are identified, these sections develop three methods, SIFT, SKS, and HoG, for describing the neighborhood of the interest points using descriptors and then matching those descriptors.

  • • (Section 11.7) If the scene is represented abstractly, by nodes in graphs, methods are provided for matching graphs.

  • • (Sections 11.8 and 11.9) In these sections, two other matching methods, including deformable templates, are described.

  • As we investigate matching scenes, or components of scenes, a new word is introduced, descriptor. This word denotes a representation for a local neighborhood in a scene, a neighborhood of perhaps 200 pixels, larger than the kernels we have thought about, but smaller than templates. The terms kernels, templates, and descriptors, while they do connote size to some extent, are really describing how this local representation is used, as the reader will see.

    Matching Iconic Representations

    Matching Templates to Scenes

    Recall that an iconic representation of an image is an image, e.g., a smaller image, an image that is not blurred, etc. In this section, we need to match two images.

    A template is a representation for an image (or sub-image) that is itself an image, but almost always smaller than the original. A template is typically moved around the target image until a location is found that maximizes some match function. The most obvious such function is the sum squared error, sometimes referred to as the sum-squared difference (SSD),

    which provides a measure of how well the template (T) matches the image (f) at point x, y, assuming the template is N × N.

    Type
    Chapter
    Information
    Publisher: Cambridge University Press
    Print publication year: 2017

    Access options

    Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

    References

    [11.1] H., Bay, A., Ess, T., Tuytelaars, and L. V., Gool. Speeded up robust features (SURF). Computer Vision and Image Understanding, 110 (3), 2008.Google Scholar
    [11.2] H., Bay, T., Tuytelaars, and L. V., Gool. SURF: Speeded-up robust features. In European Conf. on Computer Vision (ECCV), volume 3591, pages 404–417, 2006.Google Scholar
    [11.3] B., Bhanu and O., Faugeras. Shape matching of two dimensional objects. IEEE Trans. Pattern Anal. and Machine Intel., 6 (2), 1984.Google Scholar
    [11.4] G., Bilbro and W., Snyder. Fusion of range and luminance data. In IEEE Symposium on Intelligent Control, August 1988.
    [11.5] A., Bimbo and P., Pala. Visual image retrieval by elastic matching of user sketches. IEEE Trans. Pattern Anal. and Machine Intel., 19 (2), 1997.Google Scholar
    [11.6] J., Canning. A minimum description length model for recognizing objects with variable appearances (the vapor model). IEEE Trans. Pattern Anal. and Machine Intel., 16 (10), 1994.Google Scholar
    [11.7] C., Chang, W., Snyder, and C., Wang. Secure target localization in sensor networks using relaxation labeling. Int. J. Sensor Networks, 1 (1), 2008.Google Scholar
    [11.8] N., Dalal and B., Triggs. Histograms of oriented gradients for human detection. In International Conference on Computer Vision and Pattern Recognition (CVPR ’05), 2005.
    [11.9] T., Darrell and A., Pentland. Cooperative robust estimation using layers of support. IEEE Trans. Pattern Anal. and Machine Intel., 17 (5), 1995.Google Scholar
    [11.10] D., DeCarlo and D., Metaxas. Blended deformable models. IEEE Trans. Pattern Anal. and Machine Intel., 18 (4), 1996.Google Scholar
    [11.11] S., Dickinson, D., Metaxas, and A., Pentland. The role of model-based segmentation in the recovery of volumetric parts from range data. IEEE Trans. Pattern Anal. and Machine Intel., 19 (3), 1997.Google Scholar
    [11.12] P., Dollar, V., Rabaud, G., Cottrell, and S., Belongie. Behavior recognition via sparse spatiotemporal features. I IEEE Int. Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 65–82, 2005.
    [11.13] L., FeiFei and P., Perona. A Bayesian hierarchical model for learning natural scene categories. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2005.
    [11.14] M., Fischler and R., Elschlager. The representation and matching of pictoral structures. IEEE Transactions on Computers, 22 (1), Jan 1973.Google Scholar
    [11.15] R., Gonzalez and R., Woods. Digital Image Processing. Pearson, 4th edition, 2018. 5 Yes, a botanist! He was out of his field.
    [11.16] I., Goodfellow, Y., Bengio, and A., Courville. Deep Learning. MIT Press, 2016.
    [11.17] D., Graupe. Principles of Artificial Neural Networks. World Scientific, 2007.
    [11.18] E., Grosso and M., Tistarelli. Active/dynamic stereo vision. IEEE Trans. Pattern Anal. and Machine Intel., 17 (9), 1995.Google Scholar
    [11.19] S., Haykin. Neural Networks and Learning Machines. Prentice-Hall, 2009.
    [11.20] M., Hebert, K., Ikeuchi, and H., Delingette. Spherical representation for recognition of freeform surfaces. IEEE Trans. Pattern Anal. and Machine Intel., 17 (7), 1995.Google Scholar
    [11.21] R., Hummel and S., Zucker. On the foundations of relaxation labeling processes. IEEE Trans. Pattern Anal. and Machine Intel., 5 (5), 1983.Google Scholar
    [11.22] K., Sohn, J., Kim, and S., Yoon. A robust boundary-based object recognition in occlusion environment by hybrid Hopfield neural networks. Pattern Recognition, 29 (12), December 1996.Google Scholar
    [11.23] F., Jurie and B., Triggs. Creating efficient codebooks for visual recognition. In Int. Conf. on Computer Vision (ICCV), 2005.
    [11.24] K., Krish, S., Heinrich, W., Snyder, H., Cakir, and S., Khorram. A new feature based image registration algorithm. In ASPRS 2008 Annual Conference, April 2008.
    [11.25] K., Krish, S., Heinrich, W., Snyder, H., Cakir, and S., Khorram. Global registration of overlapping images using accumulative image features. Pattern Recognition Letters, 31 (2), January 2010.Google Scholar
    [11.26] A., Lanterman. Minimum description length understanding of infrared scenes. Automatic Target Recognition VIII. SPIE, 3371, April 1998.Google Scholar
    [11.27] I., Laptev and T., Lindeberg. On space-time interest points. International Journal of Computer Vision, 64(2/3), 2005.
    [11.28] Y., Leclerc. Constructing simple stable descriptions for image partitioning. Inter-national Journal of Computer Vision, 3, 1989.Google Scholar
    [11.29] Y., LeCun, Y., Bengio, and G., Hinton. Deep learning. Nature, 521 (7553), 2015.Google Scholar
    [11.30] T., Lindeberg. Image matching using generalized scale-space interest points. Journal of Mathematical Imaging and Vision, 52 (1), 2015.Google Scholar
    [11.31] D., Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 20, 2004.Google Scholar
    [11.32] D. G., Lowe. Object recognition from local scale-invariant features. In Proc. of the International Conference on Computer Vision ICCV, 1999.
    [11.33] C., Mikolajczyk and K., Schmidt. Indexing based on scale invariant interest points. In Eighth IEEE International Conference on Computer Vision. IEEE, 2001.
    [11.34] H., Moravec. Rover visual obstacle avoidance. In Proceedings of the International Joint Conference on Artificial Intelligence, 1981.
    [11.35] H., Murakami and B., Kumar. Efficient calculation of primary images from a set of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 4 (5):511–515, September 1982.Google Scholar
    [11.36] H., Murase and M., Lindenbaum. Partial eigenvalue decomposition of large images using the spatial temporal adaptive method. IEEE Transactions on Image Processing, 4 (5), May 1995.Google Scholar
    [11.37] T., Ojala, M., Pietikainen, and T., Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Analysis and Machine Intelligence, 24, 2002.Google Scholar
    [11.38] S., Pankanti and A., Jain. Integrating vision modules: Stereo, shading, grouping, and line labeling. IEEE Trans. Pattern Anal. and Machine Intel., 17 (9), 1995.Google Scholar
    [11.39] M., Pelillo and M., Refice. Learning compatibility coefficients for relaxation labeling processes. IEEE Trans. Pattern Anal. and Machine Intel., 16 (9), 1994.Google Scholar
    [11.40] R., Bolles. Robust feature matching through maximal cliques. Proc. Soc. Photo-opt. Instrum. Engrs., 182, April 1979.Google Scholar
    [11.41] J., Rissanen. A universal prior for integers and estimation by minimum description length. Ann. Statistics, 11, 1983.Google Scholar
    [11.42] P., Sastry and M., Thathachar. Analysis of stochastic automata algorithm for relaxation labeling. IEEE Trans. Pattern Anal. and Machine Intel., 16 (5), 1994.Google Scholar
    [11.43] H., Schweitzer. Occam algorithms for computing visual motion. IEEE Trans. Pattern Anal. and Machine Intel., 17 (11), 1995.Google Scholar
    [11.44] S., Sclaroff and A., Pentland. Model matching for correspondence and recognition. IEEE Trans. Pattern Anal. and Machine Intel., 17 (6), 1995.Google Scholar
    [11.45] P., Scovanner and M., Shah. A 3-dimensional SIFT descriptor and its application to action recognition. In ACM Int. Conf. on Multimedia, 2007.
    [11.46] L., Shapiro and J. M., Brady. Feature-based correspondence: An eigenvector approach. Image and Vision Computing, 10 (5), June 1992.Google Scholar
    [11.47] H., Wang, M. M., Ullah, A., Klaser, I., Laptev, and C., Schmid. Evaluation of local spatiotemporal features for action recognition. In British Machine Vision Conf. (BMVC), 2009.
    [11.48] X., Wang and H., Qi. Face recognition using optimal non-orthogonal wavelet basis evaluated by information complexity. In International Conference on Pattern Recognition, volume 1, pages 164–167, August 2002.Google Scholar
    [11.49] G., Willems, T., Tuytelaars, and L. V., Gool. An efficient dense and scale-invariant spatiotemporal interest point detector. In European Conf. on Computer Vision (ECCV), 2008.
    [11.50] Q., Wu. A correlation-relaxation-labeling framework for computing optical flow –template matching from a new perspective. IEEE Trans. Pattern Anal. and Machine Intel., 17 (9), 1995.Google Scholar
    [11.51] M., Yang and J., Lee. Object identification from multiple images based on point matching under a general transformation. IEEE Trans. Pattern Anal. and Machine Intel., 16 (7), 1994.Google Scholar
    [11.52] Y., Zhong, A., Jain, and M., Dubuisson-Jolly. Object tracking using deformable templates. IEEE Trans. Pattern Anal. and Machine Intel., 22 (5), May 2000.Google Scholar
    [11.53] S., Zhu and A., Yuille. Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Trans. Pattern Anal. and Machine Intel., 18 (9), 1996.Google Scholar

    Save book to Kindle

    To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

    Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

    Find out more about the Kindle Personal Document Service.

    Available formats
    ×

    Save book to Dropbox

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

    Available formats
    ×

    Save book to Google Drive

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

    Available formats
    ×