Book contents
- Frontmatter
- Contents
- List of contributors
- 1 Multimodal signal processing for meetings: an introduction
- 2 Data collection
- 3 Microphone arrays and beamforming
- 4 Speaker diarization
- 5 Speech recognition
- 6 Sampling techniques for audio-visual tracking and head pose estimation
- 7 Video processing and recognition
- 8 Language structure
- 9 Multimodal analysis of small-group conversational dynamics
- 10 Summarization
- 11 User requirements for meeting support technology
- 12 Meeting browsers and meeting assistants
- 13 Evaluation of meeting support technology
- 14 Conclusion and perspectives
- References
- Index
6 - Sampling techniques for audio-visual tracking and head pose estimation
Published online by Cambridge University Press: 05 July 2012
- Frontmatter
- Contents
- List of contributors
- 1 Multimodal signal processing for meetings: an introduction
- 2 Data collection
- 3 Microphone arrays and beamforming
- 4 Speaker diarization
- 5 Speech recognition
- 6 Sampling techniques for audio-visual tracking and head pose estimation
- 7 Video processing and recognition
- 8 Language structure
- 9 Multimodal analysis of small-group conversational dynamics
- 10 Summarization
- 11 User requirements for meeting support technology
- 12 Meeting browsers and meeting assistants
- 13 Evaluation of meeting support technology
- 14 Conclusion and perspectives
- References
- Index
Summary
Introduction
Analyzing the behaviors of people in smart environment using multimodal sensors requires to answer a set of typical questions: who are the people? where are they? what activities are they doing? when? with whom are they interacting? and how are they interacting? In this view, locating people or their faces and characterizing them (e.g. extracting their body or head orientation) allows us to address the first two questions (who and where), and is usually one of the first steps before applying higher-level multimodal scene analysis algorithms that address the other questions. In the last ten years, tracking algorithms have experienced considerable progress, particularly in indoor environment or for specific applications, where they have reached a maturity allowing their deployment in real systems and applications. Nevertheless, there are still several issues that can make tracking difficult: background clutter and potentially small object size; complex shape, appearance, and motion, and their changes over time or across camera views; inaccurate/rough scene calibration or inconsistent camera calibration between views for 3D tracking; real-time processing requirements. In what follows, we discuss some important aspects of tracking algorithms, and introduce the remaining chapter content.
Scenarios and Set-ups. Scenarios and application needs strongly influence the considered physical environment, and therefore the set-up (where, how many, and what type of sensors are used) and choice of tracking method. A first set of scenarios commonly involves the tracking of people in the so-called smart spaces (Singh et al., 2006).
- Type
- Chapter
- Information
- Multimodal Signal ProcessingHuman Interactions in Meetings, pp. 84 - 102Publisher: Cambridge University PressPrint publication year: 2012
- 3
- Cited by