Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-12T08:35:09.915Z Has data issue: false hasContentIssue false

Azimuthal source localization using interaural coherence in a robotic dog: modeling and application

Published online by Cambridge University Press:  15 January 2010

Rong Liu*
Affiliation:
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin 150080, P.R. China School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, P.R. China
Yongxuan Wang
Affiliation:
School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116024, P.R. China
*
*Corresponding author. E-mail: [email protected]

Summary

In nature, sounds from multiple sources, as well as reflections from the surfaces of the physical surroundings, arrive concurrently from different directions at the ears of a listener. Despite the fact that all of these waveforms sum at the eardrums, humans with normal hearing can effortlessly segregate interesting sounds from echoes and other sources of background noises. This paper presents a two-microphone technique for localization of sound sources to effectively guide robotic navigation. Its fundamental structure is adopted from a binaural signal-processing scheme employed in biological systems for the localization of sources using interaural time differences (ITDs). The two input signals are analyzed for coincidences along left/right-channel delay-line pairs. The coincidence time instants are presented as a function of the interaural coherence (IC). Specifically, we build a sphere head model for the selected robot and apply the mechanism of binaural cues selection observed in mammalian hearing system to mitigate the effects of sound echoes. The sound source is found by determining the azimuth at which the maximum of probability density function (PDF) of ITD cues occurs. This eliminates the localization artifacts found during tests. The experimental results of a systematic evaluation demonstrate the superior performance of the proposed method.

Type
Article
Copyright
Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Liu, C., Wheeler, B. C., O'Brien, W. D., Bilger, R. C., Lansing, C. R., and Feng, A. S., “Localization of multiple sound sources with two mirophones.” J. Acoust. Soc. Am., 108, 18881905 (2000).CrossRefGoogle Scholar
2.Huang, J., Supaongprapa, T., Terakura, I., Wang, F., Ohnishi, N. and Sugie, N., “A model based sound localization system and its application to robot navigation,” Robot. Auton. Syst. 27, 199209 (1999).CrossRefGoogle Scholar
3.Okuno, H. G., Ogata, T. and Komatani, K., “Computational Auditory Scene Analysis and Its Application to Robot Audition: Five Years Experience,” Proceedings of 2nd International Conference on Informatics Research for Development of Knowledge Society Infrastructure (ICKS'07), Kyoto, Japan (2007) pp. 6976.Google Scholar
4.Wang, Z. Q. and Ben-Arie, J., “Conveying visual information with Spatial Auditory Patterns,” IEEE Trans. Speech Audio Process. 4, 10921098 (1996).Google Scholar
5.Huang, J., “Spatial Sound Processing for a Hearing Robot,” In: Enabling Society with Information Technology (Jin, Q., ed.) (Springer-Verlag, 2001) pp. 197206.Google Scholar
6.Okuno, H., Nakadai, K. and Kitano, H., “Social Interaction of Humanoid Robot Based on Audio-Visual Tracking,” Proceedings of 18th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE-2002), Cairns, Australia (2002) pp. 725735.Google Scholar
7.Ward, D. B. and Williamson, R. C., “Particle Filter Beamforming for Acoustic Sound Localization in a Reverberant Environment,” Proceedigs of the 7th International Conference on Acoustics, Speech, and Signal Processing(ICASSP), Orlando, FL. (2002) pp. 17771780.Google Scholar
8.Blauert, J., Spatial hearing: The Psychophysics of Human Sound Localization (MIT Press, Cambridge, MA, 1997).Google Scholar
9.Birchfiel, S. T. and Gangishetty, R., “Acoustic Localization by Interaural Level Difference,” Proceedings of the 2005 International Conference on Acoustics, Speech, and Signal Processing(ICASSP05), Philadelphia, PA, vol. 4 (2005), pp. 1823.Google Scholar
10.Wang, D. L., Roman, N. and Brown, G. J., “Speech segregation based on sound localization,” J. Acoust. Soc. Am. 114, 22362252 (2003).Google Scholar
11.Nandy, D. and Ben-Arie, J., “Estimating the azimuth of a sound source using the binaural spectral amplitude,” IEEE Trans. Speech Audio Process. 114, 4555 (1996).CrossRefGoogle Scholar
12.Brooks, R. A., Breazeal, C., Marjanovic, M., Scassellati, B. and Williamson, M., “The Cog Project: Building a Humanoid Robot,” In: Computation for Metaphors, Analogy, and Agents (Nehaniv, C., ed.) (Springer, 1999) pp. 5287.CrossRefGoogle Scholar
13.Zakarauskas, P. and Cynader, M. S., “A computational theory of spectral cue localization,” J. Acoust. Soc. Am. 94, 13231331 (1993).CrossRefGoogle Scholar
14.Nakadai, K., Okuno, H. G. and Kitano, H., “Robot Recognizes Three Simultaneous Speech by Active Audition,” Proceedings of the 2003 IEEE International Conferences on Robotics and Automation (ICRA 2003), Taipei, Taiwan (2003) pp. 398405.Google Scholar
15.Nakadai, K., Matsuura, D., Okuno, H. G. and Tsujino, H., “Improvement of recognition of simultaneous speech signals using AV integration and scattering theory for humanoid robots,” Speech Commun. 44, 97112, (2004).CrossRefGoogle Scholar
16.Berglund, E. and Sitte, J., “Sound Source Localisation Through Active Audition,” Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Sytems(IROS2005), Edmonton, Canada (2005) pp. 653658.Google Scholar
17.Murray, J., Erwin, H. and Wermter, S., “Robotics Sound- Source Localization and Tracking Using Interaural Time Difference and Cross-Correlation,” Proceedings of NeuroBotics Workshop, Ulm, Germany (2004) pp. 8997.Google Scholar
18.Goodridge, S. G., Multimedia Sensor Fusion for Intelligent Camera Control and Human-Computer Interaction, ch. 3 Ph. D. Thesis (Springer: North Carolina State University, 1997).Google Scholar
19.Handzel, A. A., Andersson, S. B., Gebremichael, M. and Krishnaprasad, P. S., “A Biomimetic Apparatus for Sound-source Localization,” Proceedings of 42nd IEEE Conference on Decision and Control, Hawaii, USA (2003) pp. 58795884.Google Scholar
20.Nakadai, K., Hidai, K., Okuno, H. G. and Kitano, H., “Epipolar Geometry Based Sound Localization and Extraction for Humanoid Audition,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2001), Maui, HI (2001) pp. 13951401.Google Scholar
21.Faller, C. and Merimaa, J., “Source localization in complex listening situations: Selection of binaural cues based on interaural coherence,” J. Acoust. Soc. Am. 116, 30753089 (2004).CrossRefGoogle ScholarPubMed
22.Ben-Reuven, E. and Singer, Y., “Discriminative Binaural Sound Localization,” In: Advances in Neural Information Processing Systems, vol. 15 (Thrun, S., Becker, S. and Obermayer, K., eds.) (MIT Press, Cambridge, MA, 2003) pp. 12291236.Google Scholar
23.Knapp, C. H. and Carter, G. C., “The generalized correlation method for estimation of time delay,” IEEE Trans. ASSP 24 (4), 320327 (1976).CrossRefGoogle Scholar