Scale-invariant optical flow in tracking using a pan-tilt-zoom camera

Salam Dhou; Yuichi Motai

doi:10.1017/S0263574714002665

Scale-invariant optical flow in tracking using a pan-tilt-zoom camera

Published online by Cambridge University Press: 09 December 2014

Salam Dhou and

Yuichi Motai

Show author details

Salam Dhou: Affiliation:
Department of Electrical and Computer Engineering, Virginia Commonwealth University, Richmond, VA 23284-3068, USA
Yuichi Motai*: Affiliation:
Department of Electrical and Computer Engineering, Virginia Commonwealth University, Richmond, VA 23284-3068, USA
*: *Corresponding author E-mail: [email protected]

Article contents

Summary
References

Get access

Rights & Permissions

Summary

An efficient method for tracking a target using a single Pan-Tilt-Zoom (PTZ) camera is proposed. The proposed Scale-Invariant Optical Flow (SIOF) method estimates the motion of the target and rotates the camera accordingly to keep the target at the center of the image. Also, SIOF estimates the scale of the target and changes the focal length relatively to adjust the Field of View (FoV) and keep the target appear in the same size in all captured frames. SIOF is a feature-based tracking method. Feature points used are extracted and tracked using Optical Flow (OF) and Scale-Invariant Feature Transform (SIFT). They are combined in groups and used to achieve robust tracking. The feature points in these groups are used within a twist model to recover the 3D free motion of the target. The merits of this proposed method are (i) building an efficient scale-invariant tracking method that tracks the target and keep it in the FoV of the camera with the same size, and (ii) using tracking with prediction and correction to speed up the PTZ control and achieve smooth camera control. Experimental results were performed on online video streams and validated the efficiency of the proposed method SIOF, comparing with OF, SIFT, and other tracking methods. The proposed SIOF has around 36% less average tracking error and around 70% less tracking overshoot than OF.

Keywords

Object tracking Optical flow Scale-invariant feature transform Pan-tilt-zoom

Type: Articles
Information: Robotica , Volume 34 , Issue 9 , September 2016 , pp. 1923 - 1947

DOI: https://doi.org/10.1017/S0263574714002665 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1. Lowe, D. G., “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60 (2), 91–110 (Nov. 2004).CrossRef Google Scholar

2. Liu, C., Yuen, J. and Torralba, A., “SIFT flow: Dense correspondence across scenes and its applications,” IEEE Pattern Anal. Mach. Int. 33 (5), 978–994 (May 2011).Google Scholar

3. Liu, C., Yuen, J., Torralba, A., Sivic, J. and Freeman, W. T., “SIFT Flow: Dense Correspondence Across Different Scenes,” In: Proceedings of ECCV 2008, Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 5304 (Oct. 2008) pp. 28–42.Google Scholar

4. Yao, Y., Abidi, B. and Abidi, M., “3D target scale estimation and target feature separation for size preserving tracking in PTZ video source,” Int. J. Comput. Vis. 82 (3), 244–263 (May 2009).CrossRef Google Scholar

5. Micheloni, C. and Foresti, G. L., “Active tuning of intrinsic camera parameters,” IEEE Trans. Autom. Sci. Eng. 6 (4), 577–587 (Oct. 2009).Google Scholar

6. Graetzel, C. F., Nelson, B. J. and Fry, S. N., “A dynamic region-of-interest vision tracking system applied to the real-time wing kinematic analysis of tethered drosophila,” IEEE Trans. Autom. Sci. Eng. 7 (3), 463–473 (Jul. 2010).Google Scholar

7. Mikic, I., Trivedi, M., Hunter, E. and Cosman, P., “Human body model acquisition and tracking using voxel data,” Int. J. Comput. Vis. 53 (3), 199–223 (Jul.–Aug. 2003).Google Scholar

8. Jang, D. S., Jang, S. W. and Choi, H. I., “2D human body tracking with structural Kalman filter,” Pattern Recognit. 35 (10), 2041–2049 (Oct. 2002).Google Scholar

9. Schulz, D., Burgard, W., Fox, D. and Cremers, A. B., “People tracking with mobile robots using sample-based joint probabilistic data association filters,” Int. J. Robot. Res. 22 (2), 99–116 (Feb. 2003).Google Scholar

10. Rosales, R. and Sclaroff, S., “A framework for heading-guided recognition of human activity,” Comput. Vis. Image Underst. 91 (3), 335–367 (Sep. 2003).Google Scholar

11. Sun, X. D., Foote, J., Kimber, D. and Manjunath, B. S., “Region of interest extraction and virtual camera control based on panoramic video capturing,” IEEE Trans. Multimedia 7 (5), 981–990 (Oct. 2005).Google Scholar

12. Wu, S. G. and Hong, L., “Hand tracking in a natural conversational environment by the interacting multiple model and probabilistic data association (IMM-PDA) algorithm,” Pattern Recognit. 38 (11), 2143–2158 (Nov. 2005).Google Scholar

13. Yun, X. P. and Bachmann, E. R., “Design implementation, and experimental results of a quaternion-based Kalman filter for human body motion tracking,” IEEE Trans. Robot. 22 (6), 1216–1227 (Dec. 2006).Google Scholar

14. Beymer, D. and Konolige, K., “Tracking people from a mobile platform,” Exp. Robot. VIII Springer Tracts in Adv. Robot. 5, 234–244 (2003).Google Scholar

15. Horn, B. K. P. and Schunck, B. G., “Determining optical flow,” Artif. Intell. 17 (1–3), 185–203 (Aug. 1981).Google Scholar

16. Lucas, B. D. and Kanade, T., “An Iterative Image Registration Technique with An Application to Stereo Vision,” Proceedings of the Seventh International Joint Conference on Artificial Intelligence, Vancouver, BC, Canada, (Aug. 1981), vol. 81, pp. 674–679.Google Scholar

17. Tomasi, C. and Kanade, T., “Detection and Tracking of Point Features,” Technical Report CMU-CS-91-132, Carnegie Mellon University, Pittsburgh, PA (Apr. 1991).Google Scholar

18. Shi, J. and Tomasi, C., “Good Features to Track,” Proceedigs of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94), Seattle, WA, USA (Jun. 1994) pp. 593–600.Google Scholar

19. Streit, R. L., Graham, M. L. and Walsh, M. J., “Multitarget tracking of distributed targets using histogram – PMHT,” Digit. Signal Process. 12, 394–404 (May 2002).Google Scholar

20. Cheng, Y., “Mean shift, mode seeking, and clustering,” IEEE Trans. Pattern Anal. Mach. Intell. 17 (8), 790–799 (1998).Google Scholar

21. Comaniciu, D., Ramesh, V. and Meer, P., “Real-time tracking of non-rigid objects using mean shift,” IEEE Proc. Comput. Vis. Pattern Recognit., Hilton Head, SC, USA (2000) pp. 673–678.Google Scholar

22. Ido, L., “Mean shift trackers with cross-bin metrics,” IEEE Trans. Pattern Anal. Mach. Intell. 34 (4), 695–706 (Apr. 2012).Google Scholar

23. Ido, L., Michael, L. and Ehud, R., “Mean shift tracking with multiple reference color histograms,” Comput. Vis. Image Underst. 114 (3), 400–408 (Mar. 2010).Google Scholar

24. Bradski, G. R., “Computer vision face tracking for use in a perceptual user interface,” IEEE Workshop on Applications of Computer Vision, Princeton, NJ (1998), pp. 214–219.Google Scholar

25. Brox, T., Rosenhahn, B., Gall, J. and Cremers, D., “Combined region and motion-based 3D tracking of rigid and articulated objects,” IEEE Trans. Pattern Anal. Mach. Intell. 32 (3), 402–415 (Mar. 2010).Google Scholar

26. Serby, D., Meier, E.-K. and Van Gool, L., “Probabilistic Object Tracking Using Multiple Features,” In: Proceedings of 17th International Conference on Pattern Recognition (ICPR 2004), IEEE, Cambridge, United Kingdom, Vol. 2 (2004) pp. 184–187.Google Scholar

27. Tarhan, M. and Altug, E., “A catadioptric and pan-tilt-zoom camera pair object tracking system for UAVs,” J. Intell. Robot. Syst. 61 (1–4), 119–134 (Mar. 2011).CrossRef Google Scholar

28. Varcheie, P. D. Z. and Bilodeau, G. A., “Adaptive fuzzy particle filter tracker for a PTZ camera in an IP surveillance system,” IEEE Trans. Instrum. Meas. 60 (2), 354–371 (Feb. 2011).CrossRef Google Scholar

29. Song, D., Xu, Y. and Qin, N., “Aligning windows of live video from an imprecise pan-tilt-zoom camera into a remote panoramic display for remote nature observation,” J. Real-Time Image Process. 5 (1), 57–70 (2010).Google Scholar

30. Tordoff, B. and Murray, D., “Reactive control of zoom while fixating using perspective and affine cameras,” IEEE Trans. Pattern Anal. Mach. Intell. 26 (1), 98–112 (Jan. 2004).CrossRef Google Scholar PubMed

31. Tordoff, B. and Murray, D., “A method of reactive zoom control from uncertainty in tracking,” Comput. Vis. Image Underst. 105 (2), 131–144 (Feb. 2007).Google Scholar

32. Hutchinson, S. A., Hager, G. D. and Corke, P. I., “A tutorial on visual servo control,” IEEE Trans. Robot. Autom. 12 (5), 651–670 (Oct. 1996).CrossRef Google Scholar

33. Chaumette, F. and Hutchinson, S., “Visual servo control, part I: Basic approaches,” IEEE Robot. Autom. Mag. 13 (4), 82–90 (Dec. 2006).Google Scholar

34. Gans, N. R., Hu, G. and Dixon, W. E., “Keeping Multiple Objects in the Field of View of a Single PTZ Camera,” Proceedings of the 2009 American Control Conference (ACC '09) (Jun. 2009) pp. 5259–5264.Google Scholar

35. Chen, I.-H. and Wang, S.-J., “An efficient approach for the calibration of multiple PTZ cameras,” IEEE Trans. Autom. Sci. Eng. 4 (2), 286–293 (Apr. 2007).CrossRef Google Scholar

36. Se, S., Lowe, D. G. and Little, J. J., “Vision-based global localization and mapping for mobile robots,” IEEE Trans. Robot. 21 (3), 364–375 (Jun. 2005).Google Scholar

37. Zhou, H., Yuan, Y. and Shi, C., “Object tracking using SIFT features and mean shift,” Comput. Vis. Image Underst. 113 (3), 345–352 (Mar. 2009).CrossRef Google Scholar

38. Chen, A. H., Zhu, M., Wang, Y. H. and Xue, C., “Mean Shift Tracking Combining SIFT,” In: Proceedings of the 9th International Conference on Signal Processing, 2008, Beijing, China, Vol. 1–5 (2008) pp. 1532–1535.Google Scholar

39. Cui, Y., Hasler, N., Thormaehlen, T. and Seidel, H. P., “Scale Invariant Feature Transform with Irregular Orientation Histogram Binning,” In: Proceedings of International Conference on Image Analysis and Recognition, Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 5627 (2009), pp. 258–267.Google Scholar

40. Lee, H., Heo, P. G., Suk, J. Y., Yeou, B. Y. and Park, H., “Scale-invariant object tracking method using strong corners in the scale domain,” Opt. Eng. 48 (1), 017204–017204-9 (Jan. 2009).Google Scholar

41. Bay, H., Tuytelaars, T. and Gool, L. V., “SURF: Speeded Up Robust Features,” In: Proceeding of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, Springer Berlin Heidelberg, Vol. 3951 (May 2006), pp. 404–417.Google Scholar

42. Bregler, C. and Malik, J., “Tracking People with Twists and Exponential Maps,” In: Proceedings of the IEEE CS Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, (1998), pp. 8–15.Google Scholar

43. Bregler, C., Malik, J. and Pullen, K., “Twist-based acquisition and tracking of animal and human kinematics,” Int. J. Comput. Vis. 56 (3), 179–194 (Feb. 2004).CrossRef Google Scholar

44. Khan, Z. H. and Gu, I. Y. H., “Joint feature correspondences and appearance similarity for robust visual object tracking,” IEEE Trans. Inf. Foren. Secur. 5 (3), 591–606 (Sep. 2010).Google Scholar

45. Churchill, D. and Vardy, A., “Homing in Scale Space,” Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Nice, France (Sep. 2008), pp. 1307–1312.Google Scholar

46. Venkateswar, V. and Chellappa, R., “Hierarchical stereo and motion correspondence using feature groupings,” Int. J. Comput. Vis. 15, 45–69 (1995).Google Scholar

47. Morita, T. and Kanade, T., “A sequential factorization method for recovering shape and motion from image streams,” IEEE Trans. Pattern Anal. Mach. Intell. 19 (8), 858–867 (Aug. 1997).CrossRef Google Scholar

48. Polhemus Documentation: Available at: http://www.polhemus.com/?pa-ge=Motion_Liberty. [Accessed 10 August 2010].Google Scholar

49. Chatfield, C., “Prediction intervals for time-series forecasting,” Principles of forecasting. Springer US, (2001), pp. 475–494.Google Scholar

50. Lim, S. and El-Gamal, A., “Optical Flow Estimation Using High Frame Rate Sequences,” In: Proceeding of the IEEE 2001 International Conference on Image Processing (ICIP), Thessaloniki, Greece (Oct. 2001), Vol. 2, pp. 925–928.Google Scholar

51. Liu, C., Yuen, J. and Torralba, A., “Nonparametric Scene Parsing: Label Transfer via Dense Scene Alignment,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, (Jun. 2009) pp. 1972–1979.Google Scholar

Dhou and Motai Supplementary Material

Supplementary Material

PDF 123.7 KB

Article contents

Scale-invariant optical flow in tracking using a pan-tilt-zoom camera

Summary

Keywords

Access options

References

Dhou and Motai Supplementary Material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests