RGB-D visual odometry by constructing and matching features at superpixel level

Meiyi Yang; Junlin Xiong; Youfu Li

doi:10.1017/S0263574724000985

RGB-D visual odometry by constructing and matching features at superpixel level

Published online by Cambridge University Press: 18 September 2024

Meiyi Yang ,

Junlin Xiong

and

Youfu Li

Show author details

Meiyi Yang: Affiliation:
Department of Automation, University of Science and technology of China, Hefei, China Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hongkong
Junlin Xiong*: Affiliation:
Department of Automation, University of Science and technology of China, Hefei, China
Youfu Li: Affiliation:
Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hongkong
*: Corresponding author: Junlin Xiong; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Visual odometry (VO) is a key technology for estimating camera motion from captured images. In this paper, we propose a novel RGB-D visual odometry by constructing and matching features at the superpixel level that represents better adaptability in different environments than state-of-the-art solutions. Superpixels are content-sensitive and perform well in information aggregation. They could thus characterize the complexity of the environment. Firstly, we designed the superpixel-based feature SegPatch and its corresponding 3D representation MapPatch. By using the neighboring information, SegPatch robustly represents its distinctiveness in various environments with different texture densities. Due to the inclusion of depth measurement, the MapPatch constructs the scene structurally. Then, the distance between SegPatches is defined to characterize the regional similarity. We use the graph search method in scale space for searching and matching. As a result, the accuracy and efficiency of matching process are improved. Additionally, we minimize the reprojection error between the matched SegPatches and estimate camera poses through all these correspondences. Our proposed VO is evaluated on the TUM dataset both quantitatively and qualitatively, showing good balance to adapt to the environment under different realistic conditions.

Keywords

computer vision SLAM visual tracking mobile robots superpixel robot localization

Type: Research Article
Information: Robotica , Volume 42 , Issue 8 , August 2024 , pp. 2619 - 2634

DOI: https://doi.org/10.1017/S0263574724000985 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Company-Corcoles, J. P., Garcia-Fidalgo, E. and Ortiz, A., “MSC-VO: Exploiting Manhattan and structural constraints for visual odometry,” IEEE Robot Autom Lett 7(2), 2803–2810 (2022).CrossRef Google Scholar

Mur-Artal, R., Montiel, J. M. and Tardos, J. D., “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans Robot 31(5), 1147–1163 (2015).CrossRef Google Scholar

Liu, H., Li, C., Chen, G., Zhang, G., Kaess, M. and Bao, H., “Robust keyframe-based dense SLAM with an RGB-D camera” (2017). arXiv preprint arXiv: 1711.05166, 2017.Google Scholar

Wang, Y.-T. and Lin, G.-Y., “Improvement of speeded-up robust features for robot visual simultaneous localization and mapping,” Robotica 32(4), 533–549 (2014).CrossRef Google Scholar

Kerl, C., Sturm, J. and Cremers, D., “Robust Odometry Estimation for RGB-D Cameras,” In: IEEE International Conference on Robotics and Automation 2013, (IEEE, 2013) pp. 3748–3754.CrossRef Google Scholar

Engel, J., Koltun, V. and Cremers, D., “Direct sparse odometry,” IEEE Trans Pattern Anal 40(3), 611–625 (2017).CrossRef Google Scholar PubMed

Jia, Q., Pu, Y., Chen, J., Cheng, J., Liao, C. and Yang, X., “D2VO: Monocular Deep Direct Visual Odometry,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (IEEE, 2020) pp. 10158–10165.CrossRef Google Scholar

Yang, N., v. Stumberg, L., Wang, R. and Cremers, D., “D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (IEEE, 2020) pp. 1278–1289.CrossRef Google Scholar

Taubner, F., Tschopp, F., Novkovic, T., Siegwart, R. and Furrer, F., “LCD-Line Clustering and Description for Place Recognition,” In: 2020 International Conference on 3D Vision (3DV), (IEEE, 2020) pp. 908–917.CrossRef Google Scholar

Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A. and Moreno-Noguer, F., “PL-SLAM: Real-time Monocular Visual SLAM with Points and Lines,” In: Proceedings - IEEE International Conference on Robotics and Automation, (IEEE, 2017) pp. 4503–4508.CrossRef Google Scholar

Zhao, L., Huang, S., Yan, L. and Dissanayake, G., “A new feature parametrization for monocular SLAM using line features,” Robotica 33(3), 513–536 (2015).CrossRef Google Scholar

Yang, S., Song, Y., Kaess, M. and Scherer, S., “Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016, (IEEE, 2016) pp. 1222–1229.CrossRef Google Scholar

Yang, S. and Scherer, S., “Monocular object and plane SLAM in structured environments,” IEEE Robot Autom Lett 4(4), 3145–3152 (2019).CrossRef Google Scholar

Ren, X. and Malik, J., “Learning a Classification Model for Segmentation,” In: Proceedings of the IEEE International Conference on Computer Vision 2003, (IEEE, 2003) pp.10–17.CrossRef Google Scholar

Malkov, Y. A. and Yashunin, D. A., “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs,” IEEE Trans Pattern Anal 42(4), 824–836 (2020).CrossRef Google Scholar PubMed

Sturm, J., Engelhard, N., Endres, F., Burgard, W. and Cremers, D., “A Benchmark for the Evaluation of RGB-D SLAM Systems,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2012, (IEEE, 2012) pp. 573–580.CrossRef Google Scholar

Yang, N., Wang, R. and Cremers, D., “Feature-based or direct: An evaluation of monocular visual odometry,” 1–12 (2017), arXiv preprint arXiv: 1705.04300, 2017.Google Scholar

Endres, F., Hess, J., Sturm, J., Cremers, D. and Burgard, W., “3-D mapping with an RGB-D camera,” IEEE Trans Robot 30(1), 177–187 (2014).CrossRef Google Scholar

Mur-Artal, R. and Tardós, J. D., “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Trans Robot 33(5), 1255–1262 (2017).CrossRef Google Scholar

Kerl, C., Sturm, J. and Cremers, D., “Dense Visual SLAM for RGB-D Cameras,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems 2013, (IEEE, 2013) pp. 2100–2106.CrossRef Google Scholar

Pretto, A., Menegatti, E., Bennewitz, M., Burgard, W. and Pagello, E., “A Visual Odometry Framework Robust to Motion Blur,” In: IEEE International Conference on Robotics and Automation 2009, ( IEEE, 2009) pp. 2250–2257.CrossRef Google Scholar

Weingarten, J. and Siegwart, R., “EKF-based 3D SLAM for Structured Environment Reconstruction,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems 2005, (IEEE, 2005) pp. 3834–3839.CrossRef Google Scholar

Lee, T.-K., Lim, S., Lee, S., An, S. and Oh, S.-Y., “Indoor Mapping Using Planes Extracted from Noisy RGB-D Sensors,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems 2012, (IEEE, 2012) pp. 1727–1733.CrossRef Google Scholar

Kaess, M., “Simultaneous Localization and Mapping with Infinite Planes,” In: IEEE International Conference on Robotics and Automation (ICRA) 2015, (IEEE, 2015) pp. 4605–4611.CrossRef Google Scholar

Ma, L., Kerl, C., Stückler, J. and Cremers, D., “ CPA-SLAM: Consistent Plane-Model Alignment for Direct RGB-D SLAM,” In: IEEE International Conference on Robotics and Automation (ICRA) 2016, (IEEE, 2016) pp. 1285–1291.CrossRef Google Scholar

Hsiao, M., Westman, E., Zhang, G. and Kaess, M., “Keyframe-based Dense Planar SLAM,” In: IEEE International Conference on Robotics and Automation (ICRA) 2017, (IEEE, 2017) pp. 5110–5117.CrossRef Google Scholar

Taguchi, Y., Jian, Y.-D., Ramalingam, S. and Feng, C., “Point-Plane SLAM for Hand-Held 3D Sensors,” In: IEEE International Conference on Robotics and Automation 2013, (IEEE, 2013) pp. 5182–5189.CrossRef Google Scholar

Zhang, X., Wang, W., Qi, X., Liao, Z. and Wei, R., “Point-plane SLAM using supposed planes for indoor environments,” Sensors 19(17), 3795 (2019).CrossRef Google Scholar PubMed

Li, Y., Yunus, R., Brasch, N., Navab, N. and Tombari, F., “RGB-D SLAM with Structural Regularities,” In: IEEE International Conference on Robotics and Automation (ICRA) 2021, (IEEE, 2021) pp. 11581–11587.CrossRef Google Scholar

Pfister, H., Zwicker, M., van Baar, J. and Gross, M., “Surfels: Surface Elements as Rendering Primitives,” In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, (ACM Press/Addison-Wesley Publishing Co, 2000) pp. 335–342.CrossRef Google Scholar

Concha, A. and Civera, J., “DPPTAM: Dense Piecewise Planar Tracking and Mapping from A Monocular Sequence,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2015, (IEEE, 2015) pp. 5686–5693.CrossRef Google Scholar

Yan, Z., Ye, M. and Ren, L., “Dense visual SLAM with probabilistic surfel map,” IEEE Trans Vis Comput Gr 23(11), 2389–2398 (2017).CrossRef Google Scholar PubMed

Muñoz-Salinas, R. and Medina-Carnicer, R., “UcoSLAM: Simultaneous localization and mapping by fusion of Keypoints and squared planar markers,” Pattern Recognit 101, 107193 (2020).CrossRef Google Scholar

Cho, H. M., Jo, H. and Kim, E., “SP-SLAM: Surfel-point simultaneous localization and mapping,” IEEE/ASME Trans Mechatron 27(5), 2568–2579 (2021).CrossRef Google Scholar

Yunus, R., Li, Y. and Tombari, F., “ManhattanSLAM: Robust Planar Tracking and Mapping Leveraging Mixture of Manhattan Frames,” In: IEEE International Conference on Robotics and Automation (ICRA) 2021, (IEEE, 2021) pp. 6687–6693.CrossRef Google Scholar

Wang, S., Clark, R., Wen, H. and Trigoni, N., “DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks,” In: IEEE international conference on robotics and automation (ICRA) 2017, (IEEE, 2017) pp. 2043–2050.CrossRef Google Scholar

Li, R., Wang, S., Long, Z. and Gu, D., “UnDeepVO: Monocular Visual Odometry Through Unsupervised Deep Learning,” In: IEEE International Conference on Robotics and Automation (ICRA) 2018, (IEEE, 2018) pp. 7286–7291.CrossRef Google Scholar

Yang, N., Wang, R., Stuckler, J. and Cremers, D., “Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry,” In: Proceedings of the European Conference on Computer Vision (ECCV), (Springer, 2018) pp. 835–852.CrossRef Google Scholar

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P. and Süsstrunk, S., “SLIC superpixels compared to state-of-the-art superpixel methods,” IEEE Trans Pattern Anal 34(11), 2274–2281 (2012).CrossRef Google Scholar PubMed

Giraud, R., Ta, V.-T., Bugeau, A., Coupe, P. and Papadakis, N., “SuperPatchMatch: An algorithm for robust correspondences using superpixel patches,” IEEE Trans Image Process 26(8), 4068–4078 (2017).CrossRef Google Scholar PubMed

Barnes, C., Shechtman, E., Finkelstein, A. and Goldman, D. B., “PatchMatch: A randomized correspondence algorithm for structural image editing,” ACM Trans Graphic 28(3), 24 (2009).CrossRef Google Scholar

Feng, C., Taguchi, Y. and Kamat, V. R., “Fast Plane Extraction in Organized Point Clouds Using Agglomerative Hierarchical Clustering,” In: IEEE International Conference on Robotics and Automation (ICRA) 2014, (IEEE, 2014) pp. 6218–6225.CrossRef Google Scholar

Besl, P. J. and McKay, N. D., “Method for Registration of 3-d Shapes,” In: Sensor Fusion IV: Control Paradigms and Data Structures 1992. vol. 1611 (Spie, 1992) pp. 586–606.CrossRef Google Scholar

Whelan, T., Salas-Moreno, R. F., Glocker, B., Davison, A. J. and Leutenegger, S., “ElasticFusion: Real-time dense SLAM and light source estimation,” Int J Robot Res 35(14), 1697–1716 (2016).CrossRef Google Scholar

Li, S., Wu, X., Cao, Y. and Zha, H., “Generalizing to the Open World: Deep Visual Odometry With Online Adaptation,” In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021, (IEEE, 2021) pp. 13184–13193.CrossRef Google Scholar

Li, S., Xue, F., Wang, X., Yan, Z. and Zha, H., “Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2019, (IEEE, 2019) pp. 2851–2860.CrossRef Google Scholar

Article contents

RGB-D visual odometry by constructing and matching features at superpixel level

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests