Research on obstacle avoidance of underactuated autonomous underwater vehicle based on offline reinforcement learning

Tao Liu; Junhao Huang; Jintao Zhao

doi:10.1017/S0263574724001802

Research on obstacle avoidance of underactuated autonomous underwater vehicle based on offline reinforcement learning

Published online by Cambridge University Press: 29 October 2024

Tao Liu

Junhao Huang and

Jintao Zhao

Show author details

Tao Liu*: Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China Guangdong Provincial Key Laboratory of Information Technology for Deep Water Acoustics, Zhuhai, China
Junhao Huang: Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
Jintao Zhao: Affiliation:
School of Ocean Engineering and Technology, Sun Yat-sen University & Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
*: Corresponding author: Tao Liu; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The autonomous navigation and obstacle avoidance capabilities of autonomous underwater vehicles (AUVs) are essential for ensuring their safe navigation and long-term, efficient operation. However, the complexity of the marine environment poses significant challenges to safe and effective obstacle avoidance. To address this issue, this study proposes an AUV obstacle avoidance control algorithm based on offline reinforcement learning. This method adopts the Conservative Q-learning (CQL) algorithm, which is based on the Soft Actor-Critic (SAC) framework. It learns from obtained historical obstacle avoidance data and ultimately achieves a favorable obstacle avoidance control strategy. In this method, PID and SAC control algorithms are utilized to generate expert obstacle avoidance data to construct a diversified offline database. Additionally, based on the line-of-sight (LOS) guidance method and artificial potential field (APF) method, information regarding the distance and orientation of targets and obstacles is incorporated into the state space, and heading and obstacle avoidance reward terms are integrated into the reward function design. The algorithm successfully guides the AUV in autonomous navigation and dynamic obstacle avoidance in three-dimensional space. Furthermore, the algorithm exhibits a certain degree of anti-interference capability against uncertain disturbances and ocean currents, enhancing the safety and robustness of the AUV system. Simulation results fully demonstrate the feasibility and effectiveness of the intelligent obstacle avoidance method based on offline reinforcement learning. This study highlights the profound significance of offline reinforcement learning in enabling robust and reliable control systems for AUVs, paving the way for enhanced operational capabilities in challenging marine environments.

Keywords

AUVs offline rL CQL dynamic obstacle avoidance

Type: Research Article
Information: Robotica , First View , pp. 1 - 25

DOI: https://doi.org/10.1017/S0263574724001802 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Zhou, J., Si, Y. and Chen, Y., “A review of subsea AUV technology,” J Mar Sci Eng 11(6), 1119 (2023).CrossRef Google Scholar

Tabatabaee-Nasab, F. S., Moosavian, S. A. A. and Khalaji, A. K., “Adaptive fault-tolerant control for an autonomous underwater vehicle,” Robotica 40(11), 4076–4089 (2022).CrossRef Google Scholar

Taheri, E., Ferdowsi, M. H. and Danesh, M., “Design boundary layer thickness and switching gain in SMC algorithm for AUV motion control,” Robotica 37(10), 1785–1803 (2019).CrossRef Google Scholar

Wang, H. and Su, B., “Event-triggered formation control of AUVs with fixed-time RBF disturbance observer,” Appl Ocean Res 112, 102638 (2021).CrossRef Google Scholar

Thomas, C., Simetti, E. and Casalino, G., “A unifying task priority approach for autonomous underwater vehicles integrating homing and docking maneuvers,” J Mar Sci Eng 9(2), 162 (2021).CrossRef Google Scholar

Gong, P., Yan, Z., Zhang, W. and Tang, J., “Trajectory tracking control for autonomous underwater vehicles based on dual closed-loop of MPC with uncertain dynamics,” Ocean Eng 265, 112697 (2022).CrossRef Google Scholar

Li, D. and Du, L., “AUV trajectory tracking models and control strategies: A review,” J Mar Sci Eng 9(9), 1020 (2021).CrossRef Google Scholar

Hammad, M. M., Elshenawy, A. K. and El Singaby, M. I., “Position Control and Stabilization of Fully Actuated AUV Using PID Controller,” In: Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016, (Springer International Publishing, 2018) pp. 517–536.CrossRef Google Scholar

Khodayari, M. H. and Balochian, S., “Modeling and control of autonomous underwater vehicle (AUV) in heading and depth attitude via self-adaptive fuzzy PID controller,” J Mar Sci Technol 20(3), 559–578 (2015).CrossRef Google Scholar

Peng, H., Huang, B., Jin, M., Zhu, C., Zhuang, J., “Distributed finite-time bearing-based formation control for underactuated surface vessels with Levant differentiator,” ISA Transactions, 147, 239–251 (2024).CrossRef Google Scholar PubMed

Zhou, B., Huang, B., Su, Y. and Zhu, C., “Interleaved periodic event-triggered communications-based distributed formation control for cooperative unmanned surface vessels,” IEEE Trans Neur Net Learn Syst 1–13 (2024).Google Scholar PubMed

Huang, B., Song, S., Zhu, C., Li, J. and Zhou, B., “Finite-time distributed formation control for multiple unmanned surface vehicles with input saturation,” Ocean Eng 233, 109158 (2021).CrossRef Google Scholar

Zhou, B., Huang, B., Su, Y., Wang, W. and Zhang, E., “Two-layer leader-follower optimal affine formation maneuver control for networked unmanned surface vessels with input saturations,” Int J Robust Nonlin Cont 34(5), 3631–3655 (2024).CrossRef Google Scholar

Wang, C., Cai, W., Lu, J., Ding, X. and Yang, J., “Design, modeling, control, and experiments for multiple AUVs formation,” IEEE Trans Autom Sci Eng 19(4), 2776–2787 (2021).CrossRef Google Scholar

Khlif, N., Nahla, K. and Safya, B., “Reinforcement learning with modified exploration strategy for mobile robot path planning,” Robotica 41(9), 2688–2702 (2023).CrossRef Google Scholar

Zaman, M. Q. and Wu, H.-M., “An improved fuzzy inference strategy using reinforcement learning for trajectory-tracking of a mobile robot under a varying slip ratio,” Robotica 42(4), 1134–1152 (2024).CrossRef Google Scholar

Sancak, C., Yamac, F. and Itik, M., “Position control of a planar cable-driven parallel robot using reinforcement learning,” Robotica 40(10), 3378–3395 (2022).CrossRef Google Scholar

Liu, S., Ma, C. and Juan, R., “AUV obstacle avoidance framework based on event-triggered reinforcement learning,” Electronics 13(11), 2030 (2024).CrossRef Google Scholar

Fang, Y., Huang, Z., Pu, J. and Zhang, J., “AUV position tracking and trajectory control based on fast-deployed deep reinforcement learning method,” Ocean Eng 245, 110452 (2022).CrossRef Google Scholar

Cheng, C., Sha, Q., He, B. and Li, G., “Path planning and obstacle avoidance for AUV: A review,” Ocean Eng 235, 109355 (2021).CrossRef Google Scholar

Sun, Y., Zhang, C., Zhang, G., Xu, H. and Ran, X., “Three-dimensional path tracking control of autonomous underwater vehicle based on deep reinforcement learning,” J Mar Sci Eng 7(12), 443 (2019).CrossRef Google Scholar

Wu, H., Song, S., You, K. and Wu, C., “Depth control of model-free AUVs via reinforcement learning,” IEEE Trans Syst Man Cybern: Syst 49(12), 2499–2510 (2018).CrossRef Google Scholar

Cui, R., Yang, C., Li, Y. and Sharma, S., “Adaptive neural network control of AUVs with control input nonlinearities using reinforcement learning,” IEEE Trans Syst Man Cybern Syst 47(6), 1019–1029 (2017).CrossRef Google Scholar

Carlucho, I., De Paula, M., Wang, S., Petillot, Y. and Acosta, G. G., “Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning,” Robot Auton Syst 107, 71–86 (2018).CrossRef Google Scholar

Jiang, P., Song, S. and Huang, G., “Attention-based meta-reinforcement learning for tracking control of AUV with time-varying dynamics,” IEEE Trans Neur Net Lear Syst 33(11), 6388–6401 (2021).CrossRef Google Scholar

Ma, D., Chen, X., Ma, W., Zheng, H. and Qu, F., “Neural network model-based reinforcement learning control for auv 3-d path following,” IEEE Trans Intell Veh, 9(1), 893–904 (2023).Google Scholar

Yuan, J., Wang, H., Zhang, H., Lin, C., Yu, D. and Li, C., “AUV obstacle avoidance planning based on deep reinforcement learning,” J Mar Sci Eng 9(11), 1166 (2021).CrossRef Google Scholar

Hadi, B., Khosravi, A. and Sarhadi, P., “Deep reinforcement learning for adaptive path planning and control of an autonomous underwater vehicle,” Appl Ocean Res 129, 103326 (2022).CrossRef Google Scholar

Fujimoto, S., Meger, D. and Precup, D., “Off-Policy Deep Reinforcement Learning Without Exploration,” In: International Conference on Machine Learning, (PMLR, 2019) pp. 2052–2062.Google Scholar

Agarwal, R., Schuurmans, D. and Norouzi, M., “An Optimistic Perspective on Offline Reinforcement Learning,” In: International Conference on Machine Learning, (PMLR, 2020) pp. 104–114.Google Scholar

Levine, S., Kumar, A., Tucker, G. and Fu, J., “Offline reinforcement learning: Tutorial, review, and perspectives on open problems (2020) arxiv preprint arxiv: 2005.Google Scholar

Kumar, A., Zhou, A., Tucker, G. and Levine, S., “Conservative q-learning for offline reinforcement learning,” Adv Neur Inf Process Syst 33, 1179–1191 (2020).Google Scholar

Yu, T., Thomas, G., Yu, L., Ermon, S., Zou, J. Y., Levine, S. and Ma, T., “Mopo: Model-based offline policy optimization,” Adv Neur Inf Process Syst 33, 14129–14142 (2020).Google Scholar

Fossen, T. I., “Marine control systems-guidance. navigation, and control of ships, rigs and underwater vehicles,” (2002). Marine Cybernetics, Trondheim, Norway, Org. Number NO 985 195 005 MVA, www. marinecybernetics. com, ISBN: 82 92356 00 2.Google Scholar

Fossen, T. I.. Handbook of Marine Craft Hydrodynamics and Motion Control, John Wiley & Sons Ltd, Chichester, UK (2011).Google Scholar

Prestero, T. T. J., Verification of a six-degree of freedom simulation model for the REMUS autonomous underwater vehicle (Doctoral dissertation, Massachusetts institute of technology), (2001).CrossRef Google Scholar

Article contents

Research on obstacle avoidance of underactuated autonomous underwater vehicle based on offline reinforcement learning

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests