Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Wenjie Zhao; Zhou Fang; Zuqiang Yang

doi:10.1017/S0373463320000016

Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Published online by Cambridge University Press: 12 February 2020

Wenjie Zhao ,

Zhou Fang and

Zuqiang Yang

Show author details

Wenjie Zhao: Affiliation:
(School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, Zhejiang Province, China)
Zhou Fang: Affiliation:
(School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, Zhejiang Province, China)
Zuqiang Yang*: Affiliation:
(Information Science Academy of China Electronics Technology Group Corporation, Beijing, China)
*: (E-mail: [email protected])

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A distributed four-dimensional (4D) trajectory generation method based on multi-agent Q learning is presented for multiple unmanned aerial vehicles (UAVs). Based on this method, each vehicle can intelligently generate collision-free 4D trajectories for time-constrained cooperative flight tasks. For a single UAV, the 4D trajectory is generated by the bionic improved tau gravity guidance strategy, which can synchronously guide the position and velocity to the desired values at the arrival time. Furthermore, to optimise trajectory parameters, the continuous state and action wire fitting neural network Q (WFNNQ) learning method is applied. For multi-UAV applications, the learning is organised by the win or learn fast-policy hill climbing (WoLF-PHC) algorithm. Dynamic simulation results show that the proposed method can efficiently provide 4D trajectories for the multi-UAV system in challenging simultaneous arrival tasks, and the fully trained method can be used in similar trajectory generation scenarios.

Keywords

Algorithm Flight Route Planning Unmanned Aerial System (UAS)

Type: Research Article
Information: The Journal of Navigation , Volume 73 , Issue 4 , July 2020 , pp. 874 - 891

DOI: https://doi.org/10.1017/S0373463320000016 [Opens in a new window]
Copyright: Copyright © The Royal Institute of Navigation 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

REFERENCES

Alejo, D., Cobano, J., Heredia, G. and Ollero, A. (2013). Particle Swarm Optimization for Collision-Free 4D Trajectory Planning in Unmanned Aerial Vehicles. Proceedings of the 2013 International Conference on Unmanned Aircraft Systems, Atlanta, USA, 298–307.CrossRef Google Scholar

Dong, X. W., Li, Y. F., Lu, C., Hu, G. Q., Li, Q. D. and Ren, Z. (2018). Time-varying formation tracking for UAV swarm systems with switching directed topologies. IEEE Transactions on Neural Networks and Learning Systems, 30(12), 3674–3685.10.1109/TNNLS.2018.2873063CrossRef Google Scholar PubMed

Gaskett, C., Wettergreen, D. and Zelinsky, A. (1999). Reinforcement Learning Applied to the Control of an Autonomous Underwater Vehicle. Proceedings of the Australian Conference on Robotics and Automation, Brisbane, Australia, March 1999.Google Scholar

Hung, S. M. and Givigi, S. N. (2017). A Q-learning approach to flocking with UAVs in a stochastic environment. IEEE Transactions on Cybernetics, 47, 186–197.CrossRef Google Scholar

Kendoul, F. (2014). Four-dimensional guidance and control of movement using time-to-contact: application to automated docking and landing of unmanned rotorcraft systems. The International Journal of Robotics Research, 33, 237–267.CrossRef Google Scholar

Lee, D. N. (2009). General Tau Theory: evolution to date. Perception, 38(6), 837–858.CrossRef Google Scholar PubMed

Liu, Y. and Nejat, G. (2016). Multirobot cooperative learning for semiautonomous control in urban search and rescue applications. Journal of Field Robotics, 33(4), 512–536.CrossRef Google Scholar

Schogler, B., Pepping, G. J. and Lee, D. N. (2008). Tau G-guidance of transients in expressive musical performance. Experimental Brain Research, 189(3), 361–372.CrossRef Google Scholar PubMed

Tao, J. Y. and Li, D. S. (2006). Cooperative Strategy Learning in Multi-Agent Environment with Continuous State Space. 2006 International Conference on Machine Learning and Cybernetics, Dalian, China, 2107–2111.CrossRef Google Scholar

Tian, B. L., Liu, L. H., Lu, H. C. and Zuo, Z. Y. (2018). Multivariable finite time attitude control for quadrotor UAV: theory and experimentation. IEEE Transections on Industrial Electronics, 65(3), 2567–2577.10.1109/TIE.2017.2739700CrossRef Google Scholar

Wang, Y., Wang, S., Tan, M. and Yu, J. (2017). Simultaneous arrival planning for multiple unmanned vehicles formation reconfiguration. International Journal of Robotics and Automation, 32(4), 360–368.CrossRef Google Scholar

Xi, L., Yu, T., Yang, B. and Zhang, X. S. (2015). A novel multi-agent decentralized win or learn fast policy hill-climbing with eligibility trace algorithm for smart generation control of interconnected complex power grids. Energy Conversion and Management, 103, 82–93.CrossRef Google Scholar

Yang, Z., Fang, Z. and Li, P. (2016). Decentralized 4D trajectory generation for UAVs based on improved intrinsic tau guidance strategy. International Journal of Advanced Robotic Systems, 13(3), 88.CrossRef Google Scholar

Yu, T., Zhang, X. S., Zhou, B. and Chan, K. W. (2016). Hierarchical correlated Q-learning for multi-layer optimal generation command dispatch. International Journal of Electrical Power & Energy Systems, 78, 1–12.CrossRef Google Scholar

Zhang, B., Mao, Z., Liu, W. and Liu, J. (2015). Geometric reinforcement learning for path planning of UAVs. Journal of Intelligent & Robotic Systems, 77(2), 391–409.CrossRef Google Scholar

Article contents

Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests