UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy.

Xie, Jingyi; Peng, Xiaodong; Wang, Haijiao; Niu, Wenlong; Zheng, Xiao

Xie, Jingyi; Peng, Xiaodong; Wang, Haijiao; Niu, Wenlong; Zheng, Xiao.

Afiliação

Xie J; Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China.
Peng X; University of Chinese Academy of Sciences, Beijing 100049, China.
Wang H; Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China.
Niu W; University of Chinese Academy of Sciences, Beijing 100049, China.
Zheng X; Alibaba Damo Academy, Hangzhou 311121, China.

Sensors (Basel) ; 20(19)2020 Oct 01.

Article em En | MEDLINE | ID: mdl-33019747

ABSTRACT

ABSTRACT

Unmanned aerial vehicle (UAV) autonomous tracking and landing is playing an increasingly important role in military and civil applications. In particular, machine learning has been successfully introduced to robotics-related tasks. A novel UAV autonomous tracking and landing approach based on a deep reinforcement learning strategy is presented in this paper, with the aim of dealing with the UAV motion control problem in an unpredictable and harsh environment. Instead of building a prior model and inferring the landing actions based on heuristic rules, a model-free method based on a partially observable Markov decision process (POMDP) is proposed. In the POMDP model, the UAV automatically learns the landing maneuver by an end-to-end neural network, which combines the Deep Deterministic Policy Gradients (DDPG) algorithm and heuristic rules. A Modular Open Robots Simulation Engine (MORSE)-based reinforcement learning framework is designed and validated with a continuous UAV tracking and landing task on a randomly moving platform in high sensor noise and intermittent measurements. The simulation results show that when the moving platform is moving in different trajectories, the average landing success rate of the proposed algorithm is about 10% higher than that of the Proportional-Integral-Derivative (PID) method. As an indirect result, a state-of-the-art deep reinforcement learning-based UAV control method is validated, where the UAV can learn the optimal strategy of a continuously autonomous landing and perform properly in a simulation environment.

Palavras-chave

autonomous tracking and landing; deep reinforcement learning; quadrotor unmanned aerial vehicle

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article