Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Sensors (Basel) ; 21(4)2021 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-33557359

RESUMO

Unmanned aerial vehicles (UAVs) have been widely used in search and rescue (SAR) missions due to their high flexibility. A key problem in SAR missions is to search and track moving targets in an area of interest. In this paper, we focus on the problem of Cooperative Multi-UAV Observation of Multiple Moving Targets (CMUOMMT). In contrast to the existing literature, we not only optimize the average observation rate of the discovered targets, but we also emphasize the fairness of the observation of the discovered targets and the continuous exploration of the undiscovered targets, under the assumption that the total number of targets is unknown. To achieve this objective, a deep reinforcement learning (DRL)-based method is proposed under the Partially Observable Markov Decision Process (POMDP) framework, where each UAV maintains four observation history maps, and maps from different UAVs within a communication range can be merged to enhance UAVs' awareness of the environment. A deep convolutional neural network (CNN) is used to process the merged maps and generate the control commands to UAVs. The simulation results show that our policy can enable UAVs to balance between giving the discovered targets a fair observation and exploring the search region compared with other methods.

2.
Sensors (Basel) ; 21(23)2021 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-34884162

RESUMO

Planetary soft landing has been studied extensively due to its promising application prospects. In this paper, a soft landing control algorithm based on deep reinforcement learning (DRL) with good convergence property is proposed. First, the soft landing problem of the powered descent phase is formulated and the theoretical basis of Reinforcement Learning (RL) used in this paper is introduced. Second, to make it easier to converge, a reward function is designed to include process rewards like velocity tracking reward, solving the problem of sparse reward. Then, by including the fuel consumption penalty and constraints violation penalty, the lander can learn to achieve velocity tracking goal while saving fuel and keeping attitude angle within safe ranges. Then, simulations of training are carried out under the frameworks of Deep deterministic policy gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor Critic (SAC), respectively, which are of the classical RL frameworks, and all converged. Finally, the trained policy is deployed into velocity tracking and soft landing experiments, results of which demonstrate the validity of the algorithm proposed.

3.
Sensors (Basel) ; 19(12)2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31200583

RESUMO

Accurate perception of the detected terrain is a precondition for the planetary rover to perform its own mission. However, terrain measurement based on vision and LIDAR is subject to environmental changes such as strong illumination and dust storms. In this paper, considering the influence of uncertainty in the detection process, a vibration/gyro coupled terrain estimation method based on multipoint ranging information is proposed. The terrain update model is derived by analyzing the measurement uncertainty and motion uncertainty. Combined with Clearpath Jackal unmanned vehicle-the terrain mapping accuracy test based on ROS (Robot Operating System) simulation environment-indoor Optitrack auxiliary environment and outdoor soil environment was completed. The results show that the proposed algorithm has high reconstruction ability for a given scale terrain. The reconstruction accuracy in the above test environments is within 1 cm, 2 cm, and 6 cm, respectively.

4.
Sensors (Basel) ; 19(14)2019 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-31337058

RESUMO

Accurate classification and identification of the detected terrain is the basis for the long-distance patrol mission of the planetary rover. But terrain measurement based on vision and radar is subject to conditions such as light changes and dust storms. In this paper, under the premise of not increasing the sensor load of the existing rover, a terrain classification and recognition method based on vibration is proposed. Firstly, the time-frequency domain transformation of vibration information is realized by fast Fourier transform (FFT), and the characteristic representation of vibration information is given. Secondly, a deep neural network based on multi-layer perception is designed to realize classification of different terrains. Finally, combined with the Jackal unmanned vehicle platform, the XQ unmanned vehicle platform, and the vibration sensor, the terrain classification comparison test based on five different terrains was completed. The results show that the proposed algorithm has higher classification accuracy, and different platforms and running speeds have certain influence on the terrain classification at the same time, which provides support for subsequent practical applications.

5.
IEEE Trans Cybern ; 54(1): 462-475, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37028361

RESUMO

This article explores deep reinforcement learning (DRL) for the flocking control of unmanned aerial vehicle (UAV) swarms. The flocking control policy is trained using a centralized-learning-decentralized-execution (CTDE) paradigm, where a centralized critic network augmented with additional information about the entire UAV swarm is utilized to improve learning efficiency. Instead of learning inter-UAV collision avoidance capabilities, a repulsion function is encoded as an inner-UAV "instinct." In addition, the UAVs can obtain the states of other UAVs through onboard sensors in communication-denied environments, and the impact of varying visual fields on flocking control is analyzed. Through extensive simulations, it is shown that the proposed policy with the repulsion function and limited visual field has a success rate of 93.8% in training environments, 85.6% in environments with a high number of UAVs, 91.2% in environments with a high number of obstacles, and 82.2% in environments with dynamic obstacles. Furthermore, the results indicate that the proposed learning-based methods are more suitable than traditional methods in cluttered environments.

6.
Artigo em Inglês | MEDLINE | ID: mdl-37566499

RESUMO

Unmanned aerial vehicles (UAVs) have been widely used in urban target-tracking tasks, where long-term tracking of evasive targets is of great significance for public safety. However, the tracked targets are easily lost due to the evasive behavior of the targets and the unstructured characteristics of the urban environment. To address this issue, this article proposes a hybrid target-tracking approach based on target intention inference and deep reinforcement learning (DRL). First, a target intention inference model based on convolution neural networks (CNNs) is built to infer target intentions by fusing urban environment information and observed target trajectory. Then, the prediction of the target trajectory can be inspired by the inferred target intentions, which can further provide effective guidance to the target search process. In order to fully explore the policy space, the target search policy is developed under a DRL framework, where the search policy is modeled as a deep neural network (DNN) and trained by interacting with the task environment. The simulation results show that the inference of the target intentions can effectively guide the UAV to search for the target and significantly improve the target-tracking performance. Meanwhile, the generalization results indicate that the proposed DRL-based search policy has high robustness to the uncertainty of the target behavior.

7.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5440-5451, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37027270

RESUMO

Motion planning is important to the automatic operation of the manipulator. It is difficult for traditional motion planning algorithms to achieve efficient online motion planning in a rapidly changing environment and high-dimensional planning space. The neural motion planning (NMP) algorithm based on reinforcement learning provides a new way to solve the above-mentioned task. Aiming to overcome the difficulty of training the neural network in high-accuracy planning tasks, this article proposes to combine the artificial potential field (APF) method and reinforcement learning. The neural motion planner can avoid obstacles in a wide range; meanwhile, the APF method is exploited to adjust the partial position. Considering that the action space of the manipulator is high-dimensional and continuous, the soft-actor-critic (SAC) algorithm is adopted to train the neural motion planner. By training and testing with different accuracy values in a simulation engine, it is verified that, in the high-accuracy planning tasks, the success rate of the proposed hybrid method is better than using the two algorithms alone. Finally, the feasibility of directly transferring the learned neural network to the real manipulator is verified by a dynamic obstacle-avoidance task.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA