Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Dynamic Fall Recovery Control for Legged Robots via Reinforcement Learning.

Li, Sicen; Pang, Yiming; Bai, Panju; Hu, Shihao; Wang, Liquan; Wang, Gang.

Biomimetics (Basel) ; 9(4)2024 Mar 22.

Artigo em Inglês | MEDLINE | ID: mdl-38667204

RESUMO

Falling is inevitable for legged robots when deployed in unstructured and unpredictable real-world scenarios, such as uneven terrain in the wild. Therefore, to recover dynamically from a fall without unintended termination of locomotion, the robot must possess the complex motor skills required for recovery maneuvers. However, this is exceptionally challenging for existing methods, since it involves multiple unspecified internal and external contacts. To go beyond the limitation of existing methods, we introduced a novel deep reinforcement learning framework to train a learning-based state estimator and a proprioceptive history policy for dynamic fall recovery under external disturbances. The proposed learning-based framework applies to different fall cases indoors and outdoors. Furthermore, we show that the learned fall recovery policies are hardware-feasible and can be implemented on real robots. The performance of the proposed approach is evaluated with extensive trials using a quadruped robot, which shows good effectiveness in recovering the robot after a fall on flat surfaces and grassland.

Effect of Bionic Crab Shell Attitude Parameters on Lift and Drag in a Flow Field.

Hu, Shihao; Chen, Xi; Li, Jiawei; Yu, Peiye; Xin, Mingfei; Pan, Biye; Li, Sicen; Tang, Qinyun; Wang, Liquan; Ding, Mingxuan; Liu, Kaixin; Liu, Zhaojin.

Biomimetics (Basel) ; 9(2)2024 Jan 29.

Artigo em Inglês | MEDLINE | ID: mdl-38392127

RESUMO

Underwater bionic-legged robots encounter significant challenges in attitude, velocity, and positional control due to lift and drag in water current environments, making it difficult to balance operational efficiency with motion stability. This study delves into the hydrodynamic properties of a bionic crab robot's shell, drawing inspiration from the sea crab's motion postures. It further refines the robot's underwater locomotion strategy based on these insights. Initially, the research involved collecting attitude data from crabs during underwater movement through biological observation. Subsequently, hydrodynamic simulations and experimental validations of the bionic shell were conducted, examining the impact of attitude parameters on hydrodynamic performance. The findings reveal that the transverse angle predominantly influences lift and drag. Experiments in a test pool with a crab-like robot, altering transverse angles, demonstrated that increased transverse angles enhance the robot's underwater walking efficiency, stability, and overall performance.

Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.

Li, Sicen; Tang, Qinyun; Pang, Yiming; Ma, Xinmeng; Wang, Gang.

Front Neurorobot ; 16: 1081242, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36699950

RESUMO

Introduction: The value approximation bias is known to lead to suboptimal policies or catastrophic overestimation bias accumulation that prevent the agent from making the right decisions between exploration and exploitation. Algorithms have been proposed to mitigate the above contradiction. However, we still lack an understanding of how the value bias impact performance and a method for efficient exploration while keeping stable updates. This study aims to clarify the effect of the value bias and improve the reinforcement learning algorithms to enhance sample efficiency. Methods: This study designs a simple episodic tabular MDP to research value underestimation and overestimation in actor-critic methods. This study proposes a unified framework called Realistic Actor-Critic (RAC), which employs Universal Value Function Approximators (UVFA) to simultaneously learn policies with different value confidence-bound with the same neural network, each with a different under overestimation trade-off. Results: This study highlights that agents could over-explore low-value states due to inflexible under-overestimation trade-off in the fixed hyperparameters setting, which is a particular form of the exploration-exploitation dilemma. And RAC performs directed exploration without over-exploration using the upper bounds while still avoiding overestimation using the lower bounds. Through carefully designed experiments, this study empirically verifies that RAC achieves 10x sample efficiency and 25% performance improvement compared to Soft Actor-Critic in the most challenging Humanoid environment. All the source codes are available at https://github.com/ihuhuhu/RAC. Discussion: This research not only provides valuable insights for research on the exploration-exploitation trade-off by studying the frequency of policies access to low-value states under different value confidence-bounds guidance, but also proposes a new unified framework that can be combined with current actor-critic methods to improve sample efficiency in the continuous control domain.

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA