Path Integral Policy Improvement With Population Adaptation.

Yamamoto, Kosuke; Ariizumi, Ryo; Hayakawa, Tomohiro; Matsuno, Fumitoshi

Yamamoto, Kosuke; Ariizumi, Ryo; Hayakawa, Tomohiro; Matsuno, Fumitoshi.

IEEE Trans Cybern ; 52(1): 312-322, 2022 Jan.

Article em En | MEDLINE | ID: mdl-32324589

RESUMO

Path integral policy improvement (PI2) is known to be an efficient reinforcement learning algorithm, particularly, if the target system is a high-dimensional dynamical system. However, PI2, and its existing extensions, have adjustable parameters, on which the efficiency depends significantly. This article proposes an extension of PI2 that adjusts all of the critical parameters automatically. Motion acquisition tasks for three different types of simulated legged robots were performed to test the efficacy of the proposed algorithm. The results show that the proposed method cannot only eliminate the burden on the user to set the parameters appropriately but also improve the optimization performance significantly. For one of the acquired motions, a real robot experiment was conducted to show the validity of the motion.

Assuntos

Robótica; Algoritmos; Movimento (Física); Políticas; Reforço Psicológico

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Robótica Idioma: En Revista: IEEE Trans Cybern Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Robótica Idioma: En Revista: IEEE Trans Cybern Ano de publicação: 2022 Tipo de documento: Article