Toward reliable designs of data-driven reinforcement learning tracking control for Euler-Lagrange systems.

Yao, Zhikai; Yao, Jianyong

Yao, Zhikai; Yao, Jianyong.

Afiliação

Yao Z; College of Automation & College of Artifical Intelligence, Nanjing University of Post and Telecommunication, Nanjing, Jiangsu Province, 210023, China; School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu Province, 210094, China.
Yao J; School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu Province, 210094, China. Electronic address: jerryyao.buaa@gmail.com.

Neural Netw ; 153: 564-575, 2022 Sep.

Article em En | MEDLINE | ID: mdl-35843117

RESUMO

This paper addresses reinforcement learning based, direct signal tracking control with an objective of developing mathematically suitable and practically useful design approaches. Specifically, we aim to provide reliable and easy to implement designs in order to reach reproducible neural network-based solutions. Our proposed new design takes advantage of two control design frameworks: a reinforcement learning based, data-driven approach to provide the needed adaptation and (sub)optimality, and a backstepping based approach to provide closed-loop system stability framework. We develop this work based on an established direct heuristic dynamic programming (dHDP) learning paradigm to perform online learning and adaptation and a backstepping design for a class of important nonlinear dynamics described as Euler-Lagrange systems. We provide a theoretical guarantee for the stability of the overall dynamic system, weight convergence of the approximating nonlinear neural networks, and the Bellman (sub)optimality of the resulted control policy. We use simulations to demonstrate significantly improved design performance of the proposed approach over the original dHDP.

Assuntos

Algoritmos; Dinâmica não Linear; Simulação por Computador; Retroalimentação; Redes Neurais de Computação

Palavras-chave

Backstepping; Direct heuristic dynamic programming (dHDP); Reinforcement learning; Tracking control

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Dinâmica não Linear Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Dinâmica não Linear Idioma: En Ano de publicação: 2022 Tipo de documento: Article