Reinforcement learning-based attitude control for a barbell electric sail.

Ma, Xiaolei; Wen, Hao

Ma, Xiaolei; Wen, Hao.

Afiliação

Ma X; State Key Laboratory of Mechanics and Control of Aerospace Structures, Nanjing University of Aeronautics and Astronautics, No.29 Yudao Street, Nanjing, Jiangsu 210016, China; College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, No.29 Yudao Street, Nanjing, Jiangsu 210016, China.
Wen H; State Key Laboratory of Mechanics and Control of Aerospace Structures, Nanjing University of Aeronautics and Astronautics, No.29 Yudao Street, Nanjing, Jiangsu 210016, China; College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, No.29 Yudao Street, Nanjing, Jiangsu 210016, China. Electronic address: wenhao@nuaa.edu.cn.

ISA Trans ; 147: 252-264, 2024 Apr.

Article em En | MEDLINE | ID: mdl-38429140

ABSTRACT

ABSTRACT

The electric solar wind sail (E-sail) is a new propellant-free propulsion concept. The under-actuated and highly nonlinear features of E-sail systems pose a great challenge to their attitude controller design. Conventional control schemes may not be capable of dealing with this tough problem. To this end, a reinforcement learning (RL)-based control scheme, which can explore and obtain optimal policies in the absence of training datasets, is proposed for the attitude control of a barbell E-sail system. The barbell E-sail comprises two end satellites linked to an insulated confluence point through long and conductive tethers. The voltages of the two tethers can be individually modulated for attitude control. The system attitude dynamics is described using a nonsingular formulation. The control scheme has a two-stage design. In the first stage, an RL controller based on the Proximal Policy Optimization (PPO) algorithm is used to obtain an RL control strategy, which is emulated and updated by neural networks. In the second stage, the attitude feedback control is accomplished with low computation and energy consumption and fast convergence speed by performing a real-time mapping from the system state to the control output using the updated control strategy. Finally, the simulation results demonstrate that the proposed RL-based control scheme can effectively adjust the E-sail to the design attitude by regulating the tether voltage difference. The comparisons with the NMPC scheme also indicate that the developed control scheme can significantly reduce the computation time with control accuracy maintained.

Palavras-chave

Attitude control; Electric sail; Reinforcement learning; Tether

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links