RESUMO
Improving generalization ability in multi-robot formation can reduce repetitive training and calculation. In this paper, we study the multi-robot formation problem with the ability to generalize the target position. Since the generalization ability of neural network is directly proportional to spatial dimension, we adopt the strategy of using different networks to solve different objectives, so that the network learning can focus on the learning of one objective to obtain better performance. In addition, this paper presents a distributed deep reinforcement learning method based on soft actor-critic algorithm for solving multi-robot formation problem. At the same time, the formation evaluation assignment function is designed to adapt to distributed training. Compared with the original algorithm, the improved algorithm can get higher reward cumulative values. The experimental results show that the proposed algorithm can better maintain the desired formation in the moving process, and the rotation design in the reward function makes the multi-robot system have better flexibility in formation. The comparison of control signal curve shows that the proposed algorithm is more stable. At the end of the experiments, the universality of the proposed algorithm in formation maintenance and formation variations is demonstrated.
Assuntos
Robótica , Reforço Psicológico , Recompensa , Aprendizagem , AlgoritmosRESUMO
This paper proposes a novel constrained optimization model to address the loco-manipulation problem of mobile robot with redundant manipulator for trajectory tracking. To alleviate the accumulative error of the end-effector's position, a new control law is designed to eliminate the negative effect from the deviation of the initial position, leading to better performance than existing ones. To deal with the locomotion constraints in the loco-manipulation problem, the optimization model is converted to an augmented Lagrangian primal-dual problem. Furthermore, an inertial neural network approach is used to solve the problem and the corresponding Lyapunov proof guarantees the convergence of variables. The numerical simulations show that the proposed approach is more suitable for application since the model is more effective and the algorithm has better convergence rate.