Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Cybern ; PP2020 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-32735543

RESUMO

In this article, we develop a learning-based secure control framework for cyber-physical systems in the presence of sensor and actuator attacks. Specifically, we use a bank of observer-based estimators to detect the attacks while introducing a threat-detection level function. Under nominal conditions, the system operates with a nominal-feedback controller with the developed attack monitoring process checking the reliance of the measurements. If there exists an attacker injecting attack signals to a subset of the sensors and/or actuators, then the attack mitigation process is triggered and a two-player, zero-sum differential game is formulated with the defender being the minimizer and the attacker being the maximizer. Next, we solve the underlying joint state estimation and attack mitigation problem and learn the secure control policy using a reinforcement-learning-based algorithm. Finally, two illustrative numerical examples are provided to show the efficacy of the proposed framework.

2.
Artigo em Inglês | MEDLINE | ID: mdl-32203039

RESUMO

We develop a method for obtaining safe initial policies for reinforcement learning via approximate dynamic programming (ADP) techniques for uncertain systems evolving with discrete-time dynamics. We employ the kernelized Lipschitz estimation to learn multiplier matrices that are used in semidefinite programming frameworks for computing admissible initial control policies with provably high probability. Such admissible controllers enable safe initialization and constraint enforcement while providing exponential stability of the equilibrium of the closed-loop system.

3.
IEEE Trans Neural Netw Learn Syst ; 31(12): 5441-5455, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32054590

RESUMO

In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithms. First, we develop a barrier function-based system transformation to impose state constraints while converting the original problem to an unconstrained optimization problem. Second, based on optimal derived policies, two types of intermittent feedback RL algorithms are presented, namely, a static and a dynamic one. We finally leverage an actor/critic structure to solve the problem online while guaranteeing optimality, stability, and safety. Simulation results show the efficacy of the proposed approach.

4.
IEEE Trans Cybern ; 50(8): 3752-3765, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31478887

RESUMO

This article develops a novel distributed intermittent control framework with the ultimate goal of reducing the communication burden in containment control of multiagent systems communicating via a directed graph. Agents are assumed to be under disturbance and communicate on a directed graph. Both static and dynamic intermittent protocols are proposed. Intermittent H∞ containment control design is considered to attenuate the effect of the disturbance and the game algebraic Riccati equation (GARE) is employed to design the coupling and feedback gains for both static and dynamic intermittent feedback. A novel scheme is then used to unify continuous, static, and dynamic intermittent containment protocols. Finally, simulation results verify the efficacy of the proposed approach.

5.
IEEE Int Conf Rehabil Robot ; 2019: 682-688, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31374710

RESUMO

This paper presents a compliant, underactuated finger for the development of anthropomorphic robotic and prosthetic hands. The finger achieves both flexion/extension and adduction/abduction on the metacarpophalangeal joint, by using two actuators. The design employs moment arm pulleys to drive the tendon laterally and amplify the abduction motion, while also maintaining the flexion motion. Particular emphasis has been given to the analysis of the mechanism. The proposed finger has been fabricated with the hybrid deposition manufacturing technique and the actuation mechanism's efficiency has been validated with experiments that include the computation of the reachable workspace, the assessment of the exerted forces at the fingertip, the demonstration of the feasible motions, and the presentation of the grasping and manipulation capabilities. The proposed mechanism facilitates the collaboration of the two actuators to increase the exerted finger forces. Moreover, the extended workspace allows the execution of dexterous manipulation tasks.


Assuntos
Dedos/fisiologia , Fenômenos Biomecânicos , Complacência (Medida de Distensibilidade) , Humanos , Articulações/fisiologia , Rotação , Tendões/fisiologia
6.
IEEE Trans Neural Netw Learn Syst ; 30(12): 3803-3817, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30946679

RESUMO

This paper presents an online kinodynamic motion planning algorithmic framework using asymptotically optimal rapidly-exploring random tree (RRT*) and continuous-time Q-learning, which we term as RRT-Q⋆. We formulate a model-free Q-based advantage function and we utilize integral reinforcement learning to develop tuning laws for the online approximation of the optimal cost and the optimal policy of continuous-time linear systems. Moreover, we provide rigorous Lyapunov-based proofs for the stability of the equilibrium point, which results in asymptotic convergence properties. A terminal state evaluation procedure is introduced to facilitate the online implementation. We propose a static obstacle augmentation and a local replanning framework, which are based on topological connectedness, to locally recompute the robot's path and ensure collision-free navigation. We perform simulations and a qualitative comparison to evaluate the efficacy of the proposed methodology.

7.
IEEE Trans Neural Netw Learn Syst ; 29(6): 2042-2062, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29771662

RESUMO

This paper reviews the current state of the art on reinforcement learning (RL)-based feedback control solutions to optimal regulation and tracking of single and multiagent systems. Existing RL solutions to both optimal and control problems, as well as graphical games, will be reviewed. RL methods learn the solution to optimal control and game problems online and using measured data along the system trajectories. We discuss Q-learning and the integral RL algorithm as core algorithms for discrete-time (DT) and continuous-time (CT) systems, respectively. Moreover, we discuss a new direction of off-policy RL for both CT and DT systems. Finally, we review several applications.

8.
IEEE Trans Neural Netw Learn Syst ; 27(11): 2386-2398, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26513810

RESUMO

This paper proposes a control algorithm based on adaptive dynamic programming to solve the infinite-horizon optimal control problem for known deterministic nonlinear systems with saturating actuators and nonquadratic cost functionals. The algorithm is based on an actor/critic framework, where a critic neural network (NN) is used to learn the optimal cost, and an actor NN is used to learn the optimal control policy. The adaptive control nature of the algorithm requires a persistence of excitation condition to be a priori validated, but this can be relaxed using previously stored data concurrently with current data in the update of the critic NN. A robustifying control term is added to the controller to eliminate the effect of residual errors, leading to the asymptotically stability of the closed-loop system. Simulation results show the effectiveness of the proposed approach for a controlled Van der Pol oscillator and also for a power system plant.

9.
IEEE Trans Syst Man Cybern B Cybern ; 41(1): 14-25, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20350860

RESUMO

Approximate dynamic programming (ADP) is a class of reinforcement learning methods that have shown their importance in a variety of applications, including feedback control of dynamical systems. ADP generally requires full information about the system internal states, which is usually not available in practical situations. In this paper, we show how to implement ADP methods using only measured input/output data from the system. Linear dynamical systems with deterministic behavior are considered herein, which are systems of great interest in the control system community. In control system theory, these types of methods are referred to as output feedback (OPFB). The stochastic equivalent of the systems dealt with in this paper is a class of partially observable Markov decision processes. We develop both policy iteration and value iteration algorithms that converge to an optimal controller that requires only OPFB. It is shown that, similar to Q -learning, the new methods have the important advantage that knowledge of the system dynamics is not needed for the implementation of these learning algorithms or for the OPFB control. Only the order of the system, as well as an upper bound on its "observability index," must be known. The learned OPFB controller is in the form of a polynomial autoregressive moving-average controller that has equivalent performance with the optimal state variable feedback gain.


Assuntos
Algoritmos , Inteligência Artificial , Retroalimentação , Aprendizagem , Cadeias de Markov , Reforço Psicológico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...