Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Entropy (Basel) ; 23(8)2021 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-34441081

RESUMO

Stochastic spatio-temporal processes are prevalent across domains ranging from the modeling of plasma, turbulence in fluids to the wave function of quantum systems. This letter studies a measure-theoretic description of such systems by describing them as evolutionary processes on Hilbert spaces, and in doing so, derives a framework for spatio-temporal manipulation from fundamental thermodynamic principles. This approach yields a variational optimization framework for controlling stochastic fields. The resulting scheme is applicable to a wide class of spatio-temporal processes and can be used for optimizing parameterized control policies. Our simulated experiments explore the application of two forms of this approach on four stochastic spatio-temporal processes, with results that suggest new perspectives and directions for studying stochastic control problems for spatio-temporal systems.

2.
Chaos ; 28(6): 061103, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29960410

RESUMO

Empirically derived continuum models of collective behavior among large populations of dynamic agents are a subject of intense study in several fields, including biology, engineering, and finance. We formulate and study a mean-field game model whose behavior mimics an empirically derived nonlocal homogeneous flocking model for agents with gradient self-propulsion dynamics. The mean-field game framework provides a non-cooperative optimal control description of the behavior of a population of agents in a distributed setting. In this description, each agent's state is driven by optimally controlled dynamics that result in a Nash equilibrium between itself and the population. The optimal control is computed by minimizing a cost that depends only on its own state and a mean-field term. The agent distribution in phase space evolves under the optimal feedback control policy. We exploit the low-rank perturbative nature of the nonlocal term in the forward-backward system of equations governing the state and control distributions and provide a closed-loop linear stability analysis demonstrating that our model exhibits bifurcations similar to those found in the empirical model. The present work is a step towards developing a set of tools for systematic analysis, and eventually design, of collective behavior of non-cooperative dynamic agents via an inverse modeling approach.

3.
IEEE Robot Autom Lett ; 7(1): 279-286, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35005225

RESUMO

One of the main challenges in autonomous robotic exploration and navigation in unknown and unstructured environments is determining where the robot can or cannot safely move. A significant source of difficulty in this determination arises from stochasticity and uncertainty, coming from localization error, sensor sparsity and noise, difficult-to-model robot-ground interactions, and disturbances to the motion of the vehicle. Classical approaches to this problem rely on geometric analysis of the surrounding terrain, which can be prone to modeling errors and can be computationally expensive. Moreover, modeling the distribution of uncertain traversability costs is a difficult task, compounded by the various error sources mentioned above. In this work, we take a principled learning approach to this problem. We introduce a neural network architecture for robustly learning the distribution of traversability costs. Because we are motivated by preserving the life of the robot, we tackle this learning problem from the perspective of learning tail-risks, i.e. the conditional value-at-risk (CVaR). We show that this approach reliably learns the expected tail risk given a desired probability risk threshold between 0 and 1, producing a traversability costmap which is more robust to outliers, more accurately captures tail risks, and is more computationally efficient, when compared against baselines. We validate our method on data collected by a legged robot navigating challenging, unstructured environments including an abandoned subway, limestone caves, and lava tube caves.

4.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 4694-4699, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33019040

RESUMO

Determining how the nervous system controls tendon-driven bodies remains an open question. Stochastic optimal control (SOC) has been proposed as a plausible analogy in the neuroscience community. SOC relies on solving the Hamilton-Jacobi-Bellman equation, which seeks to minimize a desired cost function for a given task with noisy controls. We evaluate and compare three SOC methodologies to produce tapping by a simulated planar 3-joint human index finger: iterative Linear Quadratic Gaussian (iLQG), Model-Predictive Path Integral Control (MPPI), and Deep Forward-Backward Stochastic Differential Equations (FBSDE). We show that averaged over 128 repeats these methodologies can place the fingertip at the desired final joint angles but-because of kinematic redundancy and the presence of noise-they each have joint trajectories and final postures with different means and variances. iLQG in particular, had the largest kinematic variance and departure from the final desired joint angles. We demonstrate that MPPI and FBSDE have superior performance for such nonlinear, tendon-driven systems with noisy controls.Clinical relevance- The mathematical framework provided by MPPI and FBSDE may be best suited for tendon-driven anthropomorphic robots, exoskeletons, and prostheses for amputees.


Assuntos
Algoritmos , Tendões , Fenômenos Biomecânicos , Dedos , Humanos , Distribuição Normal
5.
Resusc Plus ; 3: 100021, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34223304

RESUMO

OBJECTIVES: We evaluated the feasibility of optimising coronary perfusion pressure (CPP) during cardiopulmonary resuscitation (CPR) with a closed-loop, machine-controlled CPR system (MC-CPR) that sends real-time haemodynamic feedback to a set of machine learning and control algorithms which determine compression/decompression characteristics over time. BACKGROUND: American Heart Association CPR guidelines (AHA-CPR) and standard mechanical devices employ a "one-size-fits-all" approach to CPR that fails to adjust compressions over time or individualise therapy, thus leading to deterioration of CPR effectiveness as duration exceeds 15-20 â€‹min. METHODS: CPR was administered for 30 â€‹min in a validated porcine model of cardiac arrest. Intubated anaesthetised pigs were randomly assigned to receive MC-CPR (6), mechanical CPR conducted according to AHA-CPR (6), or human-controlled CPR (HC-CPR) (10). MC-CPR directly controlled the CPR piston's amplitude of compression and decompression to maximise CPP over time. In HC-CPR a physician controlled the piston amplitudes to maximise CPP without any algorithmic feedback, while AHA-CPR had one compression depth without adaptation. RESULTS: MC-CPR significantly improved CPP throughout the 30-min resuscitation period compared to both AHA-CPR and HC-CPR. CPP and carotid blood flow (CBF) remained stable or improved with MC-CPR but deteriorated with AHA-CPR. HC-CPR showed initial but transient improvement that dissipated over time. CONCLUSION: Machine learning implemented in a closed-loop system successfully controlled CPR for 30 â€‹min in our preclinical model. MC-CPR significantly improved CPP and CBF compared to AHA-CPR and ameliorated the temporal haemodynamic deterioration that occurs with standard approaches.

6.
IEEE Trans Neural Netw Learn Syst ; 29(11): 5459-5474, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29993609

RESUMO

We present a trajectory optimization approach to reinforcement learning in continuous state and action spaces, called probabilistic differential dynamic programming (PDDP). Our method represents systems dynamics using Gaussian processes (GPs), and performs local dynamic programming iteratively around a nominal trajectory in Gaussian belief spaces. Different from model-based policy search methods, PDDP does not require a policy parameterization and learns a time-varying control policy via successive forward-backward sweeps. A convergence analysis of the iterative scheme is given, showing that our algorithm converges to a stationary point globally under certain conditions. We show that prior model knowledge can be incorporated into the proposed framework to speed up learning, and a generalized optimization criterion based on the predicted cost distribution can be employed to enable risk-sensitive learning. We demonstrate the effectiveness and efficiency of the proposed algorithm using nontrivial tasks. Compared with a state-of-the-art GP-based policy search method, PDDP offers a superior combination of learning speed, data efficiency, and applicability.

7.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 6357-6360, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28269703

RESUMO

This study presents a reinforcement learning approach for the optimization of the proportional-integral gains of the feedback controller represented in a computational model of epilepsy. The chaotic oscillator model provides a feedback control systems view of the dynamics of an epileptic brain with an internal feedback controller representative of the natural seizure suppression mechanism within the brain circuitry. Normal and pathological brain activity is simulated in this model by adjusting the feedback gain values of the internal controller. With insufficient gains, the internal controller cannot provide enough feedback to the brain dynamics causing an increase in correlation between different brain sites. This increase in synchronization results in the destabilization of the brain dynamics, which is representative of an epileptic seizure. To provide compensation for an insufficient internal controller an external controller is designed using proportional-integral feedback control strategy. A cross-entropy optimization algorithm is applied to the chaotic oscillator network model to learn the optimal feedback gains for the external controller instead of hand-tuning the gains to provide sufficient control to the pathological brain and prevent seizure generation. The correlation between the dynamics of neural activity within different brain sites is calculated for experimental data to show similar dynamics of epileptic neural activity as simulated by the network of chaotic oscillators.


Assuntos
Entropia , Epilepsia/patologia , Modelos Neurológicos , Neurônios/patologia , Encéfalo/patologia , Encéfalo/fisiopatologia , Epilepsia/fisiopatologia , Retroalimentação , Aprendizado de Máquina
8.
IEEE Rev Biomed Eng ; 2: 110-135, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-21687779

RESUMO

Computational models of the neuromuscular system hold the potential to allow us to reach a deeper understanding of neuromuscular function and clinical rehabilitation by complementing experimentation. By serving as a means to distill and explore specific hypotheses, computational models emerge from prior experimental data and motivate future experimental work. Here we review computational tools used to understand neuromuscular function including musculoskeletal modeling, machine learning, control theory, and statistical model analysis. We conclude that these tools, when used in combination, have the potential to further our understanding of neuromuscular function by serving as a rigorous means to test scientific hypotheses in ways that complement and leverage experimental data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA