Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Neural Netw ; 174: 106243, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38531123

RESUMO

Generative Flow Networks (GFlowNets) aim to generate diverse trajectories from a distribution in which the final states of the trajectories are proportional to the reward, serving as a powerful alternative to reinforcement learning for exploratory control tasks. However, the individual-flow matching constraint in GFlowNets limits their applications for multi-agent systems, especially continuous joint-control problems. In this paper, we propose a novel Multi-Agent generative Continuous Flow Networks (MACFN) method to enable multiple agents to perform cooperative exploration for various compositional continuous objects. Technically, MACFN trains decentralized individual-flow-based policies in a centralized global-flow-based matching fashion. During centralized training, MACFN introduces a continuous flow decomposition network to deduce the flow contributions of each agent in the presence of only global rewards. Then agents can deliver actions solely based on their assigned local flow in a decentralized way, forming a joint policy distribution proportional to the rewards. To guarantee the expressiveness of continuous flow decomposition, we theoretically derive a consistency condition on the decomposition network. Experimental results demonstrate that the proposed method yields results superior to the state-of-the-art counterparts and better exploration capability. Our code is available at https://github.com/isluoshuang/MACFN.


Assuntos
Aprendizagem , Políticas , Reforço Psicológico , Recompensa
2.
Neural Netw ; 169: 32-43, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37857171

RESUMO

Currently, through proposing discontinuous control strategies with the signum function and discussing separately short-term memory (STM) and long-term memory (LTM) of competitive artificial neural networks (ANNs), the fixed-time (FXT) synchronization of competitive ANNs has been explored. Note that the method of separate analysis usually leads to complicated theoretical derivation and synchronization conditions, and the signum function inevitably causes the chattering to reduce the performance of the control schemes. To try to solve these challenging problems, the FXT synchronization issue is concerned in this paper for competitive ANNs by establishing a theorem of FXT stability with switching type and developing continuous control schemes based on a kind of saturation functions. Firstly, different from the traditional method of studying separately STM and LTM of competitive ANNs, the models of STM and LTM are compressed into a high-dimensional system so as to reduce the complexity of theoretical analysis. Additionally, as an important theoretical preliminary, a FXT stability theorem with switching differential conditions is established and some high-precision estimates for the convergence time are explicitly presented by means of several special functions. To achieve FXT synchronization of the addressed competitive ANNs, a type of continuous pure power-law control scheme is developed via introducing the saturation function instead of the signum function, and some synchronization criteria are further derived by the established FXT stability theorem. These theoretical results are further illustrated lastly via a numerical example and are applied to image encryption.


Assuntos
Algoritmos , Redes Neurais de Computação , Fatores de Tempo
3.
Neural Netw ; 167: 104-117, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37647740

RESUMO

The implementation of robotic reinforcement learning is hampered by problems such as an unspecified reward function and high training costs. Many previous works have used cross-domain policy transfer to obtain the policy of the problem domain. However, these researches require paired and aligned dynamics trajectories or other interactions with the environment. We propose a cross-domain dynamics alignment framework for the problem domain policy acquisition that can transfer the policy trained in the source domain to the problem domain. Our framework aims to learn dynamics alignment across two domains that differ in agents' physical parameters (armature, rotation range, or torso mass) or agents' morphologies (limbs). Most importantly, we learn dynamics alignment between two domains using unpaired and unaligned dynamics trajectories. For these two scenarios, we propose a cross-physics-domain policy adaptation algorithm (CPD) and a cross-morphology-domain policy adaptation algorithm (CMD) based on our cross-domain dynamics alignment framework. In order to improve the performance of policy in the source domain so that a better policy can be transferred to the problem domain, we propose the Boltzmann TD3 (BTD3) algorithm. We conduct diverse experiments on agent continuous control domains to demonstrate the performance of our approaches. Experimental results show that our approaches can obtain better policies and higher rewards for the agents in the problem domains even when the dataset of the problem domain is small.


Assuntos
Algoritmos , Aprendizagem , Física , Políticas , Reforço Psicológico
4.
Exp Brain Res ; 241(1): 81-104, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36371477

RESUMO

Reaching movements are guided by estimates of the target object's location. Since the precision of instantaneous estimates is limited, one might accumulate visual information over time. However, if the object is not stationary, accumulating information can bias the estimate. How do people deal with this trade-off between improving precision and reducing the bias? To find out, we asked participants to tap on targets. The targets were stationary or moving, with jitter added to their positions. By analysing the response to the jitter, we show that people continuously use the latest available information about the target's position. When the target is moving, they combine this instantaneous target position with an extrapolation based on the target's average velocity during the last several hundred milliseconds. This strategy leads to a bias if the target's velocity changes systematically. Having people tap on accelerating targets showed that the bias that results from ignoring systematic changes in velocity is removed by compensating for endpoint errors if such errors are consistent across trials. We conclude that combining simple continuous updating of visual information with the low-pass filter characteristics of muscles, and adjusting movements to compensate for errors made in previous trials, leads to the precise and accurate human goal-directed movements.


Assuntos
Percepção de Movimento , Desempenho Psicomotor , Humanos , Desempenho Psicomotor/fisiologia , Percepção de Movimento/fisiologia , Incerteza , Movimento (Física) , Movimento/fisiologia
5.
Artigo em Inglês | MEDLINE | ID: mdl-36545711

RESUMO

The paper discusses the causes and needs of controlling the level of microbiological contamination of juice in a vertical extractor and the methods to control the level of contamination. The requirements and possibilities for controlling the microbiological contamination level of juice extracted from the vertical extractor are described using redox potential value measurement. Aerating the extractor in a controlled manner, regulating the pH level of the juice, and implementing one of the presented proposals on how to record the measurement results will automatically regulate microbiological contamination levels of the juice in the tower extractor.

6.
J Pers Med ; 12(4)2022 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-35455713

RESUMO

Introduction: Understanding the factors associated with the development of ventilator-associated pneumonia (VAP) in critically ill patients in the intensive care unit (ICU) will allow for better prevention and control of VAP. The aim of the study was to evaluate the incidence of VAP, as well as to determine risk factors and protective factors against VAP. Design: Mixed prospective and retrospective cohort study. Methods: The cohort involved 371 critically ill patients who received standard interventions to prevent VAP. Additionally, patients in the prospective cohort were provided with continuous automatic pressure control in tapered cuffs of endotracheal or tracheostomy tubes and continuous automatic subglottic secretion suction. Logistic regression was used to assess factors affecting VAP. Results: 52 (14%) patients developed VAP, and the incidence density of VAP per 1000 ventilator days was 9.7. The median days to onset of VAP was 7 [4; 13]. Early and late onset VAP was 6.2% and 7.8%, respectively. According to multivariable logistic regression analysis, tracheotomy (OR = 1.6; CI 95%: 1.1 to 2.31), multidrug-resistant bacteria isolated in the culture of lower respiratory secretions (OR = 2.73; Cl 95%: 1.83 to 4.07) and ICU length of stay >5 days (OR = 3.32; Cl 95%: 1.53 to 7.19) were positively correlated with VAP, while continuous control of cuff pressure and subglottic secretion suction used together were negatively correlated with VAP (OR = 0.61; Cl 95%: 0.43 to 0.87). Conclusions: Tracheotomy, multidrug-resistant bacteria, and ICU length of stay >5 days were independent risk factors of VAP, whereas continuous control of cuff pressure and subglottic secretion suction used together were protective factors against VAP.

7.
Front Neurorobot ; 16: 1075647, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36742191

RESUMO

Deep reinforcement learning (DRL) combines reinforcement learning algorithms with deep neural networks (DNNs). Spiking neural networks (SNNs) have been shown to be a biologically plausible and energy efficient alternative to DNNs. Since the introduction of surrogate gradient approaches that allowed to overcome the discontinuity in the spike function, SNNs can now be trained with the backpropagation through time (BPTT) algorithm. While largely explored on supervised learning problems, little work has been done on investigating the use of SNNs as function approximators in DRL. Here we show how SNNs can be applied to different DRL algorithms like Deep Q-Network (DQN) and Twin-Delayed Deep Deteministic Policy Gradient (TD3) for discrete and continuous action space environments, respectively. We found that SNNs are sensitive to the additional hyperparameters introduced by spiking neuron models like current and voltage decay factors, firing thresholds, and that extensive hyperparameter tuning is inevitable. However, we show that increasing the simulation time of SNNs, as well as applying a two-neuron encoding to the input observations helps reduce the sensitivity to the membrane parameters. Furthermore, we show that randomizing the membrane parameters, instead of selecting uniform values for all neurons, has stabilizing effects on the training. We conclude that SNNs can be utilized for learning complex continuous control problems with state-of-the-art DRL algorithms. While the training complexity increases, the resulting SNNs can be directly executed on neuromorphic processors and potentially benefit from their high energy efficiency.

8.
J Clin Med ; 10(21)2021 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-34768471

RESUMO

The ventilator bundle consists of multiple methods to reduce ventilator-associated pneumonia (VAP) rates in Intensive Care Units (ICU). The aim of the study was to evaluate how the continuous automatic pressure control in tapered cuffs of endotracheal/tracheostomy tubes applied along with continuous automatic subglottic secretion suction affect the incidence of VAP. In the prospective cohort (n = 198), the standard VAP bundle was modified by continuous automatic pressure control in taper-shaped cuff of endotracheal/tracheostomy tubes and subglottic secretion suction. VAP incidence, time to VAP onset, invasive mechanical ventilation days/free days, length of ICU stay, ICU mortality, and multidrug-resistant bacteria were assessed and compared to the retrospective cohort (n = 173) with the standard bundle (intermittent cuff pressure of standard cuff, lack of subglottic secretion suction). A smaller incidence of VAP (9.6% vs. 19.1%) and early onset VAP (1.5% vs. 8.1%) was found in the prospective compared to the retrospective cohort (p < 0.01). Patients in the prospective cohort were less likely to develop VAP (RR = 0.50; 95% CI: 0.29 to 0.85) and early-onset VAP (RR = 0.19; 95% CI: 0.05 to 0.64) and had longer time to onset VAP (median 9 vs. 5 days; p = 0.03). There was no significant difference (p > 0.05) between both cohorts in terms of invasive mechanical ventilation days/free days, length of ICU stay, ICU mortality and multidrug-resistant bacteria. Modification of the bundle for prevention of VAP can reduce early-onset VAP and total incidence of VAP and delay the time of VAP occurrence.

9.
Elife ; 102021 06 25.
Artigo em Inglês | MEDLINE | ID: mdl-34169838

RESUMO

How do people learn to perform tasks that require continuous adjustments of motor output, like riding a bicycle? People rely heavily on cognitive strategies when learning discrete movement tasks, but such time-consuming strategies are infeasible in continuous control tasks that demand rapid responses to ongoing sensory feedback. To understand how people can learn to perform such tasks without the benefit of cognitive strategies, we imposed a rotation/mirror reversal of visual feedback while participants performed a continuous tracking task. We analyzed behavior using a system identification approach, which revealed two qualitatively different components of learning: adaptation of a baseline controller and formation of a new, task-specific continuous controller. These components exhibited different signatures in the frequency domain and were differentially engaged under the rotation/mirror reversal. Our results demonstrate that people can rapidly build a new continuous controller de novo and can simultaneously deploy this process with adaptation of an existing controller.


Assuntos
Adaptação Psicológica , Aprendizagem , Destreza Motora , Desempenho Psicomotor , Percepção Visual , Adulto , Feminino , Humanos , Masculino , Adulto Jovem
10.
Nonlinear Dyn ; 100(3): 2933-2951, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32421101

RESUMO

This paper studies the rumor propagation model with heterogeneous networks in a multilingual environment. Firstly, a rumor propagation model with two language spreaders, in which the immunologic mechanism is considered in the ignorant, is proposed on heterogeneous networks. Secondly, the basic reproduction number and the dynamic behaviors are analyzed by using the next-generation matrix method and Lyapunov stability theory, respectively. Moreover, two control strategies are designed to effectively suppress the spread of the rumor. The one is continuous control strategy. By applying real-time control to the spreaders, the rumor spreading time can be greatly reduced and the rumor can die out in a short time. The other is event-triggered impulsive control strategy, which can effectively reduce the consumption of resources and ensure the extinction of the rumor. Finally, the correctness of theoretical analysis and the feasibility of control methods are verified by numerical simulations.

11.
Front Robot AI ; 7: 98, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33501265

RESUMO

We analyze the efficacy of modern neuro-evolutionary strategies for continuous control optimization. Overall, the results collected on a wide variety of qualitatively different benchmark problems indicate that these methods are generally effective and scale well with respect to the number of parameters and the complexity of the problem. Moreover, they are relatively robust with respect to the setting of hyper-parameters. The comparison of the most promising methods indicates that the OpenAI-ES algorithm outperforms or equals the other algorithms on all considered problems. Moreover, we demonstrate how the reward functions optimized for reinforcement learning methods are not necessarily effective for evolutionary strategies and vice versa. This finding can lead to reconsideration of the relative efficacy of the two classes of algorithm since it implies that the comparisons performed to date are biased toward one or the other class.

12.
Front Robot AI ; 7: 566037, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33585570

RESUMO

Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights into the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path following and collision avoidance, decision making becomes nontrivial. We propose a solution using state-of-the-art Deep Reinforcement Learning (DRL) techniques to develop autonomous agents capable of achieving this hybrid objective without having a priori knowledge about the goal or the environment. Our results demonstrate the viability of DRL in path following and avoiding collisions towards achieving human-level decision making in autonomous vehicle systems within extreme obstacle configurations.

13.
ISA Trans ; 98: 483-495, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31515092

RESUMO

This work proposes a model-free robust control for cable-driven manipulators with disturbance. To achieve accurate, singularity-free and fast dynamical control performance, we design a new NFTSM surface utilizing a new continuous TSM-type switch element. By replacing the integral power with fractional one for the error dynamics, the designed TSM-type switch element can effectively enhance the dynamical performance of the NFTSM surface. Time-delay estimation (TDE) technique is applied to cancel out complicated nonlinear dynamics guaranteeing an excellent model-free scheme. Thanks to the designed NFTSM surface, adopted reaching law and TDE, our control can provide good comprehensive control performance effectively. Stability and comparisons of control precision and convergence speed have been theoretically analyzed. Finally, comparative experiments were conducted to prove the superiorities of our control.

14.
Sensors (Basel) ; 19(18)2019 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-31491927

RESUMO

In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot's capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.

15.
Sensors (Basel) ; 19(7)2019 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-30935035

RESUMO

Model-free reinforcement learning is a powerful and efficient machine-learning paradigm which has been generally used in the robotic control domain. In the reinforcement learning setting, the value function method learns policies by maximizing the state-action value (Q value), but it suffers from inaccurate Q estimation and results in poor performance in a stochastic environment. To mitigate this issue, we present an approach based on the actor-critic framework, and in the critic branch we modify the manner of estimating Q-value by introducing the advantage function, such as dueling network, which can estimate the action-advantage value. The action-advantage value is independent of state and environment noise, we use it as a fine-tuning factor to the estimated Q value. We refer to this approach as the actor-dueling-critic (ADC) network since the frame is inspired by the dueling network. Furthermore, we redesign the dueling network part in the critic branch to make it adapt to the continuous action space. The method was tested on gym classic control environments and an obstacle avoidance environment, and we design a noise environment to test the training stability. The results indicate the ADC approach is more stable and converges faster than the DDPG method in noise environments.


Assuntos
Algoritmos , Aprendizado Profundo , Cadeias de Markov , Robótica
16.
Ultramicroscopy ; 190: 66-76, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29689446

RESUMO

Scanning ion conductance microscopy (SICM), one kind of scanning probe microscopy technique, featuring the advantage of non-contact imaging of sample surfaces in three dimensions with high resolution, has been widely applied in characterizations of sample topography, especially for soft materials. However, the time consuming imaging process of SICM restricts its further applications, such as in characterization of dynamic change of sample surface. In this work, a fast control mode of SICM, named as a continuous control mode, has been developed. In this mode, the SICM probe (i.e., pipette) is controlled by speed instructions in the axial direction of pipette (Z axis), and the pipette position is determined by the position sensor. Compared to the conventional piezo control mode of SICM (i.e., the stepwise control mode), in which the pipette is controlled by the position instructions and moves step by step, the continuous control mode can perform the continuous movement of the pipette in Z axis and overcome the time consuming problem caused by the repeated acceleration and deceleration of the pipette during the stepwise mode. Moreover, the imaging resolution in Z axis is not restricted by the pipette movement step and the imaging rate in the continuous control mode can be significantly enhanced without losing imaging quality. The approach speed of pipette in the continuous control mode can reach at 300 nm/ms, which is much faster than that in the stepwise mode. The surfaces of the soft polydimethylsiloxane (PDMS) samples with three different patterns, the hard metal grating sample and the cardiac fibroblasts as the biological sample demo were comparably scanned by SICM using the continuous control mode and the stepwise approach mode, respectively. The obtained SICM images of the sample topography prove that the continuous control mode can not only reduce the imaging deviation, but also efficiently improve the scanning rate of SICM. Furthermore, the continuous control mode can reconstruct the sample topography more stably compared to the stepwise control mode. The continuous control mode developed in this work can provide an efficient and reliable control strategy for improving the imaging performance of SICM system, and therefore can be potentially applied in dynamic characterizations of various samples in material science, biology and chemistry fields.


Assuntos
Íons/química , Microscopia de Varredura por Sonda/métodos , Animais , Dimetilpolisiloxanos/química , Desenho de Equipamento/métodos , Fibroblastos/fisiologia , Coração/fisiologia , Nylons/química , Ratos , Ratos Sprague-Dawley
17.
Respir Care ; 62(10): 1316-1323, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-28720674

RESUMO

Microaspiration of contaminated oropharyngeal and gastric secretions is the main mechanism for ventilator-associated pneumonia (VAP) in critically ill patients. Improving the performance of tracheal tubes in reducing microaspiration is one potential means to prevent VAP. The aim of this narrative review is to discuss recent findings on the impact of tracheal tube design on VAP prevention. Several randomized controlled studies have reported that subglottic secretion drainage (SSD) is efficient in VAP prevention. Meta-analyses have reported conflicting results regarding the impact of SSD on duration of mechanical ventilation, and one animal study raised concern about SSD-related tracheal lesions. However, this measure appears to be cost-effective. Therefore, SSD should probably be used in all patients with expected duration of mechanical ventilation > 48 h. Three randomized controlled trials have shown that tapered-cuff tracheal tubes are not useful to prevent VAP and should probably not be used in critically ill patients. Further studies are required to confirm the promising effects of continuous control of cuff pressure, polyurethane-cuffed, silver-coated, and low-volume low-pressure tracheal tubes. There is moderate evidence for the use of SSD and strong evidence against the use of tapered-cuff tracheal tubes in critically ill patients for VAP prevention. However, more data on the safety and cost-effectiveness of these measures are needed. Other tracheal tube-related preventive measures require further investigation.


Assuntos
Desenho de Equipamento , Intubação Intratraqueal/instrumentação , Pneumonia Associada à Ventilação Mecânica/prevenção & controle , Respiração Artificial/efeitos adversos , Estado Terminal/terapia , Drenagem/instrumentação , Drenagem/métodos , Glote/metabolismo , Humanos , Unidades de Terapia Intensiva , Pneumonia Associada à Ventilação Mecânica/etiologia , Poliuretanos , Pressão , Traqueia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA