Pesquisa | Portal Regional da BVS

1.

Warming up recurrent neural networks to maximise reachable multistability greatly improves learning.

Lambrechts, Gaspard; De Geeter, Florent; Vecoven, Nicolas; Ernst, Damien; Drion, Guillaume.

Neural Netw ; 166: 645-669, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37604075

RESUMO

Training recurrent neural networks is known to be difficult when time dependencies become long. In this work, we show that most standard cells only have one stable equilibrium at initialisation, and that learning on tasks with long time dependencies generally occurs once the number of network stable equilibria increases; a property known as multistability. Multistability is often not easily attained by initially monostable networks, making learning of long time dependencies between inputs and outputs difficult. This insight leads to the design of a novel way to initialise any recurrent cell connectivity through a procedure called "warmup" to improve its capability to learn arbitrarily long time dependencies. This initialisation procedure is designed to maximise network reachable multistability, i.e., the number of equilibria within the network that can be reached through relevant input trajectories, in few gradient steps. We show on several information restitution, sequence classification, and reinforcement learning benchmarks that warming up greatly improves learning speed and performance, for multiple recurrent cells, but sometimes impedes precision. We therefore introduce a double-layer architecture initialised with a partial warmup that is shown to greatly improve learning of long time dependencies while maintaining high levels of precision. This approach provides a general framework for improving learning abilities of any recurrent cell when long time dependencies are present. We also show empirically that other initialisation and pretraining procedures from the literature implicitly foster reachable multistability of recurrent cells.

Assuntos

Aprendizagem , Reforço Psicológico , Benchmarking , Inteligência , Redes Neurais de Computação

2.

Parallax Inference for Robust Temporal Monocular Depth Estimation in Unstructured Environments.

Fonder, Michaël; Ernst, Damien; Van Droogenbroeck, Marc.

Sensors (Basel) ; 22(23)2022 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-36502073

RESUMO

Estimating the distance to objects is crucial for autonomous vehicles, but cost, weight or power constraints sometimes prevent the use of dedicated depth sensors. In this case, the distance has to be estimated from on-board mounted RGB cameras, which is a complex task especially for environments such as natural outdoor landscapes. In this paper, we present a new depth estimation method suitable for use in such landscapes. First, we establish a bijective relationship between depth and the visual parallax of two consecutive frames and show how to exploit it to perform motion-invariant pixel-wise depth estimation. Then, we detail our architecture which is based on a pyramidal convolutional neural network where each level refines an input parallax map estimate by using two customized cost volumes. We use these cost volumes to leverage the visual spatio-temporal constraints imposed by motion and make the network robust for varied scenes. We benchmarked our approach both in test and generalization modes on public datasets featuring synthetic camera trajectories recorded in a wide variety of outdoor scenes. Results show that our network outperforms the state of the art on these datasets, while also performing well on a standard depth estimation benchmark.

Assuntos

Percepção de Movimento , Redes Neurais de Computação , Movimento (Física)

3.

A bio-inspired bistable recurrent cell allows for long-lasting memory.

Vecoven, Nicolas; Ernst, Damien; Drion, Guillaume.

PLoS One ; 16(6): e0252676, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34101750

RESUMO

Recurrent neural networks (RNNs) provide state-of-the-art performances in a wide variety of tasks that require memory. These performances can often be achieved thanks to gated recurrent cells such as gated recurrent units (GRU) and long short-term memory (LSTM). Standard gated cells share a layer internal state to store information at the network level, and long term memory is shaped by network-wide recurrent connection weights. Biological neurons on the other hand are capable of holding information at the cellular level for an arbitrary long amount of time through a process called bistability. Through bistability, cells can stabilize to different stable states depending on their own past state and inputs, which permits the durable storing of past information in neuron state. In this work, we take inspiration from biological neuron bistability to embed RNNs with long-lasting memory at the cellular level. This leads to the introduction of a new bistable biologically-inspired recurrent cell that is shown to strongly improves RNN performance on time-series which require very long memory, despite using only cellular connections (all recurrent connections are from neurons to themselves, i.e. a neuron state is not influenced by the state of other neurons). Furthermore, equipping this cell with recurrent neuromodulation permits to link them to standard GRU cells, taking a step towards the biological plausibility of GRU. With this link, this work paves the way for studying more complex and biologically plausible neuromodulation schemes as gating mechanisms in RNNs.

Assuntos

Algoritmos , Memória de Longo Prazo/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Sinapses/fisiologia , Animais , Retroalimentação Fisiológica/fisiologia , Humanos , Redes Neurais de Computação , Neurônios/citologia

4.

The impact of different COVID-19 containment measures on electricity consumption in Europe.

Bahmanyar, Alireza; Estebsari, Abouzar; Ernst, Damien.

Energy Res Soc Sci ; 68: 101683, 2020 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-32839702

RESUMO

As of March 13, 2020, the director general of the World Health Organization (WHO) considered Europe as the centre of the global COVID-19 outbreak. All countries within Europe had a confirmed case of COVID-19 by March 17. In response to the pandemic, different European countries took different approaches. This paper compares the impact of different containment measures taken by European countries in response to COVID-19 on their electricity consumption profiles. The comparisons are made for Spain, Italy, Belgium and the UK as countries with severe restrictions, and for the Netherlands and Sweden as countries with less restrictive measures. The results show that the consumption profiles reflect the difference in peoples' activities in different countries using various measures.

5.

Introducing neuromodulation in deep neural networks to learn adaptive behaviours.

Vecoven, Nicolas; Ernst, Damien; Wehenkel, Antoine; Drion, Guillaume.

PLoS One ; 15(1): e0227922, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-31986189

RESUMO

Animals excel at adapting their intentions, attention, and actions to the environment, making them remarkably efficient at interacting with a rich, unpredictable and ever-changing external world, a property that intelligent machines currently lack. Such an adaptation property relies heavily on cellular neuromodulation, the biological mechanism that dynamically controls intrinsic properties of neurons and their response to external stimuli in a context-dependent manner. In this paper, we take inspiration from cellular neuromodulation to construct a new deep neural network architecture that is specifically designed to learn adaptive behaviours. The network adaptation capabilities are tested on navigation benchmarks in a meta-reinforcement learning context and compared with state-of-the-art approaches. Results show that neuromodulation is capable of adapting an agent to different tasks and that neuromodulation-based approaches provide a promising way of improving adaptation of artificial systems.

Assuntos

Adaptação Psicológica/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Algoritmos , Animais , Inteligência Artificial , Atenção/fisiologia , Redes Neurais de Computação

6.

Benchmarking for Bayesian Reinforcement Learning.

Castronovo, Michael; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphael.

PLoS One ; 11(6): e0157088, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27304891

RESUMO

In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures the performance of algorithms on large sets of Markov Decision Processes (MDPs) drawn from some probability distributions is defined. In order to enable the comparison of non-anytime algorithms, our methodology also includes a detailed analysis of the computation time requirement of each algorithm. Our library is released with all source code and documentation: it includes three test problems, each of which has two different prior distributions, and seven state-of-the-art RL algorithms. Finally, our library is illustrated by comparing all the available algorithms and the results are discussed.

Assuntos

Algoritmos , Teorema de Bayes , Benchmarking/métodos , Aprendizagem/fisiologia , Reforço Psicológico , Animais , Comportamento de Escolha , Biologia Computacional/métodos , Simulação por Computador , Tomada de Decisões , Humanos , Cadeias de Markov , Reprodutibilidade dos Testes , Recompensa

7.

Mathematical Modeling of HIV Dynamics After Antiretroviral Therapy Initiation: A Review.

Rivadeneira, Pablo S; Moog, Claude H; Stan, Guy-Bart; Brunet, Cecile; Raffi, François; Ferré, Virginie; Costanza, Vicente; Mhawej, Marie J; Biafore, Federico; Ouattara, Djomangan A; Ernst, Damien; Fonteneau, Raphael; Xia, Xiaohua.

Biores Open Access ; 3(5): 233-41, 2014 Oct 01.

Artigo em Inglês | MEDLINE | ID: mdl-25371860

RESUMO

This review shows the potential ground-breaking impact that mathematical tools may have in the analysis and the understanding of the HIV dynamics. In the first part, early diagnosis of immunological failure is inferred from the estimation of certain parameters of a mathematical model of the HIV infection dynamics. This method is supported by clinical research results from an original clinical trial: data just after 1 month following therapy initiation are used to carry out the model identification. The diagnosis is shown to be consistent with results from monitoring of the patients after 6 months. In the second part of this review, prospective research results are given for the design of individual anti-HIV treatments optimizing the recovery of the immune system and minimizing side effects. In this respect, two methods are discussed. The first one combines HIV population dynamics with pharmacokinetics and pharmacodynamics models to generate drug treatments using impulsive control systems. The second one is based on optimal control theory and uses a recently published differential equation to model the side effects produced by highly active antiretroviral therapy therapies. The main advantage of these revisited methods is that the drug treatment is computed directly in amounts of drugs, which is easier to interpret by physicians and patients.

8.

Mathematical modeling of HIV dynamics after antiretroviral therapy initiation: a clinical research study.

Rivadeneira, Pablo S; Moog, Claude H; Stan, Guy-Bart; Costanza, Vicente; Brunet, Cécile; Raffi, Francois; Ferré, Virginie; Mhawej, Marie-José; Biafore, Federico; Ouattara, Djomangan A; Ernst, Damien; Fonteneau, Raphael; Xia, Xiaohua.

AIDS Res Hum Retroviruses ; 30(9): 831-4, 2014 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-25055189

Assuntos

Fármacos Anti-HIV/uso terapêutico , Infecções por HIV/tratamento farmacológico , HIV/fisiologia , Modelos Teóricos , Infecções por HIV/virologia , Humanos

9.

Batch Mode Reinforcement Learning based on the Synthesis of Artificial Trajectories.

Fonteneau, Raphael; Murphy, Susan A; Wehenkel, Louis; Ernst, Damien.

Ann Oper Res ; 208(1): 383-416, 2013 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-24049244

RESUMO

In this paper, we consider the batch mode reinforcement learning setting, where the central problem is to learn from a sample of trajectories a policy that satisfies or optimizes a performance criterion. We focus on the continuous state space case for which usual resolution schemes rely on function approximators either to represent the underlying control problem or to represent its value function. As an alternative to the use of function approximators, we rely on the synthesis of "artificial trajectories" from the given sample of trajectories, and show that this idea opens new avenues for designing and analyzing algorithms for batch mode reinforcement learning.

10.

Cross-entropy optimization of control policies with adaptive basis functions.

Busoniu, Lucian; Ernst, Damien; De Schutter, Bart; Babuska, Robert.

IEEE Trans Syst Man Cybern B Cybern ; 41(1): 196-209, 2011 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-20570774

RESUMO

This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-loop policy that can be represented using a given number of basis functions (BFs), where a discrete action is assigned to each BF. The type of the BFs and their number are specified in advance and determine the complexity of the representation. Considerable flexibility is achieved by optimizing the locations and shapes of the BFs, together with the action assignments. The optimization is carried out with the cross-entropy method and evaluates the policies by their empirical return from a representative set of initial states. The return for each representative state is estimated using Monte Carlo simulations. The resulting algorithm for cross-entropy policy search with adaptive BFs is extensively evaluated in problems with two to six state variables, for which it reliably obtains good policies with only a small number of BFs. In these experiments, cross-entropy policy search requires vastly fewer BFs than value-function techniques with equidistant BFs, and outperforms policy search with a competing optimization algorithm called DIRECT.

11.

Reinforcement learning versus model predictive control: a comparison on a power system problem.

Ernst, Damien; Glavic, Mevludin; Capitanescu, Florin; Wehenkel, Louis.

IEEE Trans Syst Man Cybern B Cybern ; 39(2): 517-29, 2009 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-19095542

RESUMO

This paper compares reinforcement learning (RL) with model predictive control (MPC) in a unified framework and reports experimental results of their application to the synthesis of a controller for a nonlinear and deterministic electrical power oscillations damping problem. Both families of methods are based on the formulation of the control problem as a discrete-time optimal control problem. The considered MPC approach exploits an analytical model of the system dynamics and cost function and computes open-loop policies by applying an interior-point solver to a minimization problem in which the system dynamics are represented by equality constraints. The considered RL approach infers in a model-free way closed-loop policies from a set of system trajectories and instantaneous cost values by solving a sequence of batch-mode supervised learning problems. The results obtained provide insight into the pros and cons of the two approaches and show that RL may certainly be competitive with MPC even in contexts where a good deterministic system model is available.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA