Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
PLoS Comput Biol ; 12(6): e1005003, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27341100

RESUMO

Animals learn to make predictions, such as associating the sound of a bell with upcoming feeding or predicting a movement that a motor command is eliciting. How predictions are realized on the neuronal level and what plasticity rule underlies their learning is not well understood. Here we propose a biologically plausible synaptic plasticity rule to learn predictions on a single neuron level on a timescale of seconds. The learning rule allows a spiking two-compartment neuron to match its current firing rate to its own expected future discounted firing rate. For instance, if an originally neutral event is repeatedly followed by an event that elevates the firing rate of a neuron, the originally neutral event will eventually also elevate the neuron's firing rate. The plasticity rule is a form of spike timing dependent plasticity in which a presynaptic spike followed by a postsynaptic spike leads to potentiation. Even if the plasticity window has a width of 20 milliseconds, associations on the time scale of seconds can be learned. We illustrate prospective coding with three examples: learning to predict a time varying input, learning to predict the next stimulus in a delayed paired-associate task and learning with a recurrent network to reproduce a temporally compressed version of a sequence. We discuss the potential role of the learning mechanism in classical trace conditioning. In the special case that the signal to be predicted encodes reward, the neuron learns to predict the discounted future reward and learning is closely related to the temporal difference learning algorithm TD(λ).


Assuntos
Potenciais de Ação/fisiologia , Biologia Computacional/métodos , Modelos Neurológicos , Neurônios/fisiologia , Animais , Dendritos/fisiologia , Macaca , Plasticidade Neuronal/fisiologia
2.
PLoS Comput Biol ; 12(2): e1004638, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26841235

RESUMO

In the last decade dendrites of cortical neurons have been shown to nonlinearly combine synaptic inputs by evoking local dendritic spikes. It has been suggested that these nonlinearities raise the computational power of a single neuron, making it comparable to a 2-layer network of point neurons. But how these nonlinearities can be incorporated into the synaptic plasticity to optimally support learning remains unclear. We present a theoretically derived synaptic plasticity rule for supervised and reinforcement learning that depends on the timing of the presynaptic, the dendritic and the postsynaptic spikes. For supervised learning, the rule can be seen as a biological version of the classical error-backpropagation algorithm applied to the dendritic case. When modulated by a delayed reward signal, the same plasticity is shown to maximize the expected reward in reinforcement learning for various coding scenarios. Our framework makes specific experimental predictions and highlights the unique advantage of active dendrites for implementing powerful synaptic plasticity rules that have access to downstream information via backpropagation of action potentials.


Assuntos
Dendritos/fisiologia , Modelos Neurológicos , Plasticidade Neuronal/fisiologia , Potenciais de Ação/fisiologia , Algoritmos , Biologia Computacional
3.
PLoS One ; 10(12): e0144636, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26670700

RESUMO

Predictive coding has been previously introduced as a hierarchical coding framework for the visual system. At each level, activity predicted by the higher level is dynamically subtracted from the input, while the difference in activity continuously propagates further. Here we introduce modular predictive coding as a feedforward hierarchy of prediction modules without back-projections from higher to lower levels. Within each level, recurrent dynamics optimally segregates the input into novelty and familiarity components. Although the anatomical feedforward connectivity passes through the novelty-representing neurons, it is nevertheless the familiarity information which is propagated to higher levels. This modularity results in a twofold advantage compared to the original predictive coding scheme: the familiarity-novelty representation forms quickly, and at each level the full representational power is exploited for an optimized readout. As we show, natural images are successfully compressed and can be reconstructed by the familiarity neurons at each level. Missing information on different spatial scales is identified by novelty neurons and complements the familiarity representation. Furthermore, by virtue of the recurrent connectivity within each level, non-classical receptive field properties still emerge. Hence, modular predictive coding is a biologically realistic metaphor for the visual system that dynamically extracts novelty at various scales while propagating the familiarity information.


Assuntos
Reconhecimento Psicológico , Vias Visuais/fisiologia , Processamento de Imagem Assistida por Computador , Aprendizagem , Dinâmica não Linear
4.
PLoS One ; 10(7): e0127269, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26158660

RESUMO

Spatial navigation and planning is assumed to involve a cognitive map for evaluating trajectories towards a goal. How such a map is realized in neuronal terms, however, remains elusive. Here we describe a simple and noise-robust neuronal implementation of a path finding algorithm in complex environments. We consider a neuronal map of the environment that supports a traveling wave spreading out from the goal location opposite to direction of the physical movement. At each position of the map, the smallest firing phase between adjacent neurons indicate the shortest direction towards the goal. In contrast to diffusion or single-wave-fronts, local phase differences build up in time at arbitrary distances from the goal, providing a minimal and robust directional information throughout the map. The time needed to reach the steady state represents an estimate of an agent's waiting time before it heads off to the goal. Given typical waiting times we estimate the minimal number of neurons involved in the cognitive map. In the context of the planning model, forward and backward spread of neuronal activity, oscillatory waves, and phase precession get a functional interpretation, allowing for speculations about the biological counterpart.


Assuntos
Modelos Neurológicos , Neurônios/fisiologia , Algoritmos
5.
Artigo em Inglês | MEDLINE | ID: mdl-24999328

RESUMO

The recurrent interaction among orientation-selective neurons in the primary visual cortex (V1) is suited to enhance contours in a noisy visual scene. Motion is known to have a strong pop-up effect in perceiving contours, but how motion-sensitive neurons in V1 support contour detection remains vastly elusive. Here we suggest how the various types of motion-sensitive neurons observed in V1 should be wired together in a micro-circuitry to optimally extract contours in the visual scene. Motion-sensitive neurons can be selective about the direction of motion occurring at some spot or respond equally to all directions (pandirectional). We show that, in the light of figure-ground segregation, direction-selective motion neurons should additively modulate the corresponding orientation-selective neurons with preferred orientation orthogonal to the motion direction. In turn, to maximally enhance contours, pandirectional motion neurons should multiplicatively modulate all orientation-selective neurons with co-localized receptive fields. This multiplicative modulation amplifies the local V1-circuitry among co-aligned orientation-selective neurons for detecting elongated contours. We suggest that the additive modulation by direction-specific motion neurons is achieved through synaptic projections to the somatic region, and the multiplicative modulation by pandirectional motion neurons through projections to the apical region of orientation-specific pyramidal neurons. For the purpose of contour detection, the V1-intrinsic integration of motion information is advantageous over a downstream integration as it exploits the recurrent V1-circuitry designed for that task.

6.
PLoS Comput Biol ; 10(6): e1003640, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24901935

RESUMO

Recent experiments revealed that the fruit fly Drosophila melanogaster has a dedicated mechanism for forgetting: blocking the G-protein Rac leads to slower and activating Rac to faster forgetting. This active form of forgetting lacks a satisfactory functional explanation. We investigated optimal decision making for an agent adapting to a stochastic environment where a stimulus may switch between being indicative of reward or punishment. Like Drosophila, an optimal agent shows forgetting with a rate that is linked to the time scale of changes in the environment. Moreover, to reduce the odds of missing future reward, an optimal agent may trade the risk of immediate pain for information gain and thus forget faster after aversive conditioning. A simple neuronal network reproduces these features. Our theory shows that forgetting in Drosophila appears as an optimal adaptive behavior in a changing environment. This is in line with the view that forgetting is adaptive rather than a consequence of limitations of the memory system.


Assuntos
Drosophila melanogaster/fisiologia , Memória/fisiologia , Adaptação Fisiológica , Adaptação Psicológica , Animais , Comportamento Animal/fisiologia , Biologia Computacional , Condicionamento Psicológico , Tomada de Decisões/fisiologia , Meio Ambiente , Aprendizagem/fisiologia , Modelos Biológicos , Modelos Psicológicos , Odorantes , Recompensa , Processos Estocásticos
7.
Int J Neural Syst ; 24(5): 1450002, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24875790

RESUMO

Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions. We previously introduced reinforcement learning for population-based decision making by spiking neurons. Here we generalize population reinforcement learning to spike-based plasticity rules that take account of the postsynaptic neural code. We consider spike/no-spike, spike count and spike latency codes. The multi-valued and continuous-valued features in the postsynaptic code allow for a generalization of binary decision making to multi-valued decision making and continuous-valued action selection. We show that code-specific learning rules speed up learning both for the discrete classification and the continuous regression tasks. The suggested learning rules also speed up with increasing population size as opposed to standard reinforcement learning rules. Continuous action selection is further shown to explain realistic learning speeds in the Morris water maze. Finally, we introduce the concept of action perturbation as opposed to the classical weight- or node-perturbation as an exploration mechanism underlying reinforcement learning. Exploration in the action space greatly increases the speed of learning as compared to exploration in the neuron or weight space.


Assuntos
Potenciais de Ação/fisiologia , Aprendizagem/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Animais , Tomada de Decisões/fisiologia , Humanos , Redes Neurais de Computação
8.
Neuron ; 81(3): 521-8, 2014 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-24507189

RESUMO

Recent modeling of spike-timing-dependent plasticity indicates that plasticity involves as a third factor a local dendritic potential, besides pre- and postsynaptic firing times. We present a simple compartmental neuron model together with a non-Hebbian, biologically plausible learning rule for dendritic synapses where plasticity is modulated by these three factors. In functional terms, the rule seeks to minimize discrepancies between somatic firings and a local dendritic potential. Such prediction errors can arise in our model from stochastic fluctuations as well as from synaptic input, which directly targets the soma. Depending on the nature of this direct input, our plasticity rule subserves supervised or unsupervised learning. When a reward signal modulates the learning rate, reinforcement learning results. Hence a single plasticity rule supports diverse learning paradigms.


Assuntos
Potenciais de Ação/fisiologia , Dendritos/fisiologia , Aprendizagem , Modelos Neurológicos , Plasticidade Neuronal/fisiologia , Neurônios/citologia , Animais , Simulação por Computador , Neurônios/fisiologia , Sinapses/fisiologia , Fatores de Tempo
9.
J Math Neurosci ; 2(1): 2, 2012 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-22657827

RESUMO

We study synaptic plasticity in a complex neuronal cell model where NMDA-spikes can arise in certain dendritic zones. In the context of reinforcement learning, two kinds of plasticity rules are derived, zone reinforcement (ZR) and cell reinforcement (CR), which both optimize the expected reward by stochastic gradient ascent. For ZR, the synaptic plasticity response to the external reward signal is modulated exclusively by quantities which are local to the NMDA-spike initiation zone in which the synapse is situated. CR, in addition, uses nonlocal feedback from the soma of the cell, provided by mechanisms such as the backpropagating action potential. Simulation results show that, compared to ZR, the use of nonlocal feedback in CR can drastically enhance learning performance. We suggest that the availability of nonlocal feedback for learning is a key advantage of complex neurons over networks of simple point neurons, which have previously been found to be largely equivalent with regard to computational capability.

10.
PLoS Comput Biol ; 7(6): e1002092, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21738460

RESUMO

In learning from trial and error, animals need to relate behavioral decisions to environmental reinforcement even though it may be difficult to assign credit to a particular decision when outcomes are uncertain or subject to delays. When considering the biophysical basis of learning, the credit-assignment problem is compounded because the behavioral decisions themselves result from the spatio-temporal aggregation of many synaptic releases. We present a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces. Each synaptic cascade correlates presynaptic input first with postsynaptic events, next with the behavioral decisions and finally with external reinforcement. For operant conditioning, learning succeeds even when reinforcement is delivered with a delay so large that temporal contiguity between decision and pertinent reward is lost due to intervening decisions which are themselves subject to delayed reinforcement. This shows that the model provides a viable mechanism for temporal credit assignment. Further, learning speeds up with increasing population size, so the plasticity cascade simultaneously addresses the spatial problem of assigning credit to synapses in different population neurons. Simulations on other tasks, such as sequential decision making, serve to contrast the performance of the proposed scheme to that of temporal difference-based learning. We argue that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcement learning in the brain.


Assuntos
Aprendizagem/fisiologia , Modelos Neurológicos , Plasticidade Neuronal/fisiologia , Neurônios/fisiologia , Sinapses/fisiologia , Algoritmos , Animais , Biologia Computacional , Simulação por Computador , Tomada de Decisões/fisiologia , Cães , Cadeias de Markov , Memória , Recompensa , Transdução de Sinais , Fatores de Tempo
11.
Neural Comput ; 22(7): 1698-717, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20235820

RESUMO

We investigate a recently proposed model for decision learning in a population of spiking neurons where synaptic plasticity is modulated by a population signal in addition to reward feedback. For the basic model, binary population decision making based on spike/no-spike coding, a detailed computational analysis is given about how learning performance depends on population size and task complexity. Next, we extend the basic model to n-ary decision making and show that it can also be used in conjunction with other population codes such as rate or even latency coding.


Assuntos
Potenciais de Ação/fisiologia , Aprendizagem/fisiologia , Modelos Neurológicos , Rede Nervosa/fisiologia , Neurônios/fisiologia , Recompensa , Animais , Inteligência Artificial , Tomada de Decisões/fisiologia , Retroalimentação Fisiológica/fisiologia , Humanos , Redes Neurais de Computação , Tempo de Reação/fisiologia
12.
PLoS Comput Biol ; 5(12): e1000586, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19997492

RESUMO

Changes of synaptic connections between neurons are thought to be the physiological basis of learning. These changes can be gated by neuromodulators that encode the presence of reward. We study a family of reward-modulated synaptic learning rules for spiking neurons on a learning task in continuous space inspired by the Morris Water maze. The synaptic update rule modifies the release probability of synaptic transmission and depends on the timing of presynaptic spike arrival, postsynaptic action potentials, as well as the membrane potential of the postsynaptic neuron. The family of learning rules includes an optimal rule derived from policy gradient methods as well as reward modulated Hebbian learning. The synaptic update rule is implemented in a population of spiking neurons using a network architecture that combines feedforward input with lateral connections. Actions are represented by a population of hypothetical action cells with strong mexican-hat connectivity and are read out at theta frequency. We show that in this architecture, a standard policy gradient rule fails to solve the Morris watermaze task, whereas a variant with a Hebbian bias can learn the task within 20 trials, consistent with experiments. This result does not depend on implementation details such as the size of the neuronal populations. Our theoretical approach shows how learning new behaviors can be linked to reward-modulated plasticity at the level of single synapses and makes predictions about the voltage and spike-timing dependence of synaptic plasticity and the influence of neuromodulators such as dopamine. It is an important step towards connecting formal theories of reinforcement learning with neuronal and synaptic properties.


Assuntos
Aprendizagem em Labirinto/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Recompensa , Potenciais Sinápticos/fisiologia , Algoritmos , Animais , Biologia Computacional/métodos , Simulação por Computador , Ratos , Transdução de Sinais
13.
Neural Comput ; 21(2): 340-52, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19431262

RESUMO

We introduce a new supervised learning rule for the tempotron task: the binary classification of input spike trains by an integrate-and-fire neuron that encodes its decision by firing or not firing. The rule is based on the gradient of a cost function, is found to have enhanced performance, and does not rely on a specific reset mechanism in the integrate-and-fire neuron.


Assuntos
Potenciais de Ação/fisiologia , Aprendizagem/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Animais , Humanos , Rede Nervosa/fisiologia , Fatores de Tempo
14.
Biol Cybern ; 100(4): 319-30, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19360435

RESUMO

Reinforcement learning in neural networks requires a mechanism for exploring new network states in response to a single, nonspecific reward signal. Existing models have introduced synaptic or neuronal noise to drive this exploration. However, those types of noise tend to almost average out-precluding or significantly hindering learning -when coding in neuronal populations or by mean firing rates is considered. Furthermore, careful tuning is required to find the elusive balance between the often conflicting demands of speed and reliability of learning. Here we show that there is in fact no need to rely on intrinsic noise. Instead, ongoing synaptic plasticity triggered by the naturally occurring online sampling of a stimulus out of an entire stimulus set produces enough fluctuations in the synaptic efficacies for successful learning. By combining stimulus sampling with reward attenuation, we demonstrate that a simple Hebbian-like learning rule yields the performance that is very close to that of primates on visuomotor association tasks. In contrast, learning rules based on intrinsic noise (node and weight perturbation) are markedly slower. Furthermore, the performance advantage of our approach persists for more complex tasks and network architectures. We suggest that stimulus sampling and reward attenuation are two key components of a framework by which any single-cell supervised learning rule can be converted into a reinforcement learning rule for networks without requiring any intrinsic noise source.


Assuntos
Encéfalo/fisiologia , Aprendizagem/fisiologia , Plasticidade Neuronal/fisiologia , Reforço Psicológico , Animais , Haplorrinos
15.
Nat Neurosci ; 12(3): 250-2, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19219040

RESUMO

Population coding is widely regarded as an important mechanism for achieving reliable behavioral responses despite neuronal variability. However, standard reinforcement learning slows down with increasing population size, as the global reward signal becomes less and less related to the performance of any single neuron. We found that learning speeds up with increasing population size if, in addition to global reward, feedback about the population response modulates synaptic plasticity.


Assuntos
Potenciais de Ação/fisiologia , Aprendizagem/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Reforço Psicológico , Plasticidade Neuronal/fisiologia
16.
BMC Bioinformatics ; 7: 129, 2006 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-16533403

RESUMO

BACKGROUND: Despite recent algorithmic and conceptual progress, the stoichiometric network analysis of large metabolic models remains a computationally challenging problem. RESULTS: SNA is a interactive, high performance toolbox for analysing the possible steady state behaviour of metabolic networks by computing the generating and elementary vectors of their flux and conversions cones. It also supports analysing the steady states by linear programming. The toolbox is implemented mainly in Mathematica and returns numerically exact results. It is available under an open source license from: http://bioinformatics.org/project/?group_id=546. CONCLUSION: Thanks to its performance and modular design, SNA is demonstrably useful in analysing genome scale metabolic networks. Further, the integration into Mathematica provides a very flexible environment for the subsequent analysis and interpretation of the results.


Assuntos
Algoritmos , Fenômenos Fisiológicos Celulares , Modelos Biológicos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Software , Fatores de Transcrição/metabolismo , Interface Usuário-Computador , Simulação por Computador
17.
FEBS J ; 272(24): 6244-53, 2005 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-16336262

RESUMO

A representative model of mitochondrial pyruvate metabolism was broken down into its extremal independent currents and compared with experimental data obtained from liver mitochondria incubated with pyruvate as a substrate but in the absence of added adenosine diphosphate. Assuming no regulation of enzymatic activities, the free-flow prediction for the output of the model shows large discrepancies with the experimental data. To study the objective of the incubated mitochondria, we calculate the conversion cone of the model, which describes the possible input/output behaviour of the network. We demonstrate the consistency of the experimental data with the model because all measured data are within this cone. Because they are close to the boundary of the cone, we deduce that pyruvate is converted very efficiently (93%) to produce the measured extramitochondrial metabolites. We find that the main function of the incubated mitochondria is the production of malate and citrate, supporting the anaplerotic pathways in the cytosol, notably gluconeogenesis and fatty acid synthesis. Finally, we show that the major flow through the enzymatic steps of the mitochondrial pyruvate metabolism can be reliably predicted based on the stoichiometric model plus the measured extramitochondrial products. A major advantage of this method is that neither kinetic simulations nor radioactive tracers are needed.


Assuntos
Mitocôndrias Hepáticas/metabolismo , Ácido Pirúvico/metabolismo , Animais , Ácido Cítrico/metabolismo , Simulação por Computador , Ácidos Graxos/biossíntese , Gluconeogênese , Malatos/metabolismo , Modelos Biológicos , Ratos
18.
Phys Rev E Stat Nonlin Soft Matter Phys ; 72(2 Pt 2): 026117, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16196654

RESUMO

A training algorithm for multilayer perceptrons is discussed and studied in detail, which relates to the technique of principal component analysis. The latter is performed with respect to a correlation matrix computed from the example inputs and their target outputs. Typical properties of the training procedure are investigated by means of a statistical physics analysis in models of learning regression and classification tasks. We demonstrate that the procedure requires by far fewer examples for good generalization than traditional online training. For networks with a large number of hidden units we derive the training prescription which achieves, within our model, the optimal generalization behavior.

19.
Biophys J ; 89(6): 3837-45, 2005 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-16183876

RESUMO

The analysis of metabolic networks has become a major topic in biotechnology in recent years. Applications range from the enhanced production of selected outputs to the prediction of genotype-phenotype relationships. The concepts used are based on the assumption of a pseudo steady-state of the network, so that for each metabolite inputs and outputs are balanced. The stoichiometric network analysis expands the steady state into a combination of nonredundant subnetworks with positive coefficients called extremal currents. Based on the unidirectional representation of the system these subnetworks form a convex cone in the flux-space. A modification of this approach allowing for reversible reactions led to the definition of elementary modes. Extreme pathways are obtained with the same method but splitting up internal reactions into forward and backward rates. In this study, we explore the relationship between these concepts. Due to the combinatorial explosion of the number of elementary modes in large networks, we promote a further set of metabolic routes, which we call the minimal generating set. It is the smallest subset of elementary modes required to describe all steady states of the system. For large-scale networks, the size of this set is of several magnitudes smaller than that of elementary modes and of extreme pathways.


Assuntos
Fenômenos Fisiológicos Celulares , Metabolismo Energético/fisiologia , Regulação da Expressão Gênica/fisiologia , Modelos Biológicos , Modelos Químicos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Animais , Simulação por Computador , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA