Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Nature ; 588(7839): 604-609, 2020 12.
Article in English | MEDLINE | ID: mdl-33361790

ABSTRACT

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess1 and Go2, where a perfect simulator is available. However, in real-world problems, the dynamics governing the environment are often complex and unknown. Here we present the MuZero algorithm, which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. The MuZero algorithm learns an iterable model that produces predictions relevant to planning: the action-selection policy, the value function and the reward. When evaluated on 57 different Atari games3-the canonical video game environment for testing artificial intelligence techniques, in which model-based planning approaches have historically struggled4-the MuZero algorithm achieved state-of-the-art performance. When evaluated on Go, chess and shogi-canonical environments for high-performance planning-the MuZero algorithm matched, without any knowledge of the game dynamics, the superhuman performance of the AlphaZero algorithm5 that was supplied with the rules of the game.

2.
J Neurosci ; 44(5)2024 Jan 31.
Article in English | MEDLINE | ID: mdl-37989593

ABSTRACT

Scientists have long conjectured that the neocortex learns patterns in sensory data to generate top-down predictions of upcoming stimuli. In line with this conjecture, different responses to pattern-matching vs pattern-violating visual stimuli have been observed in both spiking and somatic calcium imaging data. However, it remains unknown whether these pattern-violation signals are different between the distal apical dendrites, which are heavily targeted by top-down signals, and the somata, where bottom-up information is primarily integrated. Furthermore, it is unknown how responses to pattern-violating stimuli evolve over time as an animal gains more experience with them. Here, we address these unanswered questions by analyzing responses of individual somata and dendritic branches of layer 2/3 and layer 5 pyramidal neurons tracked over multiple days in primary visual cortex of awake, behaving female and male mice. We use sequences of Gabor patches with patterns in their orientations to create pattern-matching and pattern-violating stimuli, and two-photon calcium imaging to record neuronal responses. Many neurons in both layers show large differences between their responses to pattern-matching and pattern-violating stimuli. Interestingly, these responses evolve in opposite directions in the somata and distal apical dendrites, with somata becoming less sensitive to pattern-violating stimuli and distal apical dendrites more sensitive. These differences between the somata and distal apical dendrites may be important for hierarchical computation of sensory predictions and learning, since these two compartments tend to receive bottom-up and top-down information, respectively.


Subject(s)
Calcium , Neocortex , Male , Female , Mice , Animals , Calcium/physiology , Neurons/physiology , Dendrites/physiology , Pyramidal Cells/physiology , Neocortex/physiology
3.
Nat Rev Neurosci ; 21(6): 335-346, 2020 06.
Article in English | MEDLINE | ID: mdl-32303713

ABSTRACT

During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to determine the effect of an individual synaptic modification on the behaviour of the system. The backpropagation algorithm solves this problem in deep artificial neural networks, but historically it has been viewed as biologically problematic. Nonetheless, recent developments in neuroscience and the successes of artificial neural networks have reinvigorated interest in whether backpropagation offers insights for understanding learning in the cortex. The backpropagation algorithm learns quickly by computing synaptic updates using feedback connections to deliver error signals. Although feedback connections are ubiquitous in the cortex, it is difficult to see how they could deliver the error signals required by strict formulations of backpropagation. Here we build on past and recent developments to argue that feedback connections may instead induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain.


Subject(s)
Cerebral Cortex/physiology , Feedback , Learning/physiology , Algorithms , Animals , Humans , Models, Neurological , Neural Networks, Computer
4.
Nature ; 575(7782): 350-354, 2019 11.
Article in English | MEDLINE | ID: mdl-31666705

ABSTRACT

Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions1-3, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems4. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks5,6. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.


Subject(s)
Reinforcement, Psychology , Video Games , Artificial Intelligence , Humans , Learning
5.
Nature ; 557(7705): 429-433, 2018 05.
Article in English | MEDLINE | ID: mdl-29743670

ABSTRACT

Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go1,2. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning3-5 failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex 6 . Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space7,8 and is critical for integrating self-motion (path integration)6,7,9 and planning direct trajectories to goals (vector-based navigation)7,10,11. Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types 12 . We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments-optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation7,10,11, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.


Subject(s)
Biomimetics/methods , Machine Learning , Neural Networks, Computer , Spatial Navigation , Animals , Entorhinal Cortex/cytology , Entorhinal Cortex/physiology , Environment , Grid Cells/physiology , Humans
6.
Nature ; 550(7676): 354-359, 2017 10 18.
Article in English | MEDLINE | ID: mdl-29052630

ABSTRACT

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo's own move selections and also the winner of AlphaGo's games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo.


Subject(s)
Games, Recreational , Software , Unsupervised Machine Learning , Humans , Neural Networks, Computer , Reinforcement, Psychology , Supervised Machine Learning
7.
Nature ; 529(7587): 484-9, 2016 Jan 28.
Article in English | MEDLINE | ID: mdl-26819042

ABSTRACT

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.


Subject(s)
Games, Recreational , Neural Networks, Computer , Software , Supervised Machine Learning , Computers , Europe , Humans , Monte Carlo Method , Reinforcement, Psychology
8.
Behav Brain Sci ; 42: e240, 2019 11 28.
Article in English | MEDLINE | ID: mdl-31775918

ABSTRACT

Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.


Subject(s)
Artificial Intelligence , Metaphor , Brain , Learning
9.
10.
Neural Comput ; 29(3): 578-602, 2017 03.
Article in English | MEDLINE | ID: mdl-28095195

ABSTRACT

Recent work in computer science has shown the power of deep learning driven by the backpropagation algorithm in networks of artificial neurons. But real neurons in the brain are different from most of these artificial ones in at least three crucial ways: they emit spikes rather than graded outputs, their inputs and outputs are related dynamically rather than by piecewise-smooth functions, and they have no known way to coordinate arrays of synapses in separate forward and feedback pathways so that they change simultaneously and identically, as they do in backpropagation. Given these differences, it is unlikely that current deep learning algorithms can operate in the brain, but we that show these problems can be solved by two simple devices: learning rules can approximate dynamic input-output relations with piecewise-smooth functions, and a variation on the feedback alignment algorithm can train deep networks without having to coordinate forward and feedback synapses. Our results also show that deep spiking networks learn much better if each neuron computes an intracellular teaching signal that reflects that cell's nonlinearity. With this mechanism, networks of spiking neurons show useful learning in synapses at least nine layers upstream from the output cells and perform well compared to other spiking networks in the literature on the MNIST digit recognition task.

11.
Behav Brain Sci ; 40: e255, 2017 01.
Article in English | MEDLINE | ID: mdl-29342685

ABSTRACT

We agree with Lake and colleagues on their list of "key ingredients" for building human-like intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here, we survey several important examples of the progress that has been made toward building autonomous agents with human-like abilities, and highlight some outstanding challenges.


Subject(s)
Learning , Thinking , Humans , Problem Solving
12.
J Neurophysiol ; 115(4): 2021-32, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26843605

ABSTRACT

Primary motor cortex (M1) activity correlates with many motor variables, making it difficult to demonstrate how it participates in motor control. We developed a two-stage process to separate the process of classifying the motor field of M1 neurons from the process of predicting the spatiotemporal patterns of its motor field during reaching. We tested our approach with a neural network model that controlled a two-joint arm to show the statistical relationship between network connectivity and neural activity across different motor tasks. In rhesus monkeys, M1 neurons classified by this method showed preferred reaching directions similar to their associated muscle groups. Importantly, the neural population signals predicted the spatiotemporal dynamics of their associated muscle groups, although a subgroup of atypical neurons reversed their directional preference, suggesting a selective role in antagonist control. These results highlight that M1 provides important details on the spatiotemporal patterns of muscle activity during motor skills such as reaching.


Subject(s)
Motor Cortex/physiology , Motor Neurons/physiology , Movement , Muscle, Skeletal/innervation , Posture , Animals , Arm/innervation , Arm/physiology , Macaca mulatta , Male , Motor Cortex/cytology , Muscle, Skeletal/physiology
13.
J Neurophysiol ; 113(7): 2812-23, 2015 Apr 01.
Article in English | MEDLINE | ID: mdl-25673733

ABSTRACT

A prevailing theory in the cortical control of limb movement posits that premotor cortex initiates a high-level motor plan that is transformed by the primary motor cortex (MI) into a low-level motor command to be executed. This theory implies that the premotor cortex is shielded from the motor periphery, and therefore, its activity should not represent the low-level features of movement. Contrary to this theory, we show that both dorsal (PMd) and ventral premotor (PMv) cortexes exhibit population-level tuning properties that reflect the biomechanical properties of the periphery similar to those observed in M1. We recorded single-unit activity from M1, PMd, and PMv and characterized their tuning properties while six rhesus macaques performed a reaching task in the horizontal plane. Each area exhibited a bimodal distribution of preferred directions during execution consistent with the known biomechanical anisotropies of the muscles and limb segments. Moreover, these distributions varied in orientation or shape from planning to execution. A network model shows that such population dynamics are linked to a change in biomechanics of the limb as the monkey begins to move, specifically to the state-dependent properties of muscles. We suggest that, like M1, neural populations in PMd and PMv are more directly linked with the motor periphery than previously thought.


Subject(s)
Arm/physiology , Executive Function/physiology , Motor Cortex/physiology , Movement/physiology , Muscle Contraction/physiology , Muscle, Skeletal/physiology , Animals , Computer Simulation , Female , Macaca mulatta , Male , Models, Neurological , Muscle, Skeletal/innervation , Time Factors
14.
Exp Brain Res ; 228(3): 327-39, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23700129

ABSTRACT

While sensorimotor adaptation to prisms that displace the visual field takes minutes, adapting to an inversion of the visual field takes weeks. In spite of a long history of the study, the basis of this profound difference remains poorly understood. Here, we describe the computational issue that underpins this phenomenon and presents experiments designed to explore the mechanisms involved. We show that displacements can be mastered without altering the updated rule used to adjust the motor commands. In contrast, inversions flip the sign of crucial variables called sensitivity derivatives-variables that capture how changes in motor commands affect task error and therefore require an update of the feedback learning rule itself. Models of sensorimotor learning that assume internal estimates of these variables are known and fixed predicted that when the sign of a sensitivity derivative is flipped, adaptations should become increasingly counterproductive. In contrast, models that relearn these derivatives predict that performance should initially worsen, but then improve smoothly and remain stable once the estimate of the new sensitivity derivative has been corrected. Here, we evaluated these predictions by looking at human performance on a set of pointing tasks with vision perturbed by displacing and inverting prisms. Our experimental data corroborate the classic observation that subjects reduce their motor errors under inverted vision. Subjects' accuracy initially worsened and then improved. However, improvement was jagged rather than smooth and performance remained unstable even after 8 days of continually inverted vision, suggesting that subjects improve via an unknown mechanism, perhaps a combination of cognitive and implicit strategies. These results offer a new perspective on classic work with inverted vision.


Subject(s)
Adaptation, Physiological/physiology , Visual Fields/physiology , Visual Perception/physiology , Female , Humans , Male , Psychomotor Performance/physiology , Rotation
15.
Sci Data ; 10(1): 287, 2023 05 17.
Article in English | MEDLINE | ID: mdl-37198203

ABSTRACT

The apical dendrites of pyramidal neurons in sensory cortex receive primarily top-down signals from associative and motor regions, while cell bodies and nearby dendrites are heavily targeted by locally recurrent or bottom-up inputs from the sensory periphery. Based on these differences, a number of theories in computational neuroscience postulate a unique role for apical dendrites in learning. However, due to technical challenges in data collection, little data is available for comparing the responses of apical dendrites to cell bodies over multiple days. Here we present a dataset collected through the Allen Institute Mindscope's OpenScope program that addresses this need. This dataset comprises high-quality two-photon calcium imaging from the apical dendrites and the cell bodies of visual cortical pyramidal neurons, acquired over multiple days in awake, behaving mice that were presented with visual stimuli. Many of the cell bodies and dendrite segments were tracked over days, enabling analyses of how their responses change over time. This dataset allows neuroscientists to explore the differences between apical and somatic processing and plasticity.


Subject(s)
Pyramidal Cells , Visual Cortex , Animals , Mice , Cell Body , Dendrites/physiology , Neurons , Pyramidal Cells/physiology , Visual Cortex/physiology
16.
Nat Commun ; 14(1): 1597, 2023 03 22.
Article in English | MEDLINE | ID: mdl-36949048

ABSTRACT

Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts the focus from those capabilities like game playing and language that are especially well-developed or uniquely human to those capabilities - inherited from over 500 million years of evolution - that are shared with all animals. Building models that can pass the embodied Turing test will provide a roadmap for the next generation of AI.


Subject(s)
Artificial Intelligence , Neurosciences , Animals , Humans
17.
Elife ; 102021 11 03.
Article in English | MEDLINE | ID: mdl-34730516

ABSTRACT

Recent studies have identified rotational dynamics in motor cortex (MC), which many assume arise from intrinsic connections in MC. However, behavioral and neurophysiological studies suggest that MC behaves like a feedback controller where continuous sensory feedback and interactions with other brain areas contribute substantially to MC processing. We investigated these apparently conflicting theories by building recurrent neural networks that controlled a model arm and received sensory feedback from the limb. Networks were trained to counteract perturbations to the limb and to reach toward spatial targets. Network activities and sensory feedback signals to the network exhibited rotational structure even when the recurrent connections were removed. Furthermore, neural recordings in monkeys performing similar tasks also exhibited rotational structure not only in MC but also in somatosensory cortex. Our results argue that rotational structure may also reflect dynamics throughout the voluntary motor system involved in online control of motor actions.


Subject(s)
Feedback, Sensory/physiology , Macaca mulatta/physiology , Motor Cortex/physiology , Somatosensory Cortex/physiology , Animals , Models, Neurological
18.
J Neurophysiol ; 103(1): 564-72, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19923243

ABSTRACT

Correlations between neural activity in primary motor cortex (M1) and arm kinematics have recently been shown to be temporally extensive and spatially complex. These results provide a sophisticated account of M1 processing and suggest that M1 neurons encode high-level movement trajectories, termed "pathlets." However, interpreting pathlets is difficult because the mapping between M1 activity and arm kinematics is indirect: M1 activity can generate movement only via spinal circuitry and the substantial complexities of the musculoskeletal system. We hypothesized that filter-like complexities of the musculoskeletal system are sufficient to generate temporally extensive and spatially complex correlations between motor commands and arm kinematics. To test this hypothesis, we extended the computational and experimental method proposed for extracting pathlets from M1 activity to extract pathlets from muscle activity. Unlike M1 activity, it is clear that muscle activity does not encode arm kinematics. Accordingly, any spatiotemporal correlations in muscle pathlets can be attributed to musculoskeletal complexities rather than explicit higher-order representations. Our results demonstrate that extracting muscle pathlets is a robust and repeatable process. Pathlets extracted from the same muscle but different subjects or from the same muscle on different days were remarkably similar and roughly appropriate for that muscle's mechanical action. Critically, muscle pathlets included extensive spatiotemporal complexity, including kinematic features before and after the present muscle activity, similar to that reported for M1 neurons. These results suggest the possibility that M1 pathlets at least partly reflect the filter-like complexities of the periphery rather than high-level representations.


Subject(s)
Arm/physiology , Motor Activity/physiology , Motor Cortex/physiology , Muscle, Skeletal/physiology , Neurons/physiology , Analysis of Variance , Biomechanical Phenomena , Electromyography , Female , Hand/physiology , Humans , Male , Models, Neurological , Robotics , Time Factors
19.
Curr Opin Neurobiol ; 55: 82-89, 2019 04.
Article in English | MEDLINE | ID: mdl-30851654

ABSTRACT

It has long been speculated that the backpropagation-of-error algorithm (backprop) may be a model of how the brain learns. Backpropagation-through-time (BPTT) is the canonical temporal-analogue to backprop used to assign credit in recurrent neural networks in machine learning, but there's even less conviction about whether BPTT has anything to do with the brain. Even in machine learning the use of BPTT in classic neural network architectures has proven insufficient for some challenging temporal credit assignment (TCA) problems that we know the brain is capable of solving. Nonetheless, recent work in machine learning has made progress in solving difficult TCA problems by employing novel memory-based and attention-based architectures and algorithms, some of which are brain inspired. Importantly, these recent machine learning methods have been developed in the context of, and with reference to BPTT, and thus serve to strengthen BPTT's position as a useful normative guide for thinking about temporal credit assignment in artificial and biological systems alike.


Subject(s)
Algorithms , Brain , Machine Learning , Memory , Neural Networks, Computer
20.
Curr Opin Neurobiol ; 54: 28-36, 2019 02.
Article in English | MEDLINE | ID: mdl-30205266

ABSTRACT

Guaranteeing that synaptic plasticity leads to effective learning requires a means for assigning credit to each neuron for its contribution to behavior. The 'credit assignment problem' refers to the fact that credit assignment is non-trivial in hierarchical networks with multiple stages of processing. One difficulty is that if credit signals are integrated with other inputs, then it is hard for synaptic plasticity rules to distinguish credit-related activity from non-credit-related activity. A potential solution is to use the spatial layout and non-linear properties of dendrites to distinguish credit signals from other inputs. In cortical pyramidal neurons, evidence hints that top-down feedback signals are integrated in the distal apical dendrites and have a distinct impact on spike-firing and synaptic plasticity. This suggests that the distal apical dendrites of pyramidal neurons help the brain to solve the credit assignment problem.


Subject(s)
Brain/cytology , Dendrites/physiology , Learning , Neuronal Plasticity/physiology , Action Potentials/physiology , Animals , Brain/physiology , Humans , Neural Pathways/physiology
SELECTION OF CITATIONS
SEARCH DETAIL