Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Learning agile soccer skills for a bipedal robot with deep reinforcement learning.

Haarnoja, Tuomas; Moran, Ben; Lever, Guy; Huang, Sandy H; Tirumala, Dhruva; Humplik, Jan; Wulfmeier, Markus; Tunyasuvunakool, Saran; Siegel, Noah Y; Hafner, Roland; Bloesch, Michael; Hartikainen, Kristian; Byravan, Arunkumar; Hasenclever, Leonard; Tassa, Yuval; Sadeghi, Fereshteh; Batchelor, Nathan; Casarini, Federico; Saliceti, Stefano; Game, Charles; Sreendra, Neil; Patel, Kushal; Gwira, Marlon; Huber, Andrea; Hurley, Nicole; Nori, Francesco; Hadsell, Raia; Heess, Nicolas.

Sci Robot ; 9(89): eadi8022, 2024 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-38598610

RESUMO

We investigated whether deep reinforcement learning (deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies. We used deep RL to train a humanoid robot to play a simplified one-versus-one soccer game. The resulting agent exhibits robust and dynamic movement skills, such as rapid fall recovery, walking, turning, and kicking, and it transitions between them in a smooth and efficient manner. It also learned to anticipate ball movements and block opponent shots. The agent's tactical behavior adapts to specific game contexts in a way that would be impractical to manually design. Our agent was trained in simulation and transferred to real robots zero-shot. A combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training enabled good-quality transfer. In experiments, the agent walked 181% faster, turned 302% faster, took 63% less time to get up, and kicked a ball 34% faster than a scripted baseline.

Assuntos

Robótica , Futebol , Robótica/métodos , Aprendizagem , Caminhada , Simulação por Computador

Skilful precipitation nowcasting using deep generative models of radar.

Ravuri, Suman; Lenc, Karel; Willson, Matthew; Kangin, Dmitry; Lam, Remi; Mirowski, Piotr; Fitzsimons, Megan; Athanassiadou, Maria; Kashem, Sheleem; Madge, Sam; Prudden, Rachel; Mandhane, Amol; Clark, Aidan; Brock, Andrew; Simonyan, Karen; Hadsell, Raia; Robinson, Niall; Clancy, Ellen; Arribas, Alberto; Mohamed, Shakir.

Nature ; 597(7878): 672-677, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34588668

RESUMO

Precipitation nowcasting, the high-resolution forecasting of precipitation up to two hours ahead, supports the real-world socioeconomic needs of many sectors reliant on weather-dependent decision-making1,2. State-of-the-art operational nowcasting methods typically advect precipitation fields with radar-based wind estimates, and struggle to capture important non-linear events such as convective initiations3,4. Recently introduced deep learning methods use radar to directly predict future rain rates, free of physical constraints5,6. While they accurately predict low-intensity rainfall, their operational utility is limited because their lack of constraints produces blurry nowcasts at longer lead times, yielding poor performance on rarer medium-to-heavy rain events. Here we present a deep generative model for the probabilistic nowcasting of precipitation from radar that addresses these challenges. Using statistical, economic and cognitive measures, we show that our method provides improved forecast quality, forecast consistency and forecast value. Our model produces realistic and spatiotemporally consistent predictions over regions up to 1,536 km × 1,280 km and with lead times from 5-90 min ahead. Using a systematic evaluation by more than 50 expert meteorologists, we show that our generative model ranked first for its accuracy and usefulness in 89% of cases against two competitive methods. When verified quantitatively, these nowcasts are skillful without resorting to blurring. We show that generative nowcasting can provide probabilistic predictions that improve forecast value and support operational utility, and at resolutions and lead times where alternative methods struggle.

Embracing Change: Continual Learning in Deep Neural Networks.

Hadsell, Raia; Rao, Dushyant; Rusu, Andrei A; Pascanu, Razvan.

Trends Cogn Sci ; 24(12): 1028-1040, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-33158755

RESUMO

Artificial intelligence research has seen enormous progress over the past few decades, but it predominantly relies on fixed datasets and stationary environments. Continual learning is an increasingly relevant area of study that asks how artificial systems might learn sequentially, as biological systems do, from a continuous stream of correlated data. In the present review, we relate continual learning to the learning dynamics of neural networks, highlighting the potential it has to considerably improve data efficiency. We further consider the many new biologically inspired approaches that have emerged in recent years, focusing on those that utilize regularization, modularity, memory, and meta-learning, and highlight some of the most promising and impactful directions.

Assuntos

Inteligência Artificial , Redes Neurais de Computação , Humanos , Aprendizagem , Memória

Vector-based navigation using grid-like representations in artificial agents.

Banino, Andrea; Barry, Caswell; Uria, Benigno; Blundell, Charles; Lillicrap, Timothy; Mirowski, Piotr; Pritzel, Alexander; Chadwick, Martin J; Degris, Thomas; Modayil, Joseph; Wayne, Greg; Soyer, Hubert; Viola, Fabio; Zhang, Brian; Goroshin, Ross; Rabinowitz, Neil; Pascanu, Razvan; Beattie, Charlie; Petersen, Stig; Sadik, Amir; Gaffney, Stephen; King, Helen; Kavukcuoglu, Koray; Hassabis, Demis; Hadsell, Raia; Kumaran, Dharshan.

Nature ; 557(7705): 429-433, 2018 05.

Artigo em Inglês | MEDLINE | ID: mdl-29743670

RESUMO

Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go1,2. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning3-5 failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex 6 . Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space7,8 and is critical for integrating self-motion (path integration)6,7,9 and planning direct trajectories to goals (vector-based navigation)7,10,11. Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types 12 . We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments-optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation7,10,11, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.

Assuntos

Biomimética/métodos , Aprendizado de Máquina , Redes Neurais de Computação , Navegação Espacial , Animais , Córtex Entorrinal/citologia , Córtex Entorrinal/fisiologia , Meio Ambiente , Células de Grade/fisiologia , Humanos

Reply to Huszár: The elastic weight consolidation penalty is empirically valid.

Kirkpatrick, James; Pascanu, Razvan; Rabinowitz, Neil; Veness, Joel; Desjardins, Guillaume; Rusu, Andrei A; Milan, Kieran; Quan, John; Ramalho, Tiago; Grabska-Barwinska, Agnieszka; Hassabis, Demis; Clopath, Claudia; Kumaran, Dharshan; Hadsell, Raia.

Proc Natl Acad Sci U S A ; 115(11): E2498, 2018 03 13.

Artigo em Inglês | MEDLINE | ID: mdl-29463734

Assuntos

Aprendizado de Máquina

Overcoming catastrophic forgetting in neural networks.

Proc Natl Acad Sci U S A ; 114(13): 3521-3526, 2017 03 28.

Artigo em Inglês | MEDLINE | ID: mdl-28292907

RESUMO

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Until now neural networks have not been capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks that they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on a hand-written digit dataset and by learning several Atari 2600 games sequentially.

Assuntos

Redes Neurais de Computação , Algoritmos , Inteligência Artificial , Simulação por Computador , Humanos , Aprendizagem , Memória , Rememoração Mental

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA