Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Learning agile soccer skills for a bipedal robot with deep reinforcement learning.

Haarnoja, Tuomas; Moran, Ben; Lever, Guy; Huang, Sandy H; Tirumala, Dhruva; Humplik, Jan; Wulfmeier, Markus; Tunyasuvunakool, Saran; Siegel, Noah Y; Hafner, Roland; Bloesch, Michael; Hartikainen, Kristian; Byravan, Arunkumar; Hasenclever, Leonard; Tassa, Yuval; Sadeghi, Fereshteh; Batchelor, Nathan; Casarini, Federico; Saliceti, Stefano; Game, Charles; Sreendra, Neil; Patel, Kushal; Gwira, Marlon; Huber, Andrea; Hurley, Nicole; Nori, Francesco; Hadsell, Raia; Heess, Nicolas.

Sci Robot ; 9(89): eadi8022, 2024 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-38598610

RESUMO

We investigated whether deep reinforcement learning (deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies. We used deep RL to train a humanoid robot to play a simplified one-versus-one soccer game. The resulting agent exhibits robust and dynamic movement skills, such as rapid fall recovery, walking, turning, and kicking, and it transitions between them in a smooth and efficient manner. It also learned to anticipate ball movements and block opponent shots. The agent's tactical behavior adapts to specific game contexts in a way that would be impractical to manually design. Our agent was trained in simulation and transferred to real robots zero-shot. A combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training enabled good-quality transfer. In experiments, the agent walked 181% faster, turned 302% faster, took 63% less time to get up, and kicked a ball 34% faster than a scripted baseline.

Assuntos

Robótica , Futebol , Robótica/métodos , Aprendizagem , Caminhada , Simulação por Computador

Prefrontal cortex as a meta-reinforcement learning system.

Wang, Jane X; Kurth-Nelson, Zeb; Kumaran, Dharshan; Tirumala, Dhruva; Soyer, Hubert; Leibo, Joel Z; Hassabis, Demis; Botvinick, Matthew.

Nat Neurosci ; 21(6): 860-868, 2018 06.

Artigo em Inglês | MEDLINE | ID: mdl-29760527

RESUMO

Over the past 20 years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. We now draw on recent advances in artificial intelligence to introduce a new theory of reward-based learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.

Assuntos

Aprendizagem/fisiologia , Córtex Pré-Frontal/fisiologia , Reforço Psicológico , Algoritmos , Animais , Inteligência Artificial , Simulação por Computador , Dopamina/fisiologia , Humanos , Modelos Neurológicos , Optogenética , Recompensa

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA