Búsqueda | Portal de Búsqueda de la BVS España

Faster sorting algorithms discovered using deep reinforcement learning.

Mankowitz, Daniel J; Michi, Andrea; Zhernov, Anton; Gelmi, Marco; Selvi, Marco; Paduraru, Cosmin; Leurent, Edouard; Iqbal, Shariq; Lespiau, Jean-Baptiste; Ahern, Alex; Köppe, Thomas; Millikin, Kevin; Gaffney, Stephen; Elster, Sophie; Broshear, Jackson; Gamble, Chris; Milan, Kieran; Tung, Robert; Hwang, Minjae; Cemgil, Taylan; Barekatain, Mohammadamin; Li, Yujia; Mandhane, Amol; Hubert, Thomas; Schrittwieser, Julian; Hassabis, Demis; Kohli, Pushmeet; Riedmiller, Martin; Vinyals, Oriol; Silver, David.

Nature ; 618(7964): 257-263, 2023 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-37286649

RESUMEN

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day1. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past2, making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library3. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.

Overcoming catastrophic forgetting in neural networks.

Kirkpatrick, James; Pascanu, Razvan; Rabinowitz, Neil; Veness, Joel; Desjardins, Guillaume; Rusu, Andrei A; Milan, Kieran; Quan, John; Ramalho, Tiago; Grabska-Barwinska, Agnieszka; Hassabis, Demis; Clopath, Claudia; Kumaran, Dharshan; Hadsell, Raia.

Proc Natl Acad Sci U S A ; 114(13): 3521-3526, 2017 03 28.

Artículo en Inglés | MEDLINE | ID: mdl-28292907

RESUMEN

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Until now neural networks have not been capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks that they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on a hand-written digit dataset and by learning several Atari 2600 games sequentially.

Asunto(s)

Redes Neurales de la Computación , Algoritmos , Inteligencia Artificial , Simulación por Computador , Humanos , Aprendizaje , Memoria , Recuerdo Mental

Reply to Huszár: The elastic weight consolidation penalty is empirically valid.

Proc Natl Acad Sci U S A ; 115(11): E2498, 2018 03 13.

Artículo en Inglés | MEDLINE | ID: mdl-29463734

Asunto(s)

Aprendizaje Automático

RESUMEN

RESUMEN

Asunto(s)

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA