Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
Intervalo de ano de publicação
1.
Nature ; 602(7896): 223-228, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35140384

RESUMO

Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits1. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world's best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing's important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world's best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.


Assuntos
Condução de Veículo , Aprendizado Profundo , Reforço Psicológico , Esportes , Jogos de Vídeo , Condução de Veículo/normas , Comportamento Competitivo , Humanos , Recompensa , Esportes/normas
2.
Neural Comput ; 28(8): 1599-662, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27348735

RESUMO

Consider a self-motivated artificial agent who is exploring a complex environment. Part of the complexity is due to the raw high-dimensional sensory input streams, which the agent needs to make sense of. Such inputs can be compactly encoded through a variety of means; one of these is slow feature analysis (SFA). Slow features encode spatiotemporal regularities, which are information-rich explanatory factors (latent variables) underlying the high-dimensional input streams. In our previous work, we have shown how slow features can be learned incrementally, while the agent explores its world, and modularly, such that different sets of features are learned for different parts of the environment (since a single set of regularities does not explain everything). In what order should the agent explore the different parts of the environment? Following Schmidhuber's theory of artificial curiosity, the agent should always concentrate on the area where it can learn the easiest-to-learn set of features that it has not already learned. We formalize this learning problem and theoretically show that, using our model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. We provide experimental results to support the theoretical analysis.

3.
Neural Comput ; 24(11): 2994-3024, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22845826

RESUMO

We introduce here an incremental version of slow feature analysis (IncSFA), combining candid covariance-free incremental principal components analysis (CCIPCA) and covariance-free incremental minor components analysis (CIMCA). IncSFA's feature updating complexity is linear with respect to the input dimensionality, while batch SFA's (BSFA) updating complexity is cubic. IncSFA does not need to store, or even compute, any covariance matrices. The drawback to IncSFA is data efficiency: it does not use each data point as effectively as BSFA. But IncSFA allows SFA to be tractably applied, with just a few parameters, directly on high-dimensional input streams (e.g., visual input of an autonomous agent), while BSFA has to resort to hierarchical receptive-field-based architectures when the input dimension is too high. Further, IncSFA's updates have simple Hebbian and anti-Hebbian forms, extending the biological plausibility of SFA. Experimental results show IncSFA learns the same set of features as BSFA and can handle a few cases where BSFA fails.


Assuntos
Algoritmos , Simulação por Computador , Aprendizagem , Modelos Neurológicos , Inteligência Artificial , Análise de Componente Principal
4.
Front Neurorobot ; 7: 9, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23755011

RESUMO

Curiosity Driven Modular Incremental Slow Feature Analysis (CD-MISFA;) is a recently introduced model of intrinsically-motivated invariance learning. Artificial curiosity enables the orderly formation of multiple stable sensory representations to simplify the agent's complex sensory input. We discuss computational properties of the CD-MISFA model itself as well as neurophysiological analogs fulfilling similar functional roles. CD-MISFA combines 1. unsupervised representation learning through the slowness principle, 2. generation of an intrinsic reward signal through learning progress of the developing features, and 3. balancing of exploration and exploitation to maximize learning progress and quickly learn multiple feature sets for perceptual simplification. Experimental results on synthetic observations and on the iCub robot show that the intrinsic value system is essential for representation learning. Representations are typically explored and learned in order from least to most costly, as predicted by the theory of curiosity.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA