Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

TacticAI: an AI assistant for football tactics.

Wang, Zhe; Velickovic, Petar; Hennes, Daniel; Tomasev, Nenad; Prince, Laurel; Kaisers, Michael; Bachrach, Yoram; Elie, Romuald; Wenliang, Li Kevin; Piccinini, Federico; Spearman, William; Graham, Ian; Connor, Jerome; Yang, Yi; Recasens, Adrià; Khan, Mina; Beauguerlange, Nathalie; Sprechmann, Pablo; Moreno, Pol; Heess, Nicolas; Bowling, Michael; Hassabis, Demis; Tuyls, Karl.

Nat Commun ; 15(1): 1906, 2024 Mar 19.

Artigo em Inglês | MEDLINE | ID: mdl-38503774

RESUMO

Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI's model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning.

Assuntos

Desempenho Atlético , Desempenho Atlético/fisiologia , Pesquisa Qualitativa , Futebol

2.

Mastering the game of Stratego with model-free multiagent reinforcement learning.

Perolat, Julien; De Vylder, Bart; Hennes, Daniel; Tarassov, Eugene; Strub, Florian; de Boer, Vincent; Muller, Paul; Connor, Jerome T; Burch, Neil; Anthony, Thomas; McAleer, Stephen; Elie, Romuald; Cen, Sarah H; Wang, Zhe; Gruslys, Audrunas; Malysheva, Aleksandra; Khan, Mina; Ozair, Sherjil; Timbers, Finbarr; Pohlen, Toby; Eccles, Tom; Rowland, Mark; Lanctot, Marc; Lespiau, Jean-Baptiste; Piot, Bilal; Omidshafiei, Shayegan; Lockhart, Edward; Sifre, Laurent; Beauguerlange, Nathalie; Munos, Remi; Silver, David; Singh, Satinder; Hassabis, Demis; Tuyls, Karl.

Science ; 378(6623): 990-996, 2022 12 02.

Artigo em Inglês | MEDLINE | ID: mdl-36454847

RESUMO

We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence (AI) has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state-of-the-art AI methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on the Gravon games platform, competing with human expert players.

Assuntos

Inteligência Artificial , Reforço Psicológico , Jogos de Vídeo , Humanos

3.

From motor control to team play in simulated humanoid football.

Liu, Siqi; Lever, Guy; Wang, Zhe; Merel, Josh; Eslami, S M Ali; Hennes, Daniel; Czarnecki, Wojciech M; Tassa, Yuval; Omidshafiei, Shayegan; Abdolmaleki, Abbas; Siegel, Noah Y; Hasenclever, Leonard; Marris, Luke; Tunyasuvunakool, Saran; Song, H Francis; Wulfmeier, Markus; Muller, Paul; Haarnoja, Tuomas; Tracey, Brendan; Tuyls, Karl; Graepel, Thore; Heess, Nicolas.

Sci Robot ; 7(69): eabo0235, 2022 08 31.

Artigo em Inglês | MEDLINE | ID: mdl-36044556

RESUMO

Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent behavior in the physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals that are defined on much longer time scales and that often involve complex interactions with the environment and other agents. Recent research has demonstrated the potential of learning-based approaches applied to the respective problems of complex movement, long-term planning, and multiagent coordination. However, their integration traditionally required the design and optimization of independent subsystems and remains challenging. In this work, we tackled the integration of motor control and long-horizon decision-making in the context of simulated humanoid football, which requires agile motor control and multiagent coordination. We optimized teams of agents to play simulated football via reinforcement learning, constraining the solution space to that of plausible movements learned using human motion capture data. They were trained to maximize several environment rewards and to imitate pretrained football-specific skills if doing so led to improved performance. The result is a team of coordinated humanoid football players that exhibit complex behavior at different scales, quantified by a range of analysis and statistics, including those used in real-world sport analytics. Our work constitutes a complete demonstration of learned integrated decision-making at multiple scales in a multiagent setting.

Assuntos

Futebol Americano , Futebol , Humanos , Aprendizagem , Movimento , Reforço Psicológico , Futebol/fisiologia

4.

Multiagent off-screen behavior prediction in football.

Omidshafiei, Shayegan; Hennes, Daniel; Garnelo, Marta; Wang, Zhe; Recasens, Adria; Tarassov, Eugene; Yang, Yi; Elie, Romuald; Connor, Jerome T; Muller, Paul; Mackraz, Natalie; Cao, Kris; Moreno, Pol; Sprechmann, Pablo; Hassabis, Demis; Graham, Ian; Spearman, William; Heess, Nicolas; Tuyls, Karl.

Sci Rep ; 12(1): 8638, 2022 05 23.

Artigo em Inglês | MEDLINE | ID: mdl-35606400

RESUMO

In multiagent worlds, several decision-making individuals interact while adhering to the dynamics constraints imposed by the environment. These interactions, combined with the potential stochasticity of the agents' dynamic behaviors, make such systems complex and interesting to study from a decision-making perspective. Significant research has been conducted on learning models for forward-direction estimation of agent behaviors, for example, pedestrian predictions used for collision-avoidance in self-driving cars. In many settings, only sporadic observations of agents may be available in a given trajectory sequence. In football, subsets of players may come in and out of view of broadcast video footage, while unobserved players continue to interact off-screen. In this paper, we study the problem of multiagent time-series imputation in the context of human football play, where available past and future observations of subsets of agents are used to estimate missing observations for other agents. Our approach, called the Graph Imputer, uses past and future information in combination with graph networks and variational autoencoders to enable learning of a distribution of imputed trajectories. We demonstrate our approach on multiagent settings involving players that are partially-observable, using the Graph Imputer to predict the behaviors of off-screen players. To quantitatively evaluate the approach, we conduct experiments on football matches with ground truth trajectory data, using a camera module to simulate the off-screen player state estimation setting. We subsequently use our approach for downstream football analytics under partial observability using the well-established framework of pitch control, which traditionally relies on fully observed data. We illustrate that our method outperforms several state-of-the-art approaches, including those hand-crafted for football, across all considered metrics.

Assuntos

Futebol Americano , Futebol , Humanos , Aprendizagem

5.

Navigating the landscape of multiplayer games.

Omidshafiei, Shayegan; Tuyls, Karl; Czarnecki, Wojciech M; Santos, Francisco C; Rowland, Mark; Connor, Jerome; Hennes, Daniel; Muller, Paul; Pérolat, Julien; Vylder, Bart De; Gruslys, Audrunas; Munos, Rémi.

Nat Commun ; 11(1): 5603, 2020 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-33154362

RESUMO

Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence. Traditionally, researchers have focused on using well-known games to build strong agents. This progress, however, can be better informed by characterizing games and their topological landscape. Tackling this latter question can facilitate understanding of agents and help determine what game an agent should target next as part of its training. Here, we show how network measures applied to response graphs of large-scale games enable the creation of a landscape of games, quantifying relationships between games of varying sizes and characteristics. We illustrate our findings in domains ranging from canonical games to complex empirical games capturing the performance of trained agents pitted against one another. Our results culminate in a demonstration leveraging this information to generate new and interesting games, including mixtures of empirical games synthesized from real world games.

6.

α-Rank: Multi-Agent Evaluation by Evolution.

Omidshafiei, Shayegan; Papadimitriou, Christos; Piliouras, Georgios; Tuyls, Karl; Rowland, Mark; Lespiau, Jean-Baptiste; Czarnecki, Wojciech M; Lanctot, Marc; Perolat, Julien; Munos, Remi.

Sci Rep ; 9(1): 9937, 2019 07 09.

Artigo em Inglês | MEDLINE | ID: mdl-31289288

RESUMO

We introduce α-Rank, a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs). The approach leverages continuous-time and discrete-time evolutionary dynamical systems applied to empirical games, and scales tractably in the number of agents, in the type of interactions (beyond dyadic), and the type of empirical games (symmetric and asymmetric). Current models are fundamentally limited in one or more of these dimensions, and are not guaranteed to converge to the desired game-theoretic solution concept (typically the Nash equilibrium). α-Rank automatically provides a ranking over the set of agents under evaluation and provides insights into their strengths, weaknesses, and long-term dynamics in terms of basins of attraction and sink components. This is a direct consequence of the correspondence we establish to the dynamical MCC solution concept when the underlying evolutionary model's ranking-intensity parameter, α, is chosen to be large, which exactly forms the basis of α-Rank. In contrast to the Nash equilibrium, which is a static solution concept based solely on fixed points, MCCs are a dynamical solution concept based on the Markov chain formalism, Conley's Fundamental Theorem of Dynamical Systems, and the core ingredients of dynamical systems: fixed points, recurrent sets, periodic orbits, and limit cycles. Our α-Rank method runs in polynomial time with respect to the total number of pure strategy profiles, whereas computing a Nash equilibrium for a general-sum game is known to be intractable. We introduce mathematical proofs that not only provide an overarching and unifying perspective of existing continuous- and discrete-time evolutionary evaluation models, but also reveal the formal underpinnings of the α-Rank methodology. We illustrate the method in canonical games and empirically validate it in several domains, including AlphaGo, AlphaZero, MuJoCo Soccer, and Poker.

7.

Symmetric Decomposition of Asymmetric Games.

Tuyls, Karl; Pérolat, Julien; Lanctot, Marc; Ostrovski, Georg; Savani, Rahul; Leibo, Joel Z; Ord, Toby; Graepel, Thore; Legg, Shane.

Sci Rep ; 8(1): 1015, 2018 01 17.

Artigo em Inglês | MEDLINE | ID: mdl-29343692

RESUMO

We introduce new theoretical insights into two-population asymmetric games allowing for an elegant symmetric decomposition into two single population symmetric games. Specifically, we show how an asymmetric bimatrix game (A,B) can be decomposed into its symmetric counterparts by envisioning and investigating the payoff tables (A and B) that constitute the asymmetric game, as two independent, single population, symmetric games. We reveal several surprising formal relationships between an asymmetric two-population game and its symmetric single population counterparts, which facilitate a convenient analysis of the original asymmetric game due to the dimensionality reduction of the decomposition. The main finding reveals that if (x,y) is a Nash equilibrium of an asymmetric game (A,B), this implies that y is a Nash equilibrium of the symmetric counterpart game determined by payoff table A, and x is a Nash equilibrium of the symmetric counterpart game determined by payoff table B. Also the reverse holds and combinations of Nash equilibria of the counterpart games form Nash equilibria of the asymmetric game. We illustrate how these formal relationships aid in identifying and analysing the Nash structure of asymmetric games, by examining the evolutionary dynamics of the simpler counterpart games in several canonical examples.

Assuntos

Jogos Experimentais , Modelos Estatísticos , Feminino , Teoria dos Jogos , Humanos , Masculino

8.

Space Debris Removal: Learning to Cooperate and the Price of Anarchy.

Klima, Richard; Bloembergen, Daan; Savani, Rahul; Tuyls, Karl; Wittig, Alexander; Sapera, Andrei; Izzo, Dario.

Front Robot AI ; 5: 54, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-33500936

RESUMO

In this paper we study space debris removal from a game-theoretic perspective. In particular we focus on the question whether and how self-interested agents can cooperate in this dilemma, which resembles a tragedy of the commons scenario. We compare centralised and decentralised solutions and the corresponding price of anarchy, which measures the extent to which competition approximates cooperation. In addition we investigate whether agents can learn optimal strategies by reinforcement learning. To this end, we improve on an existing high fidelity orbital simulator, and use this simulator to obtain a computationally efficient surrogate model that can be used for our subsequent game-theoretic analysis. We study both single- and multi-agent approaches using stochastic (Markov) games and reinforcement learning. The main finding is that the cost of a decentralised, competitive solution can be significant, which should be taken into consideration when forming debris removal strategies.

9.

How to reach linguistic consensus: a proof of convergence for the naming game.

De Vylder, Bart; Tuyls, Karl.

J Theor Biol ; 242(4): 818-31, 2006 Oct 21.

Artigo em Inglês | MEDLINE | ID: mdl-16843499

RESUMO

In this paper we introduce a mathematical model of naming games. Naming games have been widely used within research on the origins and evolution of language. Despite the many interesting empirical results these studies have produced, most of this research lacks a formal elucidating theory. In this paper we show how a population of agents can reach linguistic consensus, i.e. learn to use one common language to communicate with one another. Our approach differs from existing formal work in two important ways: one, we relax the too strong assumption that an agent samples infinitely often during each time interval. This assumption is usually made to guarantee convergence of an empirical learning process to a deterministic dynamical system. Two, we provide a proof that under these new realistic conditions, our model converges to a common language for the entire population of agents. Finally the model is experimentally validated.

Assuntos

Consenso , Linguística , Modelos Psicológicos , Evolução Biológica , Teoria dos Jogos , Humanos , Idioma , Aprendizagem , Meio Social

10.

The evolutionary language game: an orthogonal approach.

Lenaerts, Tom; Jansen, Bart; Tuyls, Karl; De Vylder, Bart.

J Theor Biol ; 235(4): 566-82, 2005 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-15935174

RESUMO

Evolutionary game dynamics have been proposed as a mathematical framework for the cultural evolution of language and more specifically the evolution of vocabulary. This article discusses a model that is mutually exclusive in its underlying principals with some previously suggested models. The model describes how individuals in a population culturally acquire a vocabulary by actively participating in the acquisition process instead of passively observing and communicate through peer-to-peer interactions instead of vertical parent-offspring relations. Concretely, a notion of social/cultural learning called the naming game is first abstracted using learning theory. This abstraction defines the required cultural transmission mechanism for an evolutionary process. Second, the derived transmission system is expressed in terms of the well-known selection-mutation model defined in the context of evolutionary dynamics. In this way, the analogy between social learning and evolution at the level of meaning-word associations is made explicit. Although only horizontal and oblique transmission structures will be considered, extensions to vertical structures over different genetic generations can easily be incorporated. We provide a number of simplified experiments to clarify our reasoning.

Assuntos

Evolução Biológica , Teoria dos Jogos , Modelos Psicológicos , Vocabulário , Humanos , Aprendizagem , Meio Social

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA