Búsqueda | Portal de Búsqueda de la BVS España

1.

Emotions and courtship help bonded pairs cooperate, but emotional agents are vulnerable to deceit.

Sadedin, Suzanne; Duéñez-Guzmán, Edgar A; Leibo, Joel Z.

Proc Natl Acad Sci U S A ; 120(46): e2308911120, 2023 Nov 14.

Artículo en Inglés | MEDLINE | ID: mdl-37948585

RESUMEN

Coordinated pair bonds are common in birds and also occur in many other taxa. How do animals solve the social dilemmas they face in coordinating with a partner? We developed an evolutionary model to explore this question, based on observations that a) neuroendocrine feedback provides emotional bookkeeping which is thought to play a key role in vertebrate social bonds and b) these bonds are developed and maintained via courtship interactions that include low-stakes social dilemmas. Using agent-based simulation, we found that emotional bookkeeping and courtship sustained cooperation in the iterated prisoner's dilemma in noisy environments, especially when combined. However, when deceitful defection was possible at low cost, courtship often increased cooperation, whereas emotional bookkeeping decreased it.

Asunto(s)

Conducta Cooperativa , Cortejo , Animales , Emociones , Dilema del Prisionero , Simulación por Computador , Teoría del Juego

2.

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents.

Köster, Raphael; Hadfield-Menell, Dylan; Everett, Richard; Weidinger, Laura; Hadfield, Gillian K; Leibo, Joel Z.

Proc Natl Acad Sci U S A ; 119(3)2022 01 18.

Artículo en Inglés | MEDLINE | ID: mdl-35022231

RESUMEN

How do societies learn and maintain social norms? Here we use multiagent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit assignment problem in discovering delayed health effects. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This "silly rule" counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Our results highlight the benefit of employing a multiagent reinforcement learning computational model focused on learning to implement complex actions.

Asunto(s)

Aprendizaje , Refuerzo en Psicología , Normas Sociales , Ambiente , Humanos

3.

Dynamic diversity is the answer to proxy failure.

Kurth-Nelson, Zeb; Sullivan, Steve; Leibo, Joel Z; Guitart-Masip, Marc.

Behav Brain Sci ; 47: e77, 2024 May 13.

Artículo en Inglés | MEDLINE | ID: mdl-38738350

RESUMEN

We argue that a diverse and dynamic pool of agents mitigates proxy failure. Proxy modularity plays a key role in the ongoing production of diversity. We review examples from a range of scales.

Asunto(s)

Encéfalo , Humanos , Toma de Decisiones , Encéfalo/fisiología

4.

The emergence of division of labour through decentralized social sanctioning.

Yaman, Anil; Leibo, Joel Z; Iacca, Giovanni; Wan Lee, Sang.

Proc Biol Sci ; 290(2009): 20231716, 2023 10 25.

Artículo en Inglés | MEDLINE | ID: mdl-37876187

RESUMEN

Human ecological success relies on our characteristic ability to flexibly self-organize into cooperative social groups, the most successful of which employ substantial specialization and division of labour. Unlike most other animals, humans learn by trial and error during their lives what role to take on. However, when some critical roles are more attractive than others, and individuals are self-interested, then there is a social dilemma: each individual would prefer others take on the critical but unremunerative roles so they may remain free to take one that pays better. But disaster occurs if all act thus and a critical role goes unfilled. In such situations learning an optimum role distribution may not be possible. Consequently, a fundamental question is: how can division of labour emerge in groups of self-interested lifetime-learning individuals? Here, we show that by introducing a model of social norms, which we regard as emergent patterns of decentralized social sanctioning, it becomes possible for groups of self-interested individuals to learn a productive division of labour involving all critical roles. Such social norms work by redistributing rewards within the population to disincentivize antisocial roles while incentivizing prosocial roles that do not intrinsically pay as well as others.

Asunto(s)

Conducta Cooperativa , Conducta Social , Animales , Humanos , Aprendizaje , Recompensa

5.

Meta-control of social learning strategies.

Yaman, Anil; Bredeche, Nicolas; Çaylak, Onur; Leibo, Joel Z; Lee, Sang Wan.

PLoS Comput Biol ; 18(2): e1009882, 2022 02.

Artículo en Inglés | MEDLINE | ID: mdl-35226667

RESUMEN

Social learning, copying other's behavior without actual experience, offers a cost-effective means of knowledge acquisition. However, it raises the fundamental question of which individuals have reliable information: successful individuals versus the majority. The former and the latter are known respectively as success-based and conformist social learning strategies. We show here that while the success-based strategy fully exploits the benign environment of low uncertainly, it fails in uncertain environments. On the other hand, the conformist strategy can effectively mitigate this adverse effect. Based on these findings, we hypothesized that meta-control of individual and social learning strategies provides effective and sample-efficient learning in volatile and uncertain environments. Simulations on a set of environments with various levels of volatility and uncertainty confirmed our hypothesis. The results imply that meta-control of social learning affords agents the leverage to resolve environmental uncertainty with minimal exploration cost, by exploiting others' learning as an external knowledge base.

Asunto(s)

Aprendizaje Social , Humanos , Aprendizaje , Conducta Social , Incertidumbre

6.

Learning agents that acquire representations of social groups.

Leibo, Joel Z; Vezhnevets, Alexander Sasha; Eckstein, Maria K; Agapiou, John P; Duéñez-Guzmán, Edgar A.

Behav Brain Sci ; 45: e111, 2022 07 07.

Artículo en Inglés | MEDLINE | ID: mdl-35796369

RESUMEN

Humans are learning agents that acquire social group representations from experience. Here, we discuss how to construct artificial agents capable of this feat. One approach, based on deep reinforcement learning, allows the necessary representations to self-organize. This minimizes the need for hand-engineering, improving robustness and scalability. It also enables "virtual neuroscience" research on the learned representations.

Asunto(s)

Aprendizaje , Neurociencias , Humanos

7.

What is the simplest model that can account for high-fidelity imitation?

Leibo, Joel Z; Köster, Raphael; Vezhnevets, Alexander Sasha; Duénez-Guzmán, Edgar A; Agapiou, John P; Sunehag, Peter.

Behav Brain Sci ; 45: e261, 2022 11 10.

Artículo en Inglés | MEDLINE | ID: mdl-36353886

RESUMEN

What inductive biases must be incorporated into multi-agent artificial intelligence models to get them to capture high-fidelity imitation? We think very little is needed. In the right environments, both instrumental- and ritual-stance imitation can emerge from generic learning mechanisms operating on non-deliberative decision architectures. In this view, imitation emerges from trial-and-error learning and does not require explicit deliberation.

Asunto(s)

Inteligencia Artificial , Conducta Imitativa , Humanos , Aprendizaje

8.

Building machines that learn and think for themselves.

Botvinick, Matthew; Barrett, David G T; Battaglia, Peter; de Freitas, Nando; Kumaran, Darshan; Leibo, Joel Z; Lillicrap, Timothy; Modayil, Joseph; Mohamed, Shakir; Rabinowitz, Neil C; Rezende, Danilo J; Santoro, Adam; Schaul, Tom; Summerfield, Christopher; Wayne, Greg; Weber, Theophane; Wierstra, Daan; Legg, Shane; Hassabis, Demis.

Behav Brain Sci ; 40: e255, 2017 01.

Artículo en Inglés | MEDLINE | ID: mdl-29342685

RESUMEN

We agree with Lake and colleagues on their list of "key ingredients" for building human-like intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here, we survey several important examples of the progress that has been made toward building autonomous agents with human-like abilities, and highlight some outstanding challenges.

Asunto(s)

Aprendizaje , Pensamiento , Humanos , Solución de Problemas

9.

The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex.

Leibo, Joel Z; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso.

PLoS Comput Biol ; 11(10): e1004390, 2015 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-26496457

RESUMEN

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system's optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions.

Asunto(s)

Modelos Neurológicos , Red Nerviosa/fisiología , Reconocimiento Visual de Modelos/fisiología , Reconocimiento en Psicología/fisiología , Corteza Visual/fisiología , Vías Visuales/fisiología , Animales , Simulación por Computador , Humanos

10.

The dynamics of invariant object recognition in the human visual system.

Isik, Leyla; Meyers, Ethan M; Leibo, Joel Z; Poggio, Tomaso.

J Neurophysiol ; 111(1): 91-102, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-24089402

RESUMEN

The human visual system can rapidly recognize objects despite transformations that alter their appearance. The precise timing of when the brain computes neural representations that are invariant to particular transformations, however, has not been mapped in humans. Here we employ magnetoencephalography decoding analysis to measure the dynamics of size- and position-invariant visual information development in the ventral visual stream. With this method we can read out the identity of objects beginning as early as 60 ms. Size- and position-invariant visual information appear around 125 ms and 150 ms, respectively, and both develop in stages, with invariance to smaller transformations arising before invariance to larger transformations. Additionally, the magnetoencephalography sensor activity localizes to neural sources that are in the most posterior occipital regions at the early decoding times and then move temporally as invariant information develops. These results provide previously unknown latencies for key stages of human-invariant object recognition, as well as new and compelling evidence for a feed-forward hierarchical model of invariant object recognition where invariance increases at each successive visual area along the ventral stream.

Asunto(s)

Reconocimiento Visual de Modelos , Tiempo de Reacción , Corteza Visual/fisiología , Adolescente , Adulto , Potenciales Evocados Visuales , Femenino , Humanos , Masculino

11.

Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity.

Amirova, Aliya; Fteropoulli, Theodora; Ahmed, Nafiso; Cowie, Martin R; Leibo, Joel Z.

PLoS One ; 19(3): e0300024, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38470890

RESUMEN

Today, with the advent of Large-scale generative Language Models (LLMs) it is now possible to simulate free responses to interview questions such as those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative analysis methods in such a way as to generate insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a validity concept capturing the degree to which LLM-generated outputs mirror human sub-populations' beliefs and attitudes. By definition, high algorithmic fidelity suggests that latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with "silicon participants" matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of a hyper-accuracy distortion. We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect in silico research on it to generalize to real human populations. However, rapid advances in artificial intelligence raise the possibility that algorithmic fidelity may improve in the future. Thus we stress the need to establish epistemic norms now around how to assess the validity of LLM-based qualitative research, especially concerning the need to ensure the representation of heterogeneous lived experiences.

Asunto(s)

Inteligencia Artificial , Comunicación , Lenguaje

12.

The Puzzle of Evaluating Moral Cognition in Artificial Agents.

Reinecke, Madeline G; Mao, Yiran; Kunesch, Markus; Duéñez-Guzmán, Edgar A; Haas, Julia; Leibo, Joel Z.

Cogn Sci ; 47(8): e13315, 2023 08.

Artículo en Inglés | MEDLINE | ID: mdl-37555649

RESUMEN

In developing artificial intelligence (AI), researchers often benchmark against human performance as a measure of progress. Is this kind of comparison possible for moral cognition? Given that human moral judgment often hinges on intangible properties like "intention" which may have no natural analog in artificial agents, it may prove difficult to design a "like-for-like" comparison between the moral behavior of artificial and human agents. What would a measure of moral behavior for both humans and AI look like? We unravel the complexity of this question by discussing examples within reinforcement learning and generative AI, and we examine how the puzzle of evaluating artificial agents' moral cognition remains open for further investigation within cognitive science.

Asunto(s)

Inteligencia Artificial , Cognición , Humanos , Principios Morales , Juicio , Aprendizaje

13.

Machine culture.

Brinkmann, Levin; Baumann, Fabian; Bonnefon, Jean-François; Derex, Maxime; Müller, Thomas F; Nussberger, Anne-Marie; Czaplicka, Agnieszka; Acerbi, Alberto; Griffiths, Thomas L; Henrich, Joseph; Leibo, Joel Z; McElreath, Richard; Oudeyer, Pierre-Yves; Stray, Jonathan; Rahwan, Iyad.

Nat Hum Behav ; 7(11): 1855-1868, 2023 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-37985914

RESUMEN

The ability of humans to create and disseminate culture is often credited as the single most important factor of our success as a species. In this Perspective, we explore the notion of 'machine culture', culture mediated or generated by machines. We argue that intelligent machines simultaneously transform the cultural evolutionary processes of variation, transmission and selection. Recommender algorithms are altering social learning dynamics. Chatbots are forming a new mode of cultural transmission, serving as cultural models. Furthermore, intelligent machines are evolving as contributors in generating cultural traits-from game strategies and visual art to scientific results. We provide a conceptual framework for studying the present and anticipated future impact of machines on cultural evolution, and present a research agenda for the study of machine culture.

Asunto(s)

Evolución Cultural , Hominidae , Humanos , Animales , Cultura , Aprendizaje

14.

Rethink reporting of evaluation results in AI.

Burnell, Ryan; Schellaert, Wout; Burden, John; Ullman, Tomer D; Martinez-Plumed, Fernando; Tenenbaum, Joshua B; Rutar, Danaja; Cheke, Lucy G; Sohl-Dickstein, Jascha; Mitchell, Melanie; Kiela, Douwe; Shanahan, Murray; Voorhees, Ellen M; Cohn, Anthony G; Leibo, Joel Z; Hernandez-Orallo, Jose.

Science ; 380(6641): 136-138, 2023 04 14.

Artículo en Inglés | MEDLINE | ID: mdl-37053341

RESUMEN

Aggregate metrics and lack of access to results limit understanding.

15.

Importance of prefrontal meta control in human-like reinforcement learning.

Lee, Jee Hang; Leibo, Joel Z; An, Su Jin; Lee, Sang Wan.

Front Comput Neurosci ; 16: 1060101, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36618272

RESUMEN

Recent investigation on reinforcement learning (RL) has demonstrated considerable flexibility in dealing with various problems. However, such models often experience difficulty learning seemingly easy tasks for humans. To reconcile the discrepancy, our paper is focused on the computational benefits of the brain's RL. We examine the brain's ability to combine complementary learning strategies to resolve the trade-off between prediction performance, computational costs, and time constraints. The complex need for task performance created by a volatile and/or multi-agent environment motivates the brain to continually explore an ideal combination of multiple strategies, called meta-control. Understanding these functions would allow us to build human-aligned RL models.

16.

Promises and challenges of human computational ethology.

Mobbs, Dean; Wise, Toby; Suthana, Nanthia; Guzmán, Noah; Kriegeskorte, Nikolaus; Leibo, Joel Z.

Neuron ; 109(14): 2224-2238, 2021 07 21.

Artículo en Inglés | MEDLINE | ID: mdl-34143951

RESUMEN

The movements an organism makes provide insights into its internal states and motives. This principle is the foundation of the new field of computational ethology, which links rich automatic measurements of natural behaviors to motivational states and neural activity. Computational ethology has proven transformative for animal behavioral neuroscience. This success raises the question of whether rich automatic measurements of behavior can similarly drive progress in human neuroscience and psychology. New technologies for capturing and analyzing complex behaviors in real and virtual environments enable us to probe the human brain during naturalistic dynamic interactions with the environment that so far were beyond experimental investigation. Inspired by nonhuman computational ethology, we explore how these new tools can be used to test important questions in human neuroscience. We argue that application of this methodology will help human neuroscience and psychology extend limited behavioral measurements such as reaction time and accuracy, permit novel insights into how the human brain produces behavior, and ultimately reduce the growing measurement gap between human and animal neuroscience.

Asunto(s)

Encéfalo , Cognición , Etología/métodos , Neurociencias/métodos , Humanos

17.

Toward high-performance, memory-efficient, and fast reinforcement learning-Lessons from decision neuroscience.

Lee, Jee Hang; Seymour, Ben; Leibo, Joel Z; An, Su Jin; Lee, Sang Wan.

Sci Robot ; 4(26)2019 01 16.

Artículo en Inglés | MEDLINE | ID: mdl-33137757

RESUMEN

Recent insights from decision neuroscience raise hope for the development of intelligent brain-inspired solutions to robot learning in real dynamic environments full of noise and unpredictability.

18.

Human-level performance in 3D multiplayer games with population-based reinforcement learning.

Jaderberg, Max; Czarnecki, Wojciech M; Dunning, Iain; Marris, Luke; Lever, Guy; Castañeda, Antonio Garcia; Beattie, Charles; Rabinowitz, Neil C; Morcos, Ari S; Ruderman, Avraham; Sonnerat, Nicolas; Green, Tim; Deason, Louise; Leibo, Joel Z; Silver, David; Hassabis, Demis; Kavukcuoglu, Koray; Graepel, Thore.

Science ; 364(6443): 859-865, 2019 May 31.

Artículo en Inglés | MEDLINE | ID: mdl-31147514

RESUMEN

Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input. We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.

Asunto(s)

Aprendizaje Automático , Refuerzo en Psicología , Juegos de Video , Recompensa

19.

Symmetric Decomposition of Asymmetric Games.

Tuyls, Karl; Pérolat, Julien; Lanctot, Marc; Ostrovski, Georg; Savani, Rahul; Leibo, Joel Z; Ord, Toby; Graepel, Thore; Legg, Shane.

Sci Rep ; 8(1): 1015, 2018 01 17.

Artículo en Inglés | MEDLINE | ID: mdl-29343692

RESUMEN

We introduce new theoretical insights into two-population asymmetric games allowing for an elegant symmetric decomposition into two single population symmetric games. Specifically, we show how an asymmetric bimatrix game (A,B) can be decomposed into its symmetric counterparts by envisioning and investigating the payoff tables (A and B) that constitute the asymmetric game, as two independent, single population, symmetric games. We reveal several surprising formal relationships between an asymmetric two-population game and its symmetric single population counterparts, which facilitate a convenient analysis of the original asymmetric game due to the dimensionality reduction of the decomposition. The main finding reveals that if (x,y) is a Nash equilibrium of an asymmetric game (A,B), this implies that y is a Nash equilibrium of the symmetric counterpart game determined by payoff table A, and x is a Nash equilibrium of the symmetric counterpart game determined by payoff table B. Also the reverse holds and combinations of Nash equilibria of the counterpart games form Nash equilibria of the asymmetric game. We illustrate how these formal relationships aid in identifying and analysing the Nash structure of asymmetric games, by examining the evolutionary dynamics of the simpler counterpart games in several canonical examples.

Asunto(s)

Juegos Experimentales , Modelos Estadísticos , Femenino , Teoría del Juego , Humanos , Masculino

20.

Prefrontal cortex as a meta-reinforcement learning system.

Wang, Jane X; Kurth-Nelson, Zeb; Kumaran, Dharshan; Tirumala, Dhruva; Soyer, Hubert; Leibo, Joel Z; Hassabis, Demis; Botvinick, Matthew.

Nat Neurosci ; 21(6): 860-868, 2018 06.

Artículo en Inglés | MEDLINE | ID: mdl-29760527

RESUMEN

Over the past 20 years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. We now draw on recent advances in artificial intelligence to introduce a new theory of reward-based learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.

Asunto(s)

Aprendizaje/fisiología , Corteza Prefrontal/fisiología , Refuerzo en Psicología , Algoritmos , Animales , Inteligencia Artificial , Simulación por Computador , Dopamina/fisiología , Humanos , Modelos Neurológicos , Optogenética , Recompensa

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA