Search | VHL Regional Portal

1.

Dynamic diversity is the answer to proxy failure.

Kurth-Nelson, Zeb; Sullivan, Steve; Leibo, Joel Z; Guitart-Masip, Marc.

Behav Brain Sci ; 47: e77, 2024 May 13.

Article in English | MEDLINE | ID: mdl-38738350

ABSTRACT

We argue that a diverse and dynamic pool of agents mitigates proxy failure. Proxy modularity plays a key role in the ongoing production of diversity. We review examples from a range of scales.

Subject(s)

Brain , Humans , Decision Making , Brain/physiology

2.

Framework-based qualitative analysis of free responses of Large Language Models: Algorithmic fidelity.

Amirova, Aliya; Fteropoulli, Theodora; Ahmed, Nafiso; Cowie, Martin R; Leibo, Joel Z.

PLoS One ; 19(3): e0300024, 2024.

Article in English | MEDLINE | ID: mdl-38470890

ABSTRACT

Today, with the advent of Large-scale generative Language Models (LLMs) it is now possible to simulate free responses to interview questions such as those traditionally analyzed using qualitative research methods. Qualitative methodology encompasses a broad family of techniques involving manual analysis of open-ended interviews or conversations conducted freely in natural language. Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative analysis methods in such a way as to generate insights that could generalize to real human populations. The key concept in our analysis is algorithmic fidelity, a validity concept capturing the degree to which LLM-generated outputs mirror human sub-populations' beliefs and attitudes. By definition, high algorithmic fidelity suggests that latent beliefs elicited from LLMs may generalize to real humans, whereas low algorithmic fidelity renders such research invalid. Here we used an LLM to generate interviews with "silicon participants" matching specific demographic characteristics one-for-one with a set of human participants. Using framework-based qualitative analysis, we showed the key themes obtained from both human and silicon participants were strikingly similar. However, when we analyzed the structure and tone of the interviews we found even more striking differences. We also found evidence of a hyper-accuracy distortion. We conclude that the LLM we tested (GPT-3.5) does not have sufficient algorithmic fidelity to expect in silico research on it to generalize to real human populations. However, rapid advances in artificial intelligence raise the possibility that algorithmic fidelity may improve in the future. Thus we stress the need to establish epistemic norms now around how to assess the validity of LLM-based qualitative research, especially concerning the need to ensure the representation of heterogeneous lived experiences.

Subject(s)

Artificial Intelligence , Communication , Language

3.

Machine culture.

Brinkmann, Levin; Baumann, Fabian; Bonnefon, Jean-François; Derex, Maxime; Müller, Thomas F; Nussberger, Anne-Marie; Czaplicka, Agnieszka; Acerbi, Alberto; Griffiths, Thomas L; Henrich, Joseph; Leibo, Joel Z; McElreath, Richard; Oudeyer, Pierre-Yves; Stray, Jonathan; Rahwan, Iyad.

Nat Hum Behav ; 7(11): 1855-1868, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37985914

ABSTRACT

The ability of humans to create and disseminate culture is often credited as the single most important factor of our success as a species. In this Perspective, we explore the notion of 'machine culture', culture mediated or generated by machines. We argue that intelligent machines simultaneously transform the cultural evolutionary processes of variation, transmission and selection. Recommender algorithms are altering social learning dynamics. Chatbots are forming a new mode of cultural transmission, serving as cultural models. Furthermore, intelligent machines are evolving as contributors in generating cultural traits-from game strategies and visual art to scientific results. We provide a conceptual framework for studying the present and anticipated future impact of machines on cultural evolution, and present a research agenda for the study of machine culture.

Subject(s)

Cultural Evolution , Hominidae , Humans , Animals , Culture , Learning

4.

Emotions and courtship help bonded pairs cooperate, but emotional agents are vulnerable to deceit.

Sadedin, Suzanne; Duéñez-Guzmán, Edgar A; Leibo, Joel Z.

Proc Natl Acad Sci U S A ; 120(46): e2308911120, 2023 Nov 14.

Article in English | MEDLINE | ID: mdl-37948585

ABSTRACT

Coordinated pair bonds are common in birds and also occur in many other taxa. How do animals solve the social dilemmas they face in coordinating with a partner? We developed an evolutionary model to explore this question, based on observations that a) neuroendocrine feedback provides emotional bookkeeping which is thought to play a key role in vertebrate social bonds and b) these bonds are developed and maintained via courtship interactions that include low-stakes social dilemmas. Using agent-based simulation, we found that emotional bookkeeping and courtship sustained cooperation in the iterated prisoner's dilemma in noisy environments, especially when combined. However, when deceitful defection was possible at low cost, courtship often increased cooperation, whereas emotional bookkeeping decreased it.

Subject(s)

Cooperative Behavior , Courtship , Animals , Emotions , Prisoner Dilemma , Computer Simulation , Game Theory

5.

The emergence of division of labour through decentralized social sanctioning.

Yaman, Anil; Leibo, Joel Z; Iacca, Giovanni; Wan Lee, Sang.

Proc Biol Sci ; 290(2009): 20231716, 2023 10 25.

Article in English | MEDLINE | ID: mdl-37876187

ABSTRACT

Human ecological success relies on our characteristic ability to flexibly self-organize into cooperative social groups, the most successful of which employ substantial specialization and division of labour. Unlike most other animals, humans learn by trial and error during their lives what role to take on. However, when some critical roles are more attractive than others, and individuals are self-interested, then there is a social dilemma: each individual would prefer others take on the critical but unremunerative roles so they may remain free to take one that pays better. But disaster occurs if all act thus and a critical role goes unfilled. In such situations learning an optimum role distribution may not be possible. Consequently, a fundamental question is: how can division of labour emerge in groups of self-interested lifetime-learning individuals? Here, we show that by introducing a model of social norms, which we regard as emergent patterns of decentralized social sanctioning, it becomes possible for groups of self-interested individuals to learn a productive division of labour involving all critical roles. Such social norms work by redistributing rewards within the population to disincentivize antisocial roles while incentivizing prosocial roles that do not intrinsically pay as well as others.

Subject(s)

Cooperative Behavior , Social Behavior , Animals , Humans , Learning , Reward

6.

The Puzzle of Evaluating Moral Cognition in Artificial Agents.

Reinecke, Madeline G; Mao, Yiran; Kunesch, Markus; Duéñez-Guzmán, Edgar A; Haas, Julia; Leibo, Joel Z.

Cogn Sci ; 47(8): e13315, 2023 08.

Article in English | MEDLINE | ID: mdl-37555649

ABSTRACT

In developing artificial intelligence (AI), researchers often benchmark against human performance as a measure of progress. Is this kind of comparison possible for moral cognition? Given that human moral judgment often hinges on intangible properties like "intention" which may have no natural analog in artificial agents, it may prove difficult to design a "like-for-like" comparison between the moral behavior of artificial and human agents. What would a measure of moral behavior for both humans and AI look like? We unravel the complexity of this question by discussing examples within reinforcement learning and generative AI, and we examine how the puzzle of evaluating artificial agents' moral cognition remains open for further investigation within cognitive science.

Subject(s)

Artificial Intelligence , Cognition , Humans , Morals , Judgment , Learning

7.

Rethink reporting of evaluation results in AI.

Burnell, Ryan; Schellaert, Wout; Burden, John; Ullman, Tomer D; Martinez-Plumed, Fernando; Tenenbaum, Joshua B; Rutar, Danaja; Cheke, Lucy G; Sohl-Dickstein, Jascha; Mitchell, Melanie; Kiela, Douwe; Shanahan, Murray; Voorhees, Ellen M; Cohn, Anthony G; Leibo, Joel Z; Hernandez-Orallo, Jose.

Science ; 380(6641): 136-138, 2023 04 14.

Article in English | MEDLINE | ID: mdl-37053341

ABSTRACT

Aggregate metrics and lack of access to results limit understanding.

8.

What is the simplest model that can account for high-fidelity imitation?

Leibo, Joel Z; Köster, Raphael; Vezhnevets, Alexander Sasha; Duénez-Guzmán, Edgar A; Agapiou, John P; Sunehag, Peter.

Behav Brain Sci ; 45: e261, 2022 11 10.

Article in English | MEDLINE | ID: mdl-36353886

ABSTRACT

What inductive biases must be incorporated into multi-agent artificial intelligence models to get them to capture high-fidelity imitation? We think very little is needed. In the right environments, both instrumental- and ritual-stance imitation can emerge from generic learning mechanisms operating on non-deliberative decision architectures. In this view, imitation emerges from trial-and-error learning and does not require explicit deliberation.

Subject(s)

Artificial Intelligence , Imitative Behavior , Humans , Learning

9.

Learning agents that acquire representations of social groups.

Leibo, Joel Z; Vezhnevets, Alexander Sasha; Eckstein, Maria K; Agapiou, John P; Duéñez-Guzmán, Edgar A.

Behav Brain Sci ; 45: e111, 2022 07 07.

Article in English | MEDLINE | ID: mdl-35796369

ABSTRACT

Humans are learning agents that acquire social group representations from experience. Here, we discuss how to construct artificial agents capable of this feat. One approach, based on deep reinforcement learning, allows the necessary representations to self-organize. This minimizes the need for hand-engineering, improving robustness and scalability. It also enables "virtual neuroscience" research on the learned representations.

Subject(s)

Learning , Neurosciences , Humans

10.

Meta-control of social learning strategies.

Yaman, Anil; Bredeche, Nicolas; Çaylak, Onur; Leibo, Joel Z; Lee, Sang Wan.

PLoS Comput Biol ; 18(2): e1009882, 2022 02.

Article in English | MEDLINE | ID: mdl-35226667

ABSTRACT

Social learning, copying other's behavior without actual experience, offers a cost-effective means of knowledge acquisition. However, it raises the fundamental question of which individuals have reliable information: successful individuals versus the majority. The former and the latter are known respectively as success-based and conformist social learning strategies. We show here that while the success-based strategy fully exploits the benign environment of low uncertainly, it fails in uncertain environments. On the other hand, the conformist strategy can effectively mitigate this adverse effect. Based on these findings, we hypothesized that meta-control of individual and social learning strategies provides effective and sample-efficient learning in volatile and uncertain environments. Simulations on a set of environments with various levels of volatility and uncertainty confirmed our hypothesis. The results imply that meta-control of social learning affords agents the leverage to resolve environmental uncertainty with minimal exploration cost, by exploiting others' learning as an external knowledge base.

Subject(s)

Social Learning , Humans , Learning , Social Behavior , Uncertainty

11.

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents.

Köster, Raphael; Hadfield-Menell, Dylan; Everett, Richard; Weidinger, Laura; Hadfield, Gillian K; Leibo, Joel Z.

Proc Natl Acad Sci U S A ; 119(3)2022 01 18.

Article in English | MEDLINE | ID: mdl-35022231

ABSTRACT

How do societies learn and maintain social norms? Here we use multiagent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit assignment problem in discovering delayed health effects. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This "silly rule" counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Our results highlight the benefit of employing a multiagent reinforcement learning computational model focused on learning to implement complex actions.

Subject(s)

Learning , Reinforcement, Psychology , Social Norms , Environment , Humans

12.

Importance of prefrontal meta control in human-like reinforcement learning.

Lee, Jee Hang; Leibo, Joel Z; An, Su Jin; Lee, Sang Wan.

Front Comput Neurosci ; 16: 1060101, 2022.

Article in English | MEDLINE | ID: mdl-36618272

ABSTRACT

Recent investigation on reinforcement learning (RL) has demonstrated considerable flexibility in dealing with various problems. However, such models often experience difficulty learning seemingly easy tasks for humans. To reconcile the discrepancy, our paper is focused on the computational benefits of the brain's RL. We examine the brain's ability to combine complementary learning strategies to resolve the trade-off between prediction performance, computational costs, and time constraints. The complex need for task performance created by a volatile and/or multi-agent environment motivates the brain to continually explore an ideal combination of multiple strategies, called meta-control. Understanding these functions would allow us to build human-aligned RL models.

13.

Promises and challenges of human computational ethology.

Mobbs, Dean; Wise, Toby; Suthana, Nanthia; Guzmán, Noah; Kriegeskorte, Nikolaus; Leibo, Joel Z.

Neuron ; 109(14): 2224-2238, 2021 07 21.

Article in English | MEDLINE | ID: mdl-34143951

ABSTRACT

The movements an organism makes provide insights into its internal states and motives. This principle is the foundation of the new field of computational ethology, which links rich automatic measurements of natural behaviors to motivational states and neural activity. Computational ethology has proven transformative for animal behavioral neuroscience. This success raises the question of whether rich automatic measurements of behavior can similarly drive progress in human neuroscience and psychology. New technologies for capturing and analyzing complex behaviors in real and virtual environments enable us to probe the human brain during naturalistic dynamic interactions with the environment that so far were beyond experimental investigation. Inspired by nonhuman computational ethology, we explore how these new tools can be used to test important questions in human neuroscience. We argue that application of this methodology will help human neuroscience and psychology extend limited behavioral measurements such as reaction time and accuracy, permit novel insights into how the human brain produces behavior, and ultimately reduce the growing measurement gap between human and animal neuroscience.

Subject(s)

Brain , Cognition , Ethology/methods , Neurosciences/methods , Humans

14.

Human-level performance in 3D multiplayer games with population-based reinforcement learning.

Jaderberg, Max; Czarnecki, Wojciech M; Dunning, Iain; Marris, Luke; Lever, Guy; Castañeda, Antonio Garcia; Beattie, Charles; Rabinowitz, Neil C; Morcos, Ari S; Ruderman, Avraham; Sonnerat, Nicolas; Green, Tim; Deason, Louise; Leibo, Joel Z; Silver, David; Hassabis, Demis; Kavukcuoglu, Koray; Graepel, Thore.

Science ; 364(6443): 859-865, 2019 May 31.

Article in English | MEDLINE | ID: mdl-31147514

ABSTRACT

Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input. We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.

Subject(s)

Machine Learning , Reinforcement, Psychology , Video Games , Reward

15.

Toward high-performance, memory-efficient, and fast reinforcement learning-Lessons from decision neuroscience.

Lee, Jee Hang; Seymour, Ben; Leibo, Joel Z; An, Su Jin; Lee, Sang Wan.

Sci Robot ; 4(26)2019 01 16.

Article in English | MEDLINE | ID: mdl-33137757

ABSTRACT

Recent insights from decision neuroscience raise hope for the development of intelligent brain-inspired solutions to robot learning in real dynamic environments full of noise and unpredictability.

16.

Prefrontal cortex as a meta-reinforcement learning system.

Wang, Jane X; Kurth-Nelson, Zeb; Kumaran, Dharshan; Tirumala, Dhruva; Soyer, Hubert; Leibo, Joel Z; Hassabis, Demis; Botvinick, Matthew.

Nat Neurosci ; 21(6): 860-868, 2018 06.

Article in English | MEDLINE | ID: mdl-29760527

ABSTRACT

Over the past 20 years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. We now draw on recent advances in artificial intelligence to introduce a new theory of reward-based learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.

Subject(s)

Learning/physiology , Prefrontal Cortex/physiology , Reinforcement, Psychology , Algorithms , Animals , Artificial Intelligence , Computer Simulation , Dopamine/physiology , Humans , Models, Neurological , Optogenetics , Reward

17.

Symmetric Decomposition of Asymmetric Games.

Tuyls, Karl; Pérolat, Julien; Lanctot, Marc; Ostrovski, Georg; Savani, Rahul; Leibo, Joel Z; Ord, Toby; Graepel, Thore; Legg, Shane.

Sci Rep ; 8(1): 1015, 2018 01 17.

Article in English | MEDLINE | ID: mdl-29343692

ABSTRACT

We introduce new theoretical insights into two-population asymmetric games allowing for an elegant symmetric decomposition into two single population symmetric games. Specifically, we show how an asymmetric bimatrix game (A,B) can be decomposed into its symmetric counterparts by envisioning and investigating the payoff tables (A and B) that constitute the asymmetric game, as two independent, single population, symmetric games. We reveal several surprising formal relationships between an asymmetric two-population game and its symmetric single population counterparts, which facilitate a convenient analysis of the original asymmetric game due to the dimensionality reduction of the decomposition. The main finding reveals that if (x,y) is a Nash equilibrium of an asymmetric game (A,B), this implies that y is a Nash equilibrium of the symmetric counterpart game determined by payoff table A, and x is a Nash equilibrium of the symmetric counterpart game determined by payoff table B. Also the reverse holds and combinations of Nash equilibria of the counterpart games form Nash equilibria of the asymmetric game. We illustrate how these formal relationships aid in identifying and analysing the Nash structure of asymmetric games, by examining the evolutionary dynamics of the simpler counterpart games in several canonical examples.

Subject(s)

Games, Experimental , Models, Statistical , Female , Game Theory , Humans , Male

18.

Building machines that learn and think for themselves.

Botvinick, Matthew; Barrett, David G T; Battaglia, Peter; de Freitas, Nando; Kumaran, Darshan; Leibo, Joel Z; Lillicrap, Timothy; Modayil, Joseph; Mohamed, Shakir; Rabinowitz, Neil C; Rezende, Danilo J; Santoro, Adam; Schaul, Tom; Summerfield, Christopher; Wayne, Greg; Weber, Theophane; Wierstra, Daan; Legg, Shane; Hassabis, Demis.

Behav Brain Sci ; 40: e255, 2017 01.

Article in English | MEDLINE | ID: mdl-29342685

ABSTRACT

We agree with Lake and colleagues on their list of "key ingredients" for building human-like intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here, we survey several important examples of the progress that has been made toward building autonomous agents with human-like abilities, and highlight some outstanding challenges.

Subject(s)

Learning , Thinking , Humans , Problem Solving

19.

View-Tolerant Face Recognition and Hebbian Learning Imply Mirror-Symmetric Neural Tuning to Head Orientation.

Leibo, Joel Z; Liao, Qianli; Anselmi, Fabio; Freiwald, Winrich A; Poggio, Tomaso.

Curr Biol ; 27(1): 62-67, 2017 Jan 09.

Article in English | MEDLINE | ID: mdl-27916522

ABSTRACT

The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations, like depth rotations [1, 2]. Current computational models of object recognition, including recent deep-learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3-6]. Here, we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here, we demonstrate that one specific biologically plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli, like faces, at intermediate levels of the architecture and show why it does so. Thus, the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside.

Subject(s)

Brain/physiology , Facial Recognition/physiology , Head Movements/physiology , Learning/physiology , Macaca/physiology , Models, Neurological , Animals , Orientation , Orientation, Spatial , Pattern Recognition, Visual , Photic Stimulation/methods , Visual Cortex/physiology

20.

The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex.

Leibo, Joel Z; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso.

PLoS Comput Biol ; 11(10): e1004390, 2015 Oct.

Article in English | MEDLINE | ID: mdl-26496457

ABSTRACT

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system's optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions.

Subject(s)

Models, Neurological , Nerve Net/physiology , Pattern Recognition, Visual/physiology , Recognition, Psychology/physiology , Visual Cortex/physiology , Visual Pathways/physiology , Animals , Computer Simulation , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL