Pesquisa | BVS Doenças Infecciosas e Parasitárias

Mostrar: 20 | 50 | 100

Resultados 1 - 20 de 126

Filtrar

Evaluating language models for mathematics through interactions.

Collins, Katherine M; Jiang, Albert Q; Frieder, Simon; Wong, Lionel; Zilka, Miri; Bhatt, Umang; Lukasiewicz, Thomas; Wu, Yuhuai; Tenenbaum, Joshua B; Hart, William; Gowers, Timothy; Li, Wenda; Weller, Adrian; Jamnik, Mateja.

Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.

Artigo em Inglês | MEDLINE | ID: mdl-38830100

RESUMO

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.

Assuntos

Idioma , Matemática , Resolução de Problemas , Humanos , Resolução de Problemas/fisiologia , Estudantes/psicologia

Machine behaviour.

Rahwan, Iyad; Cebrian, Manuel; Obradovich, Nick; Bongard, Josh; Bonnefon, Jean-François; Breazeal, Cynthia; Crandall, Jacob W; Christakis, Nicholas A; Couzin, Iain D; Jackson, Matthew O; Jennings, Nicholas R; Kamar, Ece; Kloumann, Isabel M; Larochelle, Hugo; Lazer, David; McElreath, Richard; Mislove, Alan; Parkes, David C; Pentland, Alex 'Sandy'; Roberts, Margaret E; Shariff, Azim; Tenenbaum, Joshua B; Wellman, Michael.

Nature ; 568(7753): 477-486, 2019 04.

Artigo em Inglês | MEDLINE | ID: mdl-31019318

RESUMO

Machines powered by artificial intelligence increasingly mediate our social, cultural, economic and political interactions. Understanding the behaviour of artificial intelligence systems is essential to our ability to control their actions, reap their benefits and minimize their harms. Here we argue that this necessitates a broad scientific research agenda to study machine behaviour that incorporates and expands upon the discipline of computer science and includes insights from across the sciences. We first outline a set of questions that are fundamental to this emerging field and then explore the technical, legal and institutional constraints on the study of machine behaviour.

Assuntos

Inteligência Artificial , Inteligência Artificial/legislação & jurisprudência , Inteligência Artificial/tendências , Humanos , Motivação , Robótica

Bayes and Darwin: How replicator populations implement Bayesian computations.

Czégel, Dániel; Giaffar, Hamza; Tenenbaum, Joshua B; Szathmáry, Eörs.

Bioessays ; 44(4): e2100255, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35212408

RESUMO

Bayesian learning theory and evolutionary theory both formalize adaptive competition dynamics in possibly high-dimensional, varying, and noisy environments. What do they have in common and how do they differ? In this paper, we discuss structural and dynamical analogies and their limits, both at a computational and an algorithmic-mechanical level. We point out mathematical equivalences between their basic dynamical equations, generalizing the isomorphism between Bayesian update and replicator dynamics. We discuss how these mechanisms provide analogous answers to the challenge of adapting to stochastically changing environments at multiple timescales. We elucidate an algorithmic equivalence between a sampling approximation, particle filters, and the Wright-Fisher model of population genetics. These equivalences suggest that the frequency distribution of types in replicator populations optimally encodes regularities of a stochastic environment to predict future environments, without invoking the known mechanisms of multilevel selection and evolvability. A unified view of the theories of learning and evolution comes in sight.

Assuntos

Evolução Biológica , Genética Populacional , Teorema de Bayes , Aprendizagem

The neural architecture of language: Integrative modeling converges on predictive processing.

Schrimpf, Martin; Blank, Idan Asher; Tuckute, Greta; Kauf, Carina; Hosseini, Eghbal A; Kanwisher, Nancy; Tenenbaum, Joshua B; Fedorenko, Evelina.

Proc Natl Acad Sci U S A ; 118(45)2021 11 09.

Artigo em Inglês | MEDLINE | ID: mdl-34737231

RESUMO

The neuroscience of perception has recently been revolutionized with an integrative modeling approach in which computation, brain function, and behavior are linked across many datasets and many computational models. By revealing trends across models, this approach yields novel insights into cognitive and neural mechanisms in the target domain. We here present a systematic study taking this approach to higher-level cognition: human language processing, our species' signature cognitive skill. We find that the most powerful "transformer" models predict nearly 100% of explainable variance in neural responses to sentences and generalize across different datasets and imaging modalities (functional MRI and electrocorticography). Models' neural fits ("brain score") and fits to behavioral responses are both strongly correlated with model accuracy on the next-word prediction task (but not other language tasks). Model architecture appears to substantially contribute to neural fit. These results provide computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the human brain.

Assuntos

Encéfalo/fisiologia , Idioma , Modelos Neurológicos , Redes Neurais de Computação , Humanos

Probabilistic programming versus meta-learning as models of cognition.

Ong, Desmond C; Zhi-Xuan, Tan; Tenenbaum, Joshua B; Goodman, Noah D.

Behav Brain Sci ; 47: e158, 2024 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-39311521

RESUMO

We summarize the recent progress made by probabilistic programming as a unifying formalism for the probabilistic, symbolic, and data-driven aspects of human cognition. We highlight differences with meta-learning in flexibility, statistical assumptions and inferences about cogniton. We suggest that the meta-learning approach could be further strengthened by considering Connectionist and Bayesian approaches, rather than exclusively one or the other.

Assuntos

Teorema de Bayes , Cognição , Aprendizagem , Humanos , Cognição/fisiologia , Aprendizagem/fisiologia , Modelos Psicológicos

Emotion prediction as computation over a generative theory of mind.

Houlihan, Sean Dae; Kleiman-Weiner, Max; Hewitt, Luke B; Tenenbaum, Joshua B; Saxe, Rebecca.

Philos Trans A Math Phys Eng Sci ; 381(2251): 20220047, 2023 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-37271174

RESUMO

From sparse descriptions of events, observers can make systematic and nuanced predictions of what emotions the people involved will experience. We propose a formal model of emotion prediction in the context of a public high-stakes social dilemma. This model uses inverse planning to infer a person's beliefs and preferences, including social preferences for equity and for maintaining a good reputation. The model then combines these inferred mental contents with the event to compute 'appraisals': whether the situation conformed to the expectations and fulfilled the preferences. We learn functions mapping computed appraisals to emotion labels, allowing the model to match human observers' quantitative predictions of 20 emotions, including joy, relief, guilt and envy. Model comparison indicates that inferred monetary preferences are not sufficient to explain observers' emotion predictions; inferred social preferences are factored into predictions for nearly every emotion. Human observers and the model both use minimal individualizing information to adjust predictions of how different people will respond to the same event. Thus, our framework integrates inverse planning, event appraisals and emotion concepts in a single computational model to reverse-engineer people's intuitive theory of emotions. This article is part of a discussion meeting issue 'Cognitive artificial intelligence'.

Assuntos

Teoria da Mente , Humanos , Inteligência Artificial , Emoções

DreamCoder: growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning.

Ellis, Kevin; Wong, Lionel; Nye, Maxwell; Sablé-Meyer, Mathias; Cary, Luc; Anaya Pozo, Lore; Hewitt, Luke; Solar-Lezama, Armando; Tenenbaum, Joshua B.

Philos Trans A Math Phys Eng Sci ; 381(2251): 20220050, 2023 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-37271169

RESUMO

Expert problem-solving is driven by powerful languages for thinking about problems and their solutions. Acquiring expertise means learning these languages-systems of concepts, alongside the skills to use them. We present DreamCoder, a system that learns to solve problems by writing programs. It builds expertise by creating domain-specific programming languages for expressing domain concepts, together with neural networks to guide the search for programs within these languages. A 'wake-sleep' learning algorithm alternately extends the language with new symbolic abstractions and trains the neural network on imagined and replayed problems. DreamCoder solves both classic inductive programming tasks and creative tasks such as drawing pictures and building scenes. It rediscovers the basics of modern functional programming, vector algebra and classical physics, including Newton's and Coulomb's laws. Concepts are built compositionally from those learned earlier, yielding multilayered symbolic representations that are interpretable and transferrable to new tasks, while still growing scalably and flexibly with experience. This article is part of a discussion meeting issue 'Cognitive artificial intelligence'.

Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning.

Allen, Kelsey R; Smith, Kevin A; Tenenbaum, Joshua B.

Proc Natl Acad Sci U S A ; 117(47): 29302-29310, 2020 11 24.

Artigo em Inglês | MEDLINE | ID: mdl-33229515

RESUMO

Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use-using objects in new ways to act on the world, achieve a goal, or solve a problem. To study this type of general physical problem solving, we introduce the Virtual Tools game. In this game, people solve a large range of challenging physical puzzles in just a handful of attempts. We propose that the flexibility of human physical problem solving rests on an ability to imagine the effects of hypothesized actions, while the efficiency of human search arises from rich action priors which are updated via observations of the world. We instantiate these components in the "sample, simulate, update" (SSUP) model and show that it captures human performance across 30 levels of the Virtual Tools game. More broadly, this model provides a mechanism for explaining how people condense general physical knowledge into actionable, task-specific plans to achieve flexible and efficient physical problem solving.

Assuntos

Modelos Psicológicos , Resolução de Problemas/fisiologia , Comportamento de Utilização de Ferramentas/fisiologia , Cognição/fisiologia , Simulação por Computador , Aprendizado Profundo , Jogos Experimentais , Humanos , Imaginação/fisiologia , Conhecimento

Predicting responsibility judgments from dispositional inferences and causal attributions.

Langenhoff, Antonia F; Wiegmann, Alex; Halpern, Joseph Y; Tenenbaum, Joshua B; Gerstenberg, Tobias.

Cogn Psychol ; 129: 101412, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34303092

RESUMO

The question of how people hold others responsible has motivated decades of theorizing and empirical work. In this paper, we develop and test a computational model that bridges the gap between broad but qualitative framework theories, and quantitative but narrow models. In our model, responsibility judgments are the result of two cognitive processes: a dispositional inference about a person's character from their action, and a causal attribution about the person's role in bringing about the outcome. We test the model in a group setting in which political committee members vote on whether or not a policy should be passed. We assessed participants' dispositional inferences and causal attributions by asking how surprising and important a committee member's vote was. Participants' answers to these questions in Experiment 1 accurately predicted responsibility judgments in Experiment 2. In Experiments 3 and 4, we show that the model also predicts moral responsibility judgments, and that importance matters more for responsibility, while surprise matters more for judgments of wrongfulness.

Assuntos

Julgamento , Percepção Social , Causalidade , Humanos , Comportamento Social

10.

Modeling human intuitions about liquid flow with particle-based simulation.

Bates, Christopher J; Yildirim, Ilker; Tenenbaum, Joshua B; Battaglia, Peter.

PLoS Comput Biol ; 15(7): e1007210, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-31329579

RESUMO

Humans can easily describe, imagine, and, crucially, predict a wide variety of behaviors of liquids-splashing, squirting, gushing, sloshing, soaking, dripping, draining, trickling, pooling, and pouring-despite tremendous variability in their material and dynamical properties. Here we propose and test a computational model of how people perceive and predict these liquid dynamics, based on coarse approximate simulations of fluids as collections of interacting particles. Our model is analogous to a "game engine in the head", drawing on techniques for interactive simulations (as in video games) that optimize for efficiency and natural appearance rather than physical accuracy. In two behavioral experiments, we found that the model accurately captured people's predictions about how liquids flow among complex solid obstacles, and was significantly better than several alternatives based on simple heuristics and deep neural networks. Our model was also able to explain how people's predictions varied as a function of the liquids' properties (e.g., viscosity and stickiness). Together, the model and empirical results extend the recent proposal that human physical scene understanding for the dynamics of rigid, solid objects can be supported by approximate probabilistic simulation, to the more complex and unexplored domain of fluid dynamics.

Assuntos

Hidrodinâmica , Intuição , Biologia Computacional , Simulação por Computador , Heurística , Humanos , Julgamento , Modelos Psicológicos , Modelos Estatísticos , Redes Neurais de Computação , Fenômenos Físicos

Ver mais detalhes

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA