Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38830100

RESUMO

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.


Assuntos
Idioma , Matemática , Resolução de Problemas , Humanos , Resolução de Problemas/fisiologia , Estudantes/psicologia
2.
Int J Comput Vis ; 132(2): 555-580, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38303742

RESUMO

We propose a method for constructing generative models of 3D objects from a single 3D mesh and improving them through unsupervised low-shot learning from 2D images. Our method produces a 3D morphable model that represents shape and albedo in terms of Gaussian processes. Whereas previous approaches have typically built 3D morphable models from multiple high-quality 3D scans through principal component analysis, we build 3D morphable models from a single scan or template. As we demonstrate in the face domain, these models can be used to infer 3D reconstructions from 2D data (inverse graphics) or 3D data (registration). Specifically, we show that our approach can be used to perform face recognition using only a single 3D template (one scan total, not one per person). We extend our model to a preliminary unsupervised learning framework that enables the learning of the distribution of 3D faces using one 3D template and a small number of 2D images. Our approach is motivated as a potential model for the origins of face perception in human infants, who appear to start with an innate face template and subsequently develop a flexible system for perceiving the 3D structure of any novel face from experience with only 2D images of a relatively small number of familiar faces.

3.
Trends Cogn Sci ; 28(7): 628-642, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38616478

RESUMO

Humans often pursue idiosyncratic goals that appear remote from functional ends, including information gain. We suggest that this is valuable because goals (even prima facie foolish or unachievable ones) contain structured information that scaffolds thinking and planning. By evaluating hypotheses and plans with respect to their goals, humans can discover new ideas that go beyond prior knowledge and observable evidence. These hypotheses and plans can be transmitted independently of their original motivations, adapted across generations, and serve as an engine of cultural evolution. Here, we review recent empirical and computational research underlying goal generation and planning and discuss the ways that the flexibility of our motivational system supports cognitive gains for both individuals and societies.


Assuntos
Cognição , Objetivos , Humanos , Cognição/fisiologia , Motivação , Pensamento/fisiologia
4.
Cognition ; 250: 105790, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38908304

RESUMO

Rules help guide our behavior-particularly in complex social contexts. But rules sometimes give us the "wrong" answer. How do we know when it is okay to break the rules? In this paper, we argue that we sometimes use contractualist (agreement-based) mechanisms to determine when a rule can be broken. Our model draws on a theory of social interactions - "virtual bargaining" - that assumes that actors engage in a simulated bargaining process when navigating the social world. We present experimental data which suggests that rule-breaking decisions are sometimes driven by virtual bargaining and show that these data cannot be explained by more traditional rule-based or outcome-based approaches.


Assuntos
Julgamento , Princípios Morais , Humanos , Julgamento/fisiologia , Adulto , Feminino , Masculino , Interação Social , Adulto Jovem , Tomada de Decisões/fisiologia , Negociação
5.
Trends Cogn Sci ; 28(6): 517-540, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38508911

RESUMO

Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of separate mechanisms specialized for formal versus functional linguistic competence.


Assuntos
Idioma , Humanos , Pensamento/fisiologia , Linguística
6.
Nat Commun ; 15(1): 6847, 2024 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-39127796

RESUMO

Throughout their lives, humans seem to learn a variety of rules for things like applying category labels, following procedures, and explaining causal relationships. These rules are often algorithmically rich but are nonetheless acquired with minimal data and computation. Symbolic models based on program learning successfully explain rule-learning in many domains, but performance degrades quickly as program complexity increases. It remains unclear how to scale symbolic rule-learning methods to model human performance in challenging domains. Here we show that symbolic search over the space of metaprograms-programs that revise programs-dramatically improves learning efficiency. On a behavioral benchmark of 100 algorithmically rich rules, this approach fits human learning more accurately than alternative models while also using orders of magnitude less search. The computation required to match median human performance is consistent with conservative estimates of human thinking time. Our results suggest that metaprogram-like representations may help human learners to efficiently acquire rules.


Assuntos
Algoritmos , Aprendizagem , Humanos , Aprendizagem/fisiologia
7.
ArXiv ; 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38259351

RESUMO

Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.

8.
Nat Hum Behav ; 8(6): 1035-1043, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38907029

RESUMO

Board, card or video games have been played by virtually every individual in the world. Games are popular because they are intuitive and fun. These distinctive qualities of games also make them ideal for studying the mind. By being intuitive, games provide a unique vantage point for understanding the inductive biases that support behaviour in more complex, ecological settings than traditional laboratory experiments. By being fun, games allow researchers to study new questions in cognition such as the meaning of 'play' and intrinsic motivation, while also supporting more extensive and diverse data collection by attracting many more participants. We describe the advantages and drawbacks of using games relative to standard laboratory-based experiments and lay out a set of recommendations on how to gain the most from using games to study cognition. We hope this Perspective will lead to a wider use of games as experimental paradigms, elevating the ecological validity, scale and robustness of research on the mind.


Assuntos
Cognição , Jogos de Vídeo , Humanos , Jogos de Vídeo/psicologia , Jogos Experimentais , Motivação
9.
Nat Comput Sci ; 1(10): 678-685, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38217198

RESUMO

How do pedestrians choose their paths within city street networks? Researchers have tried to shed light on this matter through strictly controlled experiments, but an ultimate answer based on real-world mobility data is still lacking. Here, we analyze salient features of human path planning through a statistical analysis of a massive dataset of GPS traces, which reveals that (1) people increasingly deviate from the shortest path when the distance between origin and destination increases and (2) chosen paths are statistically different when origin and destination are swapped. We posit that direction to goal is a main driver of path planning and develop a vector-based navigation model; the resulting trajectories, which we have termed pointiest paths, are a statistically better predictor of human paths than a model based on minimizing distance with stochastic effects. Our findings generalize across two major US cities with different street networks, hinting to the fact that vector-based navigation might be a universal property of human path planning.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA