Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33.397
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 187(6): 1316-1326, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38490173

RESUMO

Understanding sex-related variation in health and illness requires rigorous and precise approaches to revealing underlying mechanisms. A first step is to recognize that sex is not in and of itself a causal mechanism; rather, it is a classification system comprising a set of categories, usually assigned according to a range of varying traits. Moving beyond sex as a system of classification to working with concrete and measurable sex-related variables is necessary for precision. Whether and how these sex-related variables matter-and what patterns of difference they contribute to-will vary in context-specific ways. Second, when researchers incorporate these sex-related variables into research designs, rigorous analytical methods are needed to allow strongly supported conclusions. Third, the interpretation and reporting of sex-related variation require care to ensure that basic and preclinical research advance health equity for all.


Assuntos
Pesquisa Biomédica , Equidade em Saúde , Sexo , Humanos
2.
Cell ; 181(4): 749-753, 2020 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-32413294

RESUMO

In 1991, Buck and Axel published a landmark study in Cell for work that was awarded the 2004 Nobel Prize. The identification of the olfactory receptors as the largest family of GPCRs catapulted olfaction into mainstream neurobiology. This BenchMark revisits Buck's experimental innovation and its surprising success at the time.


Assuntos
Receptores Odorantes/metabolismo , Olfato/fisiologia , Distinções e Prêmios , História do Século XX , Humanos , Neurobiologia , Prêmio Nobel , Neurônios Receptores Olfatórios , Receptores Acoplados a Proteínas G/metabolismo
3.
Cell ; 175(6): 1665-1678.e18, 2018 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-30343896

RESUMO

Low-grade gliomas almost invariably progress into secondary glioblastoma (sGBM) with limited therapeutic option and poorly understood mechanism. By studying the mutational landscape of 188 sGBMs, we find significant enrichment of TP53 mutations, somatic hypermutation, MET-exon-14-skipping (METex14), PTPRZ1-MET (ZM) fusions, and MET amplification. Strikingly, METex14 frequently co-occurs with ZM fusion and is present in ∼14% of cases with significantly worse prognosis. Subsequent studies show that METex14 promotes glioma progression by prolonging MET activity. Furthermore, we describe a MET kinase inhibitor, PLB-1001, that demonstrates remarkable potency in selectively inhibiting MET-altered tumor cells in preclinical models. Importantly, this compound also shows blood-brain barrier permeability and is subsequently applied in a phase I clinical trial that enrolls MET-altered chemo-resistant glioma patients. Encouragingly, PLB-1001 achieves partial response in at least two advanced sGBM patients with rarely significant side effects, underscoring the clinical potential for precisely treating gliomas using this therapy.


Assuntos
Neoplasias Encefálicas , Éxons , Glioblastoma , Mutação , Inibidores de Proteínas Quinases , Proteínas Proto-Oncogênicas c-met , Animais , Barreira Hematoencefálica/metabolismo , Barreira Hematoencefálica/patologia , Neoplasias Encefálicas/tratamento farmacológico , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Neoplasias Encefálicas/patologia , Sistemas de Liberação de Medicamentos , Feminino , Glioblastoma/tratamento farmacológico , Glioblastoma/genética , Glioblastoma/metabolismo , Humanos , Masculino , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Nus , Inibidores de Proteínas Quinases/farmacocinética , Inibidores de Proteínas Quinases/farmacologia , Proteínas Proto-Oncogênicas c-met/antagonistas & inibidores , Proteínas Proto-Oncogênicas c-met/genética , Proteínas Proto-Oncogênicas c-met/metabolismo , Ratos Sprague-Dawley , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , Ensaios Antitumorais Modelo de Xenoenxerto
4.
CA Cancer J Clin ; 73(5): 461-479, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37329257

RESUMO

There remains a need to synthesize linkages between social determinants of health (SDOH) and cancer screening to reduce persistent inequities contributing to the US cancer burden. The authors conducted a systematic review of US-based breast, cervical, colorectal, and lung cancer screening intervention studies to summarize how SDOH have been considered in interventions and relationships between SDOH and screening. Five databases were searched for peer-reviewed research articles published in English between 2010 and 2021. The Covidence software platform was used to screen articles and extract data using a standardized template. Data items included study and intervention characteristics, SDOH intervention components and measures, and screening outcomes. The findings were summarized using descriptive statistics and narratives. The review included 144 studies among diverse population groups. SDOH interventions increased screening rates overall by a median of 8.4 percentage points (interquartile interval, 1.8-18.8 percentage points). The objective of most interventions was to increase community demand (90.3%) and access (84.0%) to screening. SDOH interventions related to health care access and quality were most prevalent (227 unique intervention components). Other SDOH, including educational, social/community, environmental, and economic factors, were less common (90, 52, 21, and zero intervention components, respectively). Studies that included analyses of health policy, access to care, and lower costs yielded the largest proportions of favorable associations with screening outcomes. SDOH were predominantly measured at the individual level. This review describes how SDOH have been considered in the design and evaluation of cancer screening interventions and effect sizes for SDOH interventions. Findings may guide future intervention and implementation research aiming to reduce US screening inequities.


Assuntos
Neoplasias Pulmonares , Determinantes Sociais da Saúde , Humanos , Detecção Precoce de Câncer , Disparidades nos Níveis de Saúde , Escolaridade
5.
CA Cancer J Clin ; 72(3): 266-286, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34797562

RESUMO

Smoking cessation reduces the risk of death, improves recovery, and reduces the risk of hospital readmission. Evidence and policy support hospital admission as an ideal time to deliver smoking-cessation interventions. However, this is not well implemented in practice. In this systematic review, the authors summarize the literature on smoking-cessation implementation strategies and evaluate their success to guide the implementation of best-practice smoking interventions into hospital settings. The CINAHL Complete, Embase, MEDLINE Complete, and PsycInfo databases were searched using terms associated with the following topics: smoking cessation, hospitals, and implementation. In total, 14,287 original records were identified and screened, resulting in 63 eligible articles from 56 studies. Data were extracted on the study characteristics, implementation strategies, and implementation outcomes. Implementation outcomes were guided by Proctor and colleagues' framework and included acceptability, adoption, appropriateness, cost, feasibility, fidelity, penetration, and sustainability. The findings demonstrate that studies predominantly focused on the training of staff to achieve implementation. Brief implementation approaches using a small number of implementation strategies were less successful and poorly sustained compared with well resourced and multicomponent approaches. Although brief implementation approaches may be viewed as advantageous because they are less resource-intensive, their capacity to change practice in a sustained way lacks evidence. Attempts to change clinician behavior or introduce new models of care are challenging in a short time frame, and implementation efforts should be designed for long-term success. There is a need to embrace strategic, well planned implementation approaches to embed smoking-cessation interventions into hospitals and to reap and sustain the benefits for people who smoke.


Assuntos
Abandono do Hábito de Fumar , Hospitais , Humanos , Abandono do Hábito de Fumar/métodos
6.
Trends Biochem Sci ; 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-37953092

RESUMO

Science is a collaborative endeavor, and the importance of collaborations across disciplines and boundaries is becoming clearer with the advent of new technologies. This article focuses on key aspects of initiating and sustaining new collaborations, and expanding from bilateral to multilateral efforts to create major impact through team science.

7.
Annu Rev Genomics Hum Genet ; 25(1): 369-395, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38608642

RESUMO

The ethical standards for the responsible conduct of human research have come a long way; however, concerns surrounding equity remain in human genetics and genomics research. Addressing these concerns will help society realize the full potential of human genomics research. One outstanding concern is the fair and equitable sharing of benefits from research on human participants. Several international bodies have recognized that benefit-sharing can be an effective tool for ethical research conduct, but international laws, including the Convention on Biological Diversity and its Nagoya Protocol on Access and Benefit-Sharing, explicitly exclude human genetic and genomic resources. These agreements face significant challenges that must be considered and anticipated if similar principles are applied in human genomics research. We propose that benefit-sharing from human genomics research can be a bottom-up effort and embedded into the existing research process. We propose the development of a "benefit-sharing by design" framework to address concerns of fairness and equity in the use of human genomic resources and samples and to learn from the aspirations and decade of implementation of the Nagoya Protocol.


Assuntos
Genômica , Humanos , Genômica/ética , Genômica/métodos , Genoma Humano , Pesquisa em Genética/ética , Pesquisa em Genética/legislação & jurisprudência
8.
Am J Hum Genet ; 111(3): 433-444, 2024 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-38307026

RESUMO

We use the implementation science framework RE-AIM (reach, effectiveness, adoption, implementation, and maintenance) to describe outcomes of In Our DNA SC, a population-wide genomic screening (PWGS) program. In Our DNA SC involves participation through clinical appointments, community events, or at home collection. Participants provide a saliva sample that is sequenced by Helix, and those with a pathogenic variant or likely pathogenic variant for CDC Tier 1 conditions are offered free genetic counseling. We assessed key outcomes among the first cohort of individuals recruited. Over 14 months, 20,478 participants enrolled, and 14,053 samples were collected. The majority selected at-home sample collection followed by clinical sample collection and collection at community events. Participants were predominately female, White (self-identified), non-Hispanic, and between the ages of 40-49. Participants enrolled through community events were the most racially diverse and the youngest. Half of those enrolled completed the program. We identified 137 individuals with pathogenic or likely pathogenic variants for CDC Tier 1 conditions. The majority (77.4%) agreed to genetic counseling, and of those that agreed, 80.2% completed counseling. Twelve clinics participated, and we conducted 108 collection events. Participants enrolled at home were most likely to return their sample for sequencing. Through this evaluation, we identified facilitators and barriers to implementation of our state-wide PWGS program. Standardized reporting using implementation science frameworks can help generalize strategies and improve the impact of PWGS.


Assuntos
Aconselhamento Genético , Ciência da Implementação , Humanos , Feminino , Adulto , Pessoa de Meia-Idade , Genômica
9.
Development ; 151(20)2024 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-39369308

RESUMO

Humans are curious to understand the causes of traits that distinguish us from other animals and that distinguish vastly different species from one another. We also have a proclivity for simple stories and sometimes tend toward seeking and accepting simple genetic explanations for large evolutionary shifts, even to a single gene. Here, I reveal how a biased expectation of mechanistic simplicity threads through the long history of evolutionary and developmental genetics. I argue, however, that expecting a simple mechanism threatens a deeper understanding of evolution, and I define the limitations for interpreting experimental evidence in evolutionary developmental genetics.


Assuntos
Evolução Biológica , Animais , Humanos , Biologia do Desenvolvimento , Evolução Molecular , Modelos Genéticos
10.
Trends Immunol ; 45(7): 483-485, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38862366

RESUMO

Despite prevalent diversity and inclusion programs in STEM, gender biases and stereotypes persist across educational and professional settings. Recognizing this enduring bias is crucial for achieving transformative change on gender equity and can help orient policy toward more effective strategies to address ongoing disparities.


Assuntos
Sexismo , Humanos , Feminino , Masculino , Estereotipagem , Ciência , Engenharia , Matemática
11.
Proc Natl Acad Sci U S A ; 121(19): e2301436121, 2024 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-38687798

RESUMO

Amid the discourse on foreign influence investigations in research, this study examines the impact of NIH-initiated investigations starting in 2018 on U.S. scientists' productivity, focusing on those collaborating with Chinese peers. Using publication data from 2010 to 2021, we analyze over 113,000 scientists and find that investigations coincide with reduced productivity for those with China collaborations compared to those with other international collaborators, especially when accounting for publication impact. The decline is particularly pronounced in fields that received greater preinvestigation NIH funding and engaged more in U.S.-China collaborations. Indications of scientist migration and broader scientific progress implications also emerge. We also offer insights into the underlying mechanisms via qualitative interviews.


Assuntos
National Institutes of Health (U.S.) , China , Estados Unidos , Humanos , Cooperação Internacional , Pesquisadores/estatística & dados numéricos , Pesquisa Biomédica
12.
Proc Natl Acad Sci U S A ; 121(17): e2307213121, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38621134

RESUMO

In the past three decades, there has been a rise in young academy movements in the Global North and South. Such movements, in at least Germany and the Netherlands, have been shown to be quite effective in connecting scientific work with society. Likewise, these movements share a common goal of developing interdisciplinary collaboration among young scientists, which contributes to the growth of a nation's-but also global-scientific endeavors. This paper focuses on the young academy movement in the fourth-largest country hosting the biggest Muslim population in the world, which is also the third-most populous democracy: Indonesia. We observe that there has been rising awareness among the young generation of scientists in Indonesia of the need to advocate for the use of sciences in responding to upcoming and current multidimensional crises. Science advocacy can be seen in their peer-based identification of Indonesia's future challenges, encompassing the fundamental areas for scientific inquiry, discovery, and intervention. We focus on the Indonesian Young Academy of Sciences (ALMI) and its network of young scientists. We describe ALMI's science communication practice, specifically SAINS45 and Science for Indonesia's Biodiversity, and how they have been useful for policymakers, media, and school engagements. The article closes with a reflection on future directions for the young academy movement in Indonesia and beyond.


Assuntos
Islamismo , Indonésia , Alemanha , Países Baixos
13.
Proc Natl Acad Sci U S A ; 121(27): e2311888121, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38913887

RESUMO

The prediction of protein 3D structure from amino acid sequence is a computational grand challenge in biophysics and plays a key role in robust protein structure prediction algorithms, from drug discovery to genome interpretation. The advent of AI models, such as AlphaFold, is revolutionizing applications that depend on robust protein structure prediction algorithms. To maximize the impact, and ease the usability, of these AI tools we introduce APACE, AlphaFold2 and advanced computing as a service, a computational framework that effectively handles this AI model and its TB-size database to conduct accelerated protein structure prediction analyses in modern supercomputing environments. We deployed APACE in the Delta and Polaris supercomputers and quantified its performance for accurate protein structure predictions using four exemplar proteins: 6AWO, 6OAN, 7MEZ, and 6D6U. Using up to 300 ensembles, distributed across 200 NVIDIA A100 GPUs, we found that APACE is up to two orders of magnitude faster than off-the-self AlphaFold2 implementations, reducing time-to-solution from weeks to minutes. This computational approach may be readily linked with robotics laboratories to automate and accelerate scientific discovery.


Assuntos
Algoritmos , Biofísica , Proteínas , Proteínas/química , Biofísica/métodos , Conformação Proteica , Software , Biologia Computacional/métodos , Modelos Moleculares
14.
Proc Natl Acad Sci U S A ; 121(38): e2404035121, 2024 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-39236231

RESUMO

We discuss a relatively new meta-scientific research design: many-analyst studies that attempt to assess the replicability and credibility of research based on large-scale observational data. In these studies, a large number of analysts try to answer the same research question using the same data. The key idea is the greater the variation in results, the greater the uncertainty in answering the research question and, accordingly, the lower the credibility of any individual research finding. Compared to individual replications, the large crowd of analysts allows for a more systematic investigation of uncertainty and its sources. However, many-analyst studies are also resource-intensive, and there are some doubts about their potential to provide credible assessments. We identify three issues that any many-analyst study must address: 1) identifying the source of variation in the results; 2) providing an incentive structure similar to that of standard research; and 3) conducting a proper meta-analysis of the results. We argue that some recent many-analyst studies have failed to address these issues satisfactorily and have therefore provided an overly pessimistic assessment of the credibility of science. We also provide some concrete guidance on how future many-analyst studies could provide a more constructive assessment.

15.
Proc Natl Acad Sci U S A ; 121(19): e2209196121, 2024 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-38640256

RESUMO

Increasing the speed of scientific progress is urgently needed to address the many challenges associated with the biosphere in the Anthropocene. Consequently, the critical question becomes: How can science most rapidly progress to address large, complex global problems? We suggest that the lag in the development of a more predictive science of the biosphere is not only because the biosphere is so much more complex, or because we do not have enough data, or are not doing enough experiments, but, in large part, because of unresolved tension between the three dominant scientific cultures that pervade the research community. We introduce and explain the concept of the three scientific cultures and present a novel analysis of their characteristics, supported by examples and a formal mathematical definition/representation of what this means and implies. The three cultures operate, to varying degrees, across all of science. However, within the biosciences, and in contrast to some of the other sciences, they remain relatively more separated, and their lack of integration has hindered their potential power and insight. Our solution to accelerating a broader, predictive science of the biosphere is to enhance integration of scientific cultures. The process of integration-Scientific Transculturalism-recognizes that the push for interdisciplinary research, in general, is just not enough. Unless these cultures of science are formally appreciated and their thinking iteratively integrated into scientific discovery and advancement, there will continue to be numerous significant challenges that will increasingly limit forecasting and prediction efforts.


Assuntos
Previsões , Matemática
16.
Proc Natl Acad Sci U S A ; 121(35): e2404328121, 2024 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-39163339

RESUMO

How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research Librarian, Research Ethicist, Data Generator, and Novel Data Predictor, using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT-3.5 and GPT-4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT-4 (though not GPT-3.5) proved capable of detecting violations like p-hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered in large language corpora, indicating that ChatGPT can simulate known results, an antecedent to usefulness for both data generation and skills like hypothesis generation. Contrastingly, in Study 4 (Novel Data Predictor), neither model was successful at predicting new results absent in their training data, and neither appeared to leverage substantially new information when predicting more vs. less novel outcomes. Together, these results suggest that GPT is a flawed but rapidly improving librarian, a decent research ethicist already, capable of data generation in simple domains with known characteristics but poor at predicting novel patterns of empirical data to aid future experimentation.


Assuntos
Bibliotecários , Humanos , Eticistas , Pesquisadores , Ética em Pesquisa
17.
Proc Natl Acad Sci U S A ; 121(38): e2320177121, 2024 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-39269775

RESUMO

One of the longstanding aims of network neuroscience is to link a connectome's topological properties-i.e., features defined from connectivity alone-with an organism's neurobiology. One approach for doing so is to compare connectome properties with annotational maps. This type of analysis is popular at the meso-/macroscale, but is less common at the nano-scale, owing to a paucity of neuron-level connectome data. However, recent methodological advances have made possible the reconstruction of whole-brain connectomes at single-neuron resolution for a select set of organisms. These include the fruit fly, Drosophila melanogaster, and its developing larvae. In addition to fine-scale descriptions of connectivity, these datasets are accompanied by rich annotations. Here, we use a variant of the stochastic blockmodel to detect multilevel communities in the larval Drosophila connectome. We find that communities partition neurons based on function and cell type and that most interact assortatively, reflecting the principle of functional segregation. However, a small number of communities interact nonassortatively, forming form a "rich-club" of interneurons that receive sensory/ascending inputs and deliver outputs along descending pathways. Next, we investigate the role of community structure in shaping communication patterns. We find that polysynaptic signaling follows specific trajectories across modular hierarchies, with interneurons playing a key role in mediating communication routes between modules and hierarchical scales. Our work suggests a relationship between system-level architecture and the biological function and classification of individual neurons. We envision our study as an important step toward bridging the gap between complex systems and neurobiological lines of investigation in brain sciences.


Assuntos
Encéfalo , Conectoma , Drosophila melanogaster , Larva , Animais , Conectoma/métodos , Encéfalo/fisiologia , Encéfalo/crescimento & desenvolvimento , Rede Nervosa/fisiologia , Neurônios/fisiologia , Neurônios/metabolismo , Interneurônios/fisiologia , Interneurônios/metabolismo
18.
Proc Natl Acad Sci U S A ; 121(12): e2320232121, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38478684

RESUMO

The chemisorption energy of reactants on a catalyst surface, [Formula: see text], is among the most informative characteristics of understanding and pinpointing the optimal catalyst. The intrinsic complexity of catalyst surfaces and chemisorption reactions presents significant difficulties in identifying the pivotal physical quantities determining [Formula: see text]. In response to this, the study proposes a methodology, the feature deletion experiment, based on Automatic Machine Learning (AutoML) for knowledge extraction from a high-throughput density functional theory (DFT) database. The study reveals that, for binary alloy surfaces, the local adsorption site geometric information is the primary physical quantity determining [Formula: see text], compared to the electronic and physiochemical properties of the catalyst alloys. By integrating the feature deletion experiment with instance-wise variable selection (INVASE), a neural network-based explainable AI (XAI) tool, we established the best-performing feature set containing 21 intrinsic, non-DFT computed properties, achieving an MAE of 0.23 eV across a periodic table-wide chemical space involving more than 1,600 types of alloys surfaces and 8,400 chemisorption reactions. This study demonstrates the stability, consistency, and potential of AutoML-based feature deletion experiment in developing concise, predictive, and theoretically meaningful models for complex chemical problems with minimal human intervention.

19.
Proc Natl Acad Sci U S A ; 121(41): e2402802121, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39356667

RESUMO

Scientific datasets play a crucial role in contemporary data-driven research, as they allow for the progress of science by facilitating the discovery of new patterns and phenomena. This mounting demand for empirical research raises important questions on how strategic data utilization in research projects can stimulate scientific advancement. In this study, we examine the hypothesis inspired by the recombination theory, which suggests that innovative combinations of existing knowledge, including the use of unusual combinations of datasets, can lead to high-impact discoveries. Focusing on social science, we investigate the scientific outcomes of such atypical data combinations in more than 30,000 publications that leverage over 5,000 datasets curated within one of the largest social science databases, Interuniversity Consortium for Political and Social Research. This study offers four important insights. First, combining datasets, particularly those infrequently paired, significantly contributes to both scientific and broader impacts (e.g., dissemination to the general public). Second, infrequently paired datasets maintain a strong association with citation even after controlling for the atypicality of dataset topics. In contrast, the atypicality of dataset topics has a much smaller positive impact on citation counts. Third, smaller and less experienced research teams tend to use atypical combinations of datasets in research more frequently than their larger and more experienced counterparts. Last, despite the benefits of data combination, papers that amalgamate data remain infrequent. This finding suggests that the unconventional combination of datasets is an underutilized but powerful strategy correlated with the scientific impact and broader dissemination of scientific discoveries.

20.
Proc Natl Acad Sci U S A ; 121(41): e2322420121, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39365822

RESUMO

The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that to develop a holistic understanding of these systems, we must consider the problem that they were trained to solve: next-word prediction over Internet text. By recognizing the pressures that this task exerts, we can make predictions about the strategies that LLMs will adopt, allowing us to reason about when they will succeed or fail. Using this approach-which we call the teleological approach-we identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input. To test our predictions, we evaluate five LLMs (GPT-3.5, GPT-4, Claude 3, Llama 3, and Gemini 1.0) on 11 tasks, and we find robust evidence that LLMs are influenced by probability in the hypothesized ways. Many of the experiments reveal surprising failure modes. For instance, GPT-4's accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter. These results show that AI practitioners should be careful about using LLMs in low-probability situations. More broadly, we conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system-one that has been shaped by its own particular set of pressures.


Assuntos
Idioma , Humanos , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA