RESUMO
High-resolution electron microscopy of nervous systems has enabled the reconstruction of synaptic connectomes. However, we do not know the synaptic sign for each connection (i.e., whether a connection is excitatory or inhibitory), which is implied by the released transmitter. We demonstrate that artificial neural networks can predict transmitter types for presynapses from electron micrographs: a network trained to predict six transmitters (acetylcholine, glutamate, GABA, serotonin, dopamine, octopamine) achieves an accuracy of 87% for individual synapses, 94% for neurons, and 91% for known cell types across a D. melanogaster whole brain. We visualize the ultrastructural features used for prediction, discovering subtle but significant differences between transmitter phenotypes. We also analyze transmitter distributions across the brain and find that neurons that develop together largely express only one fast-acting transmitter (acetylcholine, glutamate, or GABA). We hope that our publicly available predictions act as an accelerant for neuroscientific hypothesis generation for the fly.
Assuntos
Drosophila melanogaster , Microscopia Eletrônica , Neurotransmissores , Sinapses , Animais , Encéfalo/ultraestrutura , Encéfalo/metabolismo , Conectoma , Drosophila melanogaster/ultraestrutura , Drosophila melanogaster/metabolismo , Ácido gama-Aminobutírico/metabolismo , Microscopia Eletrônica/métodos , Redes Neurais de Computação , Neurônios/metabolismo , Neurônios/ultraestrutura , Neurotransmissores/metabolismo , Sinapses/ultraestrutura , Sinapses/metabolismoRESUMO
We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate AI models and biomedical tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI's ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are poised to be proficient in various tasks, planning discovery workflows and performing self-assessment to identify and mitigate gaps in their knowledge. These agents use large language models and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from virtual cell simulation, programmable control of phenotypes, and the design of cellular circuits to developing new therapies.
Assuntos
Inteligência Artificial , Pesquisa Biomédica , Humanos , Aprendizado de MáquinaRESUMO
Spatial molecular profiling of complex tissues is essential to investigate cellular function in physiological and pathological states. However, methods for molecular analysis of large biological specimens imaged in 3D are lacking. Here, we present DISCO-MS, a technology that combines whole-organ/whole-organism clearing and imaging, deep-learning-based image analysis, robotic tissue extraction, and ultra-high-sensitivity mass spectrometry. DISCO-MS yielded proteome data indistinguishable from uncleared samples in both rodent and human tissues. We used DISCO-MS to investigate microglia activation along axonal tracts after brain injury and characterized early- and late-stage individual amyloid-beta plaques in a mouse model of Alzheimer's disease. DISCO-bot robotic sample extraction enabled us to study the regional heterogeneity of immune cells in intact mouse bodies and aortic plaques in a complete human heart. DISCO-MS enables unbiased proteome analysis of preclinical and clinical tissues after unbiased imaging of entire specimens in 3D, identifying diagnostic and therapeutic opportunities for complex diseases. VIDEO ABSTRACT.
Assuntos
Doença de Alzheimer , Proteoma , Camundongos , Humanos , Animais , Proteoma/análise , Proteômica/métodos , Doença de Alzheimer/patologia , Peptídeos beta-Amiloides , Espectrometria de Massas , Placa AmiloideRESUMO
Many COVID-19 patients infected by SARS-CoV-2 virus develop pneumonia (called novel coronavirus pneumonia, NCP) and rapidly progress to respiratory failure. However, rapid diagnosis and identification of high-risk patients for early intervention are challenging. Using a large computed tomography (CT) database from 3,777 patients, we developed an AI system that can diagnose NCP and differentiate it from other common pneumonia and normal controls. The AI system can assist radiologists and physicians in performing a quick diagnosis especially when the health system is overloaded. Significantly, our AI system identified important clinical markers that correlated with the NCP lesion properties. Together with the clinical data, our AI system was able to provide accurate clinical prognosis that can aid clinicians to consider appropriate early clinical management and allocate resources appropriately. We have made this AI system available globally to assist the clinicians to combat COVID-19.
Assuntos
Inteligência Artificial , Infecções por Coronavirus/diagnóstico , Pneumonia Viral/diagnóstico , Tomografia Computadorizada por Raios X , COVID-19 , China , Estudos de Coortes , Infecções por Coronavirus/patologia , Infecções por Coronavirus/terapia , Conjuntos de Dados como Assunto , Humanos , Pulmão/patologia , Modelos Biológicos , Pandemias , Projetos Piloto , Pneumonia Viral/patologia , Pneumonia Viral/terapia , Prognóstico , Radiologistas , Insuficiência Respiratória/diagnósticoRESUMO
Crucial transitions in cancer-including tumor initiation, local expansion, metastasis, and therapeutic resistance-involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.
Assuntos
Transformação Celular Neoplásica/metabolismo , Neoplasias/metabolismo , Microambiente Tumoral/fisiologia , Atlas como Assunto , Transformação Celular Neoplásica/patologia , Genômica/métodos , Humanos , Medicina de Precisão/métodos , Análise de Célula Única/métodosRESUMO
tRNA function is based on unique structures that enable mRNA decoding using anticodon trinucleotides. These structures interact with specific aminoacyl-tRNA synthetases and ribosomes using 3D shape and sequence signatures. Beyond translation, tRNAs serve as versatile signaling molecules interacting with other RNAs and proteins. Through evolutionary processes, tRNA fragmentation emerges as not merely random degradation but an act of recreation, generating specific shorter molecules called tRNA-derived small RNAs (tsRNAs). These tsRNAs exploit their linear sequences and newly arranged 3D structures for unexpected biological functions, epitomizing the tRNA "renovatio" (from Latin, meaning renewal, renovation, and rebirth). Emerging methods to uncover full tRNA/tsRNA sequences and modifications, combined with techniques to study RNA structures and to integrate AI-powered predictions, will enable comprehensive investigations of tRNA fragmentation products and new interaction potentials in relation to their biological functions. We anticipate that these directions will herald a new era for understanding biological complexity and advancing pharmaceutical engineering.
Assuntos
Aminoacil-tRNA Sintetases , RNA de Transferência , RNA de Transferência/metabolismo , Anticódon , Aminoacil-tRNA Sintetases/metabolismo , Ribossomos/metabolismo , RNA Mensageiro/genéticaRESUMO
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advancements in protein language models (pLMs) and structural aligners that help validate these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioinformatics available to the general scientific community.
Assuntos
Aprendizado de Máquina , Proteínas , Proteínas/química , Biologia Computacional/métodos , Conformação ProteicaRESUMO
Here we discuss approaches to K-Ras inhibition and drug resistance scenarios. A breakthrough offered a covalent drug against K-RasG12C. Subsequent innovations harnessed same-allele drug combinations, as well as cotargeting K-RasG12C with a companion drug to upstream regulators or downstream kinases. However, primary, adaptive, and acquired resistance inevitably emerge. The preexisting mutation load can explain how even exceedingly rare mutations with unobservable effects can promote drug resistance, seeding growth of insensitive cell clones, and proliferation. Statistics confirm the expectation that most resistance-related mutations are in cis, pointing to the high probability of cooperative, same-allele effects. In addition to targeted Ras inhibitors and drug combinations, bifunctional molecules and innovative tri-complex inhibitors to target Ras mutants are also under development. Since the identities and potential contributions of preexisting and evolving mutations are unknown, selecting a pharmacologic combination is taxing. Collectively, our broad review outlines considerations and provides new insights into pharmacology and resistance.
Assuntos
Antineoplásicos , Neoplasias , Humanos , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Neoplasias/tratamento farmacológico , Paclitaxel , Alelos , Combinação de MedicamentosRESUMO
Drug discovery is adapting to novel technologies such as data science, informatics, and artificial intelligence (AI) to accelerate effective treatment development while reducing costs and animal experiments. AI is transforming drug discovery, as indicated by increasing interest from investors, industrial and academic scientists, and legislators. Successful drug discovery requires optimizing properties related to pharmacodynamics, pharmacokinetics, and clinical outcomes. This review discusses the use of AI in the three pillars of drug discovery: diseases, targets, and therapeutic modalities, with a focus on small-molecule drugs. AI technologies, such as generative chemistry, machine learning, and multiproperty optimization, have enabled several compounds to enter clinical trials. The scientific community must carefully vet known information to address the reproducibility crisis. The full potential of AI in drug discovery can only be realized with sufficient ground truth and appropriate human intervention at later pipeline stages.
Assuntos
Inteligência Artificial , Médicos , Animais , Humanos , Reprodutibilidade dos Testes , Descoberta de Drogas , TecnologiaRESUMO
Deciphering the regulatory code of gene expression and interpreting the transcriptional effects of genome variation are critical challenges in human genetics. Modern experimental technologies have resulted in an abundance of data, enabling the development of sequence-based deep learning models that link patterns embedded in DNA to the biochemical and regulatory properties contributing to transcriptional regulation, including modeling epigenetic marks, 3D genome organization, and gene expression, with tissue and cell-type specificity. Such methods can predict the functional consequences of any noncoding variant in the human genome, even rare or never-before-observed variants, and systematically characterize their consequences beyond what is tractable from experiments or quantitative genetics studies alone. Recently, the development and application of interpretability approaches have led to the identification of key sequence patterns contributing to the predicted tasks, providing insights into the underlying biological mechanisms learned and revealing opportunities for improvement in future models.
Assuntos
Aprendizado Profundo , Regulação da Expressão Gênica , Transcrição Gênica , Humanos , Genoma Humano , Epigênese GenéticaRESUMO
The notion of common sense is invoked so frequently in contexts as diverse as everyday conversation, political debates, and evaluations of artificial intelligence that its meaning might be surmised to be unproblematic. Surprisingly, however, neither the intrinsic properties of common sense knowledge (what makes a claim commonsensical) nor the degree to which it is shared by people (its "commonness") have been characterized empirically. In this paper, we introduce an analytical framework for quantifying both these elements of common sense. First, we define the commonsensicality of individual claims and people in terms of the latter's propensity to agree on the former and their awareness of one another's agreement. Second, we formalize the commonness of common sense as a clique detection problem on a bipartite belief graph of people and claims, defining [Formula: see text] common sense as the fraction [Formula: see text] of claims shared by a fraction [Formula: see text] of people. Evaluating our framework on a dataset of [Formula: see text] raters evaluating [Formula: see text] diverse claims, we find that commonsensicality aligns most closely with plainly worded, fact-like statements about everyday physical reality. Psychometric attributes such as social perceptiveness influence individual common sense, but surprisingly demographic factors such as age or gender do not. Finally, we find that collective common sense is rare: At most, a small fraction [Formula: see text] of people agree on more than a small fraction [Formula: see text] of claims. Together, these results undercut universalistic beliefs about common sense and raise questions about its variability that are relevant both to human and artificial intelligence.
Assuntos
Inteligência Artificial , Conhecimento , Humanos , PsicometriaRESUMO
Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods. In light of these results, it remains unclear for practitioners how to use these methods and choose between them in a principled way. In this paper, we show that for moderately rich model classes (easily satisfied by neural networks), any feature attribution method that is complete and linear-for example, Integrated Gradients and Shapley Additive Explanations (SHAP)-can provably fail to improve on random guessing for inferring model behavior. Our results apply to common end-tasks such as characterizing local model behavior, identifying spurious features, and algorithmic recourse. One takeaway from our work is the importance of concretely defining end-tasks: Once such an end-task is defined, a simple and direct approach of repeated model evaluations can outperform many other complex feature attribution methods.
RESUMO
Sex plays a crucial role in human brain development, aging, and the manifestation of psychiatric and neurological disorders. However, our understanding of sex differences in human functional brain organization and their behavioral consequences has been hindered by inconsistent findings and a lack of replication. Here, we address these challenges using a spatiotemporal deep neural network (stDNN) model to uncover latent functional brain dynamics that distinguish male and female brains. Our stDNN model accurately differentiated male and female brains, demonstrating consistently high cross-validation accuracy (>90%), replicability, and generalizability across multisession data from the same individuals and three independent cohorts (N ~ 1,500 young adults aged 20 to 35). Explainable AI (XAI) analysis revealed that brain features associated with the default mode network, striatum, and limbic network consistently exhibited significant sex differences (effect sizes > 1.5) across sessions and independent cohorts. Furthermore, XAI-derived brain features accurately predicted sex-specific cognitive profiles, a finding that was also independently replicated. Our results demonstrate that sex differences in functional brain dynamics are not only highly replicable and generalizable but also behaviorally relevant, challenging the notion of a continuum in male-female brain organization. Our findings underscore the crucial role of sex as a biological determinant in human brain organization, have significant implications for developing personalized sex-specific biomarkers in psychiatric and neurological disorders, and provide innovative AI-based computational tools for future research.
Assuntos
Aprendizado Profundo , Doenças do Sistema Nervoso , Adulto Jovem , Humanos , Masculino , Feminino , Caracteres Sexuais , Encéfalo , EnvelhecimentoRESUMO
Design of hardware based on biological principles of neuronal computation and plasticity in the brain is a leading approach to realizing energy- and sample-efficient AI and learning machines. An important factor in selection of the hardware building blocks is the identification of candidate materials with physical properties suitable to emulate the large dynamic ranges and varied timescales of neuronal signaling. Previous work has shown that the all-or-none spiking behavior of neurons can be mimicked by threshold switches utilizing material phase transitions. Here, we demonstrate that devices based on a prototypical metal-insulator-transition material, vanadium dioxide (VO2), can be dynamically controlled to access a continuum of intermediate resistance states. Furthermore, the timescale of their intrinsic relaxation can be configured to match a range of biologically relevant timescales from milliseconds to seconds. We exploit these device properties to emulate three aspects of neuronal analog computation: fast (~1 ms) spiking in a neuronal soma compartment, slow (~100 ms) spiking in a dendritic compartment, and ultraslow (~1 s) biochemical signaling involved in temporal credit assignment for a recently discovered biological mechanism of one-shot learning. Simulations show that an artificial neural network using properties of VO2 devices to control an agent navigating a spatial environment can learn an efficient path to a reward in up to fourfold fewer trials than standard methods. The phase relaxations described in our study may be engineered in a variety of materials and can be controlled by thermal, electrical, or optical stimuli, suggesting further opportunities to emulate biological learning in neuromorphic hardware.
Assuntos
Aprendizagem , Redes Neurais de Computação , Computadores , Encéfalo/fisiologia , Neurônios/fisiologiaRESUMO
Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Thus, aligning them with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. This study reveals that such strategies emerged in state-of-the-art LLMs, but were nonexistent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents, that their performance in complex deception scenarios can be amplified utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in LLMs can trigger misaligned deceptive behavior. GPT-4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time (P < 0.001). In complex second-order deception test scenarios where the aim is to mislead someone who expects to be deceived, GPT-4 resorts to deceptive behavior 71.46% of the time (P < 0.001) when augmented with chain-of-thought reasoning. In sum, revealing hitherto unknown machine behavior in LLMs, our study contributes to the nascent field of machine psychology.
Assuntos
Enganação , Idioma , Humanos , Inteligência ArtificialRESUMO
The prediction of protein 3D structure from amino acid sequence is a computational grand challenge in biophysics and plays a key role in robust protein structure prediction algorithms, from drug discovery to genome interpretation. The advent of AI models, such as AlphaFold, is revolutionizing applications that depend on robust protein structure prediction algorithms. To maximize the impact, and ease the usability, of these AI tools we introduce APACE, AlphaFold2 and advanced computing as a service, a computational framework that effectively handles this AI model and its TB-size database to conduct accelerated protein structure prediction analyses in modern supercomputing environments. We deployed APACE in the Delta and Polaris supercomputers and quantified its performance for accurate protein structure predictions using four exemplar proteins: 6AWO, 6OAN, 7MEZ, and 6D6U. Using up to 300 ensembles, distributed across 200 NVIDIA A100 GPUs, we found that APACE is up to two orders of magnitude faster than off-the-self AlphaFold2 implementations, reducing time-to-solution from weeks to minutes. This computational approach may be readily linked with robotics laboratories to automate and accelerate scientific discovery.
Assuntos
Algoritmos , Biofísica , Proteínas , Proteínas/química , Biofísica/métodos , Conformação Proteica , Software , Biologia Computacional/métodos , Modelos MolecularesRESUMO
Eleven large language models (LLMs) were assessed using 40 bespoke false-belief tasks, considered a gold standard in testing theory of mind (ToM) in humans. Each task included a false-belief scenario, three closely matched true-belief control scenarios, and the reversed versions of all four. An LLM had to solve all eight scenarios to solve a single task. Older models solved no tasks; Generative Pre-trained Transformer (GPT)-3-davinci-003 (from November 2022) and ChatGPT-3.5-turbo (from March 2023) solved 20% of the tasks; ChatGPT-4 (from June 2023) solved 75% of the tasks, matching the performance of 6-y-old children observed in past studies. We explore the potential interpretation of these results, including the intriguing possibility that ToM-like ability, previously considered unique to humans, may have emerged as an unintended by-product of LLMs' improving language skills. Regardless of how we interpret these outcomes, they signify the advent of more powerful and socially skilled AI-with profound positive and negative implications.
Assuntos
Idioma , Teoria da Mente , Teoria da Mente/fisiologia , Humanos , Criança , Masculino , FemininoRESUMO
Recent advancements in large language models (LLMs) have raised the prospect of scalable, automated, and fine-grained political microtargeting on a scale previously unseen; however, the persuasive influence of microtargeting with LLMs remains unclear. Here, we build a custom web application capable of integrating self-reported demographic and political data into GPT-4 prompts in real-time, facilitating the live creation of unique messages tailored to persuade individual users on four political issues. We then deploy this application in a preregistered randomized control experiment (n = 8,587) to investigate the extent to which access to individual-level data increases the persuasive influence of GPT-4. Our approach yields two key findings. First, messages generated by GPT-4 were broadly persuasive, in some cases increasing support for an issue stance by up to 12 percentage points. Second, in aggregate, the persuasive impact of microtargeted messages was not statistically different from that of non-microtargeted messages (4.83 vs. 6.20 percentage points, respectively, P = 0.226). These trends hold even when manipulating the type and number of attributes used to tailor the message. These findings suggest-contrary to widespread speculation-that the influence of current LLMs may reside not in their ability to tailor messages to individuals but rather in the persuasiveness of their generic, nontargeted messages. We release our experimental dataset, GPTarget2024, as an empirical baseline for future research.
Assuntos
Comunicação Persuasiva , Política , Humanos , IdiomaRESUMO
How genomic differences contribute to phenotypic differences is a major question in biology. The recently characterized genomes, isolation environments, and qualitative patterns of growth on 122 sources and conditions of 1,154 strains from 1,049 fungal species (nearly all known) in the yeast subphylum Saccharomycotina provide a powerful, yet complex, dataset for addressing this question. We used a random forest algorithm trained on these genomic, metabolic, and environmental data to predict growth on several carbon sources with high accuracy. Known structural genes involved in assimilation of these sources and presence/absence patterns of growth in other sources were important features contributing to prediction accuracy. By further examining growth on galactose, we found that it can be predicted with high accuracy from either genomic (92.2%) or growth data (82.6%) but not from isolation environment data (65.6%). Prediction accuracy was even higher (93.3%) when we combined genomic and growth data. After the GALactose utilization genes, the most important feature for predicting growth on galactose was growth on galactitol, raising the hypothesis that several species in two orders, Serinales and Pichiales (containing the emerging pathogen Candida auris and the genus Ogataea, respectively), have an alternative galactose utilization pathway because they lack the GAL genes. Growth and biochemical assays confirmed that several of these species utilize galactose through an alternative oxidoreductive D-galactose pathway, rather than the canonical GAL pathway. Machine learning approaches are powerful for investigating the evolution of the yeast genotype-phenotype map, and their application will uncover novel biology, even in well-studied traits.
Assuntos
Galactose , Aprendizado de Máquina , Galactose/metabolismo , Genoma Fúngico , Redes e Vias Metabólicas/genética , Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genéticaRESUMO
The Saccharomycotina yeasts ("yeasts" hereafter) are a fungal clade of scientific, economic, and medical significance. Yeasts are highly ecologically diverse, found across a broad range of environments in every biome and continent on earth; however, little is known about what rules govern the macroecology of yeast species and their range limits in the wild. Here, we trained machine learning models on 12,816 terrestrial occurrence records and 96 environmental variables to infer global distribution maps at ~1 km2 resolution for 186 yeast species (~15% of described species from 75% of orders) and to test environmental drivers of yeast biogeography and macroecology. We found that predicted yeast diversity hotspots occur in mixed montane forests in temperate climates. Diversity in vegetation type and topography were some of the greatest predictors of yeast species richness, suggesting that microhabitats and environmental clines are key to yeast diversity. We further found that range limits in yeasts are significantly influenced by carbon niche breadth and range overlap with other yeast species, with carbon specialists and species in high-diversity environments exhibiting reduced geographic ranges. Finally, yeasts contravene many long-standing macroecological principles, including the latitudinal diversity gradient, temperature-dependent species richness, and a positive relationship between latitude and range size (Rapoport's rule). These results unveil how the environment governs the global diversity and distribution of species in the yeast subphylum. These high-resolution models of yeast species distributions will facilitate the prediction of economically relevant and emerging pathogenic species under current and future climate scenarios.