RESUMO
The cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells. Here we propose a vision of leveraging advances in AI to construct virtual cells, high-fidelity simulations of cells and cellular systems under different conditions that are directly learned from biological data across measurements and scales. We discuss desired capabilities of such AI Virtual Cells, including generating universal representations of biological entities across scales, and facilitating interpretable in silico experiments to predict and understand their behavior using Virtual Instruments. We further address the challenges, opportunities and requirements to realize this vision including data needs, evaluation strategies, and community standards and engagement to ensure biological accuracy and broad utility. We envision a future where AI Virtual Cells help identify new drug targets, predict cellular responses to perturbations, as well as scale hypothesis exploration. With open science collaborations across the biomedical ecosystem that includes academia, philanthropy, and the biopharma and AI industries, a comprehensive predictive understanding of cell mechanisms and interactions has come into reach.
RESUMO
With the growing number of single-cell analysis tools, benchmarks are increasingly important to guide analysis and method development. However, a lack of standardisation and extensibility in current benchmarks limits their usability, longevity, and relevance to the community. We present Open Problems, a living, extensible, community-guided benchmarking platform including 10 current single-cell tasks that we envision will raise standards for the selection, evaluation, and development of methods in single-cell analysis.
RESUMO
Checkpoint inhibitors (CPIs) targeting programmed death 1 (PD-1)/programmed death ligand 1 (PD-L1) and cytotoxic T lymphocyte antigen 4 (CTLA-4) have revolutionized cancer treatment but can trigger autoimmune complications, including CPI-induced diabetes mellitus (CPI-DM), which occurs preferentially with PD-1 blockade. We found evidence of pancreatic inflammation in patients with CPI-DM with shrinkage of pancreases, increased pancreatic enzymes, and in a case from a patient who died with CPI-DM, peri-islet lymphocytic infiltration. In the NOD mouse model, anti-PD-L1 but not anti-CTLA-4 induced diabetes rapidly. RNA sequencing revealed that cytolytic IFN-γ+CD8+ T cells infiltrated islets with anti-PD-L1. Changes in ß cells were predominantly driven by IFN-γ and TNF-α and included induction of a potentially novel ß cell population with transcriptional changes suggesting dedifferentiation. IFN-γ increased checkpoint ligand expression and activated apoptosis pathways in human ß cells in vitro. Treatment with anti-IFN-γ and anti-TNF-α prevented CPI-DM in anti-PD-L1-treated NOD mice. CPIs targeting the PD-1/PD-L1 pathway resulted in transcriptional changes in ß cells and immune infiltrates that may lead to the development of diabetes. Inhibition of inflammatory cytokines can prevent CPI-DM, suggesting a strategy for clinical application to prevent this complication.
Assuntos
Diabetes Mellitus , Receptor de Morte Celular Programada 1 , Animais , Humanos , Mediadores da Inflamação , Camundongos , Camundongos Endogâmicos NOD , Inibidores do Fator de Necrose TumoralRESUMO
ABSTRACT: Phenotypic plasticity describes the ability of cancer cells to undergo dynamic, nongenetic cell state changes that amplify cancer heterogeneity to promote metastasis and therapy evasion. Thus, cancer cells occupy a continuous spectrum of phenotypic states connected by trajectories defining dynamic transitions upon a cancer cell state landscape. With technologies proliferating to systematically record molecular mechanisms at single-cell resolution, we illuminate manifold learning techniques as emerging computational tools to effectively model cell state dynamics in a way that mimics our understanding of the cell state landscape. We anticipate that "state-gating" therapies targeting phenotypic plasticity will limit cancer heterogeneity, metastasis, and therapy resistance. SIGNIFICANCE: Nongenetic mechanisms underlying phenotypic plasticity have emerged as significant drivers of tumor heterogeneity, metastasis, and therapy resistance. Herein, we discuss new experimental and computational techniques to define phenotypic plasticity as a scaffold to guide accelerated progress in uncovering new vulnerabilities for therapeutic exploitation.
Assuntos
Transição Epitelial-Mesenquimal , Neoplasias , Adaptação Fisiológica , Humanos , Neoplasias/tratamento farmacológicoRESUMO
As the biomedical community produces datasets that are increasingly complex and high dimensional, there is a need for more sophisticated computational tools to extract biological insights. We present Multiscale PHATE, a method that sweeps through all levels of data granularity to learn abstracted biological features directly predictive of disease outcome. Built on a coarse-graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse resolutions for high-level summarizations of data and at fine resolutions for detailed representations of subsets. We apply Multiscale PHATE to a coronavirus disease 2019 (COVID-19) dataset with 54 million cells from 168 hospitalized patients and find that patients who die show CD16hiCD66blo neutrophil and IFN-γ+ granzyme B+ Th17 cell responses. We also show that population groupings from Multiscale PHATE directly fed into a classifier predict disease outcome more accurately than naive featurizations of the data. Multiscale PHATE is broadly generalizable to different data types, including flow cytometry, single-cell RNA sequencing (scRNA-seq), single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), and clinical variables.
Assuntos
COVID-19 , Análise de Célula Única , Cromatina , Humanos , Análise de Célula Única/métodos , Transposases , Sequenciamento do ExomaRESUMO
The evolution of uniquely human traits likely entailed changes in developmental gene regulation. Human Accelerated Regions (HARs), which include transcriptional enhancers harboring a significant excess of human-specific sequence changes, are leading candidates for driving gene regulatory modifications in human development. However, insight into whether HARs alter the level, distribution, and timing of endogenous gene expression remains limited. We examined the role of the HAR HACNS1 (HAR2) in human evolution by interrogating its molecular functions in a genetically humanized mouse model. We find that HACNS1 maintains its human-specific enhancer activity in the mouse embryo and modifies expression of Gbx2, which encodes a transcription factor, during limb development. Using single-cell RNA-sequencing, we demonstrate that Gbx2 is upregulated in the limb chondrogenic mesenchyme of HACNS1 homozygous embryos, supporting that HACNS1 alters gene expression in cell types involved in skeletal patterning. Our findings illustrate that humanized mouse models provide mechanistic insight into how HARs modified gene expression in human evolution.
Assuntos
Regulação da Expressão Gênica , Genoma , Modelos Genéticos , Animais , Sequência de Bases , Diferenciação Celular/genética , Condrócitos/citologia , Condrogênese/genética , Embrião de Mamíferos/metabolismo , Elementos Facilitadores Genéticos/genética , Epigênese Genética , Extremidades/embriologia , Perfilação da Expressão Gênica , Técnicas de Introdução de Genes , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Homozigoto , Humanos , Mesoderma/embriologia , Mesoderma/metabolismo , Camundongos Endogâmicos C57BL , Pan troglodytes , Regiões Promotoras Genéticas/genética , Fatores de TempoRESUMO
Often when biological entities are measured in multiple ways, there are distinct categories of information: some information is easy-to-obtain information (EI) and can be gathered on virtually every subject of interest, while other information is hard-to-obtain information (HI) and can only be gathered on some. We propose building a model to make probabilistic predictions of HI using EI. Our feature mapping GAN (FMGAN), based on the conditional GAN framework, uses an embedding network to process conditions as part of the conditional GAN training to create manifold structure when it is not readily present in the conditions. We experiment on generating RNA sequencing of cell lines perturbed with a drug conditioned on the drug's chemical structure and generating FACS data from clinical monitoring variables on a cohort of COVID-19 patients, effectively describing their immune response in great detail.
RESUMO
Current methods for comparing single-cell RNA sequencing datasets collected in multiple conditions focus on discrete regions of the transcriptional state space, such as clusters of cells. Here we quantify the effects of perturbations at the single-cell level using a continuous measure of the effect of a perturbation across the transcriptomic space. We describe this space as a manifold and develop a relative likelihood estimate of observing each cell in each of the experimental conditions using graph signal processing. This likelihood estimate can be used to identify cell populations specifically affected by a perturbation. We also develop vertex frequency clustering to extract populations of affected cells at the level of granularity that matches the perturbation response. The accuracy of our algorithm at identifying clusters of cells that are enriched or depleted in each condition is, on average, 57% higher than the next-best-performing algorithm tested. Gene signatures derived from these clusters are more accurate than those of six alternative algorithms in ground truth comparisons.
Assuntos
Biologia Computacional , Análise de Sequência de RNA/tendências , Análise de Célula Única/tendências , Transcriptoma/genética , Algoritmos , Análise por Conglomerados , Simulação por Computador , Humanos , Funções VerossimilhançaRESUMO
Obesity is a major modifiable risk factor for pancreatic ductal adenocarcinoma (PDAC), yet how and when obesity contributes to PDAC progression is not well understood. Leveraging an autochthonous mouse model, we demonstrate a causal and reversible role for obesity in early PDAC progression, showing that obesity markedly enhances tumorigenesis, while genetic or dietary induction of weight loss intercepts cancer development. Molecular analyses of human and murine samples define microenvironmental consequences of obesity that foster tumorigenesis rather than new driver gene mutations, including significant pancreatic islet cell adaptation in obesity-associated tumors. Specifically, we identify aberrant beta cell expression of the peptide hormone cholecystokinin (Cck) in response to obesity and show that islet Cck promotes oncogenic Kras-driven pancreatic ductal tumorigenesis. Our studies argue that PDAC progression is driven by local obesity-associated changes in the tumor microenvironment and implicate endocrine-exocrine signaling beyond insulin in PDAC development.
Assuntos
Carcinoma Ductal Pancreático/etiologia , Carcinoma Ductal Pancreático/metabolismo , Obesidade/metabolismo , Animais , Carcinogênese/genética , Carcinoma Ductal Pancreático/patologia , Linhagem Celular , Linhagem Celular Tumoral , Transformação Celular Neoplásica/genética , Modelos Animais de Doenças , Progressão da Doença , Células Endócrinas/metabolismo , Glândulas Exócrinas/metabolismo , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Mutação/genética , Obesidade/genética , Neoplasias Pancreáticas/metabolismo , Neoplasias Pancreáticas/patologia , Transdução de Sinais/genética , Microambiente Tumoral/fisiologia , Neoplasias PancreáticasRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
The high-dimensional data created by high-throughput technologies require visualization tools that reveal data structure and patterns in an intuitive form. We present PHATE, a visualization method that captures both local and global nonlinear structure using an information-geometric distance between data points. We compare PHATE to other tools on a variety of artificial and biological datasets, and find that it consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools. We define a manifold preservation metric, which we call denoised embedding manifold preservation (DEMaP), and show that PHATE produces lower-dimensional embeddings that are quantitatively better denoised as compared to existing visualization methods. An analysis of a newly generated single-cell RNA sequencing dataset on human germ-layer differentiation demonstrates how PHATE reveals unique biological insight into the main developmental branches, including identification of three previously undescribed subpopulations. We also show that PHATE is applicable to a wide variety of data types, including mass cytometry, single-cell RNA sequencing, Hi-C and gut microbiome data.
Assuntos
Genômica/métodos , Ensaios de Triagem em Larga Escala/métodos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Animais , Big Data , Diferenciação Celular , Células Cultivadas , Simulação por Computador , Bases de Dados Genéticas , Microbioma Gastrointestinal , Humanos , Camundongos , Análise de Sequência de RNA , Análise de Célula ÚnicaRESUMO
Cancer is a hyper-proliferative disease. Whether the proliferative state originates from the cell-of-origin or emerges later remains difficult to resolve. By tracking de novo transformation from normal hematopoietic progenitors expressing an acute myeloid leukemia (AML) oncogene MLL-AF9, we reveal that the cell cycle rate heterogeneity among granulocyte-macrophage progenitors (GMPs) determines their probability of transformation. A fast cell cycle intrinsic to these progenitors provide permissiveness for transformation, with the fastest cycling 3% GMPs acquiring malignancy with near certainty. Molecularly, we propose that MLL-AF9 preserves gene expression of the cellular states in which it is expressed. As such, when expressed in the naturally-existing, rapidly-cycling immature myeloid progenitors, this cell state becomes perpetuated, yielding malignancy. In humans, high CCND1 expression predicts worse prognosis for MLL fusion AMLs. Our work elucidates one of the earliest steps toward malignancy and suggests that modifying the cycling state of the cell-of-origin could be a preventative approach against malignancy.
Assuntos
Transformação Celular Neoplásica/genética , Regulação Leucêmica da Expressão Gênica , Leucemia Mieloide Aguda/genética , Células Progenitoras Mieloides/patologia , Proteína de Leucina Linfoide-Mieloide/genética , Proteínas de Fusão Oncogênica/genética , Animais , Ciclo Celular/efeitos dos fármacos , Ciclo Celular/genética , Diferenciação Celular/efeitos dos fármacos , Diferenciação Celular/genética , Proliferação de Células/efeitos dos fármacos , Proliferação de Células/genética , Transformação Celular Neoplásica/efeitos dos fármacos , Ciclina D1/metabolismo , Modelos Animais de Doenças , Feminino , Técnicas de Introdução de Genes , Humanos , Estimativa de Kaplan-Meier , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/mortalidade , Masculino , Camundongos Transgênicos , Piperazinas/administração & dosagem , Cultura Primária de Células , Prognóstico , Piridinas/administração & dosagemRESUMO
As Earth's climate warms, soil carbon pools and the microbial communities that process them may change, altering the way in which carbon is recycled in soil. In this study, we used a combination of metagenomics and bacterial cultivation to evaluate the hypothesis that experimentally raising soil temperatures by 5°C for 5, 8, or 20 years increased the potential for temperate forest soil microbial communities to degrade carbohydrates. Warming decreased the proportion of carbohydrate-degrading genes in the organic horizon derived from eukaryotes and increased the fraction of genes in the mineral soil associated with Actinobacteria in all studies. Genes associated with carbohydrate degradation increased in the organic horizon after 5 years of warming but had decreased in the organic horizon after warming the soil continuously for 20 years. However, a greater proportion of the 295 bacteria from 6 phyla (10 classes, 14 orders, and 34 families) isolated from heated plots in the 20-year experiment were able to depolymerize cellulose and xylan than bacterial isolates from control soils. Together, these findings indicate that the enrichment of bacteria capable of degrading carbohydrates could be important for accelerated carbon cycling in a warmer world. IMPORTANCE: The massive carbon stocks currently held in soils have been built up over millennia, and while numerous lines of evidence indicate that climate change will accelerate the processing of this carbon, it is unclear whether the genetic repertoire of the microbes responsible for this elevated activity will also change. In this study, we showed that bacteria isolated from plots subject to 20 years of 5°C of warming were more likely to depolymerize the plant polymers xylan and cellulose, but that carbohydrate degradation capacity is not uniformly enriched by warming treatment in the metagenomes of soil microbial communities. This study illustrates the utility of combining culture-dependent and culture-independent surveys of microbial communities to improve our understanding of the role changing microbial communities may play in soil carbon cycling under climate change.