Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 168
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38594933

RESUMO

Deciphering the regulatory code of gene expression and interpreting the transcriptional effects of genome variation are critical challenges in human genetics. Modern experimental technologies have resulted in an abundance of data, enabling the development of sequence-based deep learning models that link patterns embedded in DNA to the biochemical and regulatory properties contributing to transcriptional regulation, including modeling epigenetic marks, 3D genome organization, and gene expression, with tissue and cell-type specificity. Such methods can predict the functional consequences of any noncoding variant in the human genome, even rare or never-before-observed variants, and systematically characterize their consequences beyond what is tractable from experiments or quantitative genetics studies alone. Recently, the development and application of interpretability approaches have led to the identification of key sequence patterns contributing to the predicted tasks, providing insights into the underlying biological mechanisms learned and revealing opportunities for improvement in future models.

2.
Brain ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38462574

RESUMO

Neurons from layer II of the entorhinal cortex (ECII) are the first to accumulate tau protein aggregates and degenerate during prodromal Alzheimer's disease (AD). Gaining insight into the molecular mechanisms underlying this vulnerability will help reveal genes and pathways at play during incipient stages of the disease. Here, we use a data-driven functional genomics approach to model ECII neurons in silico and identify the proto-oncogene DEK as a regulator of tau pathology. We show that epigenetic changes caused by Dek silencing alter activity-induced transcription, with major effects on neuronal excitability. This is accompanied by gradual accumulation of tau in the somatodendritic compartment of mouse ECII neurons in vivo, reactivity of surrounding microglia, and microglia-mediated neuron loss. These features are all characteristic of early AD. The existence of a cell-autonomous mechanism linking AD pathogenic mechanisms in the precise neuron type where the disease starts provides unique evidence that synaptic homeostasis dysregulation is of central importance in the onset of tau pathology in AD.

3.
Nat Methods ; 21(3): 488-500, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38361019

RESUMO

Protein-protein interactions (PPIs) drive cellular processes and responses to environmental cues, reflecting the cellular state. Here we develop Tapioca, an ensemble machine learning framework for studying global PPIs in dynamic contexts. Tapioca predicts de novo interactions by integrating mass spectrometry interactome data from thermal/ion denaturation or cofractionation workflows with protein properties and tissue-specific functional networks. Focusing on the thermal proximity coaggregation method, we improved the experimental workflow. Finely tuned thermal denaturation afforded increased throughput, while cell lysis optimization enhanced protein detection from different subcellular compartments. The Tapioca workflow was next leveraged to investigate viral infection dynamics. Temporal PPIs were characterized during the reactivation from latency of the oncogenic Kaposi's sarcoma-associated herpesvirus. Together with functional assays, NUCKS was identified as a proviral hub protein, and a broader role was uncovered by integrating PPI networks from alpha- and betaherpesvirus infections. Altogether, Tapioca provides a web-accessible platform for predicting PPIs in dynamic contexts.


Assuntos
Herpesvirus Humano 8 , Manihot , Sarcoma de Kaposi , Sarcoma de Kaposi/metabolismo , Proteínas Virais/metabolismo , Manihot/metabolismo , Latência Viral , Herpesvirus Humano 8/metabolismo
4.
Nucleic Acids Res ; 52(2): 572-582, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38084892

RESUMO

Single same cell RNAseq/ATACseq multiome data provide unparalleled potential to develop high resolution maps of the cell-type specific transcriptional regulatory circuitry underlying gene expression. We present CREMA, a framework that recovers the full cis-regulatory circuitry by modeling gene expression and chromatin activity in individual cells without peak-calling or cell type labeling constraints. We demonstrate that CREMA overcomes the limitations of existing methods that fail to identify about half of functional regulatory elements which are outside the called chromatin 'peaks'. These circuit sites outside called peaks are shown to be important cell type specific functional regulatory loci, sufficient to distinguish individual cell types. Analysis of mouse pituitary data identifies a Gata2-circuit for the gonadotrope-enriched disease-associated Pcsk1 gene, which is experimentally validated by reduced gonadotrope expression in a gonadotrope conditional Gata2-knockout model. We present a web accessible human immune cell regulatory circuit resource, and provide CREMA as an R package.


Assuntos
Gonadotrofos , Hipófise , Camundongos , Humanos , Animais , Hipófise/metabolismo , Gonadotrofos/metabolismo , Cromatina/genética , Cromatina/metabolismo , Sequências Reguladoras de Ácido Nucleico
5.
bioRxiv ; 2023 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-37961197

RESUMO

To facilitate single cell multi-omics analysis and improve reproducibility, we present SPEEDI (Single-cell Pipeline for End to End Data Integration), a fully automated end-to-end framework for batch inference, data integration, and cell type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/.

6.
Nat Comput Sci ; 3(7): 644-657, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37974651

RESUMO

Resolving chromatin-remodeling-linked gene expression changes at cell-type resolution is important for understanding disease states. Here we describe MAGICAL (Multiome Accessibility Gene Integration Calling and Looping), a hierarchical Bayesian approach that leverages paired single-cell RNA sequencing and single-cell transposase-accessible chromatin sequencing from different conditions to map disease-associated transcription factors, chromatin sites, and genes as regulatory circuits. By simultaneously modeling signal variation across cells and conditions in both omics data types, MAGICAL achieved high accuracy on circuit inference. We applied MAGICAL to study Staphylococcus aureus sepsis from peripheral blood mononuclear single-cell data that we generated from subjects with bloodstream infection and uninfected controls. MAGICAL identified sepsis-associated regulatory circuits predominantly in CD14 monocytes, known to be activated by bacterial sepsis. We addressed the challenging problem of distinguishing host regulatory circuit responses to methicillin-resistant and methicillin-susceptible S. aureus infections. Although differential expression analysis failed to show predictive value, MAGICAL identified epigenetic circuit biomarkers that distinguished methicillin-resistant from methicillin-susceptible S. aureus infections.

7.
ArXiv ; 2023 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-37808087

RESUMO

Finely-tuned enzymatic pathways control cellular processes, and their dysregulation can lead to disease. Creating predictive and interpretable models for these pathways is challenging because of the complexity of the pathways and of the cellular and genomic contexts. Here we introduce Elektrum, a deep learning framework which addresses these challenges with data-driven and biophysically interpretable models for determining the kinetics of biochemical systems. First, it uses in vitro kinetic assays to rapidly hypothesize an ensemble of high-quality Kinetically Interpretable Neural Networks (KINNs) that predict reaction rates. It then employs a novel transfer learning step, where the KINNs are inserted as intermediary layers into deeper convolutional neural networks, fine-tuning the predictions for reaction-dependent in vivo outcomes. Elektrum makes effective use of the limited, but clean in vitro data and the complex, yet plentiful in vivo data that captures cellular context. We apply Elektrum to predict CRISPR-Cas9 off-target editing probabilities and demonstrate that Elektrum achieves state-of-the-art performance, regularizes neural network architectures, and maintains physical interpretability.

8.
bioRxiv ; 2023 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-37808658

RESUMO

Endurance exercise is an important health modifier. We studied cell-type specific adaptations of human skeletal muscle to acute endurance exercise using single-nucleus (sn) multiome sequencing in human vastus lateralis samples collected before and 3.5 hours after 40 min exercise at 70% VO2max in four subjects, as well as in matched time of day samples from two supine resting circadian controls. High quality same-cell RNA-seq and ATAC-seq data were obtained from 37,154 nuclei comprising 14 cell types. Among muscle fiber types, both shared and fiber-type specific regulatory programs were identified. Single-cell circuit analysis identified distinct adaptations in fast, slow and intermediate fibers as well as LUM-expressing FAP cells, involving a total of 328 transcription factors (TFs) acting at altered accessibility sites regulating 2,025 genes. These data and circuit mapping provide single-cell insight into the processes underlying tissue and metabolic remodeling responses to exercise.

9.
Cell Rep Methods ; 3(9): 100580, 2023 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-37703883

RESUMO

Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary human cells, we introduce ExPectoSC, an atlas of modular deep-learning-based models for predicting cell-type-specific gene expression directly from sequence. We provide models for 105 primary human cell types covering 7 organ systems, demonstrate their accuracy, and then apply them to prioritize relevant cell types for complex human diseases. The resulting atlas of sequence-based gene expression and variant effects is publicly available in a user-friendly interface and readily extensible to any primary cell types. We demonstrate the accuracy of our approach through systematic evaluations and apply the models to prioritize ClinVar clinical variants of uncertain significance, verifying our top predictions experimentally.


Assuntos
Ascomicetos , Humanos , Expressão Gênica/genética
10.
Cell Rep Methods ; 3(2): 100395, 2023 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-36936082

RESUMO

Assays detecting blood transcriptome changes are studied for infectious disease diagnosis. Blood-based RNA alternative splicing (AS) events, which have not been well characterized in pathogen infection, have potential normalization and assay platform stability advantages over gene expression for diagnosis. Here, we present a computational framework for developing AS diagnostic biomarkers. Leveraging a large prospective cohort of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and whole-blood RNA sequencing (RNA-seq) data, we identify a major functional AS program switch upon viral infection. Using an independent cohort, we demonstrate the improved accuracy of AS biomarkers for SARS-CoV-2 diagnosis compared with six reported transcriptome signatures. We then optimize a subset of AS-based biomarkers to develop microfluidic PCR diagnostic assays. This assay achieves nearly perfect test accuracy (61/62 = 98.4%) using a naive principal component classifier, significantly more accurate than a gene expression PCR assay in the same cohort. Therefore, our RNA splicing computational framework enables a promising avenue for host-response diagnosis of infection.


Assuntos
COVID-19 , Doenças Transmissíveis , Humanos , SARS-CoV-2/genética , COVID-19/diagnóstico , Processamento Alternativo/genética , Teste para COVID-19 , RNA , Estudos Prospectivos , Biomarcadores/análise
11.
Mol Syst Biol ; 19(5): e11361, 2023 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-36919946

RESUMO

DNA methylation comprises a cumulative record of lifetime exposures superimposed on genetically determined markers. Little is known about methylation dynamics in humans following an acute perturbation, such as infection. We characterized the temporal trajectory of blood epigenetic remodeling in 133 participants in a prospective study of young adults before, during, and after asymptomatic and mildly symptomatic SARS-CoV-2 infection. The differential methylation caused by asymptomatic or mildly symptomatic infections was indistinguishable. While differential gene expression largely returned to baseline levels after the virus became undetectable, some differentially methylated sites persisted for months of follow-up, with a pattern resembling autoimmune or inflammatory disease. We leveraged these responses to construct methylation-based machine learning models that distinguished samples from pre-, during-, and postinfection time periods, and quantitatively predicted the time since infection. The clinical trajectory in the young adults and in a diverse cohort with more severe outcomes was predicted by the similarity of methylation before or early after SARS-CoV-2 infection to the model-defined postinfection state. Unlike the phenomenon of trained immunity, the postacute SARS-CoV-2 epigenetic landscape we identify is antiprotective.


Assuntos
COVID-19 , Adulto Jovem , Humanos , COVID-19/genética , SARS-CoV-2/genética , Estudos Prospectivos , Metilação de DNA/genética , Processamento de Proteína Pós-Traducional
12.
Sci Transl Med ; 15(684): eabq8476, 2023 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-36812347

RESUMO

Periodontal disease is more common in individuals with rheumatoid arthritis (RA) who have detectable anti-citrullinated protein antibodies (ACPAs), implicating oral mucosal inflammation in RA pathogenesis. Here, we performed paired analysis of human and bacterial transcriptomics in longitudinal blood samples from RA patients. We found that patients with RA and periodontal disease experienced repeated oral bacteremias associated with transcriptional signatures of ISG15+HLADRhi and CD48highS100A2pos monocytes, recently identified in inflamed RA synovia and blood of those with RA flares. The oral bacteria observed transiently in blood were broadly citrullinated in the mouth, and their in situ citrullinated epitopes were targeted by extensively somatically hypermutated ACPAs encoded by RA blood plasmablasts. Together, these results suggest that (i) periodontal disease results in repeated breaches of the oral mucosa that release citrullinated oral bacteria into circulation, which (ii) activate inflammatory monocyte subsets that are observed in inflamed RA synovia and blood of RA patients with flares and (iii) activate ACPA B cells, thereby promoting affinity maturation and epitope spreading to citrullinated human antigens.


Assuntos
Artrite Reumatoide , Doenças Periodontais , Humanos , Autoanticorpos , Mucosa Bucal , Formação de Anticorpos , Epitopos , Bactérias
14.
Nat Comput Sci ; 3(12): 1056-1066, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38177723

RESUMO

Finely tuned enzymatic pathways control cellular processes, and their dysregulation can lead to disease. Developing predictive and interpretable models for these pathways is challenging because of the complexity of the pathways and of the cellular and genomic contexts. Here we introduce Elektrum, a deep learning framework that addresses these challenges with data-driven and biophysically interpretable models for determining the kinetics of biochemical systems. First, it uses in vitro kinetic assays to rapidly hypothesize an ensemble of high-quality kinetically interpretable neural networks (KINNs) that predict reaction rates. It then employs a transfer learning step, where the KINNs are inserted as intermediary layers into deeper convolutional neural networks, fine-tuning the predictions for reaction-dependent in vivo outcomes. We apply Elektrum to predict CRISPR-Cas9 off-target editing probabilities and demonstrate that Elektrum achieves improved performance, regularizes neural network architectures and maintains physical interpretability.


Assuntos
Sistemas CRISPR-Cas , Redes Neurais de Computação , Sistemas CRISPR-Cas/genética , RNA Guia de Sistemas CRISPR-Cas , Genômica , Aprendizado de Máquina
15.
Cell Syst ; 13(12): 989-1001.e8, 2022 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-36549275

RESUMO

The identification of a COVID-19 host response signature in blood can increase the understanding of SARS-CoV-2 pathogenesis and improve diagnostic tools. Applying a multi-objective optimization framework to both massive public and new multi-omics data, we identified a COVID-19 signature regulated at both transcriptional and epigenetic levels. We validated the signature's robustness in multiple independent COVID-19 cohorts. Using public data from 8,630 subjects and 53 conditions, we demonstrated no cross-reactivity with other viral and bacterial infections, COVID-19 comorbidities, or confounders. In contrast, previously reported COVID-19 signatures were associated with significant cross-reactivity. The signature's interpretation, based on cell-type deconvolution and single-cell data analysis, revealed prominent yet complementary roles for plasmablasts and memory T cells. Although the signal from plasmablasts mediated COVID-19 detection, the signal from memory T cells controlled against cross-reactivity with other viral infections. This framework identified a robust, interpretable COVID-19 signature and is broadly applicable in other disease contexts. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
COVID-19 , Viroses , Humanos , SARS-CoV-2
16.
Cell Syst ; 13(11): 924-931.e4, 2022 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-36323307

RESUMO

Male sex is a major risk factor for SARS-CoV-2 infection severity. To understand the basis for this sex difference, we studied SARS-CoV-2 infection in a young adult cohort of United States Marine recruits. Among 2,641 male and 244 female unvaccinated and seronegative recruits studied longitudinally, SARS-CoV-2 infections occurred in 1,033 males and 137 females. We identified sex differences in symptoms, viral load, blood transcriptome, RNA splicing, and proteomic signatures. Females had higher pre-infection expression of antiviral interferon-stimulated gene (ISG) programs. Causal mediation analysis implicated ISG differences in number of symptoms, levels of ISGs, and differential splicing of CD45 lymphocyte phosphatase during infection. Our results indicate that the antiviral innate immunity set point causally contributes to sex differences in response to SARS-CoV-2 infection. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
COVID-19 , Imunidade Inata , Caracteres Sexuais , Feminino , Humanos , Masculino , Adulto Jovem , COVID-19/imunologia , Interferons , Proteômica , SARS-CoV-2
17.
Epidemiology ; 33(6): 797-807, 2022 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-35944149

RESUMO

BACKGROUND: Marine recruits training at Parris Island experienced an unexpectedly high rate of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, despite preventive measures including a supervised, 2-week, pre-entry quarantine. We characterize SARS-CoV-2 transmission in this cohort. METHODS: Between May and November 2020, we monitored 2,469 unvaccinated, mostly male, Marine recruits prospectively during basic training. If participants tested negative for SARS-CoV-2 by quantitative polymerase chain reaction (qPCR) at the end of quarantine, they were transferred to the training site in segregated companies and underwent biweekly testing for 6 weeks. We assessed the effects of coronavirus disease 2019 (COVID-19) prevention measures on other respiratory infections with passive surveillance data, performed phylogenetic analysis, and modeled transmission dynamics and testing regimens. RESULTS: Preventive measures were associated with drastically lower rates of other respiratory illnesses. However, among the trainees, 1,107 (44.8%) tested SARS-CoV-2-positive, with either mild or no symptoms. Phylogenetic analysis of viral genomes from 580 participants revealed that all cases but one were linked to five independent introductions, each characterized by accumulation of mutations across and within companies, and similar viral isolates in individuals from the same company. Variation in company transmission rates (mean reproduction number R 0 ; 5.5 [95% confidence interval [CI], 5.0, 6.1]) could be accounted for by multiple initial cases within a company and superspreader events. Simulations indicate that frequent rapid-report testing with case isolation may minimize outbreaks. CONCLUSIONS: Transmission of wild-type SARS-CoV-2 among Marine recruits was approximately twice that seen in the community. Insights from SARS-CoV-2 outbreak dynamics and mutations spread in a remote, congregate setting may inform effective mitigation strategies.


Assuntos
COVID-19 , Surtos de Doenças , Militares , COVID-19/epidemiologia , COVID-19/prevenção & controle , Surtos de Doenças/prevenção & controle , Feminino , Humanos , Masculino , Militares/estatística & dados numéricos , Filogenia , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Estados Unidos/epidemiologia
18.
Nat Genet ; 54(7): 940-949, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35817977

RESUMO

Epigenomic profiling has enabled large-scale identification of regulatory elements, yet we still lack a systematic mapping from any sequence or variant to regulatory activities. We address this challenge with Sei, a framework for integrating human genetics data with sequence information to discover the regulatory basis of traits and diseases. Sei learns a vocabulary of regulatory activities, called sequence classes, using a deep learning model that predicts 21,907 chromatin profiles across >1,300 cell lines and tissues. Sequence classes provide a global classification and quantification of sequence and variant effects based on diverse regulatory activities, such as cell type-specific enhancer functions. These predictions are supported by tissue-specific expression, expression quantitative trait loci and evolutionary constraint data. Furthermore, sequence classes enable characterization of the tissue-specific, regulatory architecture of complex traits and generate mechanistic hypotheses for individual regulatory pathogenic mutations. We provide Sei as a resource to elucidate the regulatory basis of human health and disease.


Assuntos
Locos de Características Quantitativas , Sequências Reguladoras de Ácido Nucleico , Cromatina/genética , Epigenômica , Genética Humana , Humanos , Locos de Características Quantitativas/genética , Sequências Reguladoras de Ácido Nucleico/genética
19.
Nucleic Acids Res ; 50(14): 8168-8192, 2022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35871289

RESUMO

Nucleocapsid protein (N-protein) is required for multiple steps in betacoronaviruses replication. SARS-CoV-2-N-protein condenses with specific viral RNAs at particular temperatures making it a powerful model for deciphering RNA sequence specificity in condensates. We identify two separate and distinct double-stranded, RNA motifs (dsRNA stickers) that promote N-protein condensation. These dsRNA stickers are separately recognized by N-protein's two RNA binding domains (RBDs). RBD1 prefers structured RNA with sequences like the transcription-regulatory sequence (TRS). RBD2 prefers long stretches of dsRNA, independent of sequence. Thus, the two N-protein RBDs interact with distinct dsRNA stickers, and these interactions impart specific droplet physical properties that could support varied viral functions. Specifically, we find that addition of dsRNA lowers the condensation temperature dependent on RBD2 interactions and tunes translational repression. In contrast RBD1 sites are sequences critical for sub-genomic (sg) RNA generation and promote gRNA compression. The density of RBD1 binding motifs in proximity to TRS-L/B sequences is associated with levels of sub-genomic RNA generation. The switch to packaging is likely mediated by RBD1 interactions which generate particles that recapitulate the packaging unit of the virion. Thus, SARS-CoV-2 can achieve biochemical complexity, performing multiple functions in the same cytoplasm, with minimal protein components based on utilizing multiple distinct RNA motifs that control N-protein interactions.


Assuntos
Proteínas do Nucleocapsídeo de Coronavírus , RNA de Cadeia Dupla , SARS-CoV-2 , Sítios de Ligação , Proteínas do Nucleocapsídeo de Coronavírus/química , Fosfoproteínas/química , RNA de Cadeia Dupla/genética , RNA Viral/genética , Proteínas de Ligação a RNA/metabolismo , SARS-CoV-2/genética , Temperatura
20.
Sci Adv ; 8(23): eabn4965, 2022 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-35675394

RESUMO

Kidney Precision Medicine Project (KPMP) is building a spatially specified human kidney tissue atlas in health and disease with single-cell resolution. Here, we describe the construction of an integrated reference map of cells, pathways, and genes using unaffected regions of nephrectomy tissues and undiseased human biopsies from 56 adult subjects. We use single-cell/nucleus transcriptomics, subsegmental laser microdissection transcriptomics and proteomics, near-single-cell proteomics, 3D and CODEX imaging, and spatial metabolomics to hierarchically identify genes, pathways, and cells. Integrated data from these different technologies coherently identify cell types/subtypes within different nephron segments and the interstitium. These profiles describe cell-level functional organization of the kidney following its physiological functions and link cell subtypes to genes, proteins, metabolites, and pathways. They further show that messenger RNA levels along the nephron are congruent with the subsegmental physiological activity. This reference atlas provides a framework for the classification of kidney disease when multiple molecular mechanisms underlie convergent clinical phenotypes.


Assuntos
Nefropatias , Rim , Humanos , Rim/patologia , Nefropatias/metabolismo , Metabolômica/métodos , Proteômica/métodos , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...