Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 176
Filtrar
1.
Nat Genet ; 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641644

RESUMO

Methylation quantitative trait loci (mQTLs) are essential for understanding the role of DNA methylation changes in genetic predisposition, yet they have not been fully characterized in East Asians (EAs). Here we identified mQTLs in whole blood from 3,523 Chinese individuals and replicated them in additional 1,858 Chinese individuals from two cohorts. Over 9% of mQTLs displayed specificity to EAs, facilitating the fine-mapping of EA-specific genetic associations, as shown for variants associated with height. Trans-mQTL hotspots revealed biological pathways contributing to EA-specific genetic associations, including an ERG-mediated 233 trans-mCpG network, implicated in hematopoietic cell differentiation, which likely reflects binding efficiency modulation of the ERG protein complex. More than 90% of mQTLs were shared between different blood cell lineages, with a smaller fraction of lineage-specific mQTLs displaying preferential hypomethylation in the respective lineages. Our study provides new insights into the mQTL landscape across genetic ancestries and their downstream effects on cellular processes and diseases/traits.

2.
Protein Cell ; 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38482631

RESUMO

Epigenetic clocks are accurate predictors of human chronological age based on the analysis of DNA methylation at specific CpG sites. However, available DNA methylation (DNAm) age predictors are based on datasets with limited ethnic representation. Moreover, a systematic comparison between DNAm data and other omics datasets has not yet been performed. To address these knowledge gaps, we generated and analyzed DNA methylation datasets from two independent Chinese cohorts, revealing age-related DNAm changes. Additionally, a DNA methylation (DNAm) aging clock (iCAS-DNAmAge) and a group of DNAm-based multi-modal clocks for Chinese individuals were developed, with most of them demonstrating strong predictive capabilities for chronological age. The clocks were further employed to predict factors influencing aging rates. The DNAm aging clock, derived from multi-modal aging features (compositeAge-DNAmAge), exhibited a close association with multi-omics changes, lifestyles, and disease status, underscoring its robust potential for precise biological age assessment. Our findings offer novel insights into the regulatory mechanism of age-related DNAm changes and extend the application of the DNAm clock for measuring biological age and aging pace, providing basis for evaluating aging intervention strategies.

3.
PLoS Genet ; 20(1): e1011037, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38206971

RESUMO

Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.


Assuntos
Estudo de Associação Genômica Ampla , Privacidade , Humanos , Estudo de Associação Genômica Ampla/métodos , Genótipo , Software , Genômica
4.
Nucleic Acids Res ; 52(D1): D871-D881, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37941154

RESUMO

Large-scale genome-wide association studies (GWAS) have provided profound insights into complex traits and diseases. Yet, deciphering the fine-scale molecular mechanisms of how genetic variants manifest to cause the phenotypes remains a daunting task. Here, we present COLOCdb (https://ngdc.cncb.ac.cn/colocdb), a comprehensive genetic colocalization database by integrating more than 3000 GWAS summary statistics and 13 types of xQTL to date. By employing two representative approaches for the colocalization analysis, COLOCdb deposits results from three key components: (i) GWAS-xQTL, pair-wise colocalization between GWAS loci and different types of xQTL, (ii) GWAS-GWAS, pair-wise colocalization between the trait-associated genetic loci from GWASs and (iii) xQTL-xQTL, pair-wise colocalization between the genetic loci associated with molecular phenotypes in xQTLs. These results together represent the most comprehensive colocalization analysis, which also greatly expands the list of shared variants with genetic pleiotropy. We expect that COLOCdb can serve as a unique and useful resource in advancing the discovery of new biological mechanisms and benefit future functional studies.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas , Fenótipo , Pleiotropia Genética , Polimorfismo de Nucleotídeo Único
5.
Nucleic Acids Res ; 52(D1): D1072-D1081, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37870478

RESUMO

Annotating genetic variants to their target genes is of great importance in unraveling the causal variants and genetic mechanisms that underlie complex diseases. However, disease-associated genetic variants are often located in non-coding regions and manifest context-specific effects, making it challenging to accurately identify the target genes and regulatory mechanisms. Here, we present TargetGene (https://ngdc.cncb.ac.cn/targetgene/), a comprehensive database reporting target genes for human genetic variants from various aspects. Specifically, we collected a comprehensive catalog of multi-omics data at the single-cell and bulk levels and from various human tissues, cell types and developmental stages. To facilitate the identification of Single Nucleotide Polymorphism (SNP)-to-gene connections, we have implemented multiple analytical tools based on chromatin co-accessibility, 3D interaction, enhancer activities and quantitative trait loci, among others. We applied the pipeline to evaluate variants from nearly 1300 Genome-wide association studies (GWAS) and assembled a comprehensive atlas of multiscale regulation of genetic variants. TargetGene is equipped with user-friendly web interfaces that enable intuitive searching, navigation and browsing through the results. Overall, TargetGene provides a unique resource to empower researchers to study the regulatory mechanisms of genetic variants in complex human traits.


Assuntos
Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Cromatina/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único
6.
Nucleic Acids Res ; 52(D1): D972-D979, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37831083

RESUMO

Leveraging genetics insights to promote drug repurposing has become a promising and active strategy in pharmacology. Indeed, among the 50 drugs approved by FDA in 2021, two-thirds have genetically supported evidence. In this regard, the increasing amount of widely available genome-wide association studies (GWAS) datasets have provided substantial opportunities for drug repurposing based on genetics discoveries. Here, we developed PharmGWAS, a comprehensive knowledgebase designed to identify candidate drugs through the integration of GWAS data. PharmGWAS focuses on novel connections between diseases and small-molecule compounds derived using a reverse relationship between the genetically-regulated expression signature and the drug-induced signature. Specifically, we collected and processed 1929 GWAS datasets across a diverse spectrum of diseases and 724 485 perturbation signatures pertaining to a substantial 33609 molecular compounds. To obtain reliable and robust predictions for the reverse connections, we implemented six distinct connectivity methods. In the current version, PharmGWAS deposits a total of 740 227 genetically-informed disease-drug pairs derived from drug-perturbation signatures, presenting a valuable and comprehensive catalog. Further equipped with its user-friendly web design, PharmGWAS is expected to greatly aid the discovery of novel drugs, the exploration of drug combination therapies and the identification of drug resistance or side effects. PharmGWAS is available at https://ngdc.cncb.ac.cn/pharmgwas.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Reposicionamento de Medicamentos , Estudo de Associação Genômica Ampla , Reposicionamento de Medicamentos/métodos , Estudo de Associação Genômica Ampla/métodos
7.
J Adv Res ; 2023 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-37722560

RESUMO

INTRODUCTION: Atrial fibrillation (AF) is the most prevalent cardiac arrhythmia, and it significantly increases the risk of cardiovascular complications and morbidity, even with appropriate treatment. Tissue remodeling has been a significant topic, while its systematic transcriptional signature remains unclear in AF. OBJECTIVES: Our study aims to systematically investigate the molecular characteristics of AF at the cellular-level. METHODS: We conducted single-nuclei RNA-sequencig (snRNA-seq) analysis using nuclei isolated from the left atrial appendage (LAA) of AF patients and sinus rhythm. Pathological staining was performed to validate the key findings of snRNA-seq. RESULTS: A total of 30 cell subtypes were identified among 80, 592 nuclei. Within the LAA of AF, we observed a specific subtype of dedifferentiated cardiomyocytes (CMs) characterized by reduced expression of cardiac contractile proteins (TTN and TRDN) and heightened expression of extracellular-matrix related genes (COL1A2 and FBN1). Transcription factor prediction analysis revealed that gene expression patterns in dedifferentiated CMs were primarily regulated by CEBPG and GISLI. Additionally, we identified a distinct subtype of endothelial progenitor cells (EPCs) demonstrating elevated expression of PROM1 and KDR, a population decreased within the LAA of AF. Epicardial adipocytes disclosed a reduced release of the anti-inflammatory and anti-fibrotic factor PRG4, and an augmented secretion of VEGF signals targeting CMs. Additionally, we noted accumulation of M2-like macrophages and CD8+ T cells with high pro-inflammatory score in LAA of AF. Furthermore, the analysis of intercellular communication revealed specific pathways related to AF, such as inflammation, extracellular matrix, and vascular remodeling signals. CONCLUSIONS: This study has discovered the presence of dedifferentiated CMs, a decrease in endothelial progenitor cells, a shift in the secretion profile of adipocytes, and an amplified inflammatory response in AF. These findings could offer crucial insights for future research on AF and serve as valuable references for investigating novel therapeutic approaches for AF.

8.
Cancers (Basel) ; 15(16)2023 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-37627121

RESUMO

Immune checkpoint blockades (ICBs) have revolutionized cancer therapy by inducing durable clinical responses, but only a small percentage of patients can benefit from ICB treatments. Many studies have established various biomarkers to predict ICB responses. However, different biomarkers were found with diverse performances in practice, and a timely and unbiased assessment has yet to be conducted due to the complexity of ICB-related studies and trials. In this study, we manually curated 29 published datasets with matched transcriptome and clinical data from more than 1400 patients, and uniformly preprocessed these datasets for further analyses. In addition, we collected 39 sets of transcriptomic biomarkers, and based on the nature of the corresponding computational methods, we categorized them into the gene-set-like group (with the self-contained design and the competitive design, respectively) and the deconvolution-like group. Next, we investigated the correlations and patterns of these biomarkers and utilized a standardized workflow to systematically evaluate their performance in predicting ICB responses and survival statuses across different datasets, cancer types, antibodies, biopsy times, and combinatory treatments. In our benchmark, most biomarkers showed poor performance in terms of stability and robustness across different datasets. Two scores (TIDE and CYT) had a competitive performance for ICB response prediction, and two others (PASS-ON and EIGS_ssGSEA) showed the best association with clinical outcome. Finally, we developed ICB-Portal to host the datasets, biomarkers, and benchmark results and to implement the computational methods for researchers to test their custom biomarkers. Our work provided valuable resources and a one-stop solution to facilitate ICB-related research.

9.
PLoS Genet ; 19(7): e1010786, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37459304

RESUMO

Human ear morphology, a complex anatomical structure represented by a multidimensional set of correlated and heritable phenotypes, has a poorly understood genetic architecture. In this study, we quantitatively assessed 136 ear morphology traits using deep learning analysis of digital face images in 14,921 individuals from five different cohorts in Europe, Asia, and Latin America. Through GWAS meta-analysis and C-GWASs, a recently introduced method to effectively combine GWASs of many traits, we identified 16 genetic loci involved in various ear phenotypes, eight of which have not been previously associated with human ear features. Our findings suggest that ear morphology shares genetic determinants with other surface ectoderm-derived traits such as facial variation, mono eyebrow, and male pattern baldness. Our results enhance the genetic understanding of human ear morphology and shed light on the shared genetic contributors of different surface ectoderm-derived phenotypes. Additionally, gene editing experiments in mice have demonstrated that knocking out the newly ear-associated gene (Intu) and a previously ear-associated gene (Tbx15) causes deviating mouse ear morphology.


Assuntos
Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Masculino , Animais , Camundongos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Ásia , Polimorfismo de Nucleotídeo Único/genética
10.
iScience ; 26(5): 106646, 2023 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-37168554

RESUMO

Ischemia reperfusion injury (IRI), often related to surgical procedures, is one of the important causes of acute kidney injury (AKI). To decipher the dynamic process of AKI caused by IRI (with prolonged ischemia phase), we performed single-cell RNA sequencing (scRNA-seq) of clinically relevant IRI murine model with different ischemic intervals. We discovered that Slc5a2hi proximal tubular cells were susceptible to AKI and highly expressed neutral amino acid transporter gene Slc6a19, which was dramatically decreased over the time course. With the usage of mass spectrometry-based metabolomic analysis, we detected that the level of neutral amino acid isoleucine dropped off in AKI mouse plasma metabolites. And the reduction of plasma isoleucine was also verified in patients with cardiac surgery-associated acute kidney injury (CSA-AKI). The findings advanced the understanding of dynamic process of AKI and introduced reduction of isoleucine as a potential biomarker for CSA-AKI.

11.
Am J Med Genet B Neuropsychiatr Genet ; 192(3-4): 62-70, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36863698

RESUMO

Investigating functional, temporal, and cell-type expression features of mutations is important for understanding a complex disease. Here, we collected and analyzed common variants and de novo mutations (DNMs) in schizophrenia (SCZ). We collected 2,636 missense and loss-of-function (LoF) DNMs in 2,263 genes across 3,477 SCZ patients (SCZ-DNMs). We curated three gene lists: (a) SCZ-neuroGenes (159 genes), which are intolerant to LoF and missense DNMs and are neurologically important, (b) SCZ-moduleGenes (52 genes), which were derived from network analyses of SCZ-DNMs, and (c) SCZ-commonGenes (120 genes) from a recent GWAS as reference. To compare temporal gene expression, we used the BrainSpan dataset. We defined a fetal effect score (FES) to quantify the involvement of each gene in prenatal brain development. We further employed the specificity indexes (SIs) to evaluate cell-type expression specificity from single-cell expression data in cerebral cortices of humans and mice. Compared with SCZ-commonGenes, SCZ-neuroGenes and SCZ-moduleGenes were highly expressed in the prenatal stage, had higher FESs, and had higher SIs in fetal replicating cells and undifferentiated cell types. Our results suggested that gene expression patterns in specific cell types in early fetal stages might have impacts on the risk of SCZ during adulthood.


Assuntos
Encéfalo , Mutação , Esquizofrenia , Esquizofrenia/genética , Esquizofrenia/patologia , Esquizofrenia/fisiopatologia , Encéfalo/citologia , Encéfalo/embriologia , Encéfalo/crescimento & desenvolvimento , Encéfalo/patologia , Animais , Camundongos , Feto/citologia , Feto/embriologia , Neurônios/metabolismo , Mutação com Perda de Função , Mutação de Sentido Incorreto , Humanos , Especificidade de Órgãos
12.
Nat Commun ; 14(1): 1694, 2023 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-36973285

RESUMO

N6-methyladenosine (m6A), one of the most prevalent mRNA modifications in eukaryotes, plays a critical role in modulating both biological and pathological processes. However, it is unknown whether mutant p53 neomorphic oncogenic functions exploit dysregulation of m6A epitranscriptomic networks. Here, we investigate Li-Fraumeni syndrome (LFS)-associated neoplastic transformation driven by mutant p53 in iPSC-derived astrocytes, the cell-of-origin of gliomas. We find that mutant p53 but not wild-type (WT) p53 physically interacts with SVIL to recruit the H3K4me3 methyltransferase MLL1 to activate the expression of m6A reader YTHDF2, culminating in an oncogenic phenotype. Aberrant YTHDF2 upregulation markedly hampers expression of multiple m6A-marked tumor-suppressing transcripts, including CDKN2B and SPOCK2, and induces oncogenic reprogramming. Mutant p53 neoplastic behaviors are significantly impaired by genetic depletion of YTHDF2 or by pharmacological inhibition using MLL1 complex inhibitors. Our study reveals how mutant p53 hijacks epigenetic and epitranscriptomic machinery to initiate gliomagenesis and suggests potential treatment strategies for LFS gliomas.


Assuntos
Glioma , Síndrome de Li-Fraumeni , Humanos , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , Síndrome de Li-Fraumeni/genética , Transformação Celular Neoplásica/genética , Glioma/genética , Proteoglicanas/metabolismo
13.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36961310

RESUMO

Prediction of therapy response has been a major challenge in cancer precision medicine due to the extensive tumor heterogeneity. Recently, several deep learning methods have been developed to predict drug response by utilizing various omics data. Most of them train models by using the drug-response screening data generated from cell lines and then use these models to predict response in cancer patient data. In this study, we focus on and evaluate deep learning methods using transcriptome data for the long-standing question of personalized drug-response prediction. We developed an embedding-based approach for drug-response prediction and benchmarked similar methods for their performance. For all methods, we used pretreatment transcriptome data to train models and then conducted a comprehensive evaluation and comparison of the models using cross-panels, cross-datasets and target genes. We further validated the methods using three independent datasets assessing multiple compounds for their predictive capability of drug response, survival outcome and cell line status. As a result, the methods building on gene embeddings had an overall competitive performance with reduced overfitting when we applied evaluation parameters for model fitting as well as the correlation with clinical outcomes in the validation data. We further developed an ensemble model to combine the results from the three most competitive methods for an overall prediction. Finally, we developed DrVAEN (https://bioinfo.uth.edu/drvaen), a user-friendly and easy-accessible web-server that hosts all these methods for drug-response prediction and model comparison for broad use in cancer research, method evaluation and drug development.


Assuntos
Benchmarking , Neoplasias , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Medicina de Precisão/métodos
14.
Hum Mol Genet ; 32(6): 998-1009, 2023 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-36282535

RESUMO

Multiple sclerosis (MS) is a complex dysimmune disorder of the central nervous system. Genome-wide association studies (GWAS) have identified 233 genetic variations associated with MS at the genome-wide significant level. Epigenetic studies have pinpointed differentially methylated CpG sites in MS patients. However, the interplay between genetic risk factors and epigenetic regulation remains elusive. Here, we employed a network model to integrate GWAS summary statistics of 14 802 MS cases and 26 703 controls with DNA methylation profiles from 140 MS cases and 139 controls and the human interactome. We identified differentially methylated genes by aggregating additive effects of differentially methylated CpG sites within promoter regions. We reconstructed a gene regulatory network (GRN) using literature-curated transcription factor knowledge. Colocalization of the MS GWAS and methylation quantitative trait loci (mQTL) was performed to assess the GRN. The resultant MS-associated GRN highlighted several single nucleotide polymorphisms with GWAS-mQTL colocalization: rs6032663, rs6065926 and rs2024568 of CD40 locus, rs9913597 of STAT3 locus, and rs887864 and rs741175 of CIITA locus. Moreover, synergistic mQTL and expression QTL signals were identified in CD40, suggesting gene expression alteration was likely induced by epigenetic changes. Web-based Cell-type Specific Enrichment Analysis of Genes (WebCSEA) indicated that the GRN was enriched in T follicular helper cells (P-value = 0.0016). Drug target enrichment analysis of annotations from the Therapeutic Target Database revealed the GRN was also enriched with drug target genes (P-value = 3.89 × 10-4), revealing repurposable candidates for MS treatment. These candidates included vorinostat (HDAC1 inhibitor) and sivelestat (ELANE inhibitor), which warrant further investigation.


Assuntos
Epigênese Genética , Esclerose Múltipla , Humanos , Epigênese Genética/genética , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla/genética , Metilação de DNA/genética , Locos de Características Quantitativas/genética
15.
Mol Oncol ; 17(4): 564-581, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36495164

RESUMO

The incidence of bladder cancer and patient survival vary greatly among different populations, but the influence of the associated molecular features and evolutionary processes on its clinical treatment and prognostication remains unknown. Here, we analyze the genomic architectures of 505 bladder cancer patients from Asian/Black/White populations. We identify a previously unknown association between AHNAK mutations and activity of the APOBEC-a mutational signature, the activity of which varied substantially across populations. All significantly mutated genes but only half of arm-level somatic copy number alterations (SCNAs) are enriched with clonal events, indicating large-scale SCNAs as rich sources of bladder cancer clonal diversities. The prevalence of TP53 and ATM clonal mutations as well as the associated burden of SCNAs is significantly higher in Whites/Blacks than in Asians. We identify a trans-ancestry prognostic subtype of bladder cancer characterized by enrichment of non-muscle-invasive patients and muscle-invasive patients with good prognosis, increased CREBBP/FGFR3/HRAS/NFE2L2 mutations, decreased intra-tumor heterogeneity and genome instability, and an activated tumor microenvironment.


Assuntos
Neoplasias da Bexiga Urinária , Humanos , Prognóstico , Neoplasias da Bexiga Urinária/genética , Neoplasias da Bexiga Urinária/patologia , Mutação/genética , Instabilidade Genômica , Genômica , Microambiente Tumoral
16.
Nucleic Acids Res ; 51(D1): D835-D844, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36243988

RESUMO

A broad range of complex phenotypes are related to dysfunctions in brain (hereafter referred to as brain-related traits), including various mental and behavioral disorders and diseases of the nervous system. These traits in general share overlapping symptoms, pathogenesis, and genetic components. Here, we present Brain Catalog (https://ngdc.cncb.ac.cn/braincatalog), a comprehensive database aiming to delineate the genetic components of more than 500 GWAS summary statistics datasets for brain-related traits from multiple aspects. First, Brain Catalog provides results of candidate causal variants, causal genes, and functional tissues and cell types for each trait identified by multiple methods using comprehensive annotation datasets (58 QTL datasets spanning 6 types of QTLs). Second, Brain Catalog estimates the SNP-based heritability, the partitioning heritability based on functional annotations, and genetic correlations among traits. Finally, through bidirectional Mendelian randomization analyses, Brain Catalog presents inference of risk factors that are likely causal to each trait. In conclusion, Brain Catalog presents a one-stop shop for the genetic components of brain-related traits, potentially serving as a valuable resource for worldwide researchers to advance the understanding of how GWAS signals may contribute to the biological etiology of brain-related traits.


Assuntos
Encéfalo , Bases de Dados Genéticas , Transtornos Mentais , Encéfalo/fisiopatologia , Fenótipo , Locos de Características Quantitativas , Transtornos Mentais/genética
17.
Genomics Proteomics Bioinformatics ; 21(2): 370-384, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-35470070

RESUMO

Single-cell RNA sequencing (scRNA-seq) is revolutionizing the study of complex and dynamic cellular mechanisms. However, cell type annotation remains a main challenge as it largely relies on a priori knowledge and manual curation, which is cumbersome and subjective. The increasing number of scRNA-seq datasets, as well as numerous published genetic studies, has motivated us to build a comprehensive human cell type reference atlas.Here, we present decoding Cell type Specificity (deCS), an automatic cell type annotation method augmented by a comprehensive collection of human cell type expression profiles and marker genes. We used deCS to annotate scRNA-seq data from various tissue types and systematically evaluated the annotation accuracy under different conditions, including reference panels, sequencing depth, and feature selection strategies. Our results demonstrate that expanding the references is critical for improving annotation accuracy. Compared to many existing state-of-the-art annotation tools, deCS significantly reduced computation time and increased accuracy. deCS can be integrated into the standard scRNA-seq analytical pipeline to enhance cell type annotation. Finally, we demonstrated the broad utility of deCS to identify trait-cell type associations in 51 human complex traits, providing deep insights into the cellular mechanisms underlying disease pathogenesis. All documents for deCS, including source code, user manual, demo data, and tutorials, are freely available at https://github.com/bsml320/deCS.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Humanos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software
19.
Genome Biol ; 23(1): 220, 2022 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-36253801

RESUMO

BACKGROUND: The rapid accumulation of single-cell RNA sequencing (scRNA-seq) data presents unique opportunities to decode the genetically mediated cell-type specificity in complex diseases. Here, we develop a new method, scGWAS, which effectively leverages scRNA-seq data to achieve two goals: (1) to infer the cell types in which the disease-associated genes manifest and (2) to construct cellular modules which imply disease-specific activation of different processes. RESULTS: scGWAS only utilizes the average gene expression for each cell type followed by virtual search processes to construct the null distributions of module scores, making it scalable to large scRNA-seq datasets. We demonstrated scGWAS in 40 genome-wide association studies (GWAS) datasets (average sample size N ≈ 154,000) using 18 scRNA-seq datasets from nine major human/mouse tissues (totaling 1.08 million cells) and identified 2533 trait and cell-type associations, each with significant modules for further investigation. The module genes were validated using disease or clinically annotated references from ClinVar, OMIM, and pLI variants. CONCLUSIONS: We showed that the trait-cell type associations identified by scGWAS, while generally constrained to trait-tissue associations, could recapitulate many well-studied relationships and also reveal novel relationships, providing insights into the unsolved trait-tissue associations. Moreover, in each specific cell type, the associations with different traits were often mediated by different sets of risk genes, implying disease-specific activation of driving processes. In summary, scGWAS is a powerful tool for exploring the genetic basis of complex diseases at the cell type level using single-cell expression data.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Animais , Humanos , Camundongos , Fenótipo , Análise de Célula Única/métodos
20.
Cells ; 11(14)2022 07 16.
Artigo em Inglês | MEDLINE | ID: mdl-35883662

RESUMO

BACKGROUND: Genome-wide association studies have successfully identified variants associated with multiple conditions. However, generalizing discoveries across diverse populations remains challenging due to large variations in genetic composition. Methods that perform gene expression imputation have attempted to address the transferability of gene discoveries across populations, but with limited success. METHODS: Here, we introduce a pipeline that combines gene expression imputation with gene module discovery, including a dense gene module search and a gene set variation analysis, to address the transferability issue. Our method feeds association probabilities of imputed gene expression with a selected phenotype into tissue-specific gene-module discovery over protein interaction networks to create higher-level gene modules. RESULTS: We demonstrate our method's utility in three case-control studies of Alzheimer's disease (AD) for three different race/ethnic populations (Whites, African descent and Hispanics). We discovered 182 AD-associated genes from gene modules shared between these populations, highlighting new gene modules associated with AD. CONCLUSIONS: Our innovative framework has the potential to identify robust discoveries across populations based on gene modules, as demonstrated in AD.


Assuntos
Doença de Alzheimer , Redes Reguladoras de Genes , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA