Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 81
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Am J Hum Genet ; 109(5): 838-856, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35460606

ABSTRACT

Isolating the causal genes from numerous genetic association signals in genome-wide association studies (GWASs) of complex phenotypes remains an open and challenging question. In the present study, we proposed a statistical approach, the effective-median-based Mendelian randomization (MR) framework, for inferring the causal genes of complex phenotypes with the GWAS summary statistics (named EMIC). The effective-median method solved the high false-positive issue in the existing MR methods due to either correlation among instrumental variables or noises in approximated linkage disequilibrium (LD). EMIC can further perform a pleiotropy fine-mapping analysis to remove possible false-positive estimates. With the usage of multiple cis-expression quantitative trait loci (eQTLs), EMIC was also more powerful than the alternative methods for the causal gene inference in the simulated datasets. Furthermore, EMIC rediscovered many known causal genes of complex phenotypes (schizophrenia, bipolar disorder, and total cholesterol) and reported many new and promising candidate causal genes. In sum, this study provided an efficient solution to discriminate the candidate causal genes from vast amounts of GWAS signals with eQTLs. EMIC has been implemented in our integrative software platform KGGSEE.


Subject(s)
Genome-Wide Association Study , Mendelian Randomization Analysis , Genome-Wide Association Study/methods , Humans , Linkage Disequilibrium , Mendelian Randomization Analysis/methods , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics
2.
EMBO Rep ; 24(7): e56212, 2023 Jul 05.
Article in English | MEDLINE | ID: mdl-37154297

ABSTRACT

A previous genome-wide association study (GWAS) revealed an association of the noncoding SNP rs1663689 with susceptibility to lung cancer in the Chinese population. However, the underlying mechanism is unknown. In this study, using allele-specific 4C-seq in heterozygous lung cancer cells combined with epigenetic information from CRISPR/Cas9-edited cell lines, we show that the rs1663689 C/C variant represses the expression of ADGRG6, a gene located on a separate chromosome, through an interchromosomal interaction of the rs1663689 bearing region with the ADGRG6 promoter. This reduces downstream cAMP-PKA signaling and subsequently tumor growth both in vitro and in xenograft models. Using patient-derived organoids, we show that rs1663689 T/T-but not C/C-bearing lung tumors are sensitive to the PKA inhibitor H89, potentially informing therapeutic strategies. Our study identifies a genetic variant-mediated interchromosomal interaction underlying ADGRG6 regulation and suggests that targeting the cAMP-PKA signaling pathway may be beneficial in lung cancer patients bearing the homozygous risk genotype at rs1663689.


Subject(s)
Genome-Wide Association Study , Lung Neoplasms , Humans , Lung Neoplasms/genetics , Lung , Receptors, G-Protein-Coupled/genetics , Gene Expression Regulation
3.
Nucleic Acids Res ; 51(D1): D1122-D1128, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36330927

ABSTRACT

Deciphering the fine-scale molecular mechanisms that shape the genetic effects at disease-associated loci from genome-wide association studies (GWAS) remains challenging. The key avenue is to identify the essential molecular phenotypes that mediate the causal variant and disease under particular biological conditions. Therefore, integrating GWAS signals with context-specific quantitative trait loci (QTLs) (such as different tissue/cell types, disease states, and perturbations) from extensive molecular phenotypes would present important strategies for full understanding of disease genetics. Via persistent curation and systematic data processing of large-scale human molecular trait QTLs (xQTLs), we updated our previous QTLbase database (now QTLbase2, http://mulinlab.org/qtlbase) to comprehensively analyze and visualize context-specific QTLs across 22 molecular phenotypes and over 95 tissue/cell types. Overall, the resource features the following major updates and novel functions: (i) 960 more genome-wide QTL summary statistics from 146 independent studies; (ii) new data for 10 previously uncompiled QTL types; (iii) variant query scope expanded to fit 195 QTL datasets based on whole-genome sequencing; (iv) supports filtering and comparison of QTLs for different biological conditions, such as stimulation types and disease states; (v) a new linkage disequilibrium viewer to facilitate variant prioritization across tissue/cell types and QTL types.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Humans , Chromosome Mapping , Linkage Disequilibrium , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics , Catalogs as Topic
4.
Nucleic Acids Res ; 51(21): 11668-11687, 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-37831098

ABSTRACT

Unscheduled R-loops are a major source of replication stress and DNA damage. R-loop-induced replication defects are sensed and suppressed by ATR kinase, whereas it is not known whether R-loop itself is actively involved in ATR activation and, if so, how this is achieved. Here, we report that the nuclear form of RNA-editing enzyme ADAR1 promotes ATR activation and resolves genome-wide R-loops, a process that requires its double-stranded RNA-binding domains. Mechanistically, ADAR1 interacts with TOPBP1 and facilitates its loading on perturbed replication forks by enhancing the association of TOPBP1 with RAD9 of the 9-1-1 complex. When replication is inhibited, DNA-RNA hybrid competes with TOPBP1 for ADAR1 binding to promote the translocation of ADAR1 from damaged fork to accumulate at R-loop region. There, ADAR1 recruits RNA helicases DHX9 and DDX21 to unwind R-loops, simultaneously allowing TOPBP1 to stimulate ATR more efficiently. Collectively, we propose that the tempo-spatially regulated assembly of ADAR1-nucleated protein complexes link R-loop clearance and ATR activation, while R-loops crosstalk with blocked replication forks by transposing ADAR1 to finetune ATR activity and safeguard the genome.


Subject(s)
DNA-Binding Proteins , R-Loop Structures , Ataxia Telangiectasia Mutated Proteins/genetics , Ataxia Telangiectasia Mutated Proteins/metabolism , Cell Cycle Proteins/metabolism , DNA Replication , DNA-Binding Proteins/genetics , RNA/genetics , Humans , Animals , Mice
5.
J Allergy Clin Immunol ; 153(6): 1668-1680, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38191060

ABSTRACT

BACKGROUND: CLEC16A intron 19 has been identified as a candidate locus for common variable immunodeficiency (CVID). OBJECTIVES: This study sought to elucidate the molecular mechanism by which variants at the CLEC16A intronic locus may contribute to the pathogenesis of CVID. METHODS: The investigators performed fine-mapping of the CLEC16A locus in a CVID cohort, then deleted the candidate functional SNP in T-cell lines by the CRISPR-Cas9 technique and conducted RNA-sequencing to identify target gene(s). The interactions between the CLEC16A locus and its target genes were identified using circular chromosome conformation capture. The transcription factor complexes mediating the chromatin interactions were determined by proteomic approach. The molecular pathways regulated by the CLEC16A locus were examined by RNA-sequencing and reverse phase protein array. RESULTS: This study showed that the CLEC16A locus is an enhancer regulating expression of multiple target genes including a distant gene ATF7IP2 through chromatin interactions. Distinct transcription factor complexes mediate the chromatin interactions in an allele-specific manner. Disruption of the CLEC16A locus affects the AKT signaling pathway, as well as the molecular response of CD4+ T cells to immune stimulation. CONCLUSIONS: Through multiomics and targeted experimental approaches, this study elucidated the underlying target genes and signaling pathways involved in the genetic association of CLEC16A with CVID, and highlighted plausible molecular targets for developing novel therapeutics.


Subject(s)
Common Variable Immunodeficiency , Introns , Lectins, C-Type , Monosaccharide Transport Proteins , Humans , Lectins, C-Type/genetics , Introns/genetics , Monosaccharide Transport Proteins/genetics , Common Variable Immunodeficiency/genetics , Common Variable Immunodeficiency/immunology , Polymorphism, Single Nucleotide , Gene Expression Regulation , Female , Male , Signal Transduction/genetics , CD4-Positive T-Lymphocytes/immunology , Adult
6.
Brain Behav Immun ; 119: 767-780, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38677625

ABSTRACT

The co-occurrence and familial clustering of neurodevelopmental disorders and immune disorders suggest shared genetic risk factors. Based on genome-wide association summary statistics from five neurodevelopmental disorders and four immune disorders, we conducted genome-wide, local genetic correlation and polygenic overlap analysis. We further performed a cross-trait GWAS meta-analysis. Pleotropic loci shared between the two categories of diseases were mapped to candidate genes using multiple algorithms and approaches. Significant genetic correlations were observed between neurodevelopmental disorders and immune disorders, including both positive and negative correlations. Neurodevelopmental disorders exhibited higher polygenicity compared to immune disorders. Around 50%-90% of genetic variants of the immune disorders were shared with neurodevelopmental disorders. The cross-trait meta-analysis revealed 154 genome-wide significant loci, including 8 novel pleiotropic loci. Significant associations were observed for 30 loci with both types of diseases. Pathway analysis on the candidate genes at these loci revealed common pathways shared by the two types of diseases, including neural signaling, inflammatory response, and PI3K-Akt signaling pathway. In addition, 26 of the 30 lead SNPs were associated with blood cell traits. Neurodevelopmental disorders exhibit complex polygenic architecture, with a subset of individuals being at a heightened genetic risk for both neurodevelopmental and immune disorders. The identification of pleiotropic loci has important implications for exploring opportunities for drug repurposing, enabling more accurate patient stratification, and advancing genomics-informed precision in the medical field of neurodevelopmental disorders.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Immune System Diseases , Multifactorial Inheritance , Neurodevelopmental Disorders , Polymorphism, Single Nucleotide , Humans , Neurodevelopmental Disorders/genetics , Immune System Diseases/genetics , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide/genetics , Multifactorial Inheritance/genetics
7.
Nucleic Acids Res ; 50(D1): D1123-D1130, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34669946

ABSTRACT

The development of transcriptome-wide association studies (TWAS) has enabled researchers to better identify and interpret causal genes in many diseases. However, there are currently no resources providing a comprehensive listing of gene-disease associations discovered by TWAS from published GWAS summary statistics. TWAS analyses are also difficult to conduct due to the complexity of TWAS software pipelines. To address these issues, we introduce a new resource called webTWAS, which integrates a database of the most comprehensive disease GWAS datasets currently available with credible sets of potential causal genes identified by multiple TWAS software packages. Specifically, a total of 235 064 gene-diseases associations for a wide range of human diseases are prioritized from 1298 high-quality downloadable European GWAS summary statistics. Associations are calculated with seven different statistical models based on three popular and representative TWAS software packages. Users can explore associations at the gene or disease level, and easily search for related studies or diseases using the MeSH disease tree. Since the effects of diseases are highly tissue-specific, webTWAS applies tissue-specific enrichment analysis to identify significant tissues. A user-friendly web server is also available to run custom TWAS analyses on user-provided GWAS summary statistics data. webTWAS is freely available at http://www.webtwas.net.


Subject(s)
Databases, Genetic , Genetic Diseases, Inborn/classification , Genetic Predisposition to Disease , Transcriptome/genetics , Gene Expression Profiling , Genetic Association Studies , Genetic Diseases, Inborn/genetics , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Software
8.
Nucleic Acids Res ; 50(D1): D1408-D1416, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34570217

ABSTRACT

Interpreting the molecular mechanism of genomic variations and their causal relationship with diseases/traits are important and challenging problems in the human genetic study. To provide comprehensive and context-specific variant annotations for biologists and clinicians, here, by systematically integrating over 4TB genomic/epigenomic profiles and frequently-used annotation databases from various biological domains, we develop a variant annotation database, called VannoPortal. In general, the database has following major features: (i) systematically integrates 40 genome-wide variant annotations and prediction scores regarding allele frequency, linkage disequilibrium, evolutionary signature, disease/trait association, tissue/cell type-specific epigenome, base-wise functional prediction, allelic imbalance and pathogenicity; (ii) equips with our recent novel index system and parallel random-sweep searching algorithms for efficient management of backend databases and information extraction; (iii) greatly expands context-dependent variant annotation to incorporate large-scale epigenomic maps and regulatory profiles (such as EpiMap) across over 33 tissue/cell types; (iv) compiles many genome-scale base-wise prediction scores for regulatory/pathogenic variant classification beyond protein-coding region; (v) enables fast retrieval and direct comparison of functional evidence among linked variants using highly interactive web panel in addition to plain table; (vi) introduces many visualization functions for more efficient identification and interpretation of functional variants in single web page. VannoPortal is freely available at http://mulinlab.org/vportal.


Subject(s)
Databases, Genetic , Genetic Diseases, Inborn/genetics , Genetic Variation/genetics , Molecular Sequence Annotation , Algorithms , Epigenome/genetics , Genetic Diseases, Inborn/classification , Genome, Human/genetics , Genome-Wide Association Study , Genotype , Humans , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Software
9.
Nucleic Acids Res ; 50(6): e34, 2022 04 08.
Article in English | MEDLINE | ID: mdl-34931221

ABSTRACT

Identifying rare variants that contribute to complex diseases is challenging because of the low statistical power in current tests comparing cases with controls. Here, we propose a novel and powerful rare variants association test based on the deviation of the observed mutation burden of a gene in cases from a baseline predicted by a weighted recursive truncated negative-binomial regression (RUNNER) on genomic features available from public data. Simulation studies show that RUNNER is substantially more powerful than state-of-the-art rare variant association tests and has reasonable type 1 error rates even for stratified populations or in small samples. Applied to real case-control data, RUNNER recapitulates known genes of Hirschsprung disease and Alzheimer's disease missed by current methods and detects promising new candidate genes for both disorders. In a case-only study, RUNNER successfully detected a known causal gene of amyotrophic lateral sclerosis. The present study provides a powerful and robust method to identify susceptibility genes with rare risk variants for complex diseases.


Subject(s)
Genetic Predisposition to Disease , Genetic Variation , Models, Genetic , Software , Case-Control Studies , Computer Simulation , Humans , Mutation
10.
PLoS Genet ; 17(2): e1009363, 2021 02.
Article in English | MEDLINE | ID: mdl-33630843

ABSTRACT

Genome-wide association studies (GWASs) have identified multiple susceptibility loci for Alzheimer's disease (AD), which is characterized by early and progressive damage to the hippocampus. However, the association of hippocampal gene expression with AD and the underlying neurobiological pathways remain largely unknown. Based on the genomic and transcriptomic data of 111 hippocampal samples and the summary data of two large-scale meta-analyses of GWASs, a transcriptome-wide association study (TWAS) was performed to identify genes with significant associations between hippocampal expression and AD. We identified 54 significantly associated genes using an AD-GWAS meta-analysis of 455,258 individuals; 36 of the genes were confirmed in another AD-GWAS meta-analysis of 63,926 individuals. Fine-mapping models further prioritized 24 AD-related genes whose effects on AD were mediated by hippocampal expression, including APOE and two novel genes (PTPN9 and PCDHA4). These genes are functionally related to amyloid-beta formation, phosphorylation/dephosphorylation, neuronal apoptosis, neurogenesis and telomerase-related processes. By integrating the predicted hippocampal expression and neuroimaging data, we found that the hippocampal expression of QPCTL and ERCC2 showed significant difference between AD patients and cognitively normal elderly individuals as well as correlated with hippocampal volume. Mediation analysis further demonstrated that hippocampal volume mediated the effect of hippocampal gene expression (QPCTL and ERCC2) on AD. This study identifies two novel genes associated with AD by integrating hippocampal gene expression and genome-wide association data and reveals candidate hippocampus-mediated neurobiological pathways from gene expression to AD.


Subject(s)
Alzheimer Disease/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study/methods , Hippocampus/metabolism , Polymorphism, Single Nucleotide , Transcriptome/genetics , Aged , Aged, 80 and over , Alzheimer Disease/diagnostic imaging , Female , Gene Regulatory Networks/genetics , Genomics/methods , Hippocampus/diagnostic imaging , Humans , Magnetic Resonance Imaging/methods , Male , Whole Genome Sequencing/methods
11.
Int J Cancer ; 153(1): 111-119, 2023 07 01.
Article in English | MEDLINE | ID: mdl-36840614

ABSTRACT

Enhancers are key regulatory elements that exert crucial roles in diverse biological processes, including tumorigenesis and cancer development. Active enhancers could produce transcripts termed enhancer RNAs (eRNAs), which could be used as an index of enhancer activity. Here, we present a versatile data portal, enhancer activity quantitative trait loci database (eaQTLdb; http://www.bioailab.com:3838/eaQTLdb), for exploring the effects of genetic variants on enhancer activity and prioritizing candidate variants across different cancer types. By leveraging the accumulated multiomics data, we systematically identified genetic variants which influence enhancer activity in different cancer types, termed as eaQTLs. We have linked the eaQTLs to hallmarks of cancer and patients' overall survival to illustrate their potential biological roles in cancer development and progression. Notably, eaQTLs associated with the infiltration abundance of 24 different immune cell types were identified and incorporated into eaQTLdb. In addition, we applied colocalization analyses to examine 59 complex diseases and traits to identify eaQTLs colocalized with diseases/traits GWAS signals. Overall, eaQTLdb, incorporating a rich resource for exploration of eaQTLs in different cancer types, will not only benefit users in prioritizing candidate genetic variants and enhancers, but also help researchers decipher the roles of eaQTLs in the dysregulated pathways of cancer and tumor immune microenvironment, opening new diagnostic and therapeutic avenues in precise medicine.


Subject(s)
Neoplasms , Quantitative Trait Loci , Humans , Enhancer Elements, Genetic/genetics , RNA , Promoter Regions, Genetic , Neoplasms/genetics , Tumor Microenvironment
12.
Hum Genet ; 142(4): 507-522, 2023 Apr.
Article in English | MEDLINE | ID: mdl-36917350

ABSTRACT

Age-related macular degeneration (AMD), cataract, and glaucoma are leading causes of blindness worldwide. Previous genome-wide association studies (GWASs) have revealed a variety of susceptible loci associated with age-related ocular disorders, yet the genetic pleiotropy and causal genes across these diseases remain poorly understood. By leveraging large-scale genetic and observational data from ocular disease GWASs and UK Biobank (UKBB), we found significant pairwise genetic correlations and consistent epidemiological associations among these ocular disorders. Cross-disease meta-analysis uncovered seven pleiotropic loci, three of which were replicated in an additional cohort. Integration of variants in pleiotropic loci and multiple single-cell omics data identified that Müller cells and astrocytes were likely trait-related cell types underlying ocular comorbidity. In addition, we comprehensively integrated eye-specific gene expression quantitative loci (eQTLs), epigenomic profiling, and 3D genome data to prioritize causal pleiotropic genes. We found that pleiotropic genes were essential in nerve development and eye pigmentation, and targetable by aflibercept and pilocarpine for the treatment of AMD and glaucoma. These findings will not only facilitate the mechanistic research of ocular comorbidities but also benefit the therapeutic optimization of age-related ocular diseases.


Subject(s)
Glaucoma , Macular Degeneration , Humans , Genetic Pleiotropy , Genome-Wide Association Study , Genetic Predisposition to Disease , Macular Degeneration/genetics , Glaucoma/genetics , Polymorphism, Single Nucleotide
13.
Genome Res ; 30(12): 1789-1801, 2020 12.
Article in English | MEDLINE | ID: mdl-33060171

ABSTRACT

The advances of large-scale genomics studies have enabled compilation of cell type-specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers.


Subject(s)
Computational Biology/methods , Genetic Predisposition to Disease/genetics , Algorithms , Databases, Genetic , Genetic Variation , Genome, Human , Humans , Molecular Sequence Annotation , Whole Genome Sequencing
14.
Mol Psychiatry ; 27(11): 4432-4445, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36195640

ABSTRACT

Human hippocampal volume has been separately associated with single nucleotide polymorphisms (SNPs), DNA methylation and gene expression, but their causal relationships remain largely unknown. Here, we aimed at identifying the causal relationships of SNPs, DNA methylation, and gene expression that are associated with hippocampal volume by integrating cross-omics analyses with genome editing, overexpression and causality inference. Based on structural neuroimaging data and blood-derived genome, transcriptome and methylome data, we prioritized a possibly causal association across multiple molecular phenotypes: rs1053218 mutation leads to cg26741686 hypermethylation, thus leads to overactivation of the associated ANKRD37 gene expression in blood, a gene involving hypoxia, which may result in the reduction of human hippocampal volume. The possibly causal relationships from rs1053218 to cg26741686 methylation to ANKRD37 expression obtained from peripheral blood were replicated in human hippocampal tissue. To confirm causality, we performed CRISPR-based genome and epigenome-editing of rs1053218 homologous alleles and cg26741686 methylation in mouse neural stem cell differentiation models, and overexpressed ANKRD37 in mouse hippocampus. These in-vitro and in-vivo experiments confirmed that rs1053218 mutation caused cg26741686 hypermethylation and ANKRD37 overexpression, and cg26741686 hypermethylation favored ANKRD37 overexpression, and ANKRD37 overexpression reduced hippocampal volume. The pairwise relationships of rs1053218 with hippocampal volume, rs1053218 with cg26741686 methylation, cg26741686 methylation with ANKRD37 expression, and ANKRD37 expression with hippocampal volume could be replicated in an independent healthy young (n = 443) dataset and observed in elderly people (n = 194), and were more significant in patients with late-onset Alzheimer's disease (n = 76). This study revealed a novel causal molecular association mechanism of ANKRD37 with human hippocampal volume, which may facilitate the design of prevention and treatment strategies for hippocampal impairment.


Subject(s)
DNA Methylation , Hippocampus , Aged , Animals , Humans , Mice , Alleles , Alzheimer Disease/genetics , DNA Methylation/genetics , Epigenome , Hippocampus/metabolism , Polymorphism, Single Nucleotide/genetics
15.
Brief Bioinform ; 21(6): 1886-1903, 2020 12 01.
Article in English | MEDLINE | ID: mdl-31750520

ABSTRACT

In clinical cancer treatment, genomic alterations would often affect the response of patients to anticancer drugs. Studies have shown that molecular features of tumors could be biomarkers predictive of sensitivity or resistance to anticancer agents, but the identification of actionable mutations are often constrained by the incomplete understanding of cancer genomes. Recent progresses of next-generation sequencing technology greatly facilitate the extensive molecular characterization of tumors and promote precision medicine in cancers. More and more clinical studies, cancer cell lines studies, CRISPR screening studies as well as patient-derived model studies were performed to identify potential actionable mutations predictive of drug response, which provide rich resources of molecularly and pharmacologically profiled cancer samples at different levels. Such abundance of data also enables the development of various computational models and algorithms to solve the problem of drug sensitivity prediction, biomarker identification and in silico drug prioritization by the integration of multiomics data. Here, we review the recent development of methods and resources that identifies mutation-dependent effects for cancer treatment in clinical studies, functional genomics studies and computational studies and discuss the remaining gaps and future directions in this area.


Subject(s)
Antineoplastic Agents , High-Throughput Nucleotide Sequencing , Neoplasms , Precision Medicine , Antineoplastic Agents/therapeutic use , Genomics , Humans , Molecular Targeted Therapy , Mutation , Neoplasms/genetics , Neoplasms/therapy , Precision Medicine/methods
16.
Bioinformatics ; 37(13): 1915-1917, 2021 07 27.
Article in English | MEDLINE | ID: mdl-33270826

ABSTRACT

SUMMARY: Sampling of control variants having matched properties with input variants is widely used in enrichment analysis of genome-wide association studies/quantitative trait loci and negative data construction for pathogenic/regulatory variant prediction methods. Spurious enrichment results because of confounding factors, such as minor allele frequency and linkage disequilibrium pattern, can be avoided by calibration of statistical significance based on matched controls. Here, we presented vSampler which can generate sets of randomly drawn variants with comprehensive choices of matching properties, such as tissue/cell type-specific epigenomic features. Importantly, the development of a novel data structure and sampling algorithms for vSampler makes it significantly fast than existing tools. AVAILABILITY AND IMPLEMENTATION: vSampler web server and local program are available at http://mulinlab.org/vsampler. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study , Software , Humans , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics
17.
Nucleic Acids Res ; 48(12): 6563-6582, 2020 07 09.
Article in English | MEDLINE | ID: mdl-32459350

ABSTRACT

Functional crosstalk between histone modifications and chromatin remodeling has emerged as a key regulatory mode of transcriptional control during cell fate decisions, but the underlying mechanisms are not fully understood. Here we discover an HRP2-DPF3a-BAF epigenetic pathway that coordinates methylated histone H3 lysine 36 (H3K36me) and ATP-dependent chromatin remodeling to regulate chromatin dynamics and gene transcription during myogenic differentiation. Using siRNA screening targeting epigenetic modifiers, we identify hepatoma-derived growth factor-related protein 2 (HRP2) as a key regulator of myogenesis. Knockout of HRP2 in mice leads to impaired muscle regeneration. Mechanistically, through its HIV integrase binding domain (IBD), HRP2 associates with the BRG1/BRM-associated factor (BAF) chromatin remodeling complex by interacting directly with the BAF45c (DPF3a) subunit. Through its Pro-Trp-Trp-Pro (PWWP) domain, HRP2 preferentially binds to H3K36me2. Consistent with the biochemical studies, ChIP-seq analyses show that HRP2 colocalizes with DPF3a across the genome and that the recruitment of HRP2/DPF3a to chromatin is dependent on H3K36me2. Integrative transcriptomic and cistromic analyses, coupled with ATAC-seq, reveal that HRP2 and DPF3a activate myogenic genes by increasing chromatin accessibility through recruitment of BRG1, the ATPase subunit of the BAF complex. Taken together, these results illuminate a key role for the HRP2-DPF3a-BAF complex in the epigenetic coordination of gene transcription during myogenic differentiation.


Subject(s)
Cell Cycle Proteins/metabolism , Chromatin Assembly and Disassembly , DNA-Binding Proteins/metabolism , Histone Code , Myoblasts/metabolism , Transcription Factors/metabolism , Animals , Binding Sites , Cell Cycle Proteins/genetics , Cell Differentiation , DNA-Binding Proteins/genetics , HEK293 Cells , Humans , Male , Mice , Muscle Development , Myoblasts/cytology , Protein Binding , Transcription Factors/genetics
18.
Nucleic Acids Res ; 48(D1): D983-D991, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31598699

ABSTRACT

Recent advances in genome sequencing and functional genomic profiling have promoted many large-scale quantitative trait locus (QTL) studies, which connect genotypes with tissue/cell type-specific cellular functions from transcriptional to post-translational level. However, no comprehensive resource can perform QTL lookup across multiple molecular phenotypes and investigate the potential cascade effect of functional variants. We developed a versatile resource, named QTLbase, for interpreting the possible molecular functions of genetic variants, as well as their tissue/cell-type specificity. Overall, QTLbase has five key functions: (i) curating and compiling genome-wide QTL summary statistics for 13 human molecular traits from 233 independent studies; (ii) mapping QTL-relevant tissue/cell types to 78 unified terms according to a standard anatomogram; (iii) normalizing variant and trait information uniformly, yielding >170 million significant QTLs; (iv) providing a rich web client that enables phenome- and tissue-wise visualization; and (v) integrating the most comprehensive genomic features and functional predictions to annotate the potential QTL mechanisms. QTLbase provides a one-stop shop for QTL retrieval and comparison across multiple tissues and multiple layers of molecular complexity, and will greatly help researchers interrogate the biological mechanism of causal variants and guide the direction of functional validation. QTLbase is freely available at http://mulinlab.org/qtlbase.


Subject(s)
Databases, Genetic , Genome-Wide Association Study , Genomics , Genotype , Phenotype , Quantitative Trait Loci , Quantitative Trait, Heritable , Computational Biology/methods , Genomics/methods , Humans , Software , Web Browser
19.
Nucleic Acids Res ; 48(D1): D807-D816, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31691819

ABSTRACT

Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.


Subject(s)
Chromosome Mapping , Databases, Genetic , Disease/genetics , Genome, Human , Genome-Wide Association Study , Genotype , Humans , Linkage Disequilibrium , Quantitative Trait Loci
20.
Mol Psychiatry ; 25(3): 517-529, 2020 03.
Article in English | MEDLINE | ID: mdl-31827248

ABSTRACT

The Chinese Imaging Genetics (CHIMGEN) study establishes the largest Chinese neuroimaging genetics cohort and aims to identify genetic and environmental factors and their interactions that are associated with neuroimaging and behavioral phenotypes. This study prospectively collected genomic, neuroimaging, environmental, and behavioral data from more than 7000 healthy Chinese Han participants aged 18-30 years. As a pioneer of large-sample neuroimaging genetics cohorts of non-Caucasian populations, this cohort can provide new insights into ethnic differences in genetic-neuroimaging associations by being compared with Caucasian cohorts. In addition to micro-environmental measurements, this study also collects hundreds of quantitative macro-environmental measurements from remote sensing and national survey databases based on the locations of each participant from birth to present, which will facilitate discoveries of new environmental factors associated with neuroimaging phenotypes. With lifespan environmental measurements, this study can also provide insights on the macro-environmental exposures that affect the human brain as well as their timing and mechanisms of action.


Subject(s)
Asian People/genetics , Brain/diagnostic imaging , Brain/physiology , Adult , Brain/metabolism , China , Cohort Studies , Ethnicity/genetics , Female , Genomics/methods , Healthy Volunteers , Humans , Male , Neuroimaging/methods , Prospective Studies , Research
SELECTION OF CITATIONS
SEARCH DETAIL