Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 180
Filtrar
1.
PLoS Genet ; 20(1): e1011037, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38206971

RESUMO

Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.


Assuntos
Estudo de Associação Genômica Ampla , Privacidade , Humanos , Estudo de Associação Genômica Ampla/métodos , Genótipo , Software , Genômica
2.
Nucleic Acids Res ; 52(D1): D972-D979, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37831083

RESUMO

Leveraging genetics insights to promote drug repurposing has become a promising and active strategy in pharmacology. Indeed, among the 50 drugs approved by FDA in 2021, two-thirds have genetically supported evidence. In this regard, the increasing amount of widely available genome-wide association studies (GWAS) datasets have provided substantial opportunities for drug repurposing based on genetics discoveries. Here, we developed PharmGWAS, a comprehensive knowledgebase designed to identify candidate drugs through the integration of GWAS data. PharmGWAS focuses on novel connections between diseases and small-molecule compounds derived using a reverse relationship between the genetically-regulated expression signature and the drug-induced signature. Specifically, we collected and processed 1929 GWAS datasets across a diverse spectrum of diseases and 724 485 perturbation signatures pertaining to a substantial 33609 molecular compounds. To obtain reliable and robust predictions for the reverse connections, we implemented six distinct connectivity methods. In the current version, PharmGWAS deposits a total of 740 227 genetically-informed disease-drug pairs derived from drug-perturbation signatures, presenting a valuable and comprehensive catalog. Further equipped with its user-friendly web design, PharmGWAS is expected to greatly aid the discovery of novel drugs, the exploration of drug combination therapies and the identification of drug resistance or side effects. PharmGWAS is available at https://ngdc.cncb.ac.cn/pharmgwas.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Reposicionamento de Medicamentos , Estudo de Associação Genômica Ampla , Reposicionamento de Medicamentos/métodos , Estudo de Associação Genômica Ampla/métodos
3.
Nucleic Acids Res ; 52(D1): D871-D881, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37941154

RESUMO

Large-scale genome-wide association studies (GWAS) have provided profound insights into complex traits and diseases. Yet, deciphering the fine-scale molecular mechanisms of how genetic variants manifest to cause the phenotypes remains a daunting task. Here, we present COLOCdb (https://ngdc.cncb.ac.cn/colocdb), a comprehensive genetic colocalization database by integrating more than 3000 GWAS summary statistics and 13 types of xQTL to date. By employing two representative approaches for the colocalization analysis, COLOCdb deposits results from three key components: (i) GWAS-xQTL, pair-wise colocalization between GWAS loci and different types of xQTL, (ii) GWAS-GWAS, pair-wise colocalization between the trait-associated genetic loci from GWASs and (iii) xQTL-xQTL, pair-wise colocalization between the genetic loci associated with molecular phenotypes in xQTLs. These results together represent the most comprehensive colocalization analysis, which also greatly expands the list of shared variants with genetic pleiotropy. We expect that COLOCdb can serve as a unique and useful resource in advancing the discovery of new biological mechanisms and benefit future functional studies.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas , Fenótipo , Pleiotropia Genética , Polimorfismo de Nucleotídeo Único
4.
Nucleic Acids Res ; 52(D1): D1072-D1081, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37870478

RESUMO

Annotating genetic variants to their target genes is of great importance in unraveling the causal variants and genetic mechanisms that underlie complex diseases. However, disease-associated genetic variants are often located in non-coding regions and manifest context-specific effects, making it challenging to accurately identify the target genes and regulatory mechanisms. Here, we present TargetGene (https://ngdc.cncb.ac.cn/targetgene/), a comprehensive database reporting target genes for human genetic variants from various aspects. Specifically, we collected a comprehensive catalog of multi-omics data at the single-cell and bulk levels and from various human tissues, cell types and developmental stages. To facilitate the identification of Single Nucleotide Polymorphism (SNP)-to-gene connections, we have implemented multiple analytical tools based on chromatin co-accessibility, 3D interaction, enhancer activities and quantitative trait loci, among others. We applied the pipeline to evaluate variants from nearly 1300 Genome-wide association studies (GWAS) and assembled a comprehensive atlas of multiscale regulation of genetic variants. TargetGene is equipped with user-friendly web interfaces that enable intuitive searching, navigation and browsing through the results. Overall, TargetGene provides a unique resource to empower researchers to study the regulatory mechanisms of genetic variants in complex human traits.


Assuntos
Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Cromatina/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único
5.
PLoS Genet ; 19(7): e1010786, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37459304

RESUMO

Human ear morphology, a complex anatomical structure represented by a multidimensional set of correlated and heritable phenotypes, has a poorly understood genetic architecture. In this study, we quantitatively assessed 136 ear morphology traits using deep learning analysis of digital face images in 14,921 individuals from five different cohorts in Europe, Asia, and Latin America. Through GWAS meta-analysis and C-GWASs, a recently introduced method to effectively combine GWASs of many traits, we identified 16 genetic loci involved in various ear phenotypes, eight of which have not been previously associated with human ear features. Our findings suggest that ear morphology shares genetic determinants with other surface ectoderm-derived traits such as facial variation, mono eyebrow, and male pattern baldness. Our results enhance the genetic understanding of human ear morphology and shed light on the shared genetic contributors of different surface ectoderm-derived phenotypes. Additionally, gene editing experiments in mice have demonstrated that knocking out the newly ear-associated gene (Intu) and a previously ear-associated gene (Tbx15) causes deviating mouse ear morphology.


Assuntos
Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Masculino , Animais , Camundongos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Ásia , Polimorfismo de Nucleotídeo Único/genética
6.
Hum Mol Genet ; 32(6): 998-1009, 2023 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-36282535

RESUMO

Multiple sclerosis (MS) is a complex dysimmune disorder of the central nervous system. Genome-wide association studies (GWAS) have identified 233 genetic variations associated with MS at the genome-wide significant level. Epigenetic studies have pinpointed differentially methylated CpG sites in MS patients. However, the interplay between genetic risk factors and epigenetic regulation remains elusive. Here, we employed a network model to integrate GWAS summary statistics of 14 802 MS cases and 26 703 controls with DNA methylation profiles from 140 MS cases and 139 controls and the human interactome. We identified differentially methylated genes by aggregating additive effects of differentially methylated CpG sites within promoter regions. We reconstructed a gene regulatory network (GRN) using literature-curated transcription factor knowledge. Colocalization of the MS GWAS and methylation quantitative trait loci (mQTL) was performed to assess the GRN. The resultant MS-associated GRN highlighted several single nucleotide polymorphisms with GWAS-mQTL colocalization: rs6032663, rs6065926 and rs2024568 of CD40 locus, rs9913597 of STAT3 locus, and rs887864 and rs741175 of CIITA locus. Moreover, synergistic mQTL and expression QTL signals were identified in CD40, suggesting gene expression alteration was likely induced by epigenetic changes. Web-based Cell-type Specific Enrichment Analysis of Genes (WebCSEA) indicated that the GRN was enriched in T follicular helper cells (P-value = 0.0016). Drug target enrichment analysis of annotations from the Therapeutic Target Database revealed the GRN was also enriched with drug target genes (P-value = 3.89 × 10-4), revealing repurposable candidates for MS treatment. These candidates included vorinostat (HDAC1 inhibitor) and sivelestat (ELANE inhibitor), which warrant further investigation.


Assuntos
Epigênese Genética , Esclerose Múltipla , Humanos , Epigênese Genética/genética , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla/genética , Metilação de DNA/genética , Locos de Características Quantitativas/genética
7.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36961310

RESUMO

Prediction of therapy response has been a major challenge in cancer precision medicine due to the extensive tumor heterogeneity. Recently, several deep learning methods have been developed to predict drug response by utilizing various omics data. Most of them train models by using the drug-response screening data generated from cell lines and then use these models to predict response in cancer patient data. In this study, we focus on and evaluate deep learning methods using transcriptome data for the long-standing question of personalized drug-response prediction. We developed an embedding-based approach for drug-response prediction and benchmarked similar methods for their performance. For all methods, we used pretreatment transcriptome data to train models and then conducted a comprehensive evaluation and comparison of the models using cross-panels, cross-datasets and target genes. We further validated the methods using three independent datasets assessing multiple compounds for their predictive capability of drug response, survival outcome and cell line status. As a result, the methods building on gene embeddings had an overall competitive performance with reduced overfitting when we applied evaluation parameters for model fitting as well as the correlation with clinical outcomes in the validation data. We further developed an ensemble model to combine the results from the three most competitive methods for an overall prediction. Finally, we developed DrVAEN (https://bioinfo.uth.edu/drvaen), a user-friendly and easy-accessible web-server that hosts all these methods for drug-response prediction and model comparison for broad use in cancer research, method evaluation and drug development.


Assuntos
Benchmarking , Neoplasias , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Medicina de Precisão/métodos
8.
Nucleic Acids Res ; 51(D1): D835-D844, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36243988

RESUMO

A broad range of complex phenotypes are related to dysfunctions in brain (hereafter referred to as brain-related traits), including various mental and behavioral disorders and diseases of the nervous system. These traits in general share overlapping symptoms, pathogenesis, and genetic components. Here, we present Brain Catalog (https://ngdc.cncb.ac.cn/braincatalog), a comprehensive database aiming to delineate the genetic components of more than 500 GWAS summary statistics datasets for brain-related traits from multiple aspects. First, Brain Catalog provides results of candidate causal variants, causal genes, and functional tissues and cell types for each trait identified by multiple methods using comprehensive annotation datasets (58 QTL datasets spanning 6 types of QTLs). Second, Brain Catalog estimates the SNP-based heritability, the partitioning heritability based on functional annotations, and genetic correlations among traits. Finally, through bidirectional Mendelian randomization analyses, Brain Catalog presents inference of risk factors that are likely causal to each trait. In conclusion, Brain Catalog presents a one-stop shop for the genetic components of brain-related traits, potentially serving as a valuable resource for worldwide researchers to advance the understanding of how GWAS signals may contribute to the biological etiology of brain-related traits.


Assuntos
Encéfalo , Bases de Dados Genéticas , Transtornos Mentais , Encéfalo/fisiopatologia , Fenótipo , Locos de Características Quantitativas , Transtornos Mentais/genética
9.
Proc Natl Acad Sci U S A ; 119(16): e2117857119, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-35412907

RESUMO

The RB1 gene is frequently mutated in human cancers but its role in tumorigenesis remains incompletely defined. Using an induced pluripotent stem cell (iPSC) model of hereditary retinoblastoma (RB), we report that the spliceosome is an up-regulated target responding to oncogenic stress in RB1-mutant cells. By investigating transcriptomes and genome occupancies in RB iPSC­derived osteoblasts (OBs), we discover that both E2F3a, which mediates spliceosomal gene expression, and pRB, which antagonizes E2F3a, coregulate more than one-third of spliceosomal genes by cobinding to their promoters or enhancers. Pharmacological inhibition of the spliceosome in RB1-mutant cells leads to global intron retention, decreased cell proliferation, and impaired tumorigenesis. Tumor specimen studies and genome-wide TCGA (The Cancer Genome Atlas) expression profile analyses support the clinical relevance of pRB and E2F3a in modulating spliceosomal gene expression in multiple cancer types including osteosarcoma (OS). High levels of pRB/E2F3a­regulated spliceosomal genes are associated with poor OS patient survival. Collectively, these findings reveal an undiscovered connection between pRB, E2F3a, the spliceosome, and tumorigenesis, pointing to the spliceosomal machinery as a potentially widespread therapeutic vulnerability of pRB-deficient cancers.


Assuntos
Neoplasias Ósseas , Carcinogênese , Fator de Transcrição E2F3 , Regulação Neoplásica da Expressão Gênica , Células-Tronco Pluripotentes Induzidas , Osteossarcoma , Proteínas de Ligação a Retinoblastoma , Spliceossomos , Ubiquitina-Proteína Ligases , Neoplasias Ósseas/genética , Neoplasias Ósseas/patologia , Carcinogênese/genética , Fator de Transcrição E2F3/genética , Fator de Transcrição E2F3/metabolismo , Genes do Retinoblastoma , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Mutação , Osteossarcoma/genética , Osteossarcoma/patologia , Neoplasias da Retina/genética , Retinoblastoma/genética , Proteínas de Ligação a Retinoblastoma/genética , Proteínas de Ligação a Retinoblastoma/metabolismo , Spliceossomos/genética , Spliceossomos/metabolismo , Ubiquitina-Proteína Ligases/genética , Ubiquitina-Proteína Ligases/metabolismo
10.
Hum Mol Genet ; 31(19): 3341-3354, 2022 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-35640139

RESUMO

Genome-wide association studies (GWAS) have identified more than 75 genetic variants associated with Alzheimer's disease (ad). However, how these variants function and impact protein expression in brain regions remain elusive. Large-scale proteomic datasets of ad postmortem brain tissues have become available recently. In this study, we used these datasets to investigate brain region-specific molecular pathways underlying ad pathogenesis and explore their potential drug targets. We applied our new network-based tool, Edge-Weighted Dense Module Search of GWAS (EW_dmGWAS), to integrate ad GWAS statistics of 472 868 individuals with proteomic profiles from two brain regions from two large-scale ad cohorts [parahippocampal gyrus (PHG), sample size n = 190; dorsolateral prefrontal cortex (DLPFC), n = 192]. The resulting network modules were evaluated using a scale-free network index, followed by a cross-region consistency evaluation. Our EW_dmGWAS analyses prioritized 52 top module genes (TMGs) specific in PHG and 58 TMGs in DLPFC, of which four genes (CLU, PICALM, PRRC2A and NDUFS3) overlapped. Those four genes were significantly associated with ad (GWAS gene-level false discovery rate < 0.05). To explore the impact of these genetic components on TMGs, we further examined their differentially co-expressed genes at the proteomic level and compared them with investigational drug targets. We pinpointed three potential drug target genes, APP, SNCA and VCAM1, specifically in PHG. Gene set enrichment analyses of TMGs in PHG and DLPFC revealed region-specific biological processes, tissue-cell type signatures and enriched drug signatures, suggesting potential region-specific drug repurposing targets for ad.


Assuntos
Doença de Alzheimer , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Encéfalo/metabolismo , Drogas em Investigação/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Proteômica
11.
Genome Res ; 31(1): 146-158, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33272935

RESUMO

As the most complex organ of the human body, the brain is composed of diverse regions, each consisting of distinct cell types and their respective cellular interactions. Human brain development involves a finely tuned cascade of interactive events. These include spatiotemporal gene expression changes and dynamic alterations in cell-type composition. However, our understanding of this process is still largely incomplete owing to the difficulty of brain spatiotemporal transcriptome collection. In this study, we developed a tensor-based approach to impute gene expression on a transcriptome-wide level. After rigorous computational benchmarking, we applied our approach to infer missing data points in the widely used BrainSpan resource and completed the entire grid of spatiotemporal transcriptomics. Next, we conducted deconvolutional analyses to comprehensively characterize major cell-type dynamics across the entire BrainSpan resource to estimate the cellular temporal changes and distinct neocortical areas across development. Moreover, integration of these results with GWAS summary statistics for 13 brain-associated traits revealed multiple novel trait-cell-type associations and trait-spatiotemporal relationships. In summary, our imputed BrainSpan transcriptomic data provide a valuable resource for the research community and our findings help further studies of the transcriptional and cellular dynamics of the human brain and related diseases.


Assuntos
Encefalopatias , Encéfalo , Perfilação da Expressão Gênica , Humanos , Fenótipo , Transcriptoma
12.
Nucleic Acids Res ; 50(D1): D1164-D1171, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34634794

RESUMO

Drug response to many diseases varies dramatically due to the complex genomics and functional features and contexts. Cellular diversity of human tissues, especially tumors, is one of the major contributing factors to the different drug response in different samples. With the accumulation of single-cell RNA sequencing (scRNA-seq) data, it is now possible to study the drug response to different treatments at the single cell resolution. Here, we present CeDR Atlas (available at https://ngdc.cncb.ac.cn/cedr), a knowledgebase reporting computational inference of cellular drug response for hundreds of cell types from various tissues. We took advantage of the high-throughput profiling of drug-induced gene expression available through the Connectivity Map resource (CMap) as well as hundreds of scRNA-seq data covering cells from a wide variety of organs/tissues, diseases, and conditions. Currently, CeDR maintains the results for more than 582 single cell data objects for human, mouse and cell lines, including about 140 phenotypes and 1250 tissue-cell combination types. All the results can be explored and searched by keywords for drugs, cell types, tissues, diseases, and signature genes. Overall, CeDR fine maps drug response at cellular resolution and sheds lights on the design of combinatorial treatments, drug resistance and even drug side effects.


Assuntos
Biomarcadores Farmacológicos , Bases de Dados Factuais , Neoplasias/tratamento farmacológico , Software , Animais , Perfilação da Expressão Gênica/classificação , Humanos , Bases de Conhecimento , Camundongos , Neoplasias/classificação , RNA-Seq/classificação , Análise de Célula Única/classificação , Sequenciamento do Exoma/classificação
13.
Nucleic Acids Res ; 50(W1): W782-W790, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35610053

RESUMO

Human complex traits and common diseases show tissue- and cell-type- specificity. Recently, single-cell RNA sequencing (scRNA-seq) technology has successfully depicted cellular heterogeneity in human tissue, providing an unprecedented opportunity to understand the context-specific expression of complex trait-associated genes in human tissue-cell types (TCs). Here, we present the first web-based application to quickly assess the cell-type-specificity of genes, named Web-based Cell-type Specific Enrichment Analysis of Genes (WebCSEA, available at https://bioinfo.uth.edu/webcsea/). Specifically, we curated a total of 111 scRNA-seq panels of human tissues and 1,355 TCs from 61 different general tissues across 11 human organ systems. We adapted our previous decoding tissue-specificity (deTS) algorithm to measure the enrichment for each tissue-cell type (TC). To overcome the potential bias from the number of signature genes between different TCs, we further developed a permutation-based method that accurately estimates the TC-specificity of a given inquiry gene list. WebCSEA also provides an interactive heatmap that displays the cell-type specificity across 1355 human TCs, and other interactive and static visualizations of cell-type specificity by human organ system, developmental stage, and top-ranked tissues and cell types. In short, WebCSEA is a one-click application that provides a comprehensive exploration of the TC-specificity of genes among human major TC map.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Software , Humanos , Algoritmos , Perfilação da Expressão Gênica/métodos , Internet , Herança Multifatorial , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
14.
BMC Musculoskelet Disord ; 25(1): 432, 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38831438

RESUMO

BACKGROUND: Osteoporotic vertebral compression fractures (OVCF) in the elderly increase refracture risk post-surgery, leading to higher mortality rates. Genome-wide association studies (GWAS) have identified susceptibility genes for osteoporosis, but the phenotypic variance explained by these genes has been limited, indicating the need to explore additional causal factors. Epigenetic modifications, such as DNA methylation, may influence osteoporosis and refracture risk. However, prospective cohorts for assessing epigenetic alterations in Chinese elderly patients are lacking. Here, we propose to conduct a prospective cohort study to investigate the causal network of DNA polymorphisms, DNA methylation, and environmental factors on the development of osteoporosis and the risk of refracture. METHODS: We will collect vertebral and peripheral blood from 500 elderly OVCF patients undergoing surgery, extract DNA, and generate whole genome genotype data and DNA methylation data. Observation indicators will be collected and combined with one-year follow-up data. A healthy control group will be selected from a natural population cohort. Epigenome-wide association studies (EWAS) of osteoporosis and bone mineral density will be conducted. Differential methylation analysis will compare candidate gene methylation patterns in patients with and without refracture. Multi-omics prediction models using genetic variants and DNA methylation sites will be built to predict OVCF risk. DISCUSSION: This study will be the first large-scale population-based study of osteoporosis and bone mineral density phenotypes based on genome-wide data, multi-time point methylation data, and phenotype data. By analyzing methylation changes related to osteoporosis and bone mineral density in OVCF patients, the study will explore the feasibility of DNA methylation in evaluating postoperative osteoporosis intervention effects. The findings may identify new molecular markers for effective anti-osteoporosis treatment and inform individualized prevention and treatment strategies. TRIAL REGISTRATION: chictr.org.cn ChiCTR2200065316, 02/11/2022.


Assuntos
Metilação de DNA , Osteoporose , Fraturas por Osteoporose , Fraturas da Coluna Vertebral , Humanos , Estudos Prospectivos , Idoso , Feminino , Osteoporose/genética , Masculino , Fraturas por Osteoporose/genética , Fraturas da Coluna Vertebral/genética , Estudo de Associação Genômica Ampla , Densidade Óssea/genética , Fraturas por Compressão/genética , Pessoa de Meia-Idade , Epigênese Genética , Recidiva , Idoso de 80 Anos ou mais , China/epidemiologia
15.
Am J Hum Genet ; 107(5): 849-863, 2020 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33031748

RESUMO

Variation in levels of the human metabolome reflect changes in homeostasis, providing a window into health and disease. The genetic impact on circulating metabolites in Hispanics, a population with high cardiometabolic disease burden, is largely unknown. We conducted genome-wide association analyses on 640 circulating metabolites in 3,926 Hispanic Community Health Study/Study of Latinos participants. The estimated heritability for 640 metabolites ranged between 0%-54% with a median at 2.5%. We discovered 46 variant-metabolite pairs (p value < 1.2 × 10-10, minor allele frequency ≥ 1%, proportion of variance explained [PEV] mean = 3.4%, PEVrange = 1%-22%) with generalized effects in two population-based studies and confirmed 301 known locus-metabolite associations. Half of the identified variants with generalized effect were located in genes, including five nonsynonymous variants. We identified co-localization with the expression quantitative trait loci at 105 discovered and 151 known loci-metabolites sets. rs5855544, upstream of SLC51A, was associated with higher levels of three steroid sulfates and co-localized with expression levels of SLC51A in several tissues. Mendelian randomization (MR) analysis identified several metabolites associated with coronary heart disease (CHD) and type 2 diabetes. For example, two variants located in or near CYP4F2 (rs2108622 and rs79400241, respectively), involved in vitamin E metabolism, were associated with the levels of octadecanedioate and vitamin E metabolites (gamma-CEHC and gamma-CEHC glucuronide); MR analysis showed that genetically high levels of these metabolites were associated with lower odds of CHD. Our findings document the genetic architecture of circulating metabolites in an underrepresented Hispanic/Latino community, shedding light on disease etiology.


Assuntos
Doença das Coronárias/genética , Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Genoma Humano , Metaboloma/genética , Locos de Características Quantitativas , Adulto , Cromanos/metabolismo , Estudos de Coortes , Doença das Coronárias/diagnóstico , Doença das Coronárias/etnologia , Doença das Coronárias/metabolismo , Família 4 do Citocromo P450/genética , Família 4 do Citocromo P450/metabolismo , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/etnologia , Diabetes Mellitus Tipo 2/metabolismo , Feminino , Expressão Gênica , Estudo de Associação Genômica Ampla , Hispânico ou Latino , Humanos , Masculino , Proteínas de Membrana Transportadoras/genética , Proteínas de Membrana Transportadoras/metabolismo , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Propionatos/metabolismo , Saúde Pública , Característica Quantitativa Herdável , Vitamina E/metabolismo
16.
Development ; 147(24)2020 12 24.
Artigo em Inglês | MEDLINE | ID: mdl-33234712

RESUMO

Craniofacial development is regulated through dynamic and complex mechanisms that involve various signaling cascades and gene regulations. Disruption of such regulations can result in craniofacial birth defects. Here, we propose the first developmental stage-specific network approach by integrating two crucial regulators, transcription factors (TFs) and microRNAs (miRNAs), to study their co-regulation during craniofacial development. Specifically, we used TFs, miRNAs and non-TF genes to form feed-forward loops (FFLs) using genomic data covering mouse embryonic days E10.5 to E14.5. We identified key novel regulators (TFs Foxm1, Hif1a, Zbtb16, Myog, Myod1 and Tcf7, and miRNAs miR-340-5p and miR-129-5p) and target genes (Col1a1, Sgms2 and Slc8a3) expression of which changed in a developmental stage-dependent manner. We found that the Wnt-FoxO-Hippo pathway (from E10.5 to E11.5), tissue remodeling (from E12.5 to E13.5) and miR-129-5p-mediated Col1a1 regulation (from E10.5 to E14.5) might play crucial roles in craniofacial development. Enrichment analyses further suggested their functions. Our experiments validated the regulatory roles of miR-340-5p and Foxm1 in the Wnt-FoxO-Hippo subnetwork, as well as the role of miR-129-5p in the miR-129-5p-Col1a1 subnetwork. Thus, our study helps understand the comprehensive regulatory mechanisms for craniofacial development.


Assuntos
Ossos Faciais/crescimento & desenvolvimento , MicroRNAs/genética , Crânio/crescimento & desenvolvimento , Fatores de Transcrição/genética , Animais , Proteína Forkhead Box M1/genética , Regulação Neoplásica da Expressão Gênica/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Humanos , Subunidade alfa do Fator 1 Induzível por Hipóxia/genética , Camundongos , Proteína MyoD/genética , Miogenina/genética , Proteína com Dedos de Zinco da Leucemia Promielocítica/genética , Fatores de Transcrição/classificação , Via de Sinalização Wnt/genética
17.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32578842

RESUMO

DNA N4-methylcytosine (4mC) modification represents a novel epigenetic regulation. It involves in various cellular processes, including DNA replication, cell cycle and gene expression, among others. In addition to experimental identification of 4mC sites, in silico prediction of 4mC sites in the genome has emerged as an alternative and promising approach. In this study, we first reviewed the current progress in the computational prediction of 4mC sites and systematically evaluated the predictive capacity of eight conventional machine learning algorithms as well as 12 feature types commonly used in previous studies in six species. Using a representative benchmark dataset, we investigated the contribution of feature selection and stacking approach to the model construction, and found that feature optimization and proper reinforcement learning could improve the performance. We next recollected newly added 4mC sites in the six species' genomes and developed a novel deep learning-based 4mC site predictor, namely Deep4mC. Deep4mC applies convolutional neural networks with four representative features. For species with small numbers of samples, we extended our deep learning framework with a bootstrapping method. Our evaluation indicated that Deep4mC could obtain high accuracy and robust performance with the average area under curve (AUC) values greater than 0.9 in all species (range: 0.9005-0.9722). In comparison, Deep4mC achieved an AUC value improvement from 10.14 to 46.21% when compared to previous tools in these six species. A user-friendly web server (https://bioinfo.uth.edu/Deep4mC) was built for predicting putative 4mC sites in a genome.


Assuntos
Biologia Computacional , Metilação de DNA , Aprendizado Profundo , Epigênese Genética , Análise de Sequência de DNA , Software
18.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34086851

RESUMO

Different spatiotemporal abnormalities have been implicated in different neuropsychiatric disorders and anthropometric social traits, yet an investigation in the temporal network modularity with brain tissue transcriptomics has been lacking. We developed a supervised network approach to investigate the genome-wide association study (GWAS) results in the spatial and temporal contexts and demonstrated it in 20 brain disorders and anthropometric social traits. BrainSpan transcriptome profiles were used to discover significant modules enriched with trait susceptibility genes in a developmental stage-stratified manner. We investigated whether, and in which developmental stages, GWAS-implicated genes are coordinately expressed in brain transcriptome. We identified significant network modules for each disorder and trait at different developmental stages, providing a systematic view of network modularity at specific developmental stages for a myriad of brain disorders and traits. Specifically, we observed a strong pattern of the fetal origin for most psychiatric disorders and traits [such as schizophrenia (SCZ), bipolar disorder, obsessive-compulsive disorder and neuroticism], whereas increased co-expression activities of genes were more strongly associated with neurological diseases [such as Alzheimer's disease (AD) and amyotrophic lateral sclerosis] and anthropometric traits (such as college completion, education and subjective well-being) in postnatal brains. Further analyses revealed enriched cell types and functional features that were supported and corroborated prior knowledge in specific brain disorders, such as clathrin-mediated endocytosis in AD, myelin sheath in multiple sclerosis and regulation of synaptic plasticity in both college completion and education. Our study provides a landscape view of the spatiotemporal features in a myriad of brain-related disorders and traits.


Assuntos
Biomarcadores , Encefalopatias/etiologia , Encéfalo/metabolismo , Biologia Computacional , Perfilação da Expressão Gênica , Característica Quantitativa Herdável , Transcriptoma , Encéfalo/patologia , Encéfalo/fisiopatologia , Encefalopatias/metabolismo , Encefalopatias/patologia , Encefalopatias/fisiopatologia , Biologia Computacional/métodos , Suscetibilidade a Doenças , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Humanos , Fenótipo
19.
Nucleic Acids Res ; 49(W1): W131-W139, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-34048560

RESUMO

More than 90% of the genetic variants identified from genome-wide association studies (GWAS) are located in non-coding regions of the human genome. Here, we present a user-friendly web server, DeepFun (https://bioinfo.uth.edu/deepfun/), to assess the functional activity of non-coding genetic variants. This new server is built on a convolutional neural network (CNN) framework that has been extensively evaluated. Specifically, we collected chromatin profiles from ENCODE and Roadmap projects to construct the feature space, including 1548 DNase I accessibility, 1536 histone mark, and 4795 transcription factor binding profiles covering 225 tissues or cell types. With such comprehensive epigenomics annotations, DeepFun expands the functionality of existing non-coding variant prioritizing tools to provide a more specific functional assessment on non-coding variants in a tissue- and cell type-specific manner. By using the datasets from various GWAS studies, we conducted independent validations and demonstrated the functions of the DeepFun web server in predicting the effect of a non-coding variant in a specific tissue or cell type, as well as visualizing the potential motifs in the region around variants. We expect our server will be widely used in genetics, functional genomics, and disease studies.


Assuntos
Variação Genética , Software , Cromatina/metabolismo , Simulação por Computador , Aprendizado Profundo , Genoma Humano , Estudo de Associação Genômica Ampla , Código das Histonas , Humanos , Mutagênese , Especificidade de Órgãos , Fatores de Transcrição/metabolismo
20.
Nucleic Acids Res ; 49(D1): D552-D561, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33137204

RESUMO

Mutations in kinases are abundant and critical to study signaling pathways and regulatory roles in human disease, especially in cancer. Somatic mutations in kinase genes can affect drug treatment, both sensitivity and resistance, to clinically used kinase inhibitors. Here, we present a newly constructed database, KinaseMD (kinase mutations and drug response), to structurally and functionally annotate kinase mutations. KinaseMD integrates 679 374 somatic mutations, 251 522 network-rewiring events, and 390 460 drug response records curated from various sources for 547 kinases. We uniquely annotate the mutations and kinase inhibitor response in four types of protein substructures (gatekeeper, A-loop, G-loop and αC-helix) that are linked to kinase inhibitor resistance in literature. In addition, we annotate functional mutations that may rewire kinase regulatory network and report four phosphorylation signals (gain, loss, up-regulation and down-regulation). Overall, KinaseMD provides the most updated information on mutations, unique annotations of drug response especially drug resistance and functional sites of kinases. KinaseMD is accessible at https://bioinfo.uth.edu/kmd/, having functions for searching, browsing and downloading data. To our knowledge, there has been no systematic annotation of these structural mutations linking to kinase inhibitor response. In summary, KinaseMD is a centralized database for kinase mutations and drug response.


Assuntos
Bases de Dados Genéticas , Mutação/genética , Fosfotransferases/genética , Inibidores de Proteínas Quinases/farmacologia , Resistencia a Medicamentos Antineoplásicos/genética , Anotação de Sequência Molecular , Fosforilação/efeitos dos fármacos , Fosfotransferases/química , Inibidores de Proteínas Quinases/farmacocinética , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA