Pesquisa | Portal de Pesquisa da BVS

Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome.

Dong, Shengcheng; Boyle, Alan P.

Nucleic Acids Res ; 50(1): e6, 2022 01 11.

Artigo em Inglês | MEDLINE | ID: mdl-34648033

RESUMO

Understanding the functional consequences of genetic variation in the non-coding regions of the human genome remains a challenge. We introduce h ere a computational tool, TURF, to prioritize regulatory variants with tissue-specific function by leveraging evidence from functional genomics experiments, including over 3000 functional genomics datasets from the ENCODE project provided in the RegulomeDB database. TURF is able to generate prediction scores at both organism and tissue/organ-specific levels for any non-coding variant on the genome. We present that TURF has an overall top performance in prediction by using validated variants from MPRA experiments. We also demonstrate how TURF can pick out the regulatory variants with tissue-specific function over a candidate list from associate studies. Furthermore, we found that various GWAS traits showed the enrichment of regulatory variants predicted by TURF scores in the trait-relevant organs, which indicates that these variants can be a valuable source for future studies.

Assuntos

Genoma Humano , Genômica/métodos , Software , Linhagem Celular , Análise de Dados , Humanos

Predicting the effects of SNPs on transcription factor binding affinity.

Nishizaki, Sierra S; Ng, Natalie; Dong, Shengcheng; Porter, Robert S; Morterud, Cody; Williams, Colten; Asman, Courtney; Switzenberg, Jessica A; Boyle, Alan P.

Bioinformatics ; 36(2): 364-372, 2020 01 15.

Artigo em Inglês | MEDLINE | ID: mdl-31373606

RESUMO

MOTIVATION: Genome-wide association studies have revealed that 88% of disease-associated single-nucleotide polymorphisms (SNPs) reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl). RESULTS: SEMpl estimates transcription factor-binding affinity by observing differences in chromatin immunoprecipitation followed by deep sequencing signal intensity for SNPs within functional transcription factor-binding sites (TFBSs) genome-wide. By cataloging the effects of every possible mutation within the TFBS motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci. AVAILABILITY AND IMPLEMENTATION: SEMpl is available from https://github.com/Boyle-Lab/SEM_CPP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Sítios de Ligação , Imunoprecipitação da Cromatina , Ligação Proteica , Fatores de Transcrição

Predicting functional variants in enhancer and promoter elements using RegulomeDB.

Dong, Shengcheng; Boyle, Alan P.

Hum Mutat ; 40(9): 1292-1298, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31228310

RESUMO

Here we present a computational model, Score of Unified Regulatory Features (SURF), that predicts functional variants in enhancer and promoter elements. SURF is trained on data from massively parallel reporter assays and predicts the effect of variants on reporter expression levels. It achieved the top performance in the Fifth Critical Assessment of Genome Interpretation "Regulation Saturation" challenge. We also show that features queried through RegulomeDB, which are direct annotations from functional genomics data, help improve prediction accuracy beyond transfer learning features from DNA sequence-based deep learning models. Some of the most important features include DNase footprints, especially when coupled with complementary ChIP-seq data. Furthermore, we found our model achieved good performance in predicting allele-specific transcription factor binding events. As an extension to the current scoring system in RegulomeDB, we expect our computational model to prioritize variants in regulatory regions, thus help the understanding of functional variants in noncoding regions that lead to disease.

Assuntos

Elementos Facilitadores Genéticos , Variação Genética , Genômica/métodos , Regiões Promotoras Genéticas , Aprendizado Profundo , Predisposição Genética para Doença , Genoma Humano , Humanos , Modelos Genéticos , Análise de Sequência de DNA/métodos

Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay.

Shigaki, Dustin; Adato, Orit; Adhikari, Aashish N; Dong, Shengcheng; Hawkins-Hooker, Alex; Inoue, Fumitaka; Juven-Gershon, Tamar; Kenlay, Henry; Martin, Beth; Patra, Ayoti; Penzar, Dmitry D; Schubach, Max; Xiong, Chenling; Yan, Zhongxia; Boyle, Alan P; Kreimer, Anat; Kulakovskiy, Ivan V; Reid, John; Unger, Ron; Yosef, Nir; Shendure, Jay; Ahituv, Nadav; Kircher, Martin; Beer, Michael A.

Hum Mutat ; 40(9): 1280-1291, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31106481

RESUMO

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.

Assuntos

DNA/química , Epigenômica/métodos , Mutação Puntual , Sítios de Ligação , Linhagem Celular , Cromatina/genética , DNA/metabolismo , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo

Insights from multidimensional analyses of the pan-cancer DNA methylome heterogeneity and the uncanonical CpG-gene associations.

Liu, Yang; Huang, Rongyao; Liu, Yu; Song, Wanlu; Wang, Yuting; Yang, Yang; Dong, Shengcheng; Yang, Xuerui.

Int J Cancer ; 143(11): 2814-2827, 2018 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-30121964

RESUMO

Although the DNA methylome profiles have been available in large cancer cohorts such as The Cancer Genome Atlas (TCGA), integrative analysis of the DNA methylome architectures in a pan-cancer manner remains limited. In the present study, we aimed to systematically dissect the insightful features related to the inter-tumoral DNA methylome heterogeneity in a pan-cancer context of 21 cancers in TCGA. First, pan-cancer clustering of the DNA methylomes revealed convergence of cancers and, meanwhile, new classifications of cancer subtypes, which are often associated to prognostic differences. Next, within each type of cancer, we showed that the transcription factor (TF) genes tend to bear more dynamic promoter DNA methylation profiles than the other genes, which serves as a potential source of the transcriptome heterogeneity in cancers. Finally, we found unanticipated significant numbers of the non-canonical promoter CpG sites that are positively correlated with the gene expression. Distribution patterns of these CpG sites in the CpG islands, ChIP-seq, DNaseI-seq, PMD regions and histone modification landscapes suggested against a pervasive mechanism of transcriptional activation due to mCpG-dependent binding of TFs, which is not in complete agreement with previous hypothesis. In summary, our deep mining of the highly heterogeneous DNA methylome data in a pan-cancer context generated novel insights into the architecture of cancer epigenetics and provided a series of resources for further investigations in the related fields of cancer genomics and epigenetics.

Assuntos

Ilhas de CpG/genética , Metilação de DNA/genética , DNA de Neoplasias/genética , Neoplasias/genética , Análise por Conglomerados , Epigênese Genética/genética , Epigenômica/métodos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Regiões Promotoras Genéticas/genética , Ativação Transcricional/genética , Transcriptoma/genética

Dependency of the Cancer-Specific Transcriptional Regulation Circuitry on the Promoter DNA Methylome.

Liu, Yu; Liu, Yang; Huang, Rongyao; Song, Wanlu; Wang, Jiawei; Xiao, Zhengtao; Dong, Shengcheng; Yang, Yang; Yang, Xuerui.

Cell Rep ; 26(12): 3461-3474.e5, 2019 03 19.

Artigo em Inglês | MEDLINE | ID: mdl-30893615

RESUMO

Dynamic dysregulation of the promoter DNA methylome is a signature of cancer. However, comprehensive understandings about how the DNA methylome is incorporated in the transcriptional regulation circuitry and involved in regulating the gene expression abnormality in cancers are still missing. We introduce an integrative analysis pipeline based on mutual information theory and tailored for the multi-omics profiling data in The Cancer Genome Atlas (TCGA) to systematically find dependencies of transcriptional regulation circuits on promoter CpG methylation profiles for each of 21 cancer types. By coupling transcription factors with CpG sites, this cancer type-specific transcriptional regulation circuitry recovers a significant layer of expression regulation for many cancer-related genes. The coupled CpG sites and transcription factors also serve as markers for classifications of cancer subtypes with different prognoses, suggesting physiological relevance of such regulation machinery recapitulated here. Our results therefore generate a resource for further studies of the epigenetic scheme in gene expression dysregulations in cancers.

Assuntos

Ilhas de CpG , Metilação de DNA , DNA de Neoplasias/metabolismo , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias/metabolismo , Transcrição Gênica , Bases de Dados de Ácidos Nucleicos , Humanos , Neoplasias/patologia

Annotating and prioritizing human non-coding variants with RegulomeDB v.2.

Dong, Shengcheng; Zhao, Nanxiang; Spragins, Emma; Kagda, Meenakshi S; Li, Mingjie; Assis, Pedro; Jolanki, Otto; Luo, Yunhai; Cherry, J Michael; Boyle, Alan P; Hitz, Benjamin C.

Nat Genet ; 55(5): 724-726, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37173523

Assuntos

Variação Genética , Genômica , Humanos , Variação Genética/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA