Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
1.
Sci Adv ; 10(21): eadj4452, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781344

ABSTRACT

Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.


Subject(s)
Brain , Epigenesis, Genetic , Regulatory Sequences, Nucleic Acid , Humans , Brain/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Animals , Evolution, Molecular , Mental Disorders/genetics , Regulatory Elements, Transcriptional/genetics , Neurons/metabolism , Gene Expression Regulation , Transcription Factors/genetics , Transcription Factors/metabolism
2.
Science ; 384(6698): eadh0829, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781368

ABSTRACT

Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.


Subject(s)
Alternative Splicing , Brain , Gene Expression Regulation, Developmental , Mental Disorders , Humans , Atlases as Topic , Autism Spectrum Disorder/genetics , Brain/metabolism , Brain/growth & development , Brain/embryology , Gene Regulatory Networks , Genome-Wide Association Study , Protein Isoforms/genetics , Protein Isoforms/metabolism , Quantitative Trait Loci , Schizophrenia/genetics , Transcriptome , Mental Disorders/genetics
3.
Hepatol Commun ; 7(10)2023 10 01.
Article in English | MEDLINE | ID: mdl-37756045

ABSTRACT

BACKGROUND: Genome-wide association studies (GWAS) have identified 30 risk loci for primary sclerosing cholangitis (PSC). Variants within these loci are found predominantly in noncoding regions of DNA making their mechanisms of conferring risk hard to define. Epigenomic studies have shown noncoding variants broadly impact regulatory element activity. The possible association of noncoding PSC variants with regulatory element activity has not been studied. We aimed to (1) determine if the noncoding risk variants in PSC impact regulatory element function and (2) if so, assess the role these regulatory elements have in explaining the genetic risk for PSC. METHODS: Available epigenomic datasets were integrated to build a comprehensive atlas of cell type-specific regulatory elements, emphasizing PSC-relevant cell types. RNA-seq and ATAC-seq were performed on peripheral CD4+ T cells from 10 PSC patients and 11 healthy controls. Computational techniques were used to (1) study the enrichment of PSC-risk variants within regulatory elements, (2) correlate risk genotype with differences in regulatory element activity, and (3) identify regulatory elements differentially active and genes differentially expressed between PSC patients and controls. RESULTS: Noncoding PSC-risk variants are strongly enriched within immune-specific enhancers, particularly ones involved in T-cell response to antigenic stimulation. In total, 250 genes and >10,000 regulatory elements were identified that are differentially active between patients and controls. CONCLUSIONS: Mechanistic effects are proposed for variants at 6 PSC-risk loci where genotype was linked with differential T-cell regulatory element activity. Regulatory elements are shown to play a key role in PSC pathophysiology.


Subject(s)
Cholangitis, Sclerosing , Genome-Wide Association Study , Humans , Cholangitis, Sclerosing/genetics , Chromatin Immunoprecipitation Sequencing , Genotype
4.
Science ; 380(6643): eabn7930, 2023 04 28.
Article in English | MEDLINE | ID: mdl-37104580

ABSTRACT

Understanding the regulatory landscape of the human genome is a long-standing objective of modern biology. Using the reference-free alignment across 241 mammalian genomes produced by the Zoonomia Consortium, we charted evolutionary trajectories for 0.92 million human candidate cis-regulatory elements (cCREs) and 15.6 million human transcription factor binding sites (TFBSs). We identified 439,461 cCREs and 2,024,062 TFBSs under evolutionary constraint. Genes near constrained elements perform fundamental cellular processes, whereas genes near primate-specific elements are involved in environmental interaction, including odor perception and immune response. About 20% of TFBSs are transposable element-derived and exhibit intricate patterns of gains and losses during primate evolution whereas sequence variants associated with complex traits are enriched in constrained TFBSs. Our annotations illuminate the regulatory functions of the human genome.


Subject(s)
Evolution, Molecular , Genome, Human , Mammals , Regulatory Elements, Transcriptional , Transcription Factors , Animals , Humans , Binding Sites , DNA Transposable Elements , Mammals/classification , Mammals/genetics , Primates/classification , Primates/genetics , Transcription Factors/genetics , Transcription Factors/metabolism , Phylogeny
5.
medRxiv ; 2023 Mar 06.
Article in English | MEDLINE | ID: mdl-36945630

ABSTRACT

Genomic regulatory elements active in the developing human brain are notably enriched in genetic risk for neuropsychiatric disorders, including autism spectrum disorder (ASD), schizophrenia, and bipolar disorder. However, prioritizing the specific risk genes and candidate molecular mechanisms underlying these genetic enrichments has been hindered by the lack of a single unified large-scale gene regulatory atlas of human brain development. Here, we uniformly process and systematically characterize gene, isoform, and splicing quantitative trait loci (xQTLs) in 672 fetal brain samples from unique subjects across multiple ancestral populations. We identify 15,752 genes harboring a significant xQTL and map 3,739 eQTLs to a specific cellular context. We observe a striking drop in gene expression and splicing heritability as the human brain develops. Isoform-level regulation, particularly in the second trimester, mediates the greatest proportion of heritability across multiple psychiatric GWAS, compared with eQTLs. Via colocalization and TWAS, we prioritize biological mechanisms for ~60% of GWAS loci across five neuropsychiatric disorders, nearly two-fold that observed in the adult brain. Finally, we build a comprehensive set of developmentally regulated gene and isoform co-expression networks capturing unique genetic enrichments across disorders. Together, this work provides a comprehensive view of genetic regulation across human brain development as well as the stage-and cell type-informed mechanistic underpinnings of neuropsychiatric disorders.

6.
Hum Mol Genet ; 31(R1): R114-R122, 2022 10 20.
Article in English | MEDLINE | ID: mdl-36083269

ABSTRACT

Every cell in the human body inherits a copy of the same genetic information. The three billion base pairs of DNA in the human genome, and the roughly 50 000 coding and non-coding genes they contain, must thus encode all the complexity of human development and cell and tissue type diversity. Differences in gene regulation, or the modulation of gene expression, enable individual cells to interpret the genome differently to carry out their specific functions. Here we discuss recent and ongoing efforts to build gene regulatory maps, which aim to characterize the regulatory roles of all sequences in a genome. Many researchers and consortia have identified such regulatory elements using functional assays and evolutionary analyses; we discuss the results, strengths and shortcomings of their approaches. We also discuss new techniques the field can leverage and emerging challenges it will face while striving to build gene regulatory maps of ever-increasing resolution and comprehensiveness.


Subject(s)
Gene Expression Regulation , Regulatory Sequences, Nucleic Acid , Humans , Gene Expression Regulation/genetics , Genome, Human/genetics , Chromosome Mapping , DNA/genetics
7.
PLoS One ; 17(4): e0264799, 2022.
Article in English | MEDLINE | ID: mdl-35482762

ABSTRACT

MafB (a bZIP transcription factor), ß-catenin (the ultimate target of the Wnt signal transduction pathway that acts as a transcriptional co-activator of LEF/TCF proteins), and WDR77 (a transcriptional co-activator of multiple hormone receptors) are important for breast cellular transformation. Unexpectedly, these proteins interact directly with each other, and they have similar genomic binding profiles. Furthermore, while some of these common target sites coincide with those bound by LEF/TCF, the majority are located just downstream of transcription initiation sites at a position near paused RNA polymerase (Pol II) and the +1 nucleosome. Occupancy levels of these factors at these promoter-proximal sites are strongly correlated with the level of paused Pol II and transcriptional activity.


Subject(s)
Catenins , beta Catenin , Catenins/metabolism , Promoter Regions, Genetic , RNA Polymerase II/metabolism , Transcription Factors/metabolism , Wnt Signaling Pathway/genetics , beta Catenin/genetics , beta Catenin/metabolism
9.
Nucleic Acids Res ; 50(D1): D141-D149, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34755879

ABSTRACT

The human genome contains ∼2000 transcriptional regulatory proteins, including ∼1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.


Subject(s)
Databases, Genetic , Nucleotide Motifs/genetics , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/genetics , Binding Sites/genetics , Gene Expression Regulation/genetics , Humans , Transcription Factors/classification
10.
Genome Res ; 32(2): 389-402, 2022 02.
Article in English | MEDLINE | ID: mdl-34949670

ABSTRACT

Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated nor do they contain information on cell type-specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks are primarily proximal to GENCODE-annotated TSSs and are concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3' ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations are supported by epigenomic and other transcriptomic data sets. To show the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association study (GWAS) catalog and identified new candidate GWAS genes. Overall, our work shows the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.


Subject(s)
Gene Expression Regulation , Genome-Wide Association Study , Humans , Promoter Regions, Genetic , Transcription Initiation Site
11.
Elife ; 102021 08 31.
Article in English | MEDLINE | ID: mdl-34463254

ABSTRACT

The YAP and TAZ paralogs are transcriptional co-activators recruited to target sites by TEAD proteins. Here, we show that YAP and TAZ are also recruited by JUNB (a member of the AP-1 family) and STAT3, key transcription factors that mediate an epigenetic switch linking inflammation to cellular transformation. YAP and TAZ directly interact with JUNB and STAT3 via a WW domain important for transformation, and they stimulate transcriptional activation by AP-1 proteins. JUNB, STAT3, and TEAD co-localize at virtually all YAP/TAZ target sites, yet many target sites only contain individual AP-1, TEAD, or STAT3 motifs. This observation and differences in relative crosslinking efficiencies of JUNB, TEAD, and STAT3 at YAP/TAZ target sites suggest that YAP/TAZ is recruited by different forms of an AP-1/STAT3/TEAD complex depending on the recruiting motif. The different classes of YAP/TAZ target sites are associated with largely non-overlapping genes with distinct functions. A small minority of target sites are YAP- or TAZ-specific, and they are associated with different sequence motifs and gene classes from shared YAP/TAZ target sites. Genes containing either the AP-1 or TEAD class of YAP/TAZ sites are associated with poor survival of breast cancer patients with the triple-negative form of the disease.


Subject(s)
Adaptor Proteins, Signal Transducing/metabolism , Cell Transformation, Neoplastic/metabolism , Intracellular Signaling Peptides and Proteins/metabolism , STAT3 Transcription Factor/metabolism , Transcription Factor AP-1/metabolism , Transcription Factors/metabolism , Transcriptional Activation , Triple Negative Breast Neoplasms/metabolism , Adaptor Proteins, Signal Transducing/genetics , Cell Line, Tumor , Cell Transformation, Neoplastic/genetics , Cell Transformation, Neoplastic/pathology , Databases, Genetic , Female , Gene Expression Regulation, Neoplastic , Humans , Intracellular Signaling Peptides and Proteins/genetics , Protein Binding , Protein Interaction Domains and Motifs , STAT3 Transcription Factor/genetics , Signal Transduction , Transcription Factor AP-1/genetics , Transcription Factors/genetics , Transcriptional Coactivator with PDZ-Binding Motif Proteins , Triple Negative Breast Neoplasms/genetics , Triple Negative Breast Neoplasms/pathology , YAP-Signaling Proteins
12.
Prog Mol Biol Transl Sci ; 181: 31-43, 2021.
Article in English | MEDLINE | ID: mdl-34127199

ABSTRACT

The clustered, regularly interspersed, short palindromic repeats (CRISPR) technology is revolutionizing biological studies and holds tremendous promise for treating human diseases. However, a significant limitation of this technology is that modifications can occur on off-target sites lacking perfect complementarity to the single guide RNA (sgRNA) or canonical protospacer-adjacent motif (PAM) sequence. Several in vivo and in vitro genome-wide off-target profiling approaches have been developed to inform on the fidelity of gene editing. Of these, GUIDE-seq has become one of the most widely adopted and reproducible methods. To allow users to easily analyze GUIDE-seq data generated on any sequencing platform, we developed an open-source pipeline, GS-Preprocess, that takes standard base-call output in bcl format and generate all required input data for off-target identification using bioconductor package GUIDEseq for off-target identification. Furthermore, we created a Docker image with GS-Proprocess, GUIDE-seq, and all its R and system dependencies already installed. The bundled pipeline will empower end users to streamline the analysis of GUIDE-seq data and motivate their use of higher throughput sequencing with increased multiplexing for GUIDE-seq experiments.


Subject(s)
CRISPR-Cas Systems , RNA, Guide, Kinetoplastida , CRISPR-Cas Systems/genetics , Gene Editing , High-Throughput Nucleotide Sequencing , Humans
13.
Annu Rev Genomics Hum Genet ; 22: 199-218, 2021 08 31.
Article in English | MEDLINE | ID: mdl-33792357

ABSTRACT

Short interspersed nuclear elements (SINEs) are nonautonomous retrotransposons that occupy approximately 13% of the human genome. They are transcribed by RNA polymerase III and can be retrotranscribed and inserted back into the genome with the help of other autonomous retroelements. Because they are preferentially located close to or within gene-rich regions, they can regulate gene expression by various mechanisms that act at both the DNA and the RNA levels. In this review, we summarize recent findings on the involvement of SINEs in different types of gene regulation and discuss the potential regulatory functions of SINEs that are in close proximity to genes, Pol III-transcribed SINE RNAs, and embedded SINE sequences within Pol II-transcribed genes in the human genome. These discoveries illustrate how the human genome has exapted some SINEs into functional regulatory elements.


Subject(s)
Genome, Human , Transcription, Genetic , Gene Expression Regulation , Humans , RNA Polymerase III/genetics , Short Interspersed Nucleotide Elements/genetics
14.
Commun Biol ; 4(1): 239, 2021 02 22.
Article in English | MEDLINE | ID: mdl-33619351

ABSTRACT

The morphologically and functionally distinct cell types of a multicellular organism are maintained by their unique epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 11.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by polycomb repressive complex proteins, and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.


Subject(s)
Chromatin Assembly and Disassembly , Epigenesis, Genetic , Epigenome , Animals , Binding Sites , DNA Methylation , Epigenomics , Gene Expression Regulation, Developmental , Gestational Age , Histones/metabolism , Mice, Inbred C57BL , Polycomb Repressive Complex 2/genetics , Polycomb Repressive Complex 2/metabolism , Promoter Regions, Genetic
15.
Hepatology ; 73(3): 1011-1027, 2021 03.
Article in English | MEDLINE | ID: mdl-32452550

ABSTRACT

BACKGROUND AND AIMS: Despite surgical and chemotherapeutic advances, the 5-year survival rate for stage IV hepatoblastoma (HB), the predominant pediatric liver tumor, remains at 27%. Yes-associated protein 1 (YAP1) and ß-catenin co-activation occurs in 80% of children's HB; however, a lack of conditional genetic models precludes tumor maintenance exploration. Thus, the need for a targeted therapy remains unmet. Given the predominance of YAP1 and ß-catenin activation in HB, we sought to evaluate YAP1 as a therapeutic target in HB. APPROACH AND RESULTS: We engineered the conditional HB murine model using hydrodynamic injection to deliver transposon plasmids encoding inducible YAP1S127A , constitutive ß-cateninDelN90 , and a luciferase reporter to murine liver. Tumor regression was evaluated using bioluminescent imaging, tumor landscape characterized using RNA and ATAC sequencing, and DNA footprinting. Here we show that YAP1S127A withdrawal mediates more than 90% tumor regression with survival for 230+ days in mice. YAP1S127A withdrawal promotes apoptosis in a subset of tumor cells, and in remaining cells induces a cell fate switch that drives therapeutic differentiation of HB tumors into Ki-67-negative hepatocyte-like HB cells ("HbHeps") with hepatocyte-like morphology and mature hepatocyte gene expression. YAP1S127A withdrawal drives the formation of hbHeps by modulating liver differentiation transcription factor occupancy. Indeed, tumor-derived hbHeps, consistent with their reprogrammed transcriptional landscape, regain partial hepatocyte function and rescue liver damage in mice. CONCLUSIONS: YAP1S127A withdrawal, without silencing oncogenic ß-catenin, significantly regresses hepatoblastoma, providing in vivo data to support YAP1 as a therapeutic target for HB. YAP1S127A withdrawal alone sufficiently drives long-term regression in HB, as it promotes cell death in a subset of tumor cells and modulates transcription factor occupancy to reverse the fate of residual tumor cells to mimic functional hepatocytes.


Subject(s)
Adaptor Proteins, Signal Transducing/metabolism , Hepatoblastoma/metabolism , Hepatocytes/metabolism , Liver Neoplasms/metabolism , Transcription Factors/metabolism , Animals , Cell Differentiation , Chromatin/metabolism , Genetic Engineering , Hepatoblastoma/therapy , Humans , Liver Neoplasms/therapy , Mice , YAP-Signaling Proteins
16.
Nature ; 583(7818): 699-710, 2020 07.
Article in English | MEDLINE | ID: mdl-32728249

ABSTRACT

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.


Subject(s)
DNA/genetics , Databases, Genetic , Genome/genetics , Genomics , Molecular Sequence Annotation , Registries , Regulatory Sequences, Nucleic Acid/genetics , Animals , Chromatin/genetics , Chromatin/metabolism , DNA/chemistry , DNA Footprinting , DNA Methylation/genetics , DNA Replication Timing , Deoxyribonuclease I/metabolism , Genome, Human , Histones/metabolism , Humans , Mice , Mice, Transgenic , RNA-Binding Proteins/genetics , Transcription, Genetic/genetics , Transposases/metabolism
17.
Bioinformatics ; 36(11): 3573-3575, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32181813

ABSTRACT

SUMMARY: Sequence logos were introduced nearly 30 years ago as a human-readable format for representing consensus sequences, and they remain widely used. As new experimental and computational techniques have developed, logos have been extended: extra symbols represent covalent modifications to nucleotides, logos with multiple letters at each position illustrate models with multi-nucleotide features and symbols extending below the x-axis may represent a binding energy penalty for a residue or a negative weight output from a neural network. Web-based visualization tools for genomic data are increasingly taking advantage of modern web technology to offer dynamic, interactive figures to users, but support for sequence logos remains limited. Here, we present LogoJS, a Javascript package for rendering customizable, interactive, vector-graphic sequence logos and embedding them in web applications. LogoJS supports all the aforementioned logo extensions and is bundled with a companion web application for creating and sharing logos. AVAILABILITY AND IMPLEMENTATION: LogoJS is implemented both in plain Javascript and ReactJS, a popular user-interface framework. The web application is hosted at logojs.wenglab.org. All major browsers and operating systems are supported. The package and application are open-source; code is available at GitHub. CONTACT: zhiping.weng@umassmed.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics , Software , Humans , Internet , Nucleotides , Position-Specific Scoring Matrices
18.
Genome Biol ; 21(1): 17, 2020 01 22.
Article in English | MEDLINE | ID: mdl-31969180

ABSTRACT

BACKGROUND: Many genome-wide collections of candidate cis-regulatory elements (cCREs) have been defined using genomic and epigenomic data, but it remains a major challenge to connect these elements to their target genes. RESULTS: To facilitate the development of computational methods for predicting target genes, we develop a Benchmark of candidate Enhancer-Gene Interactions (BENGI) by integrating the recently developed Registry of cCREs with experimentally derived genomic interactions. We use BENGI to test several published computational methods for linking enhancers with genes, including signal correlation and the TargetFinder and PEP supervised learning methods. We find that while TargetFinder is the best-performing method, it is only modestly better than a baseline distance method for most benchmark datasets when trained and tested with the same cell type and that TargetFinder often does not outperform the distance method when applied across cell types. CONCLUSIONS: Our results suggest that current computational methods need to be improved and that BENGI presents a useful framework for method development and testing.


Subject(s)
Enhancer Elements, Genetic , Benchmarking , Data Curation , Gene Expression Regulation , Genomics , Machine Learning
19.
Nucleic Acids Res ; 46(21): 11184-11201, 2018 11 30.
Article in English | MEDLINE | ID: mdl-30137428

ABSTRACT

Enhancers are distal cis-regulatory elements that modulate gene expression. They are depleted of nucleosomes and enriched in specific histone modifications; thus, calling DNase-seq and histone mark ChIP-seq peaks can predict enhancers. We evaluated nine peak-calling algorithms for predicting enhancers validated by transgenic mouse assays. DNase and H3K27ac peaks were consistently more predictive than H3K4me1/2/3 and H3K9ac peaks. DFilter and Hotspot2 were the best DNase peak callers, while HOMER, MUSIC, MACS2, DFilter and F-seq were the best H3K27ac peak callers. We observed that the differential DNase or H3K27ac signals between two distant tissues increased the area under the precision-recall curve (PR-AUC) of DNase peaks by 17.5-166.7% and that of H3K27ac peaks by 7.1-22.2%. We further improved this differential signal method using multiple contrast tissues. Evaluated using a blind test, the differential H3K27ac signal method substantially improved PR-AUC from 0.48 to 0.75 for predicting heart enhancers. We further validated our approach using postnatal retina and cerebral cortex enhancers identified by massively parallel reporter assays, and observed improvements for both tissues. In summary, we compared nine peak callers and devised a superior method for predicting tissue-specific mouse developmental enhancers by reranking the called peaks.


Subject(s)
Algorithms , Chromatin/genetics , Computational Biology/methods , Enhancer Elements, Genetic/genetics , Histone Code/genetics , Animals , Binding Sites , Chromatin/metabolism , Histones/metabolism , Mice, Transgenic , Organ Specificity , Protein Processing, Post-Translational , Transcription Factors/metabolism
20.
Front Microbiol ; 8: 240, 2017.
Article in English | MEDLINE | ID: mdl-28265266

ABSTRACT

Flaviviral infections including dengue virus are an increasing clinical problem worldwide. Dengue infection triggers host production of the type 1 IFN, IFN alpha, one of the strongest and broadest acting antivirals known. However, dengue virus subverts host IFN signaling at early steps of IFN signal transduction. This subversion allows unbridled viral replication which subsequently triggers ongoing production of IFN which, again, is subverted. Identification of downstream IFN antiviral effectors will provide targets which could be activated to restore broad acting antiviral activity, stopping the signal to produce endogenous IFN at toxic levels. To this end, we performed a targeted functional genomic screen for IFN antiviral effector genes (IEGs), identifying 56 IEGs required for antiviral effects of IFN against fully infectious dengue virus. Dengue IEGs were enriched for genes encoding nuclear receptor interacting proteins, including HELZ2, MAP2K4, SLC27A2, HSP90AA1, and HSP90AB1. We focused on HELZ2 (Helicase With Zinc Finger 2), an IFN stimulated gene and IEG which encodes a promiscuous nuclear factor coactivator that exists in two isoforms. The two unique HELZ2 isoforms are both IFN responsive, contain ISRE elements, and gene products increase in the nucleus upon IFN stimulation. Chromatin immunoprecipitation-sequencing revealed that the HELZ2 complex interacts with triglyceride-regulator LMF1. Mass spectrometry revealed that HELZ2 knockdown cells are depleted of triglyceride subsets. We thus sought to determine whether HELZ2 interacts with a nuclear receptor known to regulate immune response and lipid metabolism, AHR, and identified HELZ2:AHR interactions via co-immunoprecipitation, found that AHR is a dengue IEG, and that an AHR ligand, FICZ, exhibits anti-dengue activity. Primary bone marrow derived macrophages from HELZ2 knockout mice, compared to wild type controls, exhibit enhanced dengue infectivity. Overall, these findings reveal that IFN antiviral response is mediated by HELZ2 transcriptional upregulation, enrichment of HELZ2 protein levels in the nucleus, and activation of a transcriptional program that appears to modulate intracellular lipid state. IEGs identified in this study may serve as both (1) potential targets for host directed antiviral design, downstream of the common flaviviral subversion point, as well as (2) possible biomarkers, whose variation, natural, or iatrogenic, could affect host response to viral infections.

SELECTION OF CITATIONS
SEARCH DETAIL