Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Sci Adv ; 10(21): eadj4452, 2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38781344

RESUMEN

Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.


Asunto(s)
Encéfalo , Epigénesis Genética , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Encéfalo/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Evolución Molecular , Trastornos Mentales/genética , Elementos Reguladores de la Transcripción/genética , Neuronas/metabolismo , Regulación de la Expresión Génica , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
2.
Hepatol Commun ; 7(10)2023 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-37756045

RESUMEN

BACKGROUND: Genome-wide association studies (GWAS) have identified 30 risk loci for primary sclerosing cholangitis (PSC). Variants within these loci are found predominantly in noncoding regions of DNA making their mechanisms of conferring risk hard to define. Epigenomic studies have shown noncoding variants broadly impact regulatory element activity. The possible association of noncoding PSC variants with regulatory element activity has not been studied. We aimed to (1) determine if the noncoding risk variants in PSC impact regulatory element function and (2) if so, assess the role these regulatory elements have in explaining the genetic risk for PSC. METHODS: Available epigenomic datasets were integrated to build a comprehensive atlas of cell type-specific regulatory elements, emphasizing PSC-relevant cell types. RNA-seq and ATAC-seq were performed on peripheral CD4+ T cells from 10 PSC patients and 11 healthy controls. Computational techniques were used to (1) study the enrichment of PSC-risk variants within regulatory elements, (2) correlate risk genotype with differences in regulatory element activity, and (3) identify regulatory elements differentially active and genes differentially expressed between PSC patients and controls. RESULTS: Noncoding PSC-risk variants are strongly enriched within immune-specific enhancers, particularly ones involved in T-cell response to antigenic stimulation. In total, 250 genes and >10,000 regulatory elements were identified that are differentially active between patients and controls. CONCLUSIONS: Mechanistic effects are proposed for variants at 6 PSC-risk loci where genotype was linked with differential T-cell regulatory element activity. Regulatory elements are shown to play a key role in PSC pathophysiology.


Asunto(s)
Colangitis Esclerosante , Estudio de Asociación del Genoma Completo , Humanos , Colangitis Esclerosante/genética , Secuenciación de Inmunoprecipitación de Cromatina , Genotipo
3.
Science ; 380(6643): eabn7930, 2023 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-37104580

RESUMEN

Understanding the regulatory landscape of the human genome is a long-standing objective of modern biology. Using the reference-free alignment across 241 mammalian genomes produced by the Zoonomia Consortium, we charted evolutionary trajectories for 0.92 million human candidate cis-regulatory elements (cCREs) and 15.6 million human transcription factor binding sites (TFBSs). We identified 439,461 cCREs and 2,024,062 TFBSs under evolutionary constraint. Genes near constrained elements perform fundamental cellular processes, whereas genes near primate-specific elements are involved in environmental interaction, including odor perception and immune response. About 20% of TFBSs are transposable element-derived and exhibit intricate patterns of gains and losses during primate evolution whereas sequence variants associated with complex traits are enriched in constrained TFBSs. Our annotations illuminate the regulatory functions of the human genome.


Asunto(s)
Evolución Molecular , Genoma Humano , Mamíferos , Elementos Reguladores de la Transcripción , Factores de Transcripción , Animales , Humanos , Sitios de Unión , Elementos Transponibles de ADN , Mamíferos/clasificación , Mamíferos/genética , Primates/clasificación , Primates/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Filogenia
4.
Hum Mol Genet ; 31(R1): R114-R122, 2022 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-36083269

RESUMEN

Every cell in the human body inherits a copy of the same genetic information. The three billion base pairs of DNA in the human genome, and the roughly 50 000 coding and non-coding genes they contain, must thus encode all the complexity of human development and cell and tissue type diversity. Differences in gene regulation, or the modulation of gene expression, enable individual cells to interpret the genome differently to carry out their specific functions. Here we discuss recent and ongoing efforts to build gene regulatory maps, which aim to characterize the regulatory roles of all sequences in a genome. Many researchers and consortia have identified such regulatory elements using functional assays and evolutionary analyses; we discuss the results, strengths and shortcomings of their approaches. We also discuss new techniques the field can leverage and emerging challenges it will face while striving to build gene regulatory maps of ever-increasing resolution and comprehensiveness.


Asunto(s)
Regulación de la Expresión Génica , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Regulación de la Expresión Génica/genética , Genoma Humano/genética , Mapeo Cromosómico , ADN/genética
6.
Nucleic Acids Res ; 50(D1): D141-D149, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34755879

RESUMEN

The human genome contains ∼2000 transcriptional regulatory proteins, including ∼1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.


Asunto(s)
Bases de Datos Genéticas , Motivos de Nucleótidos/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/genética , Sitios de Unión/genética , Regulación de la Expresión Génica/genética , Humanos , Factores de Transcripción/clasificación
7.
Genome Res ; 32(2): 389-402, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34949670

RESUMEN

Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated nor do they contain information on cell type-specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks are primarily proximal to GENCODE-annotated TSSs and are concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3' ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations are supported by epigenomic and other transcriptomic data sets. To show the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association study (GWAS) catalog and identified new candidate GWAS genes. Overall, our work shows the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.


Asunto(s)
Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Regiones Promotoras Genéticas , Sitio de Iniciación de la Transcripción
8.
Prog Mol Biol Transl Sci ; 181: 31-43, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34127199

RESUMEN

The clustered, regularly interspersed, short palindromic repeats (CRISPR) technology is revolutionizing biological studies and holds tremendous promise for treating human diseases. However, a significant limitation of this technology is that modifications can occur on off-target sites lacking perfect complementarity to the single guide RNA (sgRNA) or canonical protospacer-adjacent motif (PAM) sequence. Several in vivo and in vitro genome-wide off-target profiling approaches have been developed to inform on the fidelity of gene editing. Of these, GUIDE-seq has become one of the most widely adopted and reproducible methods. To allow users to easily analyze GUIDE-seq data generated on any sequencing platform, we developed an open-source pipeline, GS-Preprocess, that takes standard base-call output in bcl format and generate all required input data for off-target identification using bioconductor package GUIDEseq for off-target identification. Furthermore, we created a Docker image with GS-Proprocess, GUIDE-seq, and all its R and system dependencies already installed. The bundled pipeline will empower end users to streamline the analysis of GUIDE-seq data and motivate their use of higher throughput sequencing with increased multiplexing for GUIDE-seq experiments.


Asunto(s)
Sistemas CRISPR-Cas , ARN Guía de Kinetoplastida , Sistemas CRISPR-Cas/genética , Edición Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
9.
Commun Biol ; 4(1): 239, 2021 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-33619351

RESUMEN

The morphologically and functionally distinct cell types of a multicellular organism are maintained by their unique epigenomes and gene expression programs. Phase III of the ENCODE Project profiled 66 mouse epigenomes across twelve tissues at daily intervals from embryonic day 11.5 to birth. Applying the ChromHMM algorithm to these epigenomes, we annotated eighteen chromatin states with characteristics of promoters, enhancers, transcribed regions, repressed regions, and quiescent regions. Our integrative analyses delineate the tissue specificity and developmental trajectory of the loci in these chromatin states. Approximately 0.3% of each epigenome is assigned to a bivalent chromatin state, which harbors both active marks and the repressive mark H3K27me3. Highly evolutionarily conserved, these loci are enriched in silencers bound by polycomb repressive complex proteins, and the transcription start sites of their silenced target genes. This collection of chromatin state assignments provides a useful resource for studying mammalian development.


Asunto(s)
Ensamble y Desensamble de Cromatina , Epigénesis Genética , Epigenoma , Animales , Sitios de Unión , Metilación de ADN , Epigenómica , Regulación del Desarrollo de la Expresión Génica , Edad Gestacional , Histonas/metabolismo , Ratones Endogámicos C57BL , Complejo Represivo Polycomb 2/genética , Complejo Represivo Polycomb 2/metabolismo , Regiones Promotoras Genéticas
10.
Nature ; 583(7818): 699-710, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728249

RESUMEN

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.


Asunto(s)
ADN/genética , Bases de Datos Genéticas , Genoma/genética , Genómica , Anotación de Secuencia Molecular , Sistema de Registros , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/genética , Cromatina/metabolismo , ADN/química , Huella de ADN , Metilación de ADN/genética , Momento de Replicación del ADN , Desoxirribonucleasa I/metabolismo , Genoma Humano , Histonas/metabolismo , Humanos , Ratones , Ratones Transgénicos , Proteínas de Unión al ARN/genética , Transcripción Genética/genética , Transposasas/metabolismo
11.
Genome Biol ; 21(1): 17, 2020 01 22.
Artículo en Inglés | MEDLINE | ID: mdl-31969180

RESUMEN

BACKGROUND: Many genome-wide collections of candidate cis-regulatory elements (cCREs) have been defined using genomic and epigenomic data, but it remains a major challenge to connect these elements to their target genes. RESULTS: To facilitate the development of computational methods for predicting target genes, we develop a Benchmark of candidate Enhancer-Gene Interactions (BENGI) by integrating the recently developed Registry of cCREs with experimentally derived genomic interactions. We use BENGI to test several published computational methods for linking enhancers with genes, including signal correlation and the TargetFinder and PEP supervised learning methods. We find that while TargetFinder is the best-performing method, it is only modestly better than a baseline distance method for most benchmark datasets when trained and tested with the same cell type and that TargetFinder often does not outperform the distance method when applied across cell types. CONCLUSIONS: Our results suggest that current computational methods need to be improved and that BENGI presents a useful framework for method development and testing.


Asunto(s)
Elementos de Facilitación Genéticos , Benchmarking , Curaduría de Datos , Regulación de la Expresión Génica , Genómica , Aprendizaje Automático
12.
Nucleic Acids Res ; 46(21): 11184-11201, 2018 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-30137428

RESUMEN

Enhancers are distal cis-regulatory elements that modulate gene expression. They are depleted of nucleosomes and enriched in specific histone modifications; thus, calling DNase-seq and histone mark ChIP-seq peaks can predict enhancers. We evaluated nine peak-calling algorithms for predicting enhancers validated by transgenic mouse assays. DNase and H3K27ac peaks were consistently more predictive than H3K4me1/2/3 and H3K9ac peaks. DFilter and Hotspot2 were the best DNase peak callers, while HOMER, MUSIC, MACS2, DFilter and F-seq were the best H3K27ac peak callers. We observed that the differential DNase or H3K27ac signals between two distant tissues increased the area under the precision-recall curve (PR-AUC) of DNase peaks by 17.5-166.7% and that of H3K27ac peaks by 7.1-22.2%. We further improved this differential signal method using multiple contrast tissues. Evaluated using a blind test, the differential H3K27ac signal method substantially improved PR-AUC from 0.48 to 0.75 for predicting heart enhancers. We further validated our approach using postnatal retina and cerebral cortex enhancers identified by massively parallel reporter assays, and observed improvements for both tissues. In summary, we compared nine peak callers and devised a superior method for predicting tissue-specific mouse developmental enhancers by reranking the called peaks.


Asunto(s)
Algoritmos , Cromatina/genética , Biología Computacional/métodos , Elementos de Facilitación Genéticos/genética , Código de Histonas/genética , Animales , Sitios de Unión , Cromatina/metabolismo , Histonas/metabolismo , Ratones Transgénicos , Especificidad de Órganos , Procesamiento Proteico-Postraduccional , Factores de Transcripción/metabolismo
13.
J Immunol ; 190(11): 5578-87, 2013 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-23616578

RESUMEN

Profiling studies of mRNA and microRNA, particularly microarray-based studies, have been extensively used to create compendia of genes that are preferentially expressed in the immune system. In some instances, functional studies have been subsequently pursued. Recent efforts such as the Encyclopedia of DNA Elements have demonstrated the benefit of coupling RNA sequencing analysis with information from expressed sequence tags (ESTs) for transcriptomic analysis. However, the full characterization and identification of transcripts that function as modulators of human immune responses remains incomplete. In this study, we demonstrate that an integrated analysis of human ESTs provides a robust platform to identify the immune transcriptome. Beyond recovering a reference set of immune-enriched genes and providing large-scale cross-validation of previous microarray studies, we discovered hundreds of novel genes preferentially expressed in the immune system, including noncoding RNAs. As a result, we have established the Immunogene database, representing an integrated EST road map of gene expression in human immune cells, which can be used to further investigate the function of coding and noncoding genes in the immune system. Using this approach, we have uncovered a unique metabolic gene signature of human macrophages and identified PRDM15 as a novel overexpressed gene in human lymphomas. Thus, we demonstrate the utility of EST profiling as a basis for further deconstruction of physiologic and pathologic immune processes.


Asunto(s)
Etiquetas de Secuencia Expresada , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Sistema Inmunológico/metabolismo , Animales , Análisis por Conglomerados , Biología Computacional/métodos , Proteínas de Unión al ADN/genética , Bases de Datos de Ácidos Nucleicos , Redes Reguladoras de Genes , Genómica , Humanos , Enfermedades del Sistema Inmune/genética , Linfoma de Células B/genética , Ratones , Anotación de Secuencia Molecular , ARN Largo no Codificante/genética , Reproducibilidad de los Resultados , Factores de Transcripción/genética , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA