Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
J Neurosci ; 41(43): 9008-9030, 2021 10 27.
Artículo en Inglés | MEDLINE | ID: mdl-34462306

RESUMEN

Recent large genome-wide association studies have identified multiple confident risk loci linked to addiction-associated behavioral traits. Most genetic variants linked to addiction-associated traits lie in noncoding regions of the genome, likely disrupting cis-regulatory element (CRE) function. CREs tend to be highly cell type-specific and may contribute to the functional development of the neural circuits underlying addiction. Yet, a systematic approach for predicting the impact of risk variants on the CREs of specific cell populations is lacking. To dissect the cell types and brain regions underlying addiction-associated traits, we applied stratified linkage disequilibrium score regression to compare genome-wide association studies to genomic regions collected from human and mouse assays for open chromatin, which is associated with CRE activity. We found enrichment of addiction-associated variants in putative CREs marked by open chromatin in neuronal (NeuN+) nuclei collected from multiple prefrontal cortical areas and striatal regions known to play major roles in reward and addiction. To further dissect the cell type-specific basis of addiction-associated traits, we also identified enrichments in human orthologs of open chromatin regions of female and male mouse neuronal subtypes: cortical excitatory, D1, D2, and PV. Last, we developed machine learning models to predict mouse cell type-specific open chromatin, enabling us to further categorize human NeuN+ open chromatin regions into cortical excitatory or striatal D1 and D2 neurons and predict the functional impact of addiction-associated genetic variants. Our results suggest that different neuronal subtypes within the reward system play distinct roles in the variety of traits that contribute to addiction.SIGNIFICANCE STATEMENT We combine statistical genetic and machine learning techniques to find that the predisposition to for nicotine, alcohol, and cannabis use behaviors can be partially explained by genetic variants in conserved regulatory elements within specific brain regions and neuronal subtypes of the reward system. Our computational framework can flexibly integrate open chromatin data across species to screen for putative causal variants in a cell type- and tissue-specific manner for numerous complex traits.


Asunto(s)
Conducta Adictiva/genética , Encéfalo/fisiología , Predisposición Genética a la Enfermedad/genética , Variación Genética/fisiología , Neuronas/fisiología , Elementos Reguladores de la Transcripción/fisiología , Animales , Conducta Adictiva/patología , Encéfalo/patología , Bases de Datos Genéticas , Femenino , Humanos , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Transgénicos , Neuronas/patología , Sitios de Carácter Cuantitativo/genética
2.
BMC Genomics ; 23(1): 295, 2022 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-35410161

RESUMEN

BACKGROUND: Many transcription factors (TFs), such as multi zinc-finger (ZF) TFs, have multiple DNA binding domains (DBDs), and deciphering the DNA binding motifs of individual DBDs is a major challenge. One example of such a TF is CCCTC-binding factor (CTCF), a TF with eleven ZFs that plays a variety of roles in transcriptional regulation, most notably anchoring DNA loops. Previous studies found that CTCF ZFs 3-7 bind CTCF's core motif and ZFs 9-11 bind a specific upstream motif, but the motifs of ZFs 1-2 have yet to be identified. RESULTS: We developed a new approach to identifying the binding motifs of individual DBDs of a TF through analyzing chromatin immunoprecipitation sequencing (ChIP-seq) experiments in which a single DBD is mutated: we train a deep convolutional neural network to predict whether wild-type TF binding sites are preserved in the mutant TF dataset and interpret the model. We applied this approach to mouse CTCF ChIP-seq data and identified the known binding preferences of CTCF ZFs 3-11 as well as a putative GAG binding motif for ZF 1. We analyzed other CTCF datasets to provide additional evidence that ZF 1 is associated with binding at the motif we identified, and we found that the presence of the motif for ZF 1 is associated with CTCF ChIP-seq peak strength. CONCLUSIONS: Our approach can be applied to any TF for which in vivo binding data from both the wild-type and mutated versions of the TF are available, and our findings provide new potential insights binding preferences of CTCF's DBDs.


Asunto(s)
Factores de Transcripción , Zinc , Animales , Sitios de Unión , Factor de Unión a CCCTC/metabolismo , ADN/metabolismo , Ratones , Redes Neurales de la Computación , Unión Proteica , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Zinc/metabolismo , Dedos de Zinc/genética
3.
BMC Genomics ; 23(1): 291, 2022 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-35410163

RESUMEN

BACKGROUND: Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorithms that examine the ability of one or more nucleotides to align across large evolutionary distances. While these nucleotide alignment-based approaches have proven powerful for protein-coding genes and some non-coding elements, they fail to capture conservation of many enhancers, distal regulatory elements that control spatial and temporal patterns of gene expression. The function of enhancers is governed by a complex, often tissue- and cell type-specific code that links combinations of transcription factor binding sites and other regulation-related sequence patterns to regulatory activity. Thus, function of orthologous enhancer regions can be conserved across large evolutionary distances, even when nucleotide turnover is high. RESULTS: We present a new machine learning-based approach for evaluating enhancer conservation that leverages the combinatorial sequence code of enhancer activity rather than relying on the alignment of individual nucleotides. We first train a convolutional neural network model that can predict tissue-specific open chromatin, a proxy for enhancer activity, across mammals. Next, we apply that model to distinguish instances where the genome sequence would predict conserved function versus a loss of regulatory activity in that tissue. We present criteria for systematically evaluating model performance for this task and use them to demonstrate that our models accurately predict tissue-specific conservation and divergence in open chromatin between primate and rodent species, vastly out-performing leading nucleotide alignment-based approaches. We then apply our models to predict open chromatin at orthologs of brain and liver open chromatin regions across hundreds of mammals and find that brain enhancers associated with neuron activity have a stronger tendency than the general population to have predicted lineage-specific open chromatin. CONCLUSION: The framework presented here provides a mechanism to annotate tissue-specific regulatory function across hundreds of genomes and to study enhancer evolution using predicted regulatory differences rather than nucleotide-level conservation measurements.


Asunto(s)
Cromatina , Elementos de Facilitación Genéticos , Animales , Cromatina/genética , Humanos , Mamíferos/genética , Redes Neurales de la Computación , Nucleótidos
4.
Bioinformatics ; 36(15): 4339-4340, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32407523

RESUMEN

SUMMARY: Diverse traits have evolved through cis-regulatory changes in genome sequence that influence the magnitude, timing and cell type-specificity of gene expression. Advances in high-throughput sequencing and regulatory genomics have led to the identification of regulatory elements in individual species, but these genomic regions remain difficult to align across taxonomic orders due to their lack of sequence conservation relative to protein coding genes. The groundwork for tracing the evolution of regulatory elements is provided by the recent assembly of hundreds of genomes, the generation of reference-free Cactus multiple sequence alignments of these genomes, and the development of the halLiftover tool for mapping regions across these alignments. We present halLiftover Post-processing for the Evolution of Regulatory Elements (HALPER), a tool for constructing contiguous regulatory element orthologs from the outputs of halLiftover. We anticipate that this tool will enable users to efficiently identify orthologs of regulatory elements across hundreds of species, providing novel insights into the evolution of traits that have evolved through gene expression. AVAILABILITY AND IMPLEMENTATION: HALPER is implemented in python and available on github: https://github.com/pfenninglab/halLiftover-postprocessing. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Genoma , Secuencias Reguladoras de Ácidos Nucleicos/genética , Alineación de Secuencia
5.
Genome Res ; 25(6): 907-17, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25910490

RESUMEN

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.


Asunto(s)
Mapeo Cromosómico , Metilación de ADN , Polimorfismo de Nucleótido Simple , Alelos , Línea Celular , Biología Computacional , Simulación por Computador , Bases de Datos Genéticas , Epigénesis Genética , Regulación de la Expresión Génica , Biblioteca de Genes , Estudios de Asociación Genética , Genoma Humano , Genotipo , Humanos , Fenotipo , Sitios de Carácter Cuantitativo , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN
6.
Science ; 383(6690): eabn3263, 2024 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-38422184

RESUMEN

Vocal production learning ("vocal learning") is a convergently evolved trait in vertebrates. To identify brain genomic elements associated with mammalian vocal learning, we integrated genomic, anatomical, and neurophysiological data from the Egyptian fruit bat (Rousettus aegyptiacus) with analyses of the genomes of 215 placental mammals. First, we identified a set of proteins evolving more slowly in vocal learners. Then, we discovered a vocal motor cortical region in the Egyptian fruit bat, an emergent vocal learner, and leveraged that knowledge to identify active cis-regulatory elements in the motor cortex of vocal learners. Machine learning methods applied to motor cortex open chromatin revealed 50 enhancers robustly associated with vocal learning whose activity tended to be lower in vocal learners. Our research implicates convergent losses of motor cortex regulatory elements in mammalian vocal learning evolution.


Asunto(s)
Elementos de Facilitación Genéticos , Euterios , Evolución Molecular , Regulación de la Expresión Génica , Corteza Motora , Neuronas Motoras , Proteínas , Vocalización Animal , Animales , Quirópteros/genética , Quirópteros/fisiología , Vocalización Animal/fisiología , Corteza Motora/citología , Corteza Motora/fisiología , Cromatina/metabolismo , Neuronas Motoras/fisiología , Laringe/fisiología , Epigénesis Genética , Genoma , Proteínas/genética , Proteínas/metabolismo , Secuencia de Aminoácidos , Euterios/genética , Euterios/fisiología , Aprendizaje Automático
7.
Science ; 380(6643): eabm7993, 2023 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-37104615

RESUMEN

Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.


Asunto(s)
Elementos de Facilitación Genéticos , Variación Genética , Aprendizaje Automático , Mamíferos , Animales , Mamíferos/genética , Fenotipo
8.
Science ; 380(6643): eabn3943, 2023 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-37104599

RESUMEN

Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.


Asunto(s)
Euterios , Evolución Molecular , Animales , Femenino , Humanos , Secuencia Conservada/genética , Euterios/genética , Genoma Humano
9.
Elife ; 112022 05 16.
Artículo en Inglés | MEDLINE | ID: mdl-35576146

RESUMEN

Recent discoveries of extreme cellular diversity in the brain warrant rapid development of technologies to access specific cell populations within heterogeneous tissue. Available approaches for engineering-targeted technologies for new neuron subtypes are low yield, involving intensive transgenic strain or virus screening. Here, we present Specific Nuclear-Anchored Independent Labeling (SNAIL), an improved virus-based strategy for cell labeling and nuclear isolation from heterogeneous tissue. SNAIL works by leveraging machine learning and other computational approaches to identify DNA sequence features that confer cell type-specific gene activation and then make a probe that drives an affinity purification-compatible reporter gene. As a proof of concept, we designed and validated two novel SNAIL probes that target parvalbumin-expressing (PV+) neurons. Nuclear isolation using SNAIL in wild-type mice is sufficient to capture characteristic open chromatin features of PV+ neurons in the cortex, striatum, and external globus pallidus. The SNAIL framework also has high utility for multispecies cell probe engineering; expression from a mouse PV+ SNAIL enhancer sequence was enriched in PV+ neurons of the macaque cortex. Expansion of this technology has broad applications in cell type-specific observation, manipulation, and therapeutics across species and disease models.


Asunto(s)
Elementos de Facilitación Genéticos , Aprendizaje Automático , Neuronas , Análisis de Secuencia de ADN , Animales , Corteza Cerebral/metabolismo , Biología Computacional/métodos , Elementos de Facilitación Genéticos/genética , Globo Pálido , Ratones , Neuronas/metabolismo , Parvalbúminas/metabolismo , Análisis de Secuencia de ADN/métodos
10.
Science ; 374(6564): 201-206, 2021 Oct 08.
Artículo en Inglés | MEDLINE | ID: mdl-34618556

RESUMEN

Symptoms of neurological diseases emerge through the dysfunction of neural circuits whose diffuse and intertwined architectures pose serious challenges for delivering therapies. Deep brain stimulation (DBS) improves Parkinson's disease symptoms acutely but does not differentiate between neuronal circuits, and its effects decay rapidly if stimulation is discontinued. Recent findings suggest that optogenetic manipulation of distinct neuronal subpopulations in the external globus pallidus (GPe) provides long-lasting therapeutic effects in dopamine-depleted (DD) mice. We used synaptic differences to excite parvalbumin-expressing GPe neurons and inhibit lim-homeobox-6­expressing GPe neurons simultaneously using brief bursts of electrical stimulation. In DD mice, circuit-inspired DBS provided long-lasting therapeutic benefits that far exceeded those induced by conventional DBS, extending several hours after stimulation. These results establish the feasibility of transforming knowledge of circuit architecture into translatable therapeutic approaches.


Asunto(s)
Estimulación Encefálica Profunda/métodos , Dopamina/deficiencia , Globo Pálido/fisiopatología , Neuronas/fisiología , Enfermedad de Parkinson/terapia , Estimulación Eléctrica Transcutánea del Nervio/métodos , Animales , Modelos Animales de Enfermedad , Dopamina/genética , Femenino , Globo Pálido/citología , Masculino , Ratones , Ratones Endogámicos C57BL , Optogenética , Enfermedad de Parkinson/fisiopatología , Núcleo Subtalámico/citología , Núcleo Subtalámico/fisiopatología , Sinapsis/fisiología
11.
Bioinformatics ; 25(12): i21-9, 2009 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-19477990

RESUMEN

MOTIVATION: Genome-wide association studies are commonly used to identify possible associations between genetic variations and diseases. These studies mainly focus on identifying individual single nucleotide polymorphisms (SNPs) potentially linked with one disease of interest. In this work, we introduce a novel methodology that identifies similarities between diseases using information from a large number of SNPs. We separate the diseases for which we have individual genotype data into one reference disease and several query diseases. We train a classifier that distinguishes between individuals that have the reference disease and a set of control individuals. This classifier is then used to classify the individuals that have the query diseases. We can then rank query diseases according to the average classification of the individuals in each disease set, and identify which of the query diseases are more similar to the reference disease. We repeat these classification and comparison steps so that each disease is used once as reference disease. RESULTS: We apply this approach using a decision tree classifier to the genotype data of seven common diseases and two shared control sets provided by the Wellcome Trust Case Control Consortium. We show that this approach identifies the known genetic similarity between type 1 diabetes and rheumatoid arthritis, and identifies a new putative similarity between bipolar disease and hypertension.


Asunto(s)
Artritis Reumatoide/genética , Biología Computacional/métodos , Diabetes Mellitus Tipo 1/genética , Predisposición Genética a la Enfermedad/genética , Artritis Reumatoide/clasificación , Diabetes Mellitus Tipo 1/clasificación , Perfilación de la Expresión Génica , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Polimorfismo de Nucleótido Simple
12.
Nat Commun ; 10(1): 4063, 2019 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-31492858

RESUMEN

Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements.


Asunto(s)
Sistemas CRISPR-Cas , Regulación Neoplásica de la Expresión Génica , Genoma Humano/genética , ARN Guía de Kinetoplastida/genética , Elementos Reguladores de la Transcripción/genética , Biología Computacional/métodos , Epigénesis Genética/genética , Epigenómica/métodos , Edición Génica/métodos , Células HEK293 , Humanos , Células K562
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA