RESUMEN
CRISPR-based gene activation (CRISPRa) is a strategy for upregulating gene expression by targeting promoters or enhancers in a tissue/cell-type specific manner. Here, we describe an experimental framework that combines highly multiplexed perturbations with single-cell RNA sequencing (sc-RNA-seq) to identify cell-type-specific, CRISPRa-responsive cis-regulatory elements and the gene(s) they regulate. Random combinations of many gRNAs are introduced to each of many cells, which are then profiled and partitioned into test and control groups to test for effect(s) of CRISPRa perturbations of both enhancers and promoters on the expression of neighboring genes. Applying this method to a library of 493 gRNAs targeting candidate cis-regulatory elements in both K562 cells and iPSC-derived excitatory neurons, we identify gRNAs capable of specifically upregulating intended target genes and no other neighboring genes within 1 Mb, including gRNAs yielding upregulation of six autism spectrum disorder (ASD) and neurodevelopmental disorder (NDD) risk genes in neurons. A consistent pattern is that the responsiveness of individual enhancers to CRISPRa is restricted by cell type, implying a dependency on either chromatin landscape and/or additional trans-acting factors for successful gene activation. The approach outlined here may facilitate large-scale screens for gRNAs that activate genes in a cell type-specific manner.
Asunto(s)
Sistemas CRISPR-Cas , Elementos de Facilitación Genéticos , Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Células K562 , Elementos de Facilitación Genéticos/genética , Regiones Promotoras Genéticas/genética , ARN Guía de Sistemas CRISPR-Cas/genética , Trastorno del Espectro Autista/genética , Neuronas/metabolismo , Células Madre Pluripotentes Inducidas/metabolismo , Células Madre Pluripotentes Inducidas/citología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genéticaRESUMEN
BACKGROUND: Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of the regulatory programs this variation affects can shed light on the apparatuses of human diseases. RESULTS: We collect epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we construct networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks serve as the base for a rich series of analyses, through which we demonstrate their temporal dynamics and enrichment for various disease-associated variants. We apply the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrate methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. CONCLUSIONS: Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes; this includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
Asunto(s)
Elementos de Facilitación Genéticos , Redes Reguladoras de Genes , Regiones Promotoras Genéticas , Humanos , Neurogénesis/genética , Diferenciación Celular , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Modelos Genéticos , Neuronas/metabolismoRESUMEN
Spermatogenesis is a complex process that can be disrupted by genetic and epigenetic changes, potentially leading to male infertility. Recent research has rapidly increased the number of protein coding mutations causally linked to impaired spermatogenesis in humans and mice. However, the role of non-coding mutations remains largely unexplored. As a case study to evaluate the effects of non-coding mutations on spermatogenesis, we first identified an evolutionarily conserved topologically associated domain (TAD) boundary near two genes with important roles in mammalian testis function: Dmrtb1 and Lrp8 . We then used CRISPR-Cas9 to generate a mouse line where 26kb of the boundary was removed including a strong and evolutionarily conserved CTCF binding site. ChIP-seq and Hi-C experiments confirmed the removal of the CTCF site and a resulting increase in the DNA-DNA interactions across the domain boundary. Mutant mice displayed significant changes in testis gene expression, abnormal testis histology, a 35% drop in the estimated efficiency of spermatogenesis and a 28% decrease in daily sperm production compared to littermate controls. Despite these quantitative changes in testis function, mutant mice show no significant changes in fertility. This suggests that non-coding deletions affecting testis gene regulation may have smaller effects on fertility compared to coding mutations of the same genes. Our results demonstrate that disruption of a TAD boundary can have a negative impact on sperm production and highlight the importance of considering non-coding mutations in the analysis of patients with male infertility.
RESUMEN
Background: Increasing evidence suggests that a substantial proportion of disease-associated mutations occur in enhancers, regions of non-coding DNA essential to gene regulation. Understanding the structures and mechanisms of regulatory programs this variation affects can shed light on the apparatuses of human diseases. Results: We collected epigenetic and gene expression datasets from seven early time points during neural differentiation. Focusing on this model system, we constructed networks of enhancer-promoter interactions, each at an individual stage of neural induction. These networks served as the base for a rich series of analyses, through which we demonstrated their temporal dynamics and enrichment for various disease-associated variants. We applied the Girvan-Newman clustering algorithm to these networks to reveal biologically relevant substructures of regulation. Additionally, we demonstrated methods to validate predicted enhancer-promoter interactions using transcription factor overexpression and massively parallel reporter assays. Conclusions: Our findings suggest a generalizable framework for exploring gene regulatory programs and their dynamics across developmental processes. This includes a comprehensive approach to studying the effects of disease-associated variation on transcriptional networks. The techniques applied to our networks have been published alongside our findings as a computational tool, E-P-INAnalyzer. Our procedure can be utilized across different cellular contexts and disorders.
RESUMEN
Nucleotide variants in cell type-specific gene regulatory elements in the human brain are risk factors for human disease. We measured chromatin accessibility in 1932 aliquots of sorted neurons and non-neurons from 616 human postmortem brains and identified 34,539 open chromatin regions with chromatin accessibility quantitative trait loci (caQTLs). Only 10.4% of caQTLs are shared between neurons and non-neurons, which supports cell type-specific genetic regulation of the brain regulome. Incorporating allele-specific chromatin accessibility improves statistical fine-mapping and refines molecular mechanisms that underlie disease risk. Using massively parallel reporter assays in induced excitatory neurons, we screened 19,893 brain QTLs and identified the functional impact of 476 regulatory variants. Combined, this comprehensive resource captures variation in the human brain regulome and provides insights into disease etiology.
Asunto(s)
Encefalopatías , Encéfalo , Cromatina , Regulación de la Expresión Génica , Elementos Reguladores de la Transcripción , Humanos , Alelos , Encéfalo/metabolismo , Encefalopatías/genética , Cromatina/metabolismo , Neuronas/metabolismo , Sitios de Carácter Cuantitativo , Masculino , FemeninoRESUMEN
Nucleotide changes in gene regulatory elements are important determinants of neuronal development and diseases. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 open chromatin regions, including thousands of sequences with cell type-specific accessibility and variants associated with brain gene regulation. In primary cells, we identified 46,802 active enhancer sequences and 164 variants that alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.
Asunto(s)
Corteza Cerebral , Neurogénesis , Organoides , Humanos , Corteza Cerebral/embriología , Corteza Cerebral/metabolismo , Cromatina/metabolismo , Cromatina/genética , Aprendizaje Profundo , Elementos de Facilitación Genéticos , Regulación del Desarrollo de la Expresión Génica , Neurogénesis/genética , Neuronas/metabolismo , Organoides/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Regiones Promotoras Genéticas , Elementos Reguladores de la TranscripciónRESUMEN
Genetic studies find hundreds of thousands of noncoding variants associated with psychiatric disorders. Massively parallel reporter assays (MPRAs) and in vivo transgenic mouse assays can be used to assay the impact of these variants. However, the relevance of MPRAs to in vivo function is unknown and transgenic assays suffer from low throughput. Here, we studied the utility of combining the two assays to study the impact of non-coding variants. We carried out an MPRA on over 50,000 sequences derived from enhancers validated in transgenic mouse assays and from multiple fetal neuronal ATAC-seq datasets. We also tested over 20,000 variants, including synthetic mutations in highly active neuronal enhancers and 177 common variants associated with psychiatric disorders. Variants with a high impact on MPRA activity were further tested in mice. We found a strong and specific correlation between MPRA and mouse neuronal enhancer activity including changes in neuronal enhancer activity in mouse embryos for variants with strong MPRA effects. Mouse assays also revealed pleiotropic variant effects that could not be observed in MPRA. Our work provides a large catalog of functional neuronal enhancers and variant effects and highlights the effectiveness of combining MPRAs and mouse transgenic assays.
RESUMEN
The human genome contains millions of retrotransposons, several of which could become active due to somatic mutations having phenotypic consequences, including disease. However, it is not thoroughly understood how nucleotide changes in retrotransposons affect their jumping activity. Here, we developed a novel massively parallel jumping assay (MPJA) that can test the jumping potential of thousands of transposons en masse. We generated nucleotide variant library of selected four Alu retrotransposons containing 165,087 different haplotypes and tested them for their jumping ability using MPJA. We found 66,821 unique jumping haplotypes, allowing us to pinpoint domains and variants vital for transposition. Mapping these variants to the Alu-RNA secondary structure revealed stem-loop features that contribute to jumping potential. Combined, our work provides a novel high-throughput assay that assesses the ability of retrotransposons to jump and identifies nucleotide changes that have the potential to reactivate them in the human genome.
RESUMEN
Adolescent idiopathic scoliosis (AIS), a sideways curvature of the spine, is sexually dimorphic, with increased incidence in females. A genome-wide association study identified a female-specific AIS susceptibility locus near the PAX1 gene. Here, we use mouse enhancer assays, three mouse enhancer knockouts, and subsequent phenotypic analyses to characterize this region. Using mouse enhancer assays, we characterize a sequence, PEC7, which overlaps the AIS-associated variant, and find it to be active in the tail tip and intervertebral disc. Removal of PEC7 or Xe1, a known sclerotome enhancer nearby, or deletion of both sequences lead to a kinky tail phenotype only in the Xe1 and combined (Xe1+PEC7) knockouts, with only the latter showing a female sex dimorphic phenotype. Extensive phenotypic characterization of these mouse lines implicates several differentially expressed genes and estrogen signaling in the sex dimorphic bias. In summary, our work functionally characterizes an AIS-associated locus and dissects the mechanism for its sexual dimorphism.
Asunto(s)
Escoliosis , Animales , Femenino , Ratones , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Escoliosis/genética , Escoliosis/epidemiología , Cola (estructura animal) , Factores de Transcripción/genéticaRESUMEN
Regulation of gene expression through enhancers is one of the major processes shaping the structure and function of the human brain during development. High-throughput assays have predicted thousands of enhancers involved in neurodevelopment, and confirming their activity through orthogonal functional assays is crucial. Here, we utilized Massively Parallel Reporter Assays (MPRAs) in stem cells and forebrain organoids to evaluate the activity of ~ 7000 gene-linked enhancers previously identified in human fetal tissues and brain organoids. We used a Gaussian mixture model to evaluate the contribution of background noise in the measured activity signal to confirm the activity of ~ 35% of the tested enhancers, with most showing temporal-specific activity, suggesting their evolving role in neurodevelopment. The temporal specificity was further supported by the correlation of activity with gene expression. Our findings provide a valuable gene regulatory resource to the scientific community.
Asunto(s)
Regulación de la Expresión Génica , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Organoides , Prosencéfalo , Elementos de Facilitación GenéticosRESUMEN
Children diagnosed with autism spectrum disorder (ASD) commonly present with sensory hypersensitivity or abnormally strong reactions to sensory stimuli. Such hypersensitivity can be overwhelming, causing high levels of distress that contribute markedly to the negative aspects of the disorder. Here, we identify a mechanism that underlies hypersensitivity in a sensorimotor reflex found to be altered in humans and in mice with loss of function in the ASD risk-factor gene SCN2A. The cerebellum-dependent vestibulo-ocular reflex (VOR), which helps maintain one's gaze during movement, was hypersensitized due to deficits in cerebellar synaptic plasticity. Heterozygous loss of SCN2A-encoded NaV1.2 sodium channels in granule cells impaired high-frequency transmission to Purkinje cells and long-term potentiation, a form of synaptic plasticity important for modulating VOR gain. VOR plasticity could be rescued in mice via a CRISPR-activator approach that increases Scn2a expression, demonstrating that evaluation of a simple reflex can be used to assess and quantify successful therapeutic intervention.
Asunto(s)
Trastorno del Espectro Autista , Cerebelo , Canal de Sodio Activado por Voltaje NAV1.2 , Plasticidad Neuronal , Animales , Canal de Sodio Activado por Voltaje NAV1.2/genética , Canal de Sodio Activado por Voltaje NAV1.2/metabolismo , Ratones , Plasticidad Neuronal/fisiología , Cerebelo/metabolismo , Trastorno del Espectro Autista/genética , Trastorno del Espectro Autista/fisiopatología , Humanos , Reflejo Vestibuloocular/fisiología , Masculino , Células de Purkinje/metabolismo , Ratones Endogámicos C57BLRESUMEN
Skin color is highly variable in Africans, yet little is known about the underlying molecular mechanism. Here we applied massively parallel reporter assays to screen 1,157 candidate variants influencing skin pigmentation in Africans and identified 165 single-nucleotide polymorphisms showing differential regulatory activities between alleles. We combine Hi-C, genome editing and melanin assays to identify regulatory elements for MFSD12, HMG20B, OCA2, MITF, LEF1, TRPS1, BLOC1S6 and CYB561A3 that impact melanin levels in vitro and modulate human skin color. We found that independent mutations in an OCA2 enhancer contribute to the evolution of human skin color diversity and detect signals of local adaptation at enhancers of MITF, LEF1 and TRPS1, which may contribute to the light skin color of Khoesan-speaking populations from Southern Africa. Additionally, we identified CYB561A3 as a novel pigmentation regulator that impacts genes involved in oxidative phosphorylation and melanogenesis. These results provide insights into the mechanisms underlying human skin color diversity and adaptive evolution.
Asunto(s)
Albinismo Oculocutáneo , Melaninas , Pigmentación de la Piel , Humanos , Pigmentación de la Piel/genética , Melaninas/genética , Alelos , Genómica , Pigmentación/genética , Polimorfismo de Nucleótido Simple/genética , Proteínas Represoras/genéticaRESUMEN
Adolescent idiopathic scoliosis (AIS) is a common and progressive spinal deformity in children that exhibits striking sexual dimorphism, with girls at more than fivefold greater risk of severe disease compared to boys. Despite its medical impact, the molecular mechanisms that drive AIS are largely unknown. We previously defined a female-specific AIS genetic risk locus in an enhancer near the PAX1 gene. Here, we sought to define the roles of PAX1 and newly identified AIS-associated genes in the developmental mechanism of AIS. In a genetic study of 10,519 individuals with AIS and 93,238 unaffected controls, significant association was identified with a variant in COL11A1 encoding collagen (α1) XI (rs3753841; NM_080629.2_c.4004C>T; p.(Pro1335Leu); p=7.07E-11, OR = 1.118). Using CRISPR mutagenesis we generated Pax1 knockout mice (Pax1-/-). In postnatal spines we found that PAX1 and collagen (α1) XI protein both localize within the intervertebral disc-vertebral junction region encompassing the growth plate, with less collagen (α1) XI detected in Pax1-/- spines compared to wild-type. By genetic targeting we found that wild-type Col11a1 expression in costal chondrocytes suppresses expression of Pax1 and of Mmp3, encoding the matrix metalloproteinase 3 enzyme implicated in matrix remodeling. However, the latter suppression was abrogated in the presence of the AIS-associated COL11A1P1335L mutant. Further, we found that either knockdown of the estrogen receptor gene Esr2 or tamoxifen treatment significantly altered Col11a1 and Mmp3 expression in chondrocytes. We propose a new molecular model of AIS pathogenesis wherein genetic variation and estrogen signaling increase disease susceptibility by altering a PAX1-COL11a1-MMP3 signaling axis in spinal chondrocytes.
Adolescent idiopathic scoliosis (AIS) is a twisting deformity of the spine that occurs during periods of rapid growth in children worldwide. Children with severe cases of AIS require surgery to stop it from getting worse, presenting a significant financial burden to health systems and families. Although AIS is known to cluster in families, its genetic causes and its inheritance pattern have remained elusive. Additionally, AIS is known to be more prevalent in females, a bias that has not been explained. Advances in techniques to study the genetics underlying diseases have revealed that certain variations that increase the risk of AIS affect cartilage and connective tissue. In humans, one such variation is near a gene called Pax1, and it is female-specific. The extracellular matrix is a network of proteins and other molecules in the space between cells that help connect tissues together, and it is particularly important in cartilage and other connective tissues. One of the main components of the extracellular matrix is collagen. Yu, Kanshour, Ushiki et al. hypothesized that changes in the extracellular matrix could affect the cartilage and connective tissues of the spine, leading to AIS. To show this, the scientists screened over 100,000 individuals and found that AIS is associated with variants in two genes coding for extracellular matrix proteins. One of these variants was found in a gene called Col11a1, which codes for one of the proteins that makes up collagen. To understand the relationship between Pax1 and Col11a1, Yu, Kanshour, Ushiki et al. genetically modified mice so that they would lack the Pax1 gene. In these mice, the activation of Col11a1 was reduced in the mouse spine. They also found that the form of Col11a1 associated with AIS could not suppress the activation of a gene called Mmp3 in mouse cartilage cells as effectively as unmutated Col11a1. Going one step further, the researchers found that lowering the levels of an estrogen receptor altered the activation patterns of Pax1, Col11a1, and Mmp3 in mouse cartilage cells. These findings suggest a possible mechanism for AIS, particularly in females. The findings of Yu, Kanshour, Ushiki et al. highlight that cartilage cells in the spine are particularly relevant in AIS. The results also point to specific molecules within the extracellular matrix as important for maintaining proper alignment in the spine when children are growing rapidly. This information may guide future therapies aimed at maintaining healthy spinal cells in adolescent children, particularly girls.
Asunto(s)
Escoliosis , Masculino , Animales , Niño , Ratones , Humanos , Femenino , Adolescente , Escoliosis/genética , Metaloproteinasa 3 de la Matriz/genética , Columna Vertebral , Factores de Transcripción/genética , Colágeno/genética , Variación Genética , Colágeno Tipo XI/genéticaRESUMEN
The advent of perturbation-based massively parallel reporter assays (MPRAs) technique has facilitated the delineation of the roles of non-coding regulatory elements in orchestrating gene expression. However, computational efforts remain scant to evaluate and establish guidelines for sequence design strategies for perturbation MPRAs. In this study, we propose a framework for evaluating and comparing various perturbation strategies for MPRA experiments. Within this framework, we benchmark three different perturbation approaches from the perspectives of alteration in motif-based profiles, consistency of MPRA outputs, and robustness of models that predict the activities of putative regulatory motifs. While our analyses show very similar results across multiple benchmarking metrics, the predictive modeling for the approach involving random nucleotide shuffling shows significant robustness compared with the other two approaches. Thus, we recommend designing sequences by randomly shuffling the nucleotides of the perturbed site in perturbation-MPRA, followed by a coherence check to prevent the introduction of other variations of the target motifs. In summary, our evaluation framework and the benchmarking findings create a resource of computational pipelines and highlight the potential of perturbation-MPRA in predicting non-coding regulatory activities.
Asunto(s)
Técnicas Genéticas , Secuencias Reguladoras de Ácidos Nucleicos , NucleótidosRESUMEN
Frugivory evolved multiple times in mammals, including bats. However, the cellular and molecular components driving it remain largely unknown. Here, we use integrative single-cell sequencing (scRNA-seq and scATAC-seq) on insectivorous (Eptesicus fuscus; big brown bat) and frugivorous (Artibeus jamaicensis; Jamaican fruit bat) bat kidneys and pancreases and identify key cell population, gene expression and regulatory differences associated with the Jamaican fruit bat that also relate to human disease, particularly diabetes. We find a decrease in loop of Henle and an increase in collecting duct cells, and differentially active genes and regulatory elements involved in fluid and electrolyte balance in the Jamaican fruit bat kidney. The Jamaican fruit bat pancreas shows an increase in endocrine and a decrease in exocrine cells, and differences in genes and regulatory elements involved in insulin regulation. We also find that these frugivorous bats share several molecular characteristics with human diabetes. Combined, our work provides insights from a frugivorous mammal that could be leveraged for therapeutic purposes.
Asunto(s)
Quirópteros , Diabetes Mellitus , Humanos , Animales , Páncreas , Riñón , Células EpitelialesRESUMEN
Massively parallel reporter assays (MPRAs) represent a set of high-throughput technologies that measure the functional effects of thousands of sequences/variants on gene regulatory activity. There are several different variations of MPRA technology and they are used for numerous applications, including regulatory element discovery, variant effect measurement, saturation mutagenesis, synthetic regulatory element generation or characterization of evolutionary gene regulatory differences. Despite their many designs and uses, there is no comprehensive database that incorporates the results of these experiments. To address this, we developed MPRAbase, a manually curated database that currently harbors 129 experiments, encompassing 17,718,677 elements tested across 35 cell types and 4 organisms. The MPRAbase web interface (http://www.mprabase.com) serves as a centralized user-friendly repository to download existing MPRA data for independent analysis and is designed with the ability to allow researchers to share their published data for rapid dissemination to the community.
RESUMEN
Topological associating domains (TADs) are self-interacting genomic units crucial for shaping gene regulation patterns. Despite their importance, the extent of their evolutionary conservation and its functional implications remain largely unknown. In this study, we generate Hi-C and ChIP-seq data and compare TAD organization across four primate and four rodent species and characterize the genetic and epigenetic properties of TAD boundaries in correspondence to their evolutionary conservation. We find 14% of all human TAD boundaries to be shared among all eight species (ultraconserved), while 15% are human-specific. Ultraconserved TAD boundaries have stronger insulation strength, CTCF binding, and enrichment of older retrotransposons compared to species-specific boundaries. CRISPR-Cas9 knockouts of an ultraconserved boundary in a mouse model lead to tissue-specific gene expression changes and morphological phenotypes. Deletion of a human-specific boundary near the autism-related AUTS2 gene results in the upregulation of this gene in neurons. Overall, our study provides pertinent TAD boundary evolutionary conservation annotations and showcases the functional importance of TAD evolution.