RESUMO
Approximately a quarter of the human genome consists of gene deserts, large regions devoid of genes often located adjacent to developmental genes and thought to contribute to their regulation. However, defining the regulatory functions embedded within these deserts is challenging due to their large size. Here, we explore the cis-regulatory architecture of a gene desert flanking the Shox2 gene, which encodes a transcription factor indispensable for proximal limb, craniofacial, and cardiac pacemaker development. We identify the gene desert as a regulatory hub containing more than 15 distinct enhancers recapitulating anatomical subdomains of Shox2 expression. Ablation of the gene desert leads to embryonic lethality due to Shox2 depletion in the cardiac sinus venosus, caused in part by the loss of a specific distal enhancer. The gene desert is also required for stylopod morphogenesis, mediated via distributed proximal limb enhancers. In summary, our study establishes a multi-layered role of the Shox2 gene desert in orchestrating pleiotropic developmental expression through modular arrangement and coordinated dynamics of tissue-specific enhancers.
Assuntos
Elementos Facilitadores Genéticos , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Homeodomínio , Animais , Humanos , Camundongos , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , MorfogêneseRESUMO
Regulatory elements (enhancers) are major drivers of gene expression in mammals and harbor many genetic variants associated with human diseases. Here, we present an updated VISTA Enhancer Browser (https://enhancer.lbl.gov), a database of transgenic enhancer assays conducted in developing mouse embryos in vivo. Since the original publication in 2007, the database grew nearly 20-fold from 250 to over 4500 experiments and currently harbors over 23 500 images. The updated database provides structured information on experiments conducted at different stages of embryonic development, including enhancer activities of human pathogenic and synthetic variants and sequences derived from a variety of species. In addition to manually curated results of thousands of individual experiments, the new database also features hundreds of manually curated comparisons between alleles. The VISTA Enhancer Browser provides a crucial resource for study of human genetic variation, gene regulation and developmental biology.
RESUMO
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generate single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. We evaluate enhancer activity for 59 elements using an in vivo transgenic assay and validate 44 (75%), demonstrating that single cell accessibility can be a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieve significant reduction in our variant search space and nominate candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work delivers non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Assuntos
Elementos Facilitadores Genéticos , Animais , Camundongos , Humanos , Elementos Facilitadores Genéticos/genética , Neurônios Motores/metabolismo , Cromatina/metabolismo , Cromatina/genética , Masculino , Análise de Célula Única , Epigenômica/métodos , Feminino , LinhagemRESUMO
Distant-acting enhancers are central to human development. However, our limited understanding of their functional sequence features prevents the interpretation of enhancer mutations in disease. Here, we determined the functional sensitivity to mutagenesis of human developmental enhancers in vivo. Focusing on seven enhancers active in the developing brain, heart, limb and face, we created over 1700 transgenic mice for over 260 mutagenized enhancer alleles. Systematic mutation of 12-basepair blocks collectively altered each sequence feature in each enhancer at least once. We show that 69% of all blocks are required for normal in vivo activity, with mutations more commonly resulting in loss (60%) than in gain (9%) of function. Using predictive modeling, we annotated critical nucleotides at base-pair resolution. The vast majority of motifs predicted by these machine learning models (88%) coincided with changes to in vivo function, and the models showed considerable sensitivity, identifying 59% of all functional blocks. Taken together, our results reveal that human enhancers contain a high density of sequence features required for their normal in vivo function and provide a rich resource for further exploration of human enhancer logic.
RESUMO
Little is known about the role of non-coding regions in the etiology of autism spectrum disorder (ASD). We examined three classes of non-coding regions: human accelerated regions (HARs), which show signatures of positive selection in humans; experimentally validated neural VISTA enhancers (VEs); and conserved regions predicted to act as neural enhancers (CNEs). Targeted and whole-genome analysis of >16,600 samples and >4,900 ASD probands revealed that likely recessive, rare, inherited variants in HARs, VEs, and CNEs substantially contribute to ASD risk in probands whose parents share ancestry, which enriches for recessive contributions, but modestly contribute, if at all, in simplex family structures. We identified multiple patient variants in HARs near IL1RAPL1 and in VEs near OTX1 and SIM1 and showed that they change enhancer activity. Our results implicate both human-evolved and evolutionarily conserved non-coding regions in ASD risk and suggest potential mechanisms of how regulatory changes can modulate social behavior.
Assuntos
Transtorno do Espectro Autista , Humanos , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/epidemiologia , Predisposição Genética para Doença , Elementos Facilitadores Genéticos/genética , Masculino , Evolução Molecular , FemininoRESUMO
Chondrocyte differentiation controls skeleton development and stature. Here we provide a comprehensive map of chondrocyte-specific enhancers and show that they provide a mechanistic framework through which non-coding genetic variants can influence skeletal development and human stature. Working with fetal chondrocytes isolated from mice bearing a Col2a1 fluorescent regulatory sensor, we identify 780 genes and 2'704 putative enhancers specifically active in chondrocytes using a combination of RNA-seq, ATAC-seq and H3K27ac ChIP-seq. Most of these enhancers (74%) show pan-chondrogenic activity, with smaller populations being restricted to limb (18%) or trunk (8%) chondrocytes only. Notably, genetic variations overlapping these enhancers better explain height differences than those overlapping non-chondrogenic enhancers. Finally, targeted deletions of identified enhancers at the Fgfr3, Col2a1, Hhip and, Nkx3-2 loci confirm their role in regulating cognate genes. This enhancer map provides a framework for understanding how genes and non-coding variations influence bone development and diseases.
Assuntos
Condrócitos , Condrogênese , Elementos Facilitadores Genéticos , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos , Animais , Elementos Facilitadores Genéticos/genética , Humanos , Condrócitos/metabolismo , Condrócitos/citologia , Camundongos , Condrogênese/genética , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/genética , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/metabolismo , Colágeno Tipo II/genética , Colágeno Tipo II/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Desenvolvimento Ósseo/genética , Extremidades/embriologia , Masculino , Diferenciação Celular/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , FemininoRESUMO
Genetic studies find hundreds of thousands of noncoding variants associated with psychiatric disorders. Massively parallel reporter assays (MPRAs) and in vivo transgenic mouse assays can be used to assay the impact of these variants. However, the relevance of MPRAs to in vivo function is unknown and transgenic assays suffer from low throughput. Here, we studied the utility of combining the two assays to study the impact of non-coding variants. We carried out an MPRA on over 50,000 sequences derived from enhancers validated in transgenic mouse assays and from multiple fetal neuronal ATAC-seq datasets. We also tested over 20,000 variants, including synthetic mutations in highly active neuronal enhancers and 177 common variants associated with psychiatric disorders. Variants with a high impact on MPRA activity were further tested in mice. We found a strong and specific correlation between MPRA and mouse neuronal enhancer activity including changes in neuronal enhancer activity in mouse embryos for variants with strong MPRA effects. Mouse assays also revealed pleiotropic variant effects that could not be observed in MPRA. Our work provides a large catalog of functional neuronal enhancers and variant effects and highlights the effectiveness of combining MPRAs and mouse transgenic assays.
RESUMO
The genetic basis of human facial variation and craniofacial birth defects remains poorly understood. Distant-acting transcriptional enhancers control the fine-tuned spatiotemporal expression of genes during critical stages of craniofacial development. However, a lack of accurate maps of the genomic locations and cell type-resolved activities of craniofacial enhancers prevents their systematic exploration in human genetics studies. Here, we combine histone modification, chromatin accessibility, and gene expression profiling of human craniofacial development with single-cell analyses of the developing mouse face to define the regulatory landscape of facial development at tissue- and single cell-resolution. We provide temporal activity profiles for 14,000 human developmental craniofacial enhancers. We find that 56% of human craniofacial enhancers share chromatin accessibility in the mouse and we provide cell population- and embryonic stage-resolved predictions of their in vivo activity. Taken together, our data provide an expansive resource for genetic and developmental studies of human craniofacial development.
Assuntos
Cromatina , Sequências Reguladoras de Ácido Nucleico , Humanos , Animais , Camundongos , Cromatina/genética , Perfilação da Expressão Gênica , Genômica , Processamento de Proteína Pós-TraducionalRESUMO
Little is known about the role of noncoding regions in the etiology of autism spectrum disorder (ASD). We examined three classes of noncoding regions: Human Accelerated Regions (HARs), which show signatures of positive selection in humans; experimentally validated neural Vista Enhancers (VEs); and conserved regions predicted to act as neural enhancers (CNEs). Targeted and whole genome analysis of >16,600 samples and >4900 ASD probands revealed that likely recessive, rare, inherited variants in HARs, VEs, and CNEs substantially contribute to ASD risk in probands whose parents share ancestry, which enriches for recessive contributions, but modestly, if at all, in simplex family structures. We identified multiple patient variants in HARs near IL1RAPL1 and in a VE near SIM1 and showed that they change enhancer activity. Our results implicate both human-evolved and evolutionarily conserved noncoding regions in ASD risk and suggest potential mechanisms of how changes in regulatory regions can modulate social behavior.
RESUMO
The genome engineering capability of the CRISPR/Cas system depends on the DNA repair machinery to generate the final outcome. Several genes can have an impact on mutations created, but their exact function and contribution to the result of the repair are not completely characterised. This lack of knowledge has limited the ability to comprehend and regulate the editing outcomes. Here, we measure how the absence of 21 repair genes changes the mutation outcomes of Cas9-generated cuts at 2,812 synthetic target sequences in mouse embryonic stem cells. Absence of key non-homologous end joining genes Lig4, Xrcc4, and Xlf abolished small insertions and deletions, while disabling key microhomology-mediated repair genes Nbn and Polq reduced frequency of longer deletions. Complex alleles of combined insertion and deletions were preferentially generated in the absence of Xrcc6. We further discover finer structure in the outcome frequency changes for single nucleotide insertions and deletions between large microhomologies that are differentially modulated by the knockouts. We use the knowledge of the reproducible variation across repair milieus to build predictive models of Cas9 editing results that outperform the current standards. This work improves our understanding of DNA repair gene function, and provides avenues for more precise modulation of CRISPR/Cas9-generated mutations.
RESUMO
The genetic basis of craniofacial birth defects and general variation in human facial shape remains poorly understood. Distant-acting transcriptional enhancers are a major category of non-coding genome function and have been shown to control the fine-tuned spatiotemporal expression of genes during critical stages of craniofacial development1-3. However, a lack of accurate maps of the genomic location and cell type-specific in vivo activities of all craniofacial enhancers prevents their systematic exploration in human genetics studies. Here, we combined histone modification and chromatin accessibility profiling from different stages of human craniofacial development with single-cell analyses of the developing mouse face to create a comprehensive catalogue of the regulatory landscape of facial development at tissue- and single cell-resolution. In total, we identified approximately 14,000 enhancers across seven developmental stages from weeks 4 through 8 of human embryonic face development. We used transgenic mouse reporter assays to determine the in vivo activity patterns of human face enhancers predicted from these data. Across 16 in vivo validated human enhancers, we observed a rich diversity of craniofacial subregions in which these enhancers are active in vivo. To annotate the cell type specificities of human-mouse conserved enhancers, we performed single-cell RNA-seq and single-nucleus ATAC-seq of mouse craniofacial tissues from embryonic days e11.5 to e15.5. By integrating these data across species, we find that the majority (56%) of human craniofacial enhancers are functionally conserved in mice, providing cell type- and embryonic stage-resolved predictions of their in vivo activity profiles. Using retrospective analysis of known craniofacial enhancers in combination with single cell-resolved transgenic reporter assays, we demonstrate the utility of these data for predicting the in vivo cell type specificity of enhancers. Taken together, our data provide an expansive resource for genetic and developmental studies of human craniofacial development.
RESUMO
Hereditary congenital facial paresis type 1 (HCFP1) is an autosomal dominant disorder of absent or limited facial movement that maps to chromosome 3q21-q22 and is hypothesized to result from facial branchial motor neuron (FBMN) maldevelopment. In the present study, we report that HCFP1 results from heterozygous duplications within a neuron-specific GATA2 regulatory region that includes two enhancers and one silencer, and from noncoding single-nucleotide variants (SNVs) within the silencer. Some SNVs impair binding of NR2F1 to the silencer in vitro and in vivo and attenuate in vivo enhancer reporter expression in FBMNs. Gata2 and its effector Gata3 are essential for inner-ear efferent neuron (IEE) but not FBMN development. A humanized HCFP1 mouse model extends Gata2 expression, favors the formation of IEEs over FBMNs and is rescued by conditional loss of Gata3. These findings highlight the importance of temporal gene regulation in development and of noncoding variation in rare mendelian disease.
Assuntos
Paralisia Facial , Animais , Camundongos , Paralisia Facial/genética , Paralisia Facial/congênito , Paralisia Facial/metabolismo , Fator de Transcrição GATA2/genética , Fator de Transcrição GATA2/metabolismo , Neurônios Motores/metabolismo , Neurogênese , Neurônios EferentesRESUMO
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
RESUMO
Heart disease is associated with re-expression of key transcription factors normally active only during prenatal development of the heart. However, the impact of this reactivation on the regulatory landscape in heart disease is unclear. Here, we use RNA-seq and ChIP-seq targeting a histone modification associated with active transcriptional enhancers to generate genome-wide enhancer maps from left ventricle tissue from up to 26 healthy controls, 18 individuals with idiopathic dilated cardiomyopathy (DCM), and five fetal hearts. Healthy individuals have a highly reproducible epigenomic landscape, consisting of more than 33,000 predicted heart enhancers. In contrast, we observe reproducible disease-associated changes in activity at 6,850 predicted heart enhancers. Combined analysis of adult and fetal samples reveals that the heart disease epigenome and transcriptome both acquire fetal-like characteristics, with 3,400 individual enhancers sharing fetal regulatory properties. We also provide a comprehensive data resource (http://heart.lbl.gov) for the mechanistic exploration of DCM etiology.
Assuntos
Cardiomiopatia Dilatada , Elementos Facilitadores Genéticos , Adulto , Elementos Facilitadores Genéticos/genética , Epigenoma , Epigenômica , Humanos , Fatores de TranscriçãoRESUMO
Repair of Cas9-induced double-stranded breaks results primarily in formation of small insertions and deletions (indels), but can also cause potentially harmful large deletions. While mechanisms leading to the creation of small indels are relatively well understood, very little is known about the origins of large deletions. Using a library of clonal NGS-validated mouse embryonic stem cells deficient for 32 DNA repair genes, we have shown that large deletion frequency increases in cells impaired for non-homologous end joining and decreases in cells deficient for the central resection gene Nbn and the microhomology-mediated end joining gene Polq. Across deficient clones, increase in large deletion frequency was closely correlated with the increase in the extent of microhomology and the size of small indels, implying a continuity of repair processes across different genomic scales. Furthermore, by targeting diverse genomic sites, we identified examples of repair processes that were highly locus-specific, discovering a role for exonuclease Trex1. Finally, we present evidence that indel sizes increase with the overall efficiency of Cas9 mutagenesis. These findings may have impact on both basic research and clinical use of CRISPR-Cas9, in particular in conjunction with repair pathway modulation.
Assuntos
Sistemas CRISPR-Cas , Quebras de DNA de Cadeia Dupla , Animais , Reparo do DNA por Junção de Extremidades/genética , Reparo do DNA/genética , Mutação INDEL , CamundongosRESUMO
The DNA mutation produced by cellular repair of a CRISPR-Cas9-generated double-strand break determines its phenotypic effect. It is known that the mutational outcomes are not random, but depend on DNA sequence at the targeted location. Here we systematically study the influence of flanking DNA sequence on repair outcome by measuring the edits generated by >40,000 guide RNAs (gRNAs) in synthetic constructs. We performed the experiments in a range of genetic backgrounds and using alternative CRISPR-Cas9 reagents. In total, we gathered data for >109 mutational outcomes. The majority of reproducible mutations are insertions of a single base, short deletions or longer microhomology-mediated deletions. Each gRNA has an individual cell-line-dependent bias toward particular outcomes. We uncover sequence determinants of the mutations produced and use these to derive a predictor of Cas9 editing outcomes. Improved understanding of sequence repair will allow better design of gene editing experiments.
RESUMO
CRISPR-Cas9 is poised to become the gene editing tool of choice in clinical contexts. Thus far, exploration of Cas9-induced genetic alterations has been limited to the immediate vicinity of the target site and distal off-target sequences, leading to the conclusion that CRISPR-Cas9 was reasonably specific. Here we report significant on-target mutagenesis, such as large deletions and more complex genomic rearrangements at the targeted sites in mouse embryonic stem cells, mouse hematopoietic progenitors and a human differentiated cell line. Using long-read sequencing and long-range PCR genotyping, we show that DNA breaks introduced by single-guide RNA/Cas9 frequently resolved into deletions extending over many kilobases. Furthermore, lesions distal to the cut site and crossover events were identified. The observed genomic damage in mitotically active cells caused by CRISPR-Cas9 editing may have pathogenic consequences.
Assuntos
Sistemas CRISPR-Cas , Quebras de DNA de Cadeia Dupla , Deleção de Sequência , Animais , Genótipo , Humanos , Camundongos , Mutagênese , Reação em Cadeia da Polimerase/métodosRESUMO
The introduction of CRISPR/Cas9 gene editing in mammalian cells is a scientific breakthrough, which has greatly affected basic research and gene therapy. The simplicity and general access to CRISPR/Cas9 reagents has in an unprecedented manner "democratized" gene targeting in biomedical research, enabling genetic engineering of any gene in any cell, tissue, organ, and organism. The ability for fast, precise, and efficient profiling of the double-stranded break induced insertions and deletions (indels), mediated by any of the available programmable nucleases, is paramount to any given gene targeting approach. In this study we review the most commonly used indel detection methods and using a robust, sensitive, and cost efficient Indel Detection by Amplicon Analysis method, we have investigated the impact of the most commonly used CRISPR/Cas9 delivery formats, including lentivirus transduction, plasmid lipofection, and ribo nuclear protein electroporation, on the dynamics of indel profile formation. We observe rapid indel formation using RNP electroporation, especially with synthetic stabilized gRNA, as well as long-term decline in overall indel frequency with lipofectamine-based, plasmid transfection methods. Most methods reach peak editing on day 2-3 postdelivery. Furthermore, we find relative increase in frequency of larger size indels (>6bp) under condition of persistent editing using stably integrated lentiviral gRNA and Cas9 vectors.