RESUMO
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Haploinsuficiência/genética , HumanosRESUMO
Point mutations and structural variants that directly disrupt the coding sequence of MEF2C have been associated with a spectrum of neurodevelopmental disorders (NDDs). However, the impact of MEF2C haploinsufficiency on neurodevelopmental pathways and synaptic processes is not well understood, nor are the complex mechanisms that govern its regulation. To explore the functional changes associated with structural variants that alter MEF2C expression and/or regulation, we generated an allelic series of 204 isogenic human induced pluripotent stem cell (hiPSC)-derived neural stem cells and glutamatergic induced neurons. These neuronal models harbored CRISPR-engineered mutations that involved direct deletion of MEF2C or deletion of the boundary points for topologically associating domains (TADs) and chromatin loops encompassing MEF2C. Systematic profiling of mutation-specific alterations, contrasted to unedited controls that were exposed to the same guide RNAs for each edit, revealed that deletion of MEF2C caused differential expression of genes associated with neurodevelopmental pathways and synaptic function. We also discovered significant reduction in synaptic activity measured by multielectrode arrays (MEAs) in neuronal cells. By contrast, we observed robust buffering against MEF2C regulatory disruption following deletion of a distal 5q14.3 TAD and loop boundary, whereas homozygous loss of a proximal loop boundary resulted in down-regulation of MEF2C expression and reduced electrophysiological activity on MEA that was comparable to direct gene disruption. Collectively, these studies highlight the considerable functional impact of MEF2C deletion in neuronal cells and systematically characterize the complex interactions that challenge a priori predictions of regulatory consequences from structural variants that disrupt three-dimensional genome organization.
Assuntos
Células-Tronco Pluripotentes Induzidas , Células-Tronco Neurais , Humanos , Genoma , Haploinsuficiência , Fatores de Transcrição MEF2/genética , Neurônios , Transcrição GênicaRESUMO
Chromosome 16p11.2 reciprocal genomic disorder, resulting from recurrent copy-number variants (CNVs), involves intellectual disability, autism spectrum disorder (ASD), and schizophrenia, but the responsible mechanisms are not known. To systemically dissect molecular effects, we performed transcriptome profiling of 350 libraries from six tissues (cortex, cerebellum, striatum, liver, brown fat, and white fat) in mouse models harboring CNVs of the syntenic 7qF3 region, as well as cellular, transcriptional, and single-cell analyses in 54 isogenic neural stem cell, induced neuron, and cerebral organoid models of CRISPR-engineered 16p11.2 CNVs. Transcriptome-wide differentially expressed genes were largely tissue-, cell-type-, and dosage-specific, although more effects were shared between deletion and duplication and across tissue than expected by chance. The broadest effects were observed in the cerebellum (2,163 differentially expressed genes), and the greatest enrichments were associated with synaptic pathways in mouse cerebellum and human induced neurons. Pathway and co-expression analyses identified energy and RNA metabolism as shared processes and enrichment for ASD-associated, loss-of-function constraint, and fragile X messenger ribonucleoprotein target gene sets. Intriguingly, reciprocal 16p11.2 dosage changes resulted in consistent decrements in neurite and electrophysiological features, and single-cell profiling of organoids showed reciprocal alterations to the proportions of excitatory and inhibitory GABAergic neurons. Changes both in neuronal ratios and in gene expression in our organoid analyses point most directly to calretinin GABAergic inhibitory neurons and the excitatory/inhibitory balance as targets of disruption that might contribute to changes in neurodevelopmental and cognitive function in 16p11.2 carriers. Collectively, our data indicate the genomic disorder involves disruption of multiple contributing biological processes and that this disruption has relative impacts that are context specific.
Assuntos
Transtorno do Espectro Autista , Transtornos Cromossômicos , Deficiência Intelectual , Animais , Transtorno do Espectro Autista/genética , Calbindina 2/genética , Córtex Cerebral , Deleção Cromossômica , Transtornos Cromossômicos/genética , Cromossomos Humanos Par 16/genética , Variações do Número de Cópias de DNA , Genômica , Humanos , Deficiência Intelectual/genética , Camundongos , Neurônios , RNARESUMO
Hennekam lymphangiectasia-lymphedema syndrome is an autosomal recessive disorder characterized by congenital lymphedema, intestinal lymphangiectasia, facial dysmorphism, and variable intellectual disability. Known disease genes include CCBE1, FAT4, and ADAMTS3. In a patient with clinically diagnosed Hennekam syndrome but without mutations or copy-number changes in the three known disease genes, we identified a homozygous single-exon deletion affecting FBXL7. Specifically, exon 3, which encodes the F-box domain and several leucine-rich repeats of FBXL7, is eliminated. Our analyses of databases representing >100,000 control individuals failed to identify biallelic loss-of-function variants in FBXL7. Published studies in Drosophila indicate Fbxl7 interacts with Fat, of which human FAT4 is an ortholog, and mutation of either gene yields similar morphological consequences. These data suggest that FBXL7 may be the fourth gene for Hennekam syndrome, acting via a shared pathway with FAT4.
Assuntos
Anormalidades Craniofaciais/genética , Proteínas F-Box/genética , Predisposição Genética para Doença , Linfangiectasia Intestinal/genética , Linfedema/genética , Proteínas ADAMTS/genética , Alelos , Animais , Pré-Escolar , Anormalidades Craniofaciais/complicações , Anormalidades Craniofaciais/patologia , Drosophila melanogaster/genética , Genótipo , Homozigoto , Humanos , Linfangiectasia Intestinal/complicações , Linfangiectasia Intestinal/patologia , Linfedema/complicações , Linfedema/patologia , Masculino , Técnicas de Diagnóstico Molecular/métodos , Mutação/genética , Linhagem , Fenótipo , Pró-Colágeno N-Endopeptidase/genéticaRESUMO
Recurrent rearrangements of Chromosome 8p23.1 are associated with congenital heart defects and developmental delay. The complexity of this region has led to inconsistencies in the current reference assembly, confounding studies of genetic variation. Using comparative sequence-based approaches, we generated a high-quality 6.3-Mbp alternate reference assembly of an inverted Chromosome 8p23.1 haplotype. Comparison with nonhuman primates reveals a 746-kbp duplicative transposition and two separate inversion events that arose in the last million years of human evolution. The breakpoints associated with these rearrangements map to an ape-specific interchromosomal core duplicon that clusters at sites of evolutionary inversion (P = 7.8 × 10-5). Refinement of microdeletion breakpoints identifies a subgroup of patients that map to the same interchromosomal core involved in the evolutionary formation of the duplication blocks. Our results define a higher-order genomic instability element that has shaped the structure of specific chromosomes during primate evolution contributing to rearrangements associated with inversion and disease.
Assuntos
Evolução Molecular , Predisposição Genética para Doença , Instabilidade Genômica , Duplicações Segmentares Genômicas , Animais , Pontos de Quebra do Cromossomo , Deleção Cromossômica , Cromossomos Humanos Par 8/genética , Humanos , Primatas/genéticaRESUMO
We searched for disruptive, genic rare copy-number variants (CNVs) among 411 families affected by sporadic autism spectrum disorder (ASD) from the Simons Simplex Collection by using available exome sequence data and CoNIFER (Copy Number Inference from Exome Reads). Compared to high-density SNP microarrays, our approach yielded â¼2× more smaller genic rare CNVs. We found that affected probands inherited more CNVs than did their siblings (453 versus 394, p = 0.004; odds ratio [OR] = 1.19) and that the probands' CNVs affected more genes (921 versus 726, p = 0.02; OR = 1.30). These smaller CNVs (median size 18 kb) were transmitted preferentially from the mother (136 maternal versus 100 paternal, p = 0.02), although this bias occurred irrespective of affected status. The excess burden of inherited CNVs among probands was driven primarily by sibling pairs with discordant social-behavior phenotypes (p < 0.0002, measured by Social Responsiveness Scale [SRS] score), which contrasts with families where the phenotypes were more closely matched or less extreme (p > 0.5). Finally, we found enrichment of brain-expressed genes unique to probands, especially in the SRS-discordant group (p = 0.0035). In a combined model, our inherited CNVs, de novo CNVs, and de novo single-nucleotide variants all independently contributed to the risk of autism (p < 0.05). Taken together, these results suggest that small transmitted rare CNVs play a role in the etiology of simplex autism. Importantly, the small size of these variants aids in the identification of specific genes as additional risk factors associated with ASD.
Assuntos
Transtornos Globais do Desenvolvimento Infantil/genética , Variações do Número de Cópias de DNA , Desequilíbrio de Ligação , Criança , Exoma , Feminino , Expressão Gênica , Predisposição Genética para Doença , Humanos , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único , Risco , Fatores de Risco , IrmãosRESUMO
Copy number variation (CNV) contributes to disease and has restructured the genomes of great apes. The diversity and rate of this process, however, have not been extensively explored among great ape lineages. We analyzed 97 deeply sequenced great ape and human genomes and estimate 16% (469 Mb) of the hominid genome has been affected by recent CNV. We identify a comprehensive set of fixed gene deletions (n = 340) and duplications (n = 405) as well as >13.5 Mb of sequence that has been specifically lost on the human lineage. We compared the diversity and rates of copy number and single nucleotide variation across the hominid phylogeny. We find that CNV diversity partially correlates with single nucleotide diversity (r(2) = 0.5) and recapitulates the phylogeny of apes with few exceptions. Duplications significantly outpace deletions (2.8-fold). The load of segregating duplications remains significantly higher in bonobos, Western chimpanzees, and Sumatran orangutans-populations that have experienced recent genetic bottlenecks (P = 0.0014, 0.02, and 0.0088, respectively). The rate of fixed deletion has been more clocklike with the exception of the chimpanzee lineage, where we observe a twofold increase in the chimpanzee-bonobo ancestor (P = 4.79 × 10(-9)) and increased deletion load among Western chimpanzees (P = 0.002). The latter includes the first genomic disorder in a chimpanzee with features resembling Smith-Magenis syndrome mediated by a chimpanzee-specific increase in segmental duplication complexity. We hypothesize that demographic effects, such as bottlenecks, have contributed to larger and more gene-rich segments being deleted in the chimpanzee lineage and that this effect, more generally, may account for episodic bursts in CNV during hominid evolution.
Assuntos
Variações do Número de Cópias de DNA , Evolução Molecular , Hominidae/genética , Filogenia , Animais , Sequência de Bases , Deleção de Genes , Duplicação Gênica , Carga Genética , Genoma Humano , Humanos , Dados de Sequência Molecular , Linhagem , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNARESUMO
BACKGROUND: IgE is a key mediator of allergic inflammation, and its levels are frequently increased in patients with allergic disorders. OBJECTIVE: We sought to identify genetic variants associated with IgE levels in Latinos. METHODS: We performed a genome-wide association study and admixture mapping of total IgE levels in 3334 Latinos from the Genes-environments & Admixture in Latino Americans (GALA II) study. Replication was evaluated in 454 Latinos, 1564 European Americans, and 3187 African Americans from independent studies. RESULTS: We confirmed associations of 6 genes identified by means of previous genome-wide association studies and identified a novel genome-wide significant association of a polymorphism in the zinc finger protein 365 gene (ZNF365) with total IgE levels (rs200076616, P = 2.3 × 10(-8)). We next identified 4 admixture mapping peaks (6p21.32-p22.1, 13p22-31, 14q23.2, and 22q13.1) at which local African, European, and/or Native American ancestry was significantly associated with IgE levels. The most significant peak was 6p21.32-p22.1, where Native American ancestry was associated with lower IgE levels (P = 4.95 × 10(-8)). All but 22q13.1 were replicated in an independent sample of Latinos, and 2 of the peaks were replicated in African Americans (6p21.32-p22.1 and 14q23.2). Fine mapping of 6p21.32-p22.1 identified 6 genome-wide significant single nucleotide polymorphisms in Latinos, 2 of which replicated in European Americans. Another single nucleotide polymorphism was peak-wide significant within 14q23.2 in African Americans (rs1741099, P = 3.7 × 10(-6)) and replicated in non-African American samples (P = .011). CONCLUSION: We confirmed genetic associations at 6 genes and identified novel associations within ZNF365, HLA-DQA1, and 14q23.2. Our results highlight the importance of studying diverse multiethnic populations to uncover novel loci associated with total IgE levels.
Assuntos
Loci Gênicos , Estudo de Associação Genômica Ampla , Genótipo , Hispânico ou Latino , Imunoglobulina E/genética , Polimorfismo de Nucleotídeo Único , Adolescente , Adulto , Negro ou Afro-Americano , Criança , Mapeamento Cromossômico , Cromossomos Humanos Par 14/química , Proteínas de Ligação a DNA/genética , Feminino , Genoma Humano , Cadeias alfa de HLA-DQ/genética , Humanos , Masculino , Fatores de Transcrição/genética , População BrancaRESUMO
New technologies and large-cohort studies have enabled novel variant discovery and association at unprecedented scale, yet functional characterization of these variants remains paramount to deciphering disease mechanisms. Approaches that facilitate parallelized genome editing of cells of interest or induced pluripotent stem cells (iPSCs) have become critical tools toward this goal. Here, we developed an approach that incorporates libraries of CRISPR-Cas9 guide RNAs (gRNAs) together with inducible Cas9 into a piggyBac (PB) transposon system to engineer dozens to hundreds of genomic variants in parallel against isogenic cellular backgrounds. This method empowers loss-of-function (LoF) studies through the introduction of insertions or deletions (indels) and copy-number variants (CNVs), though generating specific nucleotide changes is possible with prime editing. The ability to rapidly establish high-quality mutational models at scale will facilitate the development of isogenic cellular collections and catalyze comparative functional genomic studies investigating the roles of hundreds of genes and mutations in development and disease.
Assuntos
Sistemas CRISPR-Cas , Células-Tronco Pluripotentes Induzidas , Humanos , Edição de Genes/métodos , Mutação , GenômicaRESUMO
Nuclear compartments are prominent features of 3D chromatin organization, but sequencing depth limitations have impeded investigation at ultra fine-scale. CTCF loops are generally studied at a finer scale, but the impact of looping on proximal interactions remains enigmatic. Here, we critically examine nuclear compartments and CTCF loop-proximal interactions using a combination of in situ Hi-C at unparalleled depth, algorithm development, and biophysical modeling. Producing a large Hi-C map with 33 billion contacts in conjunction with an algorithm for performing principal component analysis on sparse, super massive matrices (POSSUMM), we resolve compartments to 500 bp. Our results demonstrate that essentially all active promoters and distal enhancers localize in the A compartment, even when flanking sequences do not. Furthermore, we find that the TSS and TTS of paused genes are often segregated into separate compartments. We then identify diffuse interactions that radiate from CTCF loop anchors, which correlate with strong enhancer-promoter interactions and proximal transcription. We also find that these diffuse interactions depend on CTCF's RNA binding domains. In this work, we demonstrate features of fine-scale chromatin organization consistent with a revised model in which compartments are more precise than commonly thought while CTCF loops are more protracted.
Assuntos
Cromatina , Elementos Facilitadores Genéticos , Cromatina/genética , Fator de Ligação a CCCTC/genética , Fator de Ligação a CCCTC/metabolismo , Elementos Facilitadores Genéticos/genética , Núcleo Celular/genética , Núcleo Celular/metabolismo , Regiões Promotoras GenéticasRESUMO
To assess the relative impact of inherited and de novo variants on autism risk, we generated a comprehensive set of exonic single-nucleotide variants (SNVs) and copy number variants (CNVs) from 2,377 families with autism. We find that private, inherited truncating SNVs in conserved genes are enriched in probands (odds ratio = 1.14, P = 0.0002) in comparison to unaffected siblings, an effect involving significant maternal transmission bias to sons. We also observe a bias for inherited CNVs, specifically for small (<100 kb), maternally inherited events (P = 0.01) that are enriched in CHD8 target genes (P = 7.4 × 10(-3)). Using a logistic regression model, we show that private truncating SNVs and rare, inherited CNVs are statistically independent risk factors for autism, with odds ratios of 1.11 (P = 0.0002) and 1.23 (P = 0.01), respectively. This analysis identifies a second class of candidate genes (for example, RIMS1, CUL7 and LZTR1) where transmitted mutations may create a sensitized background but are unlikely to be completely penetrant.
Assuntos
Transtorno Autístico/genética , Códon sem Sentido , Variações do Número de Cópias de DNA , Exoma , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação , Masculino , Polimorfismo de Nucleotídeo Único , RiscoRESUMO
Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR)â= 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS.