Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 168
Filtrar
1.
Nat Genet ; 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38741017

RESUMO

Human pluripotent stem (hPS) cells can, in theory, be differentiated into any cell type, making them a powerful in vitro model for human biology. Recent technological advances have facilitated large-scale hPS cell studies that allow investigation of the genetic regulation of molecular phenotypes and their contribution to high-order phenotypes such as human disease. Integrating hPS cells with single-cell sequencing makes identifying context-dependent genetic effects during cell development or upon experimental manipulation possible. Here we discuss how the intersection of stem cell biology, population genetics and cellular genomics can help resolve the functional consequences of human genetic variation. We examine the critical challenges of integrating these fields and approaches to scaling them cost-effectively and practically. We highlight two areas of human biology that can particularly benefit from population-scale hPS cell studies, elucidating mechanisms underlying complex disease risk loci and evaluating relationships between common genetic variation and pharmacotherapeutic phenotypes.

2.
medRxiv ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38585811

RESUMO

Purpose: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). Methods: We coupled phenotyping with exome or genome sequencing of 467 pedigrees with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. Results: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. Conclusion: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.

3.
bioRxiv ; 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38645134

RESUMO

Missense variants can have a range of functional impacts depending on factors such as the specific amino acid substitution and location within the gene. To interpret their deleteriousness, studies have sought to identify regions within genes that are specifically intolerant of missense variation 1-12 . Here, we leverage the patterns of rare missense variation in 125,748 individuals in the Genome Aggregation Database (gnomAD) 13 against a null mutational model to identify transcripts that display regional differences in missense constraint. Missense-depleted regions are enriched for ClinVar 14 pathogenic variants, de novo missense variants from individuals with neurodevelopmental disorders (NDDs) 15,16 , and complex trait heritability. Following ClinGen calibration recommendations for the ACMG/AMP guidelines, we establish that regions with less than 20% of their expected missense variation achieve moderate support for pathogenicity. We create a missense deleteriousness metric (MPC) that incorporates regional constraint and outperforms other deleteriousness scores at stratifying case and control de novo missense variation, with a strong enrichment in NDDs. These results provide additional tools to aid in missense variant interpretation.

4.
Genome Biol ; 25(1): 111, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685090

RESUMO

BACKGROUND: Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. RESULTS: We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. CONCLUSIONS: Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.


Assuntos
Regiões 5' não Traduzidas , Fases de Leitura Aberta , Biossíntese de Proteínas , Humanos , Dosagem de Genes , Regulação da Expressão Gênica , Sítio de Iniciação de Transcrição , Processamento Alternativo , Conformação de Ácido Nucleico
5.
Ann Clin Transl Neurol ; 11(5): 1250-1266, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38544359

RESUMO

OBJECTIVE: Most families with heritable neuromuscular disorders do not receive a molecular diagnosis. Here we evaluate diagnostic utility of exome, genome, RNA sequencing, and protein studies and provide evidence-based recommendations for their integration into practice. METHODS: In total, 247 families with suspected monogenic neuromuscular disorders who remained without a genetic diagnosis after standard diagnostic investigations underwent research-led massively parallel sequencing: neuromuscular disorder gene panel, exome, genome, and/or RNA sequencing to identify causal variants. Protein and RNA studies were also deployed when required. RESULTS: Integration of exome sequencing and auxiliary genome, RNA and/or protein studies identified causal or likely causal variants in 62% (152 out of 247) of families. Exome sequencing alone informed 55% (83 out of 152) of diagnoses, with remaining diagnoses (45%; 69 out of 152) requiring genome sequencing, RNA and/or protein studies to identify variants and/or support pathogenicity. Arrestingly, novel disease genes accounted for <4% (6 out of 152) of diagnoses while 36.2% of solved families (55 out of 152) harbored at least one splice-altering or structural variant in a known neuromuscular disorder gene. We posit that contemporary neuromuscular disorder gene-panel sequencing could likely provide 66% (100 out of 152) of our diagnoses today. INTERPRETATION: Our results emphasize thorough clinical phenotyping to enable deep scrutiny of all rare genetic variation in phenotypically consistent genes. Post-exome auxiliary investigations extended our diagnostic yield by 81% overall (34-62%). We present a diagnostic algorithm that details deployment of genomic and auxiliary investigations to obtain these diagnoses today most effectively. We hope this provides a practical guide for clinicians as they gain greater access to clinical genome and transcriptome sequencing.


Assuntos
Sequenciamento do Exoma , Doenças Neuromusculares , Humanos , Doenças Neuromusculares/genética , Doenças Neuromusculares/diagnóstico , Masculino , Feminino , Adulto , Análise de Sequência de RNA/métodos , Criança , Adolescente , Exoma/genética , Pessoa de Meia-Idade , Adulto Jovem , Pré-Escolar , Sequenciamento de Nucleotídeos em Larga Escala , Lactente , Testes Genéticos/métodos
6.
medRxiv ; 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496558

RESUMO

Genes encoding long non-coding RNAs (lncRNAs) comprise a large fraction of the human genome, yet haploinsufficiency of a lncRNA has not been shown to cause a Mendelian disease. CHASERR is a highly conserved human lncRNA adjacent to CHD2-a coding gene in which de novo loss-of-function variants cause developmental and epileptic encephalopathy. Here we report three unrelated individuals each harboring an ultra-rare heterozygous de novo deletion in the CHASERR locus. We report similarities in severe developmental delay, facial dysmorphisms, and cerebral dysmyelination in these individuals, distinguishing them from the phenotypic spectrum of CHD2 haploinsufficiency. We demonstrate reduced CHASERR mRNA expression and corresponding increased CHD2 mRNA and protein in whole blood and patient-derived cell lines-specifically increased expression of the CHD2 allele in cis with the CHASERR deletion, as predicted from a prior mouse model of Chaserr haploinsufficiency. We show for the first time that de novo structural variants facilitated by Alu-mediated non-allelic homologous recombination led to deletion of a non-coding element (the lncRNA CHASERR) to cause a rare syndromic neurodevelopmental disorder. We also demonstrate that CHD2 has bidirectional dosage sensitivity in human disease. This work highlights the need to carefully evaluate other lncRNAs, particularly those upstream of genes associated with Mendelian disorders.

7.
Nat Rev Genet ; 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38366034

RESUMO

Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.

9.
Nat Genet ; 56(1): 152-161, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057443

RESUMO

Recessive diseases arise when both copies of a gene are impacted by a damaging genetic variant. When a patient carries two potentially causal variants in a gene, accurate diagnosis requires determining that these variants occur on different copies of the chromosome (that is, are in trans) rather than on the same copy (that is, in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. Here we developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in the Genome Aggregation Database (v2, n = 125,748 exomes). Our approach estimates phase with 96% accuracy, both in trio data and in patients with Mendelian conditions and presumed causal compound heterozygous variants. We provide a public resource of phasing estimates for coding variants and counts per gene of rare variants in trans that can aid interpretation of rare co-occurring variants in the context of recessive disease.


Assuntos
Exoma , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Exoma/genética , Sequenciamento do Exoma , Genótipo
10.
Nature ; 625(7993): 92-100, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057664

RESUMO

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção Genética
11.
Am J Hum Genet ; 110(9): 1496-1508, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37633279

RESUMO

Predicted loss of function (pLoF) variants are often highly deleterious and play an important role in disease biology, but many pLoF variants may not result in loss of function (LoF). Here we present a framework that advances interpretation of pLoF variants in research and clinical settings by considering three categories of LoF evasion: (1) predicted rescue by secondary sequence properties, (2) uncertain biological relevance, and (3) potential technical artifacts. We also provide recommendations on adjustments to ACMG/AMP guidelines' PVS1 criterion. Applying this framework to all high-confidence pLoF variants in 22 genes associated with autosomal-recessive disease from the Genome Aggregation Database (gnomAD v.2.1.1) revealed predicted LoF evasion or potential artifacts in 27.3% (304/1,113) of variants. The major reasons were location in the last exon, in a homopolymer repeat, in a low proportion expressed across transcripts (pext) scored region, or the presence of cryptic in-frame splice rescues. Variants predicted to evade LoF or to be potential artifacts were enriched for ClinVar benign variants. PVS1 was downgraded in 99.4% (162/163) of pLoF variants predicted as likely not LoF/not LoF, with 17.2% (28/163) downgraded as a result of our framework, adding to previous guidelines. Variant pathogenicity was affected (mostly from likely pathogenic to VUS) in 20 (71.4%) of these 28 variants. This framework guides assessment of pLoF variants beyond standard annotation pipelines and substantially reduces false positive rates, which is key to ensure accurate LoF variant prediction in both a research and clinical setting.


Assuntos
Padrões de Herança , Humanos , Éxons , Incerteza
12.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37595579

RESUMO

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Assuntos
Transtorno do Espectro Autista , Feminino , Gravidez , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Primeiro Trimestre da Gravidez , Ultrassonografia Pré-Natal , Mapeamento Cromossômico , Exoma
13.
Am J Hum Genet ; 110(9): 1482-1495, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37652022

RESUMO

Understanding the penetrance of pathogenic variants identified as secondary findings (SFs) is of paramount importance with the growing availability of genetic testing. We estimated penetrance through large-scale analyses of individuals referred for diagnostic sequencing for hypertrophic cardiomyopathy (HCM; 10,400 affected individuals, 1,332 variants) and dilated cardiomyopathy (DCM; 2,564 affected individuals, 663 variants), using a cross-sectional approach comparing allele frequencies against reference populations (293,226 participants from UK Biobank and gnomAD). We generated updated prevalence estimates for HCM (1:543) and DCM (1:220). In aggregate, the penetrance by late adulthood of rare, pathogenic variants (23% for HCM, 35% for DCM) and likely pathogenic variants (7% for HCM, 10% for DCM) was substantial for dominant cardiomyopathy (CM). Penetrance was significantly higher for variant subgroups annotated as loss of function or ultra-rare and for males compared to females for variants in HCM-associated genes. We estimated variant-specific penetrance for 316 recurrent variants most likely to be identified as SFs (found in 51% of HCM- and 17% of DCM-affected individuals). 49 variants were observed at least ten times (14% of affected individuals) in HCM-associated genes. Median penetrance was 14.6% (±14.4% SD). We explore estimates of penetrance by age, sex, and ancestry and simulate the impact of including future cohorts. This dataset reports penetrance of individual variants at scale and will inform the management of individuals undergoing genetic screening for SFs. While most variants had low penetrance and the costs and harms of screening are unclear, some individuals with highly penetrant variants may benefit from SFs.


Assuntos
Cardiomiopatias , Cardiomiopatia Dilatada , Cardiomiopatia Hipertrófica , Feminino , Masculino , Humanos , Adulto , Penetrância , Cardiomiopatias/genética , Cardiomiopatia Dilatada/genética , Frequência do Gene
14.
Neuron ; 111(18): 2800-2810.e5, 2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37463579

RESUMO

Genetic association studies have made significant contributions to our understanding of the etiology of neurodevelopmental disorders (NDDs). However, these studies rarely focused on the African continent. The NeuroDev Project aims to address this diversity gap through detailed phenotypic and genetic characterization of children with NDDs from Kenya and South Africa. We present results from NeuroDev's first year of data collection, including phenotype data from 206 cases and clinical genetic analyses of 99 parent-child trios. Most cases met criteria for global developmental delay/intellectual disability (GDD/ID, 80.3%). Approximately half of the children with GDD/ID also met criteria for autism. Analysis of exome-sequencing data identified a pathogenic or likely pathogenic variant in 13 (17%) of the 75 cases from South Africa and 9 (38%) of the 24 cases from Kenya. Data from the trio pilot are publicly available, and the NeuroDev Project will continue to develop resources for the global genetics community.


Assuntos
Transtorno Autístico , Deficiência Intelectual , Transtornos do Neurodesenvolvimento , Humanos , Criança , Transtornos do Neurodesenvolvimento/genética , Fenótipo , Deficiência Intelectual/genética , Transtorno Autístico/genética , Exoma , Deficiências do Desenvolvimento/genética
16.
Genome Res ; 33(6): 999-1005, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37253541

RESUMO

Large-scale high-throughput sequencing data sets have been transformative for informing clinical variant interpretation and for use as reference panels for statistical and population genetic efforts. Although such resources are often treated as ground truth, we find that in widely used reference data sets such as the Genome Aggregation Database (gnomAD), some variants pass gold-standard filters, yet are systematically different in their genotype calls across genotype discovery approaches. The inclusion of such discordant sites in study designs involving multiple genotype discovery strategies could bias results and lead to false-positive hits in association studies owing to technological artifacts rather than a true relationship to the phenotype. Here, we describe this phenomenon of discordant genotype calls across genotype discovery approaches, characterize the error mode of wrong calls, provide a list of discordant sites identified in gnomAD that should be treated with caution in analyses, and present a metric and machine learning classifier trained on gnomAD data to identify likely discordant variants in other data sets. We find that different genotype discovery approaches have different sets of variants at which this problem occurs, but there are characteristic variant features that can be used to predict discordant behavior. Discordant sites are largely shared across ancestry groups, although different populations are powered for the discovery of different variants. We find that the most common error mode is that of a variant being heterozygous for one approach and homozygous for the other, with heterozygous in the genomes and homozygous reference in the exomes making up the majority of miscalls.


Assuntos
Exoma , Genética Populacional , Genótipo , Heterozigoto , Fenótipo , Polimorfismo de Nucleotídeo Único
17.
Nat Rev Genet ; 24(8): 535-549, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37085594

RESUMO

Single-cell genomic technologies are revealing the cellular composition, identities and states in tissues at unprecedented resolution. They have now scaled to the point that it is possible to query samples at the population level, across thousands of individuals. Combining single-cell information with genotype data at this scale provides opportunities to link genetic variation to the cellular processes underpinning key aspects of human biology and disease. This strategy has potential implications for disease diagnosis, risk prediction and development of therapeutic solutions. But, effectively integrating large-scale single-cell genomic data, genetic variation and additional phenotypic data will require advances in data generation and analysis methods. As single-cell genetics begins to emerge as a field in its own right, we review its current state and the challenges and opportunities ahead.


Assuntos
Genoma , Genômica , Humanos , Genômica/métodos , Genótipo , Genética Humana
18.
bioRxiv ; 2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-36993580

RESUMO

Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected individual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in trans) rather than on the same copy (i.e. in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency < 1×10-4). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in trans, that can aid interpretation of rare co-occurring variants in the context of recessive disease.

19.
medRxiv ; 2023 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-36945502

RESUMO

Predicted loss of function (pLoF) variants are highly deleterious and play an important role in disease biology, but many of these variants may not actually result in loss-of-function. Here we present a framework that advances interpretation of pLoF variants in research and clinical settings by considering three categories of LoF evasion: (1) predicted rescue by secondary sequence properties, (2) uncertain biological relevance, and (3) potential technical artifacts. We also provide recommendations on adjustments to ACMG/AMP guidelines's PVS1 criterion. Applying this framework to all high-confidence pLoF variants in 22 autosomal recessive disease-genes from the Genome Aggregation Database (gnomAD, v2.1.1) revealed predicted LoF evasion or potential artifacts in 27.3% (304/1,113) of variants. The major reasons were location in the last exon, in a homopolymer repeat, in low per-base expression (pext) score regions, or the presence of cryptic splice rescues. Variants predicted to be potential artifacts or to evade LoF were enriched for ClinVar benign variants. PVS1 was downgraded in 99.4% (162/163) of LoF evading variants assessed, with 17.2% (28/163) downgraded as a result of our framework, adding to previous guidelines. Variant pathogenicity was affected (mostly from likely pathogenic to VUS) in 20 (71.4%) of these 28 variants. This framework guides assessment of pLoF variants beyond standard annotation pipelines, and substantially reduces false positive rates, which is key to ensure accurate LoF variant prediction in both a research and clinical setting.

20.
medRxiv ; 2023 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-38234731

RESUMO

Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...