RESUMEN
Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here, we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (130 Mbp median continuity), closing 92% of all previous assembly gaps1,2 and reaching telomere-to-telomere (T2T) status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8, and AMY1/AMY2, and fully resolve 1,852 complex structural variants (SVs). In addition, we completely assemble and validate 1,246 human centromeres. We find up to 30-fold variation in α-satellite high-order repeat (HOR) array length and characterize the pattern of mobile element insertions into α-satellite HOR arrays. While most centromeres predict a single site of kinetochore attachment, epigenetic analysis suggests the presence of two hypomethylated regions for 7% of centromeres. Combining our data with the draft pangenome reference1 significantly enhances genotyping accuracy from short-read data, enabling whole-genome inference3 to a median quality value (QV) of 45. Using this approach, 26,115 SVs per sample are detected, substantially increasing the number of SVs now amenable to downstream disease association studies.
RESUMEN
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generate single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. We evaluate enhancer activity for 59 elements using an in vivo transgenic assay and validate 44 (75%), demonstrating that single cell accessibility can be a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieve significant reduction in our variant search space and nominate candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work delivers non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
Asunto(s)
Elementos de Facilitación Genéticos , Animales , Ratones , Humanos , Elementos de Facilitación Genéticos/genética , Neuronas Motoras/metabolismo , Cromatina/metabolismo , Cromatina/genética , Masculino , Análisis de la Célula Individual , Epigenómica/métodos , Femenino , LinajeRESUMEN
PURPOSE: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). METHODS: We coupled phenotyping with exome or genome sequencing of 467 probands (550 affected and 1108 total individuals) with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. RESULTS: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. CONCLUSION: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.
RESUMEN
Underrepresented populations are often excluded from genomic studies owing in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high-quality set of 4094 whole genomes from 80 populations in the HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also show substantial added value from this data set compared with the prior versions of the component resources, typically combined via liftOver and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared with previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality-control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.
Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Humanos , Proyecto Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Variación Genética , Genómica/métodosRESUMEN
Purpose: To identify genetic etiologies and genotype/phenotype associations for unsolved ocular congenital cranial dysinnervation disorders (oCCDDs). Methods: We coupled phenotyping with exome or genome sequencing of 467 pedigrees with genetically unsolved oCCDDs, integrating analyses of pedigrees, human and animal model phenotypes, and de novo variants to identify rare candidate single nucleotide variants, insertion/deletions, and structural variants disrupting protein-coding regions. Prioritized variants were classified for pathogenicity and evaluated for genotype/phenotype correlations. Results: Analyses elucidated phenotypic subgroups, identified pathogenic/likely pathogenic variant(s) in 43/467 probands (9.2%), and prioritized variants of uncertain significance in 70/467 additional probands (15.0%). These included known and novel variants in established oCCDD genes, genes associated with syndromes that sometimes include oCCDDs (e.g., MYH10, KIF21B, TGFBR2, TUBB6), genes that fit the syndromic component of the phenotype but had no prior oCCDD association (e.g., CDK13, TGFB2), genes with no reported association with oCCDDs or the syndromic phenotypes (e.g., TUBA4A, KIF5C, CTNNA1, KLB, FGF21), and genes associated with oCCDD phenocopies that had resulted in misdiagnoses. Conclusion: This study suggests that unsolved oCCDDs are clinically and genetically heterogeneous disorders often overlapping other Mendelian conditions and nominates many candidates for future replication and functional studies.
RESUMEN
Underrepresented populations are often excluded from genomic studies due in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high quality set of 4,094 whole genomes from HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also demonstrate substantial added value from this dataset compared to the prior versions of the component resources, typically combined via liftover and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared to previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.
RESUMEN
Developing efficient and safe antibacterial agents to inhibit pathogens including Physalospora piricola and Staphylococcus aureus is of great importance. Herein, a novel compound composed of Rosa roxburghii procyanidin, chitosan and selenium nanoparticle (RC-SeNP) was bio-synthesized, with the average diameter and zeta potential being 84.56 nm and -25.60 mV, respectively. The inhibition diameter of the RC-SeNP against P. piricola and S. aureus reached 18.67 mm and 13.13 mm, and the maximum scavenging activity against DPPH and ABTS reached 96.02% and 98.92%, respectively. Moreover, the RC-SeNP completely inhibited the propagation P. piricola and S. aureus on actual apples, suggesting excellent in vivo antimicrobial capacity. The transcriptome analysis and electron microscope observation indicated that the antibacterial activity would be attributed to adhering to and crack the cell walls as well as damage the cytomembrane and nucleus. Moreover, the RC-SeNP effectively maintained the vitamin C, total acid, and water contents of red bayberry, demonstrating potential application for fruit preservation. At last, the RC-SeNP showed no cell toxicity and trace selenium residual dose (0.03 mg/kg on apple, 0.12 mg/kg on red bayberry). This study would enlighten future development on novel nano-bioantibacterial agents for sustainable agriculture.
Asunto(s)
Quitosano , Nanopartículas , Rosa , Selenio , Antioxidantes/farmacología , Antioxidantes/química , Selenio/química , Quitosano/química , Staphylococcus aureus , Nanopartículas/química , Antibacterianos/farmacología , Antibacterianos/química , Extractos Vegetales/farmacologíaRESUMEN
The degradation of organic pollution by sulfur-modified nano zero-valent iron(S-nZVI) combined with advanced oxidation systems has been extensively studied. However, the low utilization of nZVI and low reactive oxygen species (ROS) yield in the system have limited its wide application. Herein, a natural organic acid commonly found in citrus fruits, citric acid (CA), was combined with the conventional S-nZVI@Ps system to enhance the degradation of norfloxacin (NOR). The addition of CA increased the NOR removal by about 31% compared with the conventional S-nZVI@Ps system under the same experimental conditions. Among them, the enhanced effect of CA is mainly reflected in its ability to promote the release of Fe2+ and accelerate the cycling of Fe2+ and Fe3+ to further improve the utilization of nZVI and the generation of ROS; it also promotes the dissolution of the active substance (FeS) on the surface of S-nZVI to further improve the degradation rate of NOR. More importantly, the chelate of CA and Fe2+ (CA-Fe2+) had higher reactivity than alone Fe2+. Free radical quenching and electron spin resonance (ESR) experiments indicated that the main ROS for the degradation of NOR in the CA/S-nZVI@Ps system were SO4â¢- and OHâ¢. CA-bound sulfur-modifying effects on NOR degradation was systematically investigated, and the degradation mechanism of NOR in CA/S-nZVI@Ps system was explored by various techniques. Additionally, the effect of common anions in water matrix on the degradation of NOR in CA/S-nZVI@Ps system and its degradation of various pollutants were also studied. This study provides a new perspective to enhance the degradation of pollutants by S-nZVI combined with advanced oxidation system, which can help to solve the application boundary problem of S-nZVI.
Asunto(s)
Contaminantes Ambientales , Contaminantes Químicos del Agua , Norfloxacino , Ácido Cítrico , Especies Reactivas de Oxígeno , Contaminantes Químicos del Agua/análisis , Citratos , AzufreRESUMEN
Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.
Asunto(s)
Trastorno del Espectro Autista , Femenino , Embarazo , Humanos , Trastorno del Espectro Autista/diagnóstico , Trastorno del Espectro Autista/genética , Primer Trimestre del Embarazo , Ultrasonografía Prenatal , Mapeo Cromosómico , ExomaRESUMEN
Copy number variants (CNVs) are major contributors to genetic diversity and disease. While standardized methods, such as the genome analysis toolkit (GATK), exist for detecting short variants, technical challenges have confounded uniform large-scale CNV analyses from whole-exome sequencing (WES) data. Given the profound impact of rare and de novo coding CNVs on genome organization and human disease, we developed GATK-gCNV, a flexible algorithm to discover rare CNVs from sequencing read-depth information, complete with open-source distribution via GATK. We benchmarked GATK-gCNV in 7,962 exomes from individuals in quartet families with matched genome sequencing and microarray data, finding up to 95% recall of rare coding CNVs at a resolution of more than two exons. We used GATK-gCNV to generate a reference catalog of rare coding CNVs in WES data from 197,306 individuals in the UK Biobank, and observed strong correlations between per-gene CNV rates and measures of mutational constraint, as well as rare CNV associations with multiple traits. In summary, GATK-gCNV is a tunable approach for sensitive and specific CNV discovery in WES data, with broad applications.
Asunto(s)
Variaciones en el Número de Copia de ADN , Exoma , Humanos , Exoma/genética , Secuenciación del Exoma , Variaciones en el Número de Copia de ADN/genética , Mapeo Cromosómico , ExonesRESUMEN
We characterized the role of structural variants, a largely unexplored type of genetic variation, in two non-Alzheimer's dementias, namely Lewy body dementia (LBD) and frontotemporal dementia (FTD)/amyotrophic lateral sclerosis (ALS). To do this, we applied an advanced structural variant calling pipeline (GATK-SV) to short-read whole-genome sequence data from 5,213 European-ancestry cases and 4,132 controls. We discovered, replicated, and validated a deletion in TPCN1 as a novel risk locus for LBD and detected the known structural variants at the C9orf72 and MAPT loci as associated with FTD/ALS. We also identified rare pathogenic structural variants in both LBD and FTD/ALS. Finally, we assembled a catalog of structural variants that can be mined for new insights into the pathogenesis of these understudied forms of dementia.
RESUMEN
PURPOSE: Orofacial clefts (OFCs) are common birth defects including cleft lip, cleft lip and palate, and cleft palate. OFCs have heterogeneous etiologies, complicating clinical diagnostics because it is not always apparent if the cause is Mendelian, environmental, or multifactorial. Sequencing is not currently performed for isolated or sporadic OFCs; therefore, we estimated the diagnostic yield for 418 genes in 841 cases and 294 controls. METHODS: We evaluated 418 genes using genome sequencing and curated variants to assess their pathogenicity using American College of Medical Genetics criteria. RESULTS: 9.04% of cases and 1.02% of controls had "likely pathogenic" variants (P < .0001), which was almost exclusively driven by heterozygous variants in autosomal genes. Cleft palate (17.6%) and cleft lip and palate (9.09%) cases had the highest yield, whereas cleft lip cases had a 2.80% yield. Out of 39 genes with likely pathogenic variants, 9 genes, including CTNND1 and IRF6, accounted for more than half of the yield (4.64% of cases). Most variants (61.8%) were "variants of uncertain significance", occurring more frequently in cases (P = .004), but no individual gene showed a significant excess of variants of uncertain significance. CONCLUSION: These results underscore the etiological heterogeneity of OFCs and suggest sequencing could reduce the diagnostic gap in OFCs.
Asunto(s)
Labio Leporino , Fisura del Paladar , Humanos , Labio Leporino/diagnóstico , Labio Leporino/genética , Fisura del Paladar/diagnóstico , Fisura del Paladar/genética , Alelos , Mapeo Cromosómico , Factores Reguladores del Interferón/genéticaRESUMEN
Introduction: Probiotic Lactiplantibacillus plantarum MC5 produces large amounts of exopolysaccharides (EPS), and its use as a compound fermentor can greatly improve the quality of fermented milk. Methods: To gain insight into the genomic characteristics of probiotic MC5 and reveal the relationship between its EPS biosynthetic phenotype and genotype, we analyzed the carbohydrate metabolic capacity, nucleotide sugar formation pathways, and EPS biosynthesis-related gene clusters of strain MC5 based on its whole genome sequence. Finally, we performed validation tests on the monosaccharides and disaccharides that strain MC5 may metabolize. Results: Genomic analysis showed that MC5 has seven nucleotide sugar biosynthesis pathways and 11 sugar-specific phosphate transport systems, suggesting that the strain can metabolize mannose, fructose, sucrose, cellobiose, glucose, lactose, and galactose. Validation results showed that strain MC5 can metabolize these seven sugars and produce significant amounts of EPS (> 250 mg/L). In addition, strain MC5 possesses two typical eps biosynthesis gene clusters, which include the conserved genes epsABCDE, wzx, and wzy, six key genes for polysaccharide biosynthesis, and one MC5-specific epsG gene. Discussion: These insights into the mechanism of EPS-MC5 biosynthesis can be used to promote the production of EPS through genetic engineering.
RESUMEN
This study optimized the exopolysaccharides (EPS) production for Lactiplantibacillus plantarum MC5 (Lp. plantarum MC5) and evaluated the resistance to human simulated digestive juices, antioxidant activity in vitro, and rheological properties of EPS-MC5. The results showed that maximum EPS production of 345.98 mg/L (about 1.5-old greater than the initial production) was obtained at optimal conditions of inoculum size (4.0%), incubation time (30 h), incubation temperature (34.0 °C), and initial pH value (6.40). Furthermore, the resisting-digestion capacity of EPS-MC5 after 180 min in α-amylase, simulated gastric juice (pH 2.0, 3.0, 4.0), and simulated intestinal juice (pH 6.8) was 98.59%, 98.62%, 98.78%, 98.86%, and 98.74%, respectively. In addition, the radical scavenging rates of DPPHâ¢, ABTSâ¢, â¢OH, and ferric-iron reducing power (OD700) of EPS-MC5 were 73.33%, 87.74%, 46.07%, and 1.20, respectively. Furthermore, rheological results showed that the EPS-MC5 had a higher apparent viscosity (3.01 Pa) and shear stress (41.78 Pa), and the viscoelastic modulus (84.02 and 161.02 Pa at the shear frequency of 100 Hz). These results provide a new insight into the application of EPS in human health and functional foods, which could also improve theoretical guidance for the industrial application of EPS.
Asunto(s)
Lactobacillus plantarum , Probióticos , Humanos , Probióticos/química , Antioxidantes/química , Viscosidad , Reología , Polisacáridos Bacterianos/química , Lactobacillus plantarum/químicaRESUMEN
OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.
Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedad de Parkinson , Humanos , Enfermedad de Parkinson/genética , Genoma Humano , Secuenciación Completa del Genoma , GenotipoRESUMEN
Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies. Here, we present a single cell multi-omic framework integrating embryonic mouse chromatin accessibility, histone modification, and gene expression assays to discover cranial motor neuron (cMN) cis-regulatory elements and subsequently nominate candidate non-coding variants in the congenital cranial dysinnervation disorders (CCDDs), a set of Mendelian disorders altering cMN development. We generated single cell epigenomic profiles for ~86,000 cMNs and related cell types, identifying ~250,000 accessible regulatory elements with cognate gene predictions for ~145,000 putative enhancers. Seventy-five percent of elements (44 of 59) validated in an in vivo transgenic reporter assay, demonstrating that single cell accessibility is a strong predictor of enhancer activity. Applying our cMN atlas to 899 whole genome sequences from 270 genetically unsolved CCDD pedigrees, we achieved significant reduction in our variant search space and nominated candidate variants predicted to regulate known CCDD disease genes MAFB, PHOX2A, CHN1, and EBF3 - as well as new candidates in recurrently mutated enhancers through peak- and gene-centric allelic aggregation. This work provides novel non-coding variant discoveries of relevance to CCDDs and a generalizable framework for nominating non-coding variants of potentially high functional impact in other Mendelian disorders.
RESUMEN
Puccinia triticina, which is the causative agent of wheat leaf rust, is widely spread in China and most other wheat-planting countries around the globe. Cultivating resistant wheat cultivars is the most economical, effective, and environmentally friendly method for controlling leaf rust-caused yield damage. Exploring the source of resistance is very important in wheat resistance breeding programs. In order to explore more effective resistance sources for wheat leaf rust, the resistance of 112 wheat accessions introduced from the U.S. National Plant Germplasm System were identified using a mixture of pathogenic isolates of THTT, THTS, PHTT, THJT and THJS which are the most predominant races in China. As a result, all of these accessions showed high resistance at seedling stage, of which, ninety-nine accessions exhibited resistance at adult plant stage. Eleven molecular markers of eight effective leaf rust resistance genes in China were used to screen the 112 accessions. Seven effective leaf rust resistance genes Lr9, Lr19, Lr24, Lr28, Lr29, Lr38 and Lr45 were detected, except Lr47. Twenty-three accessions had only one of those seven effective leaf rust resistance gene. Eleven accessions carried Lr24+Lr38, and 7 accessions carried Lr9+Lr24+Lr38, Lr24+Lr38+Lr45, Lr24+Lr29+Lr38 and Lr19+Lr38+Lr45 respectively. The remaining seventy-one accessions had none of those eight effective leaf rust resistance genes. This study will provide theoretical guidance for rational utilization of these introduted wheat accessions directly or for breeding the resistant wheat cultivars.
RESUMEN
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.
Asunto(s)
Genoma Humano , Secuenciación Completa del Genoma , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación INDEL , Masculino , Polimorfismo de Nucleótido SimpleRESUMEN
Pathogenic variants in the SRCAP (SNF2-related CREBBP activator protein) gene, which encodes a chromatin-remodeling ATPase, cause neurodevelopmental disorders including Floating Harbor syndrome (FLHS). Here, we report the discovery of a de novo transposon insertion in SRCAP exon 13 from trio genome sequencing in a 28-year-old female with failure to thrive, developmental delay, mood disorder and seizure disorder. The insertion was a full-length (~2.8 kb), antisense-oriented SVA insertion relative to the SRCAP transcript, bearing a 5' transduction and hallmarks of target-primed reverse transcription. The 20-bp 5' transduction allowed us to trace the source SVA element to an intron of a long non-coding RNA on chromosome 12, which is highly expressed in testis. RNA sequencing and qRT-PCR confirmed significant depletion of SRCAP expression and low-level exon skipping in the proband. This case highlights a novel disease-causing structural variant and the importance of transposon analysis in a clinical diagnostic setting.