RESUMEN
Pediatric solid tumors are rare malignancies that represent a leading cause of death by disease among children in developed countries. The early age-of-onset of these tumors suggests that germline genetic factors are involved, yet conventional germline testing for short coding variants in established predisposition genes only identifies pathogenic events in 10-15% of patients. Here, we examined the role of germline structural variants (SVs)-an underexplored form of germline variation-in pediatric extracranial solid tumors using germline genome sequencing of 1,766 affected children, their 943 unaffected relatives, and 6,665 adult controls. We discovered a sex-biased association between very large (>1 megabase) germline chromosomal abnormalities and a four-fold increased risk of solid tumors in male children. The overall impact of germline SVs was greatest in neuroblastoma, where we revealed burdens of ultra-rare SVs that cause loss-of-function of highly expressed, mutationally intolerant, neurodevelopmental genes, as well as noncoding SVs predicted to disrupt three-dimensional chromatin domains in neural crest-derived tissues. Collectively, our results implicate rare germline SVs as a predisposing factor to pediatric solid tumors that may guide future studies and clinical practice.
RESUMEN
Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.
Asunto(s)
Trastorno del Espectro Autista , Femenino , Embarazo , Humanos , Trastorno del Espectro Autista/diagnóstico , Trastorno del Espectro Autista/genética , Primer Trimestre del Embarazo , Ultrasonografía Prenatal , Mapeo Cromosómico , ExomaRESUMEN
OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.
Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedad de Parkinson , Humanos , Enfermedad de Parkinson/genética , Genoma Humano , Secuenciación Completa del Genoma , GenotipoRESUMEN
Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
Asunto(s)
Exoma , Variación Genética , Genoma Humano , Islas de CpG , Análisis Mutacional de ADN , Bases de Datos Genéticas , Humanos , MutaciónRESUMEN
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
Asunto(s)
Exoma/genética , Genes Esenciales/genética , Variación Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Enfermedades Cardiovasculares/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Humanos , Mutación con Pérdida de Función/genética , Masculino , Tasa de Mutación , Proproteína Convertasa 9/genética , ARN Mensajero/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma , Secuenciación Completa del GenomaRESUMEN
OBJECTIVE: Recently, the ASC-1 complex has been identified as a mechanistic link between amyotrophic lateral sclerosis and spinal muscular atrophy (SMA), and 3 mutations of the ASC-1 gene TRIP4 have been associated with SMA or congenital myopathy. Our goal was to define ASC-1 neuromuscular function and the phenotypical spectrum associated with TRIP4 mutations. METHODS: Clinical, molecular, histological, and magnetic resonance imaging studies were made in 5 families with 7 novel TRIP4 mutations. Fluorescence activated cell sorting and Western blot were performed in patient-derived fibroblasts and muscles and in Trip4 knocked-down C2C12 cells. RESULTS: All mutations caused ASC-1 protein depletion. The clinical phenotype was purely myopathic, ranging from lethal neonatal to mild ambulatory adult patients. It included early onset axial and proximal weakness, scoliosis, rigid spine, dysmorphic facies, cutaneous involvement, respiratory failure, and in the older cases, dilated cardiomyopathy. Muscle biopsies showed multiminicores, nemaline rods, cytoplasmic bodies, caps, central nuclei, rimmed fibers, and/or mild endomysial fibrosis. ASC-1 depletion in C2C12 and in patient-derived fibroblasts and muscles caused accelerated proliferation, altered expression of cell cycle proteins, and/or shortening of the G0/G1 cell cycle phase leading to cell size reduction. INTERPRETATION: Our results expand the phenotypical and molecular spectrum of TRIP4-associated disease to include mild adult forms with or without cardiomyopathy, associate ASC-1 depletion with isolated primary muscle involvement, and establish TRIP4 as a causative gene for several congenital muscle diseases, including nemaline, core, centronuclear, and cytoplasmic-body myopathies. They also identify ASC-1 as a novel cell cycle regulator with a key role in cell proliferation, and underline transcriptional coregulation defects as a novel pathophysiological mechanism. ANN NEUROL 2020;87:217-232.
Asunto(s)
Sistema de Transporte de Aminoácidos y+/fisiología , Ciclo Celular/fisiología , Enfermedades Musculares/fisiopatología , Factores de Transcripción/genética , Adulto , Sistema de Transporte de Aminoácidos y+/metabolismo , Células Cultivadas , Niño , Preescolar , Femenino , Fibroblastos/fisiología , Humanos , Lactante , Masculino , Persona de Mediana Edad , Proteínas Musculares/genética , Músculo Esquelético/patología , Músculo Esquelético/fisiopatología , Enfermedades Musculares/genética , Mutación , Linaje , FenotipoRESUMEN
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.