RESUMEN
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.
Asunto(s)
Genoma Humano , Secuenciación Completa del Genoma , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación INDEL , Masculino , Polimorfismo de Nucleótido SimpleRESUMEN
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
Asunto(s)
Genoma Humano , Genómica , Humanos , Diploidia , Genoma Humano/genética , Haplotipos/genética , Análisis de Secuencia de ADN , Genómica/normas , Estándares de Referencia , Estudios de Cohortes , Alelos , Variación GenéticaRESUMEN
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Asunto(s)
Variación Genética , Genoma Humano/genética , Secuenciación Completa del Genoma , Alelos , Estudios de Casos y Controles , Epigénesis Genética , Femenino , Dosificación de Gen/genética , Genética de Población , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Anotación de Secuencia Molecular , Sitios de Carácter Cuantitativo , Grupos Raciales/genética , Programas InformáticosRESUMEN
Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.
Asunto(s)
Secuenciación del Exoma , Estudios de Asociación Genética/métodos , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Sitios de Carácter Cuantitativo/genética , Alelos , HDL-Colesterol/genética , Análisis por Conglomerados , Determinación de Punto Final , Finlandia , Mapeo Geográfico , Humanos , Herencia Multifactorial/genética , Reproducibilidad de los ResultadosRESUMEN
An Amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10-54) and is also associated with increased levels of total cholesterol (p = 1.22 × 10-28) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10-21) and alanine (p = 6.14 × 10-12) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10-10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10-35). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.
Asunto(s)
Enfermedades Cardiovasculares/genética , Variación Estructural del Genoma/genética , Alelos , Colesterol/sangre , Variaciones en el Número de Copia de ADN/genética , Femenino , Finlandia , Genoma Humano/genética , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Proteínas Mitocondriales/genética , Regiones Promotoras Genéticas/genética , Piruvato Deshidrogenasa (Lipoamida)-Fosfatasa/genética , Ácido Pirúvico/metabolismo , Albúmina Sérica Humana/genéticaRESUMEN
SUMMARY: Large-scale human genetics studies are now employing whole genome sequencing with the goal of conducting comprehensive trait mapping analyses of all forms of genome variation. However, methods for structural variation (SV) analysis have lagged far behind those for smaller scale variants, and there is an urgent need to develop more efficient tools that scale to the size of human populations. Here, we present a fast and highly scalable software toolkit (svtools) and cloud-based pipeline for assembling high quality SV maps-including deletions, duplications, mobile element insertions, inversions and other rearrangements-in many thousands of human genomes. We show that this pipeline achieves similar variant detection performance to established per-sample methods (e.g. LUMPY), while providing fast and affordable joint analysis at the scale of ≥100 000 genomes. These tools will help enable the next generation of human genetics studies. AVAILABILITY AND IMPLEMENTATION: svtools is implemented in Python and freely available (MIT) from https://github.com/hall-lab/svtools. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genoma Humano , Programas Informáticos , Humanos , Eliminación de Secuencia , Secuenciación Completa del GenomaRESUMEN
Summary: Here we present SVScore, a tool for in silico structural variation (SV) impact prediction. SVScore aggregates per-base single nucleotide polymorphism (SNP) pathogenicity scores across relevant genomic intervals for each SV in a manner that considers variant type, gene features and positional uncertainty. We show that the allele frequency spectrum of high-scoring SVs is strongly skewed toward lower frequencies, suggesting that they are under purifying selection, and that SVScore identifies deleterious variants more effectively than alternative methods. Notably, our results also suggest that duplications are under surprisingly strong selection relative to deletions, and that there are a similar number of strongly pathogenic SVs and SNPs in the human population. Availability and Implementation: SVScore is implemented in Perl and available freely at {{ http://www.github.com/lganel/SVScore }} for use under the MIT license. Contact: ihall@wustl.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Variación Estructural del Genoma , Programas Informáticos , Frecuencia de los Genes , Genómica/métodos , Humanos , Polimorfismo de Nucleótido Simple , Eliminación de SecuenciaRESUMEN
Recurrent genomic mutations in uterine and non-uterine leiomyosarcomas have not been well established. Using a next generation sequencing (NGS) panel of common cancer-associated genes, 25 leiomyosarcomas arising from multiple sites were examined to explore genetic alterations, including single nucleotide variants (SNV), small insertions/deletions (indels), and copy number alterations (CNA). Sequencing showed 86 non-synonymous, coding region somatic variants within 151 gene targets in 21 cases, with a mean of 4.1 variants per case; 4 cases had no putative mutations in the panel of genes assayed. The most frequently altered genes were TP53 (36%), ATM and ATRX (16%), and EGFR and RB1 (12%). CNA were identified in 85% of cases, with the most frequent copy number losses observed in chromosomes 10 and 13 including PTEN and RB1; the most frequent gains were seen in chromosomes 7 and 17. Our data show that deletions in canonical cancer-related genes are common in leiomyosarcomas. Further, the spectrum of gene mutations observed shows that defects in DNA repair and chromosomal maintenance are central to the biology of leiomyosarcomas, and that activating mutations observed in other common cancer types are rare in leiomyosarcomas.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Leiomiosarcoma/genética , Mutación , Adolescente , Adulto , Anciano , Proteínas de la Ataxia Telangiectasia Mutada/genética , Variaciones en el Número de Copia de ADN , ADN Helicasas/genética , Receptores ErbB/genética , Femenino , Humanos , Mutación INDEL , Leiomiosarcoma/patología , Masculino , Persona de Mediana Edad , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple , Proteína de Retinoblastoma/genética , Proteína p53 Supresora de Tumor/genética , Proteína Nuclear Ligada al Cromosoma X , Adulto JovenRESUMEN
BACKGROUND: The Long Life Family Study (LLFS) is an international study to identify the genetic components of various healthy aging phenotypes. We hypothesized that pedigree-specific rare variants at longevity-associated genes could have a similar functional impact on healthy phenotypes. METHODS: We performed custom hybridization capture sequencing to identify the functional variants in 464 candidate genes for longevity or the major diseases of aging in 615 pedigrees (4,953 individuals) from the LLFS, using a multiplexed, custom hybridization capture. Variants were analyzed individually or as a group across an entire gene for association to aging phenotypes using family based tests. RESULTS: We found significant associations to three genes and nine single variants. Most notably, we found a novel variant significantly associated with exceptional survival in the 3' UTR OBFC1 in 13 individuals from six pedigrees. OBFC1 (chromosome 10) is involved in telomere maintenance, and falls within a linkage peak recently reported from an analysis of telomere length in LLFS families. Two different algorithms for single gene associations identified three genes with an enrichment of variation that was significantly associated with three phenotypes (GSK3B with the Healthy Aging Index, NOTCH1 with diastolic blood pressure and TP53 with serum HDL). CONCLUSIONS: Sequencing analysis of family-based associations for age-related phenotypes can identify rare or novel variants.
Asunto(s)
Estudios de Asociación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Longevidad/genética , Linaje , Fenotipo , Anciano , Femenino , Pruebas Genéticas , Variación Genética/genética , Humanos , MasculinoRESUMEN
African Americans are admixed with genetic contributions from European and African ancestral populations. Admixture mapping leverages this information to map genes influencing differential disease risk across populations. We performed admixture and association mapping in 3,300 African American current or former smokers from the COPDGene Study. We analyzed estimated local ancestry and SNP genotype information to identify regions associated with FEV1 /FVC, the ratio of forced expiratory volume in one second to forced vital capacity, measured by spirometry performed after bronchodilator administration. Global African ancestry inversely associated with FEV1 /FVC (P = 0.035). Genome-wide admixture analysis, controlling for age, gender, body mass index, current smoking status, pack-years smoked, and four principal components summarizing the genetic background of African Americans in the COPDGene Study, identified a region on chromosome 12q14.1 associated with FEV1 /FVC (P = 2.1 × 10(-6) ) when regressed on local ancestry. Allelic association in this region of chromosome 12 identified an intronic variant in FAM19A2 (rs348644) as associated with FEV1 /FVC (P = 1.76 × 10(-6) ). By combining admixture and association mapping, a marker on chromosome 12q14.1 was identified as being associated with reduced FEV1 /FVC ratio among African Americans in the COPDGene Study.
Asunto(s)
Quimiocinas CC/genética , Enfermedad Pulmonar Obstructiva Crónica/genética , Capacidad Vital/genética , Negro o Afroamericano/genética , Mapeo Cromosómico , Susceptibilidad a Enfermedades , Femenino , Volumen Espiratorio Forzado/genética , Frecuencia de los Genes , Estudios de Asociación Genética , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Sitios de Carácter Cuantitativo , Factores de Riesgo , Población Blanca/genéticaRESUMEN
With the advent of large-scale genomic analysis, the genetic landscape of glioblastoma (GBM) has become more clear, including characteristic genetic alterations in EGFR. In routine clinical practice, genetic alterations in GBMs are identified using several disparate techniques that consume already limited amounts of tissue and add to overall testing costs. In this study, we sought to determine if the full spectrum of EGFR mutations in GBMs could be detected using a single next generation sequencing (NGS) based oncology assay in 34 consecutive cases. Using a battery of informatics tools to identify single nucleotide variants, insertions and deletions, and amplification (including variants EGFRvIII and EGFRvV), twenty-one of the 34 (62%) individuals had at least one alteration in EGFR by sequencing, consistent with published datasets. Mutations detected include several single nucleotide variants, amplification (confirmed by fluorescence in situ hybridization), and the variants EGFRvIII and EGFRvV (confirmed by multiplex ligation-dependent probe amplification). Here we show that a single NGS assay can identify the full spectrum of relevant EGFR mutations. Overall, sequencing based diagnostics have the potential to maximize the amount of genetic information obtained from GBMs and simultaneously reduce the total time, required specimen material, and costs associated with current multimodality studies.
Asunto(s)
Neoplasias Encefálicas/genética , Receptores ErbB/genética , Glioblastoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mutación , Análisis de Secuencia de ADN/métodos , Adolescente , Adulto , Anciano , Neoplasias Encefálicas/diagnóstico , Femenino , Glioblastoma/diagnóstico , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: T-cell receptor (TCR) clonality assessment is a principal diagnostic test in the management of mycosis fungoides (MF). However, current polymerase chain reaction-based methods may produce ambiguous results, often because of low abundance of clonal T lymphocytes, resulting in weak clonal peaks that cannot be size-resolved by contemporary capillary electrophoresis (CE). OBJECTIVE: We sought to determine if next-generation sequencing (NGS)-based detection has increased sensitivity for T-cell clonality over CE-based detection in MF. METHODS: Clonality was determined by an NGS-based method in which the TCR-γ variable region was polymerase chain reaction amplified and the products sequenced to establish the identity of rearranged variable and joining regions. RESULTS: Of the 35 MF cases tested, 29 (85%) showed a clonal T-cell rearrangement by NGS, compared with 15 (44%) by standard CE detection. Three patients with MF had follow-up testing that showed identical, clonal TCR sequences in subsequent skin biopsy specimens. LIMITATIONS: Clonal T-cell populations have been described in benign conditions; evidence of clonality alone, by any method, is not sufficient for diagnosis. CONCLUSION: TCR clonality assessment by NGS has superior sensitivity compared with CE-based detection. Further, NGS enables tracking of specific clones across multiple time points for more accurate identification of recurrent MF.
Asunto(s)
Predisposición Genética a la Enfermedad , Micosis Fungoide/diagnóstico , Micosis Fungoide/genética , Receptores de Antígenos de Linfocitos T/genética , Neoplasias Cutáneas/diagnóstico , Neoplasias Cutáneas/genética , Adulto , Anciano , Clonación Molecular/métodos , ADN de Neoplasias/genética , Bases de Datos Factuales , Electroforesis/métodos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Datos de Secuencia Molecular , Recurrencia Local de Neoplasia/genética , Recurrencia Local de Neoplasia/patología , Reacción en Cadena de la Polimerasa/métodos , Pronóstico , Estudios RetrospectivosRESUMEN
Merkel cell carcinoma is a highly aggressive cutaneous neuroendocrine tumor that has been associated with Merkel cell polyomavirus in up to 80% of cases. Merkel cell polyomavirus is believed to influence pathogenesis, at least in part, through expression of the large T antigen, which includes a retinoblastoma protein-binding domain. However, there appears to be significant clinical and morphological overlap between polyomavirus-positive and polyomavirus-negative Merkel cell carcinoma cases. Although much of the recent focus of Merkel cell carcinoma pathogenesis has been on polyomavirus, the pathogenesis of polyomavirus-negative cases is still poorly understood. We hypothesized that there are underlying human somatic mutations that unify Merkel cell carcinoma pathogenesis across polyomavirus status, and to investigate we performed whole exome sequencing on five polyomavirus-positive cases and three polyomavirus-negative cases. We found that there were no significant differences in the overall number of single-nucleotide variations, copy number variations, insertion/deletions, and chromosomal rearrangements when comparing polyomavirus-positive to polyomavirus-negative cases. However, we did find that the retinoblastoma pathway genes harbored a high number of mutations in Merkel cell carcinoma. Furthermore, the retinoblastoma gene (RB1) was found to have nonsense truncating protein mutations in all three polyomavirus-negative cases; no such mutations were found in the polyomavirus-positive cases. In all eight cases, the retinoblastoma pathway dysregulation was confirmed by immunohistochemistry. Although polyomavirus-positive Merkel cell carcinoma is believed to undergo retinoblastoma dysregulation through viral large T antigen expression, our findings demonstrate that somatic mutations in polyomavirus-negative Merkel cell carcinoma lead to retinoblastoma dysregulation through an alternative pathway. This novel finding suggests that the retinoblastoma pathway dysregulation leads to an overlapping Merkel cell carcinoma phenotype and that oncogenesis occurs through either a polyomavirus-dependent (viral large T antigen expression) or polyomavirus-independent (host somatic mutation) mechanism.
Asunto(s)
Biomarcadores de Tumor/genética , Carcinoma de Células de Merkel/genética , Análisis Mutacional de ADN/métodos , Exoma , Genes de Retinoblastoma , Mutación , Infecciones por Polyomavirus/genética , Proteína de Retinoblastoma/genética , Neoplasias Cutáneas/genética , Infecciones Tumorales por Virus/genética , Anciano , Anciano de 80 o más Años , Biomarcadores de Tumor/análisis , Carcinoma de Células de Merkel/química , Carcinoma de Células de Merkel/virología , Femenino , Predisposición Genética a la Enfermedad , Humanos , Inmunohistoquímica , Masculino , Poliomavirus de Células de Merkel/aislamiento & purificación , Persona de Mediana Edad , Fenotipo , Infecciones por Polyomavirus/metabolismo , Infecciones por Polyomavirus/virología , Proteína de Retinoblastoma/análisis , Neoplasias Cutáneas/química , Neoplasias Cutáneas/virología , Infecciones Tumorales por Virus/metabolismo , Infecciones Tumorales por Virus/virologíaRESUMEN
Targeted next-generation sequencing (NGS) cancer panels have become a popular method for the identification of clinically predictive mutations in cancer. Such methods typically detect single nucleotide variants (SNVs) and small insertions/deletions (indels) in known cancer genes and can provide further information regarding diagnosis in challenging surgical pathology cases, as well as identify therapeutic targets and prognostically significant mutations. However, in addition to SNVs and indels, other mutation classes, including copy number variants (CNVs) and translocations, can be simultaneously detected from targeted NGS data. Here, as proof of methods, we present clinical data which demonstrate that targeted NGS panels can separate synchronous liver tumors based on CNV status, in the absence of distinct SNVs and indels. Such CNV-based analysis can be performed without additional cost using existing targeted cancer panel data and publically available software.
Asunto(s)
Carcinoma Neuroendocrino/genética , Dosificación de Gen , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patología , Biopsia con Aguja Gruesa , Carcinoma Neuroendocrino/patología , Humanos , Neoplasias Primarias Múltiples/genética , Medicina de PrecisiónRESUMEN
Here, we characterize the DNA methylation phenotypes of bone marrow cells from mice with hematopoietic deficiency of Dnmt3a or Dnmt3b (or both enzymes) or expressing the dominant-negative Dnmt3aR878H mutation [R882H in humans; the most common DNMT3A mutation found in acute myeloid leukemia (AML)]. Using these cells as substrates, we defined DNA remethylation after overexpressing wild-type (WT) DNMT3A1, DNMT3B1, DNMT3B3 (an inactive splice isoform of DNMT3B), or DNMT3L (a catalytically inactive "chaperone" for DNMT3A and DNMT3B in early embryogenesis). Overexpression of DNMT3A for 2 weeks reverses the hypomethylation phenotype of Dnmt3a-deficient cells or cells expressing the R878H mutation. Overexpression of DNMT3L (which is minimally expressed in AML cells) also corrects the hypomethylation phenotype of Dnmt3aR878H/+ marrow, probably by augmenting the activity of WT DNMT3A encoded by the residual WT allele. DNMT3L reactivation may represent a previously unidentified approach for restoring DNMT3A activity in hematopoietic cells with reduced DNMT3A function.
Asunto(s)
ADN (Citosina-5-)-Metiltransferasas , Leucemia Mieloide Aguda , Humanos , Ratones , Animales , ADN (Citosina-5-)-Metiltransferasas/genética , ADN Metiltransferasa 3A , ADN , Mutación , Metilación de ADN , Leucemia Mieloide Aguda/genéticaRESUMEN
PURPOSE: Persistent molecular disease (PMD) after induction chemotherapy predicts relapse in AML. In this study, we used whole-exome sequencing (WES) and targeted error-corrected sequencing to assess the frequency and mutational patterns of PMD in 30 patients with AML. MATERIALS AND METHODS: The study cohort included 30 patients with adult AML younger than 65 years who were uniformly treated with standard induction chemotherapy. Tumor/normal WES was performed for all patients at presentation. PMD analysis was evaluated in bone marrow samples obtained during clinicopathologic remission using repeat WES and analysis of patient-specific mutations and error-corrected sequencing of 40 recurrently mutated AML genes (MyeloSeq). RESULTS: WES for patient-specific mutations detected PMD in 63% of patients (19/30) using a minimum variant allele fraction (VAF) of 2.5%. In comparison, MyeloSeq identified persistent mutations above 0.1% VAF in 77% of patients (23/30). PMD was usually present at relatively high levels (>2.5% VAFs), such that WES and MyeloSeq agreed for 73% of patients despite differences in detection limits. Mutations in DNMT3A, ASXL1, and TET2 (ie, DTA mutations) were persistent in 16 of 17 patients, but WES also detected non-DTA mutations in 14 of these patients, which for some patients distinguished residual AML cells from clonal hematopoiesis. Surprisingly, MyeloSeq detected additional variants not identified at presentation in 73% of patients that were consistent with new clonal cell populations after chemotherapy. CONCLUSION: PMD and clonal hematopoiesis are both common in patients with AML in first remission. These findings demonstrate the importance of baseline testing for accurate interpretation of mutation-based tumor monitoring assays for patients with AML and highlight the need for clinical trials to determine whether these complex mutation patterns correlate with clinical outcomes in AML.
Asunto(s)
Leucemia Mieloide Aguda , Humanos , Adulto , Leucemia Mieloide Aguda/genética , Exoma , Pronóstico , Recurrencia Local de Neoplasia/genética , Análisis de Secuencia de ADNRESUMEN
TP53 -mutated myeloid malignancies are most frequently associated with complex cytogenetics. The presence of complex and extensive structural variants complicates detailed genomic analysis by conventional clinical techniques. We performed whole genome sequencing of 42 AML/MDS cases with paired normal tissue to characterize the genomic landscape of TP53 -mutated myeloid malignancies. The vast majority of cases had multi-hit involvement at the TP53 genetic locus (94%), as well as aneuploidy and chromothripsis. Chromosomal patterns of aneuploidy differed significantly from TP53 -mutated cancers arising in other tissues. Recurrent structural variants affected regions that include ETV6 on chr12p, RUNX1 on chr21, and NF1 on chr17q. Most notably for ETV6 , transcript expression was low in cases of TP53 -mutated myeloid malignancies both with and without structural rearrangements involving chromosome 12p. Telomeric content is increased in TP53 -mutated AML/MDS compared other AML subtypes, and telomeric content was detected adjacent to interstitial regions of chromosomes. The genomic landscape of TP53 -mutated myeloid malignancies reveals recurrent structural variants affecting key hematopoietic transcription factors and telomeric repeats that are generally not detected by panel sequencing or conventional cytogenetic analyses. Key Points: WGS comprehensively determines TP53 mutation status, resulting in the reclassification of 12% of cases from mono-allelic to multi-hit Chromothripsis is more frequent than previously appreciated, with a preference for specific chromosomes ETV6 is deleted in 45% of cases, with evidence for epigenetic suppression in non-deleted cases NF1 is mutated in 48% of cases, with multi-hit mutations in 17% of these cases TP53 -mutated AML/MDS is associated with altered telomere content compared with other AMLs.
RESUMEN
TP53-mutated myeloid malignancies are associated with complex cytogenetics and extensive structural variants, which complicates detailed genomic analysis by conventional clinical techniques. We performed whole-genome sequencing (WGS) of 42 acute myeloid leukemia (AML)/myelodysplastic syndromes (MDS) cases with paired normal tissue to better characterize the genomic landscape of TP53-mutated AML/MDS. WGS accurately determines TP53 allele status, a key prognostic factor, resulting in the reclassification of 12% of cases from monoallelic to multihit. Although aneuploidy and chromothripsis are shared with most TP53-mutated cancers, the specific chromosome abnormalities are distinct to each cancer type, suggesting a dependence on the tissue of origin. ETV6 expression is reduced in nearly all cases of TP53-mutated AML/MDS, either through gene deletion or presumed epigenetic silencing. Within the AML cohort, mutations of NF1 are highly enriched, with deletions of 1 copy of NF1 present in 45% of cases and biallelic mutations in 17%. Telomere content is increased in TP53-mutated AMLs compared with other AML subtypes, and abnormal telomeric sequences were detected in the interstitial regions of chromosomes. These data highlight the unique features of TP53-mutated myeloid malignancies, including the high frequency of chromothripsis and structural variation, the frequent involvement of unique genes (including NF1 and ETV6) as cooperating events, and evidence for altered telomere maintenance.
Asunto(s)
Cromotripsis , Leucemia Mieloide Aguda , Síndromes Mielodisplásicos , Trastornos Mieloproliferativos , Humanos , Mutación , Aberraciones Cromosómicas , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/patología , Trastornos Mieloproliferativos/genética , Síndromes Mielodisplásicos/genética , Síndromes Mielodisplásicos/patología , Genómica , Proteína p53 Supresora de Tumor/genéticaRESUMEN
We summarize the contributions of Group 9 of Genetic Analysis Workshop 17. This group addressed the problems of linkage disequilibrium and other longer range forms of allelic association when evaluating the effects of genotypes on phenotypes. Issues raised by long-range associations, whether a result of selection, stratification, possible technical errors, or chance, were less expected but proved to be important. Most contributors focused on regression methods of various types to illustrate problematic issues or to develop adaptations for dealing with high-density genotype assays. Study design was also considered, as was graphical modeling. Although no method emerged as uniformly successful, most succeeded in reducing false-positive results either by considering clusters of loci within genes or by applying smoothing metrics that required results from adjacent loci to be similar. Two unexpected results that questioned our assumptions of what is required to model linkage disequilibrium were observed. The first was that correlations between loci separated by large genetic distances can greatly inflate single-locus test statistics, and, whether the result of selection, stratification, possible technical errors, or chance, these correlations seem overabundant. The second unexpected result was that applying principal components analysis to genome-wide genotype data can apparently control not only for population structure but also for linkage disequilibrium.