RESUMO
Germline determinants of gene expression in tumors are infrequently studied due to the complexity of transcript regulation caused by somatically acquired alterations. We performed expression quantitative trait locus (eQTL)-based analyses using the multi-level information provided in The Cancer Genome Atlas (TCGA). Of the factors we measured, cis-acting eQTLs accounted for 1.2% of the total variation of tumor gene expression, while somatic copy-number alteration and CpG methylation accounted for 7.3% and 3.3%, respectively. eQTL analyses of 15 previously reported breast cancer risk loci resulted in the discovery of three variants that are significantly associated with transcript levels (false discovery rate [FDR] < 0.1). Our trans-based analysis identified an additional three risk loci to act through ESR1, MYC, and KLF4. These findings provide a more comprehensive picture of gene expression determinants in breast cancer as well as insights into the underlying biology of breast cancer risk loci.
Assuntos
Neoplasias da Mama/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Linhagem Celular Tumoral , Perfilação da Expressão Gênica , Humanos , Fator 4 Semelhante a KruppelRESUMO
Current tailored-therapy efforts in cancer are largely focused on a small number of highly recurrently mutated driver genes but therapeutic targeting of these oncogenes remains challenging. However, the vast number of genes mutated infrequently across cancers has received less attention, in part, due to a lack of understanding of their biological significance. We present SYSMut, an extendable systems biology platform that can robustly infer the biologic consequences of somatic mutations by integrating routine multiomics profiles in primary tumors. We establish SYSMut's improved performance vis-à-vis state-of-the-art driver gene identification methodologies by recapitulating the functional impact of known driver genes, while additionally identifying novel functionally impactful mutated genes across 29 cancers. Subsequent application of SYSMut on low-frequency gene mutations in head and neck squamous cell (HNSC) cancers, followed by molecular and pharmacogenetic validation, revealed the lipidogenic network as a novel therapeutic vulnerability in aggressive HNSC cancers. SYSMut is thus a robust scalable framework that enables the discovery of new targetable avenues in cancer.
Assuntos
Neoplasias , Humanos , Mutação , Neoplasias/genética , Oncogenes , Biologia de SistemasRESUMO
Cytogenetic analysis provides important information on the genetic mechanisms of cancer. The Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer (Mitelman DB) is the largest catalog of acquired chromosome aberrations, presently comprising >70 000 cases across multiple cancer types. Although this resource has enabled the identification of chromosome abnormalities leading to specific cancers and cancer mechanisms, a large-scale, systematic analysis of these aberrations and their downstream implications has been difficult due to the lack of a standard, automated mapping from aberrations to genomic coordinates. We previously introduced CytoConverter as a tool that automates such conversions. CytoConverter has now been updated with improved interpretation of karyotypes and has been integrated with the Mitelman DB, providing a comprehensive mapping of the 70 000+ cases to genomic coordinates, as well as visualization of the frequencies of chromosomal gains and losses. Importantly, all CytoConverter-generated genomic coordinates are publicly available in Google BigQuery, a cloud-based data warehouse, facilitating data exploration and integration with other datasets hosted by the Institute for Systems Biology Cancer Gateway in the Cloud (ISB-CGC) Resource. We demonstrate the use of BigQuery for integrative analysis of Mitelman DB with other cancer datasets, including a comparison of the frequency of imbalances identified in Mitelman DB cases with those found in The Cancer Genome Atlas (TCGA) copy number datasets. This solution provides opportunities to leverage the power of cloud computing for low-cost, scalable, and integrated analysis of chromosome aberrations and gene fusions in cancer.
Assuntos
Computação em Nuvem , Neoplasias , Humanos , Aberrações Cromossômicas , Cariotipagem , Neoplasias/genética , Fusão GênicaRESUMO
BACKGROUND & AIMS: Mechanisms contributing to the onset and progression of Barrett's (BE)-associated esophageal adenocarcinoma (EAC) remain elusive. Here, we interrogated the major signaling pathways deregulated early in the development of Barrett's neoplasia. METHODS: Whole-transcriptome RNA sequencing analysis was performed in primary BE, EAC, normal esophageal squamous, and gastric biopsy tissues (n = 89). Select pathway components were confirmed by quantitative polymerase chain reaction in an independent cohort of premalignant and malignant biopsy tissues (n = 885). Functional impact of selected pathway was interrogated using transcriptomic, proteomic, and pharmacogenetic analyses in mammalian esophageal organotypic and patient-derived BE/EAC cell line models, in vitro and/or in vivo. RESULTS: The vast majority of primary BE/EAC tissues and cell line models showed hyperactivation of EphB2 signaling. Transcriptomic/proteomic analyses identified EphB2 as an endogenous binding partner of MYC binding protein 2, and an upstream regulator of c-MYC. Knockdown of EphB2 significantly impeded the viability/proliferation of EAC and BE cells in vitro/in vivo. Activation of EphB2 in normal esophageal squamous 3-dimensional organotypes disrupted epithelial maturation and promoted columnar differentiation programs, notably including MYC. EphB2 and MYC showed selective induction in esophageal submucosal glands with acinar ductal metaplasia, and in a porcine model of BE-like esophageal submucosal gland spheroids. Clinically approved inhibitors of MEK, a protein kinase that regulates MYC, effectively suppressed EAC tumor growth in vivo. CONCLUSIONS: The EphB2 signaling is frequently hyperactivated across the BE-EAC continuum. EphB2 is an upstream regulator of MYC, and activation of EphB2-MYC axis likely precedes BE development. Targeting EphB2/MYC could be a promising therapeutic strategy for this often refractory and aggressive cancer.
Assuntos
Esôfago de Barrett , Carcinoma de Células Escamosas , Neoplasias Esofágicas , Suínos , Animais , Esôfago de Barrett/patologia , Efrina-B2/genética , Proteômica , Neoplasias Esofágicas/patologia , Carcinoma de Células Escamosas/patologia , Proto-Oncogenes , Proteínas Tirosina Quinases/genética , Quinases de Proteína Quinase Ativadas por Mitógeno/genética , Mamíferos/genéticaRESUMO
Idiopathic aplastic anemia (IAA) is a rare autoimmune bone marrow failure (BMF) disorder initiated by a human leukocyte antigen (HLA)-restricted T-cell response to unknown antigens. As in other autoimmune disorders, the predilection for certain HLA profiles seems to represent an etiologic factor; however, the structure-function patterns involved in the self-presentation in this disease remain unclear. Herein, we analyzed the molecular landscape of HLA complexes of a cohort of 300 IAA patients and almost 3000 healthy and disease controls by deeply dissecting their genotypic configurations, functional divergence, self-antigen binding capabilities, and T-cell receptor (TCR) repertoire specificities. Specifically, analysis of the evolutionary divergence of HLA genotypes (HED) showed that IAA patients carried class II HLA molecules whose antigen-binding sites were characterized by a high level of structural homology, only partially explained by specific risk allele profiles. This pattern implies reduced HLA binding capabilities, confirmed by binding analysis of hematopoietic stem cell (HSC)-derived self-peptides. IAA phenotype was associated with the enrichment in a few amino acids at specific positions within the peptide-binding groove of DRB1 molecules, affecting the interface HLA-antigen-TCR ß and potentially constituting the basis of T-cell dysfunction and autoreactivity. When analyzing associations with clinical outcomes, low HED was associated with risk of malignant progression and worse survival, underlying reduced tumor surveillance in clearing potential neoantigens derived from mechanisms of clonal hematopoiesis. Our data shed light on the immunogenetic risk associated with IAA etiology and clonal evolution and on general pathophysiological mechanisms potentially involved in other autoimmune disorders.
Assuntos
Anemia Aplástica/genética , Genes MHC da Classe II , Antígenos HLA-D/genética , Adulto , Alelos , Estudos de Coortes , Feminino , Genótipo , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
Patients with heritable cancer syndromes characterized by germline PTEN mutations (termed PTEN hamartoma tumor syndrome, PHTS) benefit from PTEN-enabled cancer risk assessment and clinical management. PTEN-wildtype patients (~50%) remain at increased risk of developing certain cancers. Existence of germline mutations in other known cancer susceptibility genes has not been explored in these patients, with implications for different medical management. We conducted a 4-year multicenter prospective study of incident patients with features of Cowden/Cowden-like (CS/CS-like) and Bannayan-Riley-Ruvalcaba syndromes (BRRS) without PTEN mutations. Exome sequencing and targeted analysis were performed including 59 clinically actionable genes from the American College of Medical Genetics and Genomics (ACMG) and 24 additional genes associated with inherited cancer syndromes. Pathogenic or likely pathogenic cancer susceptibility gene alterations were found in 7 of the 87 (8%) CS/CS-like and BRRS patients and included MUTYH, RET, TSC2, BRCA1, BRCA2, ERCC2 and HRAS. We found classic phenotypes associated with the identified genes in 5 of the 7 (71.4%) patients. Variant positive patients were enriched for the presence of second malignant neoplasms compared to patients without identified variants (OR = 6.101, 95% CI 1.143-35.98, p = 0.035). Germline variant spectrum and frequencies were compared to The Cancer Genome Atlas (TCGA), including 6 apparently sporadic cancers associated with PHTS. With comparable overall prevalence of germline variants, the spectrum of mutated genes was different in our patients compared to TCGA. Intriguingly, we also found notable enrichment of variants of uncertain significance (VUS) in our patients (OR = 2.3, 95% CI 1.5-3.5, p = 0.0002). Our data suggest that only a small subset of PTEN-wildtype CS/CS-like and BRRS patients could be accounted for by germline variants in some of the known cancer-related genes. Thus, the existence of alterations in other and more likely non-classic cancer-associated genes is plausible, reflecting the complexity of these heterogeneous hereditary cancer syndromes.
Assuntos
Mutação em Linhagem Germinativa , Síndrome do Hamartoma Múltiplo/genética , Oncogenes , PTEN Fosfo-Hidrolase/genética , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , DNA Glicosilases/genética , Análise Mutacional de DNA , Feminino , Genes BRCA1 , Genes BRCA2 , Predisposição Genética para Doença , Variação Genética , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Neoplasias Primárias Múltiplas/genética , Estudos Prospectivos , Proteínas Proto-Oncogênicas c-ret/genética , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteína 2 do Complexo Esclerose Tuberosa , Proteínas Supressoras de Tumor/genética , Sequenciamento do Exoma , Proteína Grupo D do Xeroderma Pigmentoso/genética , Adulto JovemRESUMO
Several pediatric mitochondrial disorders, including Leigh syndrome (LS), impact mitochondrial (mt) genetics, development, and metabolism, leading to complex pathologies and energy failure. The extent to which pathogenic mtDNA variants regulate disease severity in LS is currently not well understood. To better understand this relationship, we computed a glycolytic bioenergetics health index (BHI) for measuring mitochondrial dysfunction in LS patient fibroblast cells harboring varying percentages of pathogenic mutant mtDNA (T8993G, T9185C) exhibiting deficiency in complex V or complex I (T10158C, T12706C). A high percentage (>90%) of pathogenic mtDNA in cells affecting complex V and a low percentage (<39%) of pathogenic mtDNA in cells affecting complex I was quantified. Levels of defective enzyme activities of the electron transport chain correlated with the percentage of pathogenic mtDNA. Subsequent bioenergetics assays showed cell lines relied on both OXPHOS and glycolysis for meeting energy requirements. Results suggest that whereas the precise mechanism of LS has not been elucidated, a multi-pronged approach taking into consideration the specific pathogenic mtDNA variant, glycolytic BHI, and the composite BHI (average ratio of oxphos to glycolysis) can aid in better understanding the factors influencing disease severity in LS.
Assuntos
DNA Mitocondrial/metabolismo , Fibroblastos/metabolismo , Glicólise , Doença de Leigh/metabolismo , Mutação , Fosforilação Oxidativa , Adulto , Criança , Pré-Escolar , DNA Mitocondrial/genética , Feminino , Humanos , Lactente , Doença de Leigh/genética , MasculinoRESUMO
BACKGROUND: Cytogenetic nomenclature is used to describe chromosomal aberrations (or lack thereof) in a collection of cells, referred to as the cells' karyotype. The nomenclature identifies locations on chromosomes using a system of cytogenetic bands, each with a unique name and region on a chromosome. Each band is microscopically visible after staining, and encompasses a large portion of the chromosome. More modern analyses employ genomic coordinates, which precisely specify a chromosomal location according to its distance from the end of the chromosome. Currently, there is no tool to convert cytogenetic nomenclature into genomic coordinates. Since locations of genes and other genomic features are usually specified by genomic coordinates, a conversion tool will facilitate the identification of the features that are harbored in the regions of chromosomal gain and loss that are implied by a karyotype. RESULTS: Our tool, termed CytoConverter, takes as input either a single karyotype or a file consisting of multiple karyotypes from several individuals. All net chromosomal gains and losses implied by the karyotype are returned in standard genomic coordinates, along with the numbers of cells harboring each aberration if included in the input. CytoConverter also returns graphical output detailing areas of gains and losses of chromosomes and chromosomal segments. CONCLUSIONS: CytoConverter is available as a web-based application at https://jxw773.shinyapps.io/Cytogenetic__software/ and as an R script at https://sourceforge.net/projects/cytoconverter/ . Supplemental Material detailing the underlying algorithms is available.
Assuntos
Aberrações Cromossômicas , Citogenética/métodos , Genômica/métodos , Internet/instrumentação , Cariótipo , HumanosRESUMO
Although mitochondrial genomes (mtDNA) accumulate elevated levels of mutations in cancer cells, the origin and functional impact of these mutations remain controversial. Here, we queried whole-genome sequence data from 1,916 patients across 24 cancer types to characterize patterns of mtDNA mutations and elucidate the selective constraints driving their fate. Given that mitochondrial genomes are polyploid, cells with advantageous levels of mtDNA mutations can be selected for depending on their cellular environment. Therefore, we tracked changes in per-cell abundances of mtDNA mutations from normal to tumor cells in the same patient. Tumor mitochondrial genomes show distinct mutational patterns and are disproportionately enriched for protein-altering changes. Moreover, protein-altering mtDNA variants that are initially present at low frequencies in normal cells preferentially expand in the altered tumor environment, suggesting selective advantage. We also perform these analyses with attention to the cancer's tissue of origin, which revealed tissue-specific differences in selective signals. The mitochondrial genomes in renal chromophobe and thyroid cancers show particularly strong signals of positive selection, indicated by higher proportions and per-cell abundances of truncating variants. Dramatic tumor- and tissue-specific variations in selective pressures suggest that cancer cells with advantageous levels of damaged mitochondrial genomes will selectively proliferate to facilitate the tumorigenic process.
Assuntos
DNA Mitocondrial/genética , Neoplasias/genética , DNA Mitocondrial/metabolismo , Bases de Dados de Ácidos Nucleicos , Feminino , Genoma Mitocondrial/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Masculino , Mitocôndrias/genética , Mutação , Sequenciamento Completo do GenomaRESUMO
Cowden syndrome (CS) is an autosomal dominant disorder that predisposes to breast, thyroid, and other epithelial cancers. Differentiated thyroid carcinoma (DTC), as one of the major component cancers of CS, is the fastest rising incident cancer in the USA, and the most familial of all solid tumours. To identify additional candidate genes of CS and potentially DTC, we analysed a multi-generation CS-like family with papillary thyroid cancer (PTC), applying a combined linkage-based and whole-genome sequencing strategy and identified an in-frame germline compound heterozygous deletion, p.[Gln1478del];[Gln1476-Gln1478del] in USF3 (previously known as KIAA2018). Among 90 unrelated CS/CS-like individuals, 29% were found to have p.[Gln1478del];[Gln1476-Gln1478del]. Of 497 TCGA PTC individuals, 138 (27%) were found to carry this germline compound deletion, with somatically decreased tumour USF3 expression. We demonstrate an increased migration phenotype along with enhanced epithelial-to-mesenchymal transition (EMT) signature after USF3 knockdown or USF3 p.[Gln1478del];[Gln1476-Gln1478del] overexpression, which sensitizes cells to the endoplasmic reticulum (ER) stress. Loss of USF3 function induced cell necrosis-like features and impaired respiratory capacity while providing a glutamine-dependent cell survival advantage, strongly suggests a metabolic survival and migration-favouring microenvironment for carcinogenesis. Therefore, USF3 may be involved in the predisposition of thyroid cancer. Importantly, the results that glutamine-dependent survival and sensitivity to ER stress in USF3-deficient cells provide avenues for therapeutic and adjunct preventive interventions for both sporadic cancer as well as cancer predisposition syndromes with similar mechanisms.
Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Carcinoma/genética , Predisposição Genética para Doença , Síndrome do Hamartoma Múltiplo/genética , Neoplasias da Glândula Tireoide/genética , Fatores Estimuladores Upstream/genética , Carcinoma/patologia , Carcinoma Papilar , Movimento Celular , Estresse do Retículo Endoplasmático/genética , Transição Epitelial-Mesenquimal/genética , Feminino , Genoma Humano , Genótipo , Mutação em Linhagem Germinativa , Síndrome do Hamartoma Múltiplo/patologia , Heterozigoto , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Linhagem , Peptídeos/genética , Deleção de Sequência , Câncer Papilífero da Tireoide , Glândula Tireoide/patologia , Neoplasias da Glândula Tireoide/patologia , Microambiente Tumoral/genéticaRESUMO
Compound heterozygous germline mutations in CTC1 gene have been found in patients with atypical dyskeratosis congenita (DC), whereas heterozygous carriers are unaffected. Through screening of a large cohort of adult patients with acquired bone marrow failure syndromes, in addition to a DC case, we have also found extremely rare or novel heterozygous deleterious germline variants of CTC1 in patients with aplastic anaemia (AA; n = 5), paroxysmal nocturnal haemoglobinuria (PNH; n = 3) and myelodysplastic syndrome (MDS; n = 2). A compound heterozygous case of AA showed clonal evolution. Our results suggest that some of the inherited CTC1 variants may represent predisposition factors for acquired bone marrow failure.
Assuntos
Transtornos da Insuficiência da Medula Óssea/genética , Mutação em Linhagem Germinativa , Proteínas de Ligação a Telômeros/genética , Telômero/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Transtornos da Insuficiência da Medula Óssea/metabolismo , Transtornos da Insuficiência da Medula Óssea/patologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Telômero/metabolismo , Telômero/patologia , Proteínas de Ligação a Telômeros/metabolismoRESUMO
BACKGROUND: Chromosomal deletions represent an important class of human genetic variation. Various methods have been developed to mine "next-generation" sequencing (NGS) data to detect deletions and quantify their clonal abundances. These methods have focused almost exclusively on the nuclear genome, ignoring the mitochondrial chromosome (mtDNA). Detecting mtDNA deletions requires special care. First, the chromosome's relatively small size (16,569 bp) necessitates the ability to detect extremely focal events. Second, the chromosome can be present at thousands of copies in a single cell (in contrast to two copies of nuclear chromosomes), and mtDNA deletions may be present on only a very small percentage of chromosomes. Here we present a method, termed MitoDel, to detect mtDNA deletions from NGS data. RESULTS: We validate the method on simulated and real data, and show that MitoDel can detect novel and previously-reported mtDNA deletions. We establish that MitoDel can find deletions such as the "common deletion" at heteroplasmy levels well below 1%. CONCLUSIONS: MitoDel is a tool for detecting large mitochondrial deletions at low heteroplasmy levels. The tool can be downloaded at http://mendel.gene.cwru.edu/laframboiselab/ .
Assuntos
DNA Mitocondrial/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Deleção de Sequência , Adulto , Idoso , Encéfalo/metabolismo , Simulação por Computador , Variação Genética , Genoma Mitocondrial , Humanos , Mitocôndrias/genética , Fatores de TempoRESUMO
Myelodysplastic syndromes are typically diseases of older adults. Patients in whom the onset is early may have distinct molecular and clinical features or reflect a demographic continuum. The identification of differences between "early onset" patients and those diagnosed at a traditional age has the potential to advance understanding of the pathogenesis of myelodysplasia and may lead to formation of distinct morphological subcategories. We studied a cohort of 634 patients with various subcategories of myelodysplastic syndrome and secondary acute myeloid leukemia, stratifying them based on age at presentation and clinical parameters. We then characterized molecular abnormalities detected by next-generation deep sequencing of 60 genes that are commonly mutated in myeloid malignancies. The number of mutations increased linearly with age and on average, patients >50 years of age had more mutations. TET2, SRSF2, and DNMT3A were more commonly mutated in patients >50 years old compared to patients ≤50 years old. In general, patients >50 years of age also had more mutations in spliceosomal, epigenetic modifier, and RAS gene families. Although there are age-related differences in molecular features among patients with myelodysplasia, most notably in the incidence of SRSF2 mutations, our results suggest that patients ≤50 years old belong to a disease continuum with a distinct pattern of early onset ancestral events.
Assuntos
Idade de Início , Mutação , Síndromes Mielodisplásicas/genética , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , DNA (Citosina-5-)-Metiltransferases/genética , DNA Metiltransferase 3A , Análise Mutacional de DNA , Proteínas de Ligação a DNA/genética , Dioxigenases , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Pessoa de Meia-Idade , Proteínas Proto-Oncogênicas/genética , Fatores de Processamento de Serina-Arginina/genética , Adulto JovemRESUMO
Systematic efforts are underway to decipher the genetic changes associated with tumor initiation and progression. However, widespread clinical application of this information is hampered by an inability to identify critical genetic events across the spectrum of human tumors with adequate sensitivity and scalability. Here, we have adapted high-throughput genotyping to query 238 known oncogene mutations across 1,000 human tumor samples. This approach established robust mutation distributions spanning 17 cancer types. Of 17 oncogenes analyzed, we found 14 to be mutated at least once, and 298 (30%) samples carried at least one mutation. Moreover, we identified previously unrecognized oncogene mutations in several tumor types and observed an unexpectedly high number of co-occurring mutations. These results offer a new dimension in tumor genetics, where mutations involving multiple cancer genes may be interrogated simultaneously and in 'real time' to guide cancer classification and rational therapeutic intervention.
Assuntos
Análise Mutacional de DNA/métodos , Mutação , Neoplasias/genética , Oncogenes , Perfilação da Expressão Gênica , Genoma Humano , Genótipo , HumanosRESUMO
During tumor initiation and progression, cancer cells acquire a selective advantage, allowing them to outcompete their normal counterparts. Identification of the genetic changes that underlie these tumor acquired traits can provide deeper insights into the biology of tumorigenesis. Regions of copy number alterations and germline DNA variants are some of the elements subject to selection during tumor evolution. Integrated examination of inherited variation and somatic alterations holds the potential to reveal specific nucleotide alleles that a tumor "prefers" to have amplified. Next-generation sequencing of tumor and matched normal tissues provides a high-resolution platform to identify and analyze such somatic amplicons. Within an amplicon, examination of informative (e.g., heterozygous) sites deviating from a 1:1 ratio may suggest selection of that allele. A naive approach examines the reads for each heterozygous site in isolation; however, this ignores available valuable linkage information across sites. We, therefore, present a novel hidden Markov model-based method-Haplotype Amplification in Tumor Sequences (HATS)-that analyzes tumor and normal sequence data, along with training data for phasing purposes, to infer amplified alleles and haplotypes in regions of copy number gain. Our method is designed to handle rare variants and biases in read data. We assess the performance of HATS using simulated amplified regions generated from varying copy number and coverage levels, followed by amplicons in real data. We demonstrate that HATS infers the amplified alleles more accurately than does the naive approach, especially at low to intermediate coverage levels and in cases (including high coverage) possessing stromal contamination or allelic bias.
Assuntos
Biologia Computacional/métodos , Amplificação de Genes , Haplótipos , Neoplasias/genética , Simulação por Computador , Humanos , Modelos Genéticos , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
PURPOSE: To date the standard nosology and prognostic schemes for myeloid neoplasms have been based on morphologic and cytogenetic criteria. We sought to test the hypothesis that a comprehensive, unbiased analysis of somatic mutations may allow for an improved classification of these diseases to predict outcome (overall survival). EXPERIMENTAL DESIGN: We performed whole-exome sequencing (WES) of 274 myeloid neoplasms, including myelodysplastic syndrome (MDS, N=75), myelodysplastic/myeloproliferative neoplasia (MDS/MPN, N=33), and acute myeloid leukemia (AML, N=22), augmenting the resulting mutational data with public WES results from AML (N=144). We fit random survival forests (RSFs) to the patient survival and clinical/cytogenetic data, with and without gene mutation information, to build prognostic classifiers. A targeted sequencing assay was used to sequence predictor genes in an independent cohort of 507 patients, whose accompanying data were used to evaluate performance of the risk classifiers. RESULTS: We show that gene mutations modify the impact of standard clinical variables on patient outcome, and therefore their incorporation hones the accuracy of prediction. The mutation-based classification scheme robustly predicted patient outcome in the validation set (log rank P=6.77 × 10(-21); poor prognosis vs. good prognosis categories HR 10.4, 95% CI 3.21-33.6). The RSF-based approach also compares favorably with recently-published efforts to incorporate mutational information for MDS prognosis. CONCLUSION: The results presented here support the inclusion of mutational information in prognostic classification of myeloid malignancies. Our classification scheme is implemented in a publicly available web-based tool (http://myeloid-risk. CASE: edu/).
Assuntos
Neoplasias da Medula Óssea/genética , Exoma , Neoplasias da Medula Óssea/classificação , Neoplasias da Medula Óssea/fisiopatologia , Estudos de Coortes , PrognósticoRESUMO
The presence of mitochondrial DNA (mtDNA) mutations in human cancer has long been recognized, but their functional significance has remained obscure. Debate persists as to whether the mutations help drive the tumor, or are bystander events. Here, we analyze next-generation mtDNA sequence data from 99 breast cancer patients. High depth coverage enables detection of even low-level heteroplasmic variants, and data from matched normal tissue allow us to distinguish between shifts in heteroplasmy and acquired mutations. Somatic mtDNA mutations are found in 73 (73.7%) of patient tumors, and dramatic shifts from the initial germline allele proportions are observed for many heteroplasmies. Clustering of somatic mutations in promoter and replication regions, and also in genes coding for electron transport chain complex I, suggest selection for mutations affecting critical mitochondrial processes. Furthermore, statistical tests for Darwinian selection reveal evidence for positive and relaxed negative selection for somatic missense mutations. We also observe a dramatic decrease in per-cell mtDNA content in tumor tissues, as well as a surprising positive correlation between somatic mtDNA mutational burden and patient survival. Taken together, our results support the view that somatic mtDNA mutations are not solely bystander events, but have significance in cancer from both biological and clinical perspectives. We also anticipate that the catalog of heteroplasmies and somatic mutations presented here will serve as a reference for future studies of cancer mitochondrial genomes.
Assuntos
Neoplasias da Mama/genética , Genoma Mitocondrial , Mutação , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/mortalidade , Transformação Celular Neoplásica/genética , DNA Mitocondrial , Feminino , Dosagem de Genes , Estudo de Associação Genômica Ampla , Genômica , Mutação em Linhagem Germinativa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Avaliação de Resultados da Assistência ao Paciente , RNA de Transferência/genéticaRESUMO
BACKGROUND: With the advent of paired-end high throughput sequencing, it is now possible to identify various types of structural variation on a genome-wide scale. Although many methods have been proposed for structural variation detection, most do not provide precise boundaries for identified variants. In this paper, we propose a new method, Distribution Based detection of Duplication Boundaries (DB2), for accurate detection of tandem duplication breakpoints, an important class of structural variation, with high precision and recall. RESULTS: Our computational experiments on simulated data show that DB2 outperforms state-of-the-art methods in terms of finding breakpoints of tandem duplications, with a higher positive predictive value (precision) in calling the duplications' presence. In particular, DB2's prediction of tandem duplications is correct 99% of the time even for very noisy data, while narrowing down the space of possible breakpoints within a margin of 15 to 20 bps on the average. Most of the existing methods provide boundaries in ranges that extend to hundreds of bases with lower precision values. Our method is also highly robust to varying properties of the sequencing library and to the sizes of the tandem duplications, as shown by its stable precision, recall and mean boundary mismatch performance. We demonstrate our method's efficacy using both simulated paired-end reads, and those generated from a melanoma sample and two ovarian cancer samples. Newly discovered tandem duplications are validated using PCR and Sanger sequencing. CONCLUSIONS: Our method, DB2, uses discordantly aligned reads, taking into account the distribution of fragment length to predict tandem duplications along with their breakpoints on a donor genome. The proposed method fine tunes the breakpoint calls by applying a novel probabilistic framework that incorporates the empirical fragment length distribution to score each feasible breakpoint. DB2 is implemented in Java programming language and is freely available at http://mendel.gene.cwru.edu/laframboiselab/software.php.
Assuntos
Algoritmos , Quebras de DNA , Duplicação Gênica , Genoma , Pareamento Incorreto de Bases , Feminino , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/patologia , Análise de Sequência de DNA , Software , Tirosina Quinase 3 Semelhante a fms/genéticaRESUMO
BACKGROUND: Sequences up to several megabases in length have been found to be present in individual genomes but absent in the human reference genome. These sequences may be common in populations, and their absence in the reference genome may indicate rare variants in the genomes of individuals who served as donors for the human genome project. As the reference genome is used in probe design for microarray technology and mapping short reads in next generation sequencing (NGS), this missing sequence could be a source of bias in functional genomic studies and variant analysis. One End Anchor (OEA) and/or orphan reads from paired-end sequencing have been used to identify novel sequences that are absent in reference genome. However, there is no study to investigate the distribution, evolution and functionality of those sequences in human populations. RESULTS: To systematically identify and study the missing common sequences (micSeqs), we extended the previous method by pooling OEA reads from large number of individuals and applying strict filtering methods to remove false sequences. The pipeline was applied to data from phase 1 of the 1000 Genomes Project. We identified 309 micSeqs that are present in at least 1% of the human population, but absent in the reference genome. We confirmed 76% of these 309 micSeqs by comparison to other primate genomes, individual human genomes, and gene expression data. Furthermore, we randomly selected fifteen micSeqs and confirmed their presence using PCR validation in 38 additional individuals. Functional analysis using published RNA-seq and ChIP-seq data showed that eleven micSeqs are highly expressed in human brain and three micSeqs contain transcription factor (TF) binding regions, suggesting they are functional elements. In addition, the identified micSeqs are absent in non-primates and show dynamic acquisition during primate evolution culminating with most micSeqs being present in Africans, suggesting some micSeqs may be important sources of human diversity. CONCLUSIONS: 76% of micSeqs were confirmed by a comparative genomics approach. Fourteen micSeqs are expressed in human brain or contain TF binding regions. Some micSeqs are primate-specific, conserved and may play a role in the evolution of primates.