RESUMEN
BACKGROUND: Thirty to forty percent of patients with Diffuse Large B-cell Lymphoma (DLBCL) have an adverse clinical evolution. The increased understanding of DLBCL biology has shed light on the clinical evolution of this pathology, leading to the discovery of prognostic factors based on gene expression data, genomic rearrangements and mutational subgroups. Nevertheless, additional efforts are needed in order to enable survival predictions at the patient level. In this study we investigated new machine learning-based models of survival using transcriptomic and clinical data. METHODS: Gene expression profiling (GEP) of in 2 different publicly available retrospective DLBCL cohorts were analyzed. Cox regression and unsupervised clustering were performed in order to identify probes associated with overall survival on the largest cohort. Random forests were created to model survival using combinations of GEP data, COO classification and clinical information. Cross-validation was used to compare model results in the training set, and Harrel's concordance index (c-index) was used to assess model's predictability. Results were validated in an independent test set. RESULTS: Two hundred thirty-three and sixty-four patients were included in the training and test set, respectively. Initially we derived and validated a 4-gene expression clusterization that was independently associated with lower survival in 20% of patients. This pattern included the following genes: TNFRSF9, BIRC3, BCL2L1 and G3BP2. Thereafter, we applied machine-learning models to predict survival. A set of 102 genes was highly predictive of disease outcome, outperforming available clinical information and COO classification. The final best model integrated clinical information, COO classification, 4-gene-based clusterization and the expression levels of 50 individual genes (training set c-index, 0.8404, test set c-index, 0.7942). CONCLUSION: Our results indicate that DLBCL survival models based on the application of machine learning algorithms to gene expression and clinical data can largely outperform other important prognostic variables such as disease stage and COO. Head-to-head comparisons with other risk stratification models are needed to compare its usefulness.
Asunto(s)
Biomarcadores de Tumor/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Linfoma de Células B Grandes Difuso/mortalidad , Proteínas Adaptadoras Transductoras de Señales/genética , Proteína 3 que Contiene Repeticiones IAP de Baculovirus/genética , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Linfoma de Células B Grandes Difuso/genética , Masculino , Análisis por Micromatrices , Persona de Mediana Edad , Pronóstico , Proteínas de Unión al ARN/genética , Estudios Retrospectivos , Análisis de Supervivencia , Miembro 9 de la Superfamilia de Receptores de Factores de Necrosis Tumoral/genética , Aprendizaje Automático no Supervisado , Proteína bcl-X/genéticaRESUMEN
BACKGROUND: Chronic Lymphocytic Leukemia (CLL) is the most frequent lymphoproliferative disorder in western countries and is characterized by a remarkable clinical heterogeneity. During the last decade, multiple genomic studies have identified a myriad of somatic events driving CLL proliferation and aggressivity. Nevertheless, and despite the mounting evidence of inherited risk for CLL development, the existence of germline variants associated with clinical outcomes has not been addressed in depth. METHODS: Exome sequencing data from control leukocytes of CLL patients involved in the International Cancer Genome Consortium (ICGC) was used for genotyping. Cox regression was used to detect variants associated with clinical outcomes. Gene and pathways level associations were also calculated. RESULTS: Single nucleotide polymorphisms in PPP4R2 and MAP3K4 were associated with earlier treatment need. A gene-level analysis evidenced a significant association of RIPK3 with both treatment need and survival. Furthermore, germline variability in pathways such as apoptosis, cell-cycle, pentose phosphate, GNα13 and Nitric oxide was associated with overall survival. CONCLUSION: Our results support the existence of inherited conditionants of CLL evolution and points towards genes and pathways that may results useful as biomarkers of disease outcome. More research is needed to validate these findings.
Asunto(s)
Biomarcadores de Tumor/genética , Secuenciación del Exoma/métodos , Mutación de Línea Germinal , Leucemia Linfocítica Crónica de Células B/genética , Femenino , Subunidades alfa de la Proteína de Unión al GTP G12-G13/genética , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Humanos , MAP Quinasa Quinasa Quinasa 4/genética , Masculino , Fosfoproteínas Fosfatasas/genética , Análisis de SupervivenciaRESUMEN
Acute lymphoblastic leukemia (ALL) is the most prevalent cancer in children, and despite considerable progress in treatment outcomes, relapses still pose significant risks of mortality and long-term complications. To address this challenge, we employed a supervised machine learning technique, specifically random survival forests, to predict the risk of relapse and mortality using array-based DNA methylation data from a cohort of 763 pediatric ALL patients treated in Nordic countries. The relapse risk predictor (RRP) was constructed based on 16 CpG sites, demonstrating c-indexes of 0.667 and 0.677 in the training and test sets, respectively. The mortality risk predictor (MRP), comprising 53 CpG sites, exhibited c-indexes of 0.751 and 0.754 in the training and test sets, respectively. To validate the prognostic value of the predictors, we further analyzed two independent cohorts of Canadian (n = 42) and Nordic (n = 384) ALL patients. The external validation confirmed our findings, with the RRP achieving a c-index of 0.667 in the Canadian cohort, and the RRP and MRP achieving c-indexes of 0.529 and 0.621, respectively, in an independent Nordic cohort. The precision of the RRP and MRP models improved when incorporating traditional risk group data, underscoring the potential for synergistic integration of clinical prognostic factors. The MRP model also enabled the definition of a risk group with high rates of relapse and mortality. Our results demonstrate the potential of DNA methylation as a prognostic factor and a tool to refine risk stratification in pediatric ALL. This may lead to personalized treatment strategies based on epigenetic profiling.
Asunto(s)
Metilación de ADN , Leucemia-Linfoma Linfoblástico de Células Precursoras , Niño , Humanos , Canadá , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Resultado del Tratamiento , Pronóstico , RecurrenciaRESUMEN
[This corrects the article DOI: 10.3389/fonc.2022.968340.].
RESUMEN
Diffuse Large B-cell Lymphoma (DLBCL) is the most common type of aggressive lymphoma. Approximately 60% of fit patients achieve curation with immunochemotherapy, but the remaining patients relapse or have refractory disease, which predicts a short survival. Traditionally, risk stratification in DLBCL has been based on scores that combine clinical variables. Other methodologies have been developed based on the identification of novel molecular features, such as mutational profiles and gene expression signatures. Recently, we developed the LymForest-25 profile, which provides a personalized survival risk prediction based on the integration of transcriptomic and clinical features using an artificial intelligence system. In the present report, we studied the relationship between the molecular variables included in LymForest-25 in the context of the data released by the REMoDL-B trial, which evaluated the addition of bortezomib to the standard treatment (R-CHOP) in the upfront setting of DLBCL. For this, we retrained the machine learning model of survival on the group of patients treated with R-CHOP (N=469) and then made survival predictions for those patients treated with bortezomib plus R-CHOP (N=459). According to these results, the RB-CHOP scheme achieved a 30% reduction in the risk of progression or death for the 50% of DLBCL patients at higher molecular risk (p-value 0.03), potentially expanding the effectiveness of this treatment to a wider patient population as compared with other previously defined risk groups.
RESUMEN
A growing need to evaluate risk-adapted treatments in multiple myeloma (MM) exists. Several clinical and molecular scores have been developed in the last decades, which individually explain some of the variability in the heterogeneous clinical behavior of this neoplasm. Recently, we presented Iacobus-50 (IAC-50), which is a machine learning-based survival model based on clinical, biochemical, and genomic data capable of risk-stratifying newly diagnosed MM patients and predicting the optimal upfront treatment scheme. In the present study, we evaluated the prognostic value of the IAC-50 gene expression signature in an external cohort composed of patients from the Total Therapy trials 3, 4, and 5. The prognostic value of IAC-50 was validated, and additionally we observed a better performance in terms of progression-free survival and overall survival prediction compared with the UAMS70 gene expression signature. The combination of the IAC-50 gene expression signature with traditional prognostic variables (International Staging System [ISS] score, baseline B2-microglobulin, and age) improved the performance well above the predictability of the ISS score. IAC-50 emerges as a powerful risk stratification model which might be considered for risk stratification in newly diagnosed myeloma patients, in the context of clinical trials but also in real life.
RESUMEN
Risk stratification in acute myeloid leukemia (AML) has been extensively improved thanks to the incorporation of recurrent cytogenomic alterations into risk stratification guidelines. However, mortality rates among fit patients assigned to low or intermediate risk groups are still high. Therefore, significant room exists for the improvement of AML prognostication. In a previous work, we presented the Stellae-123 gene expression signature, which achieved a high accuracy in the prognostication of adult patients with AML. Stellae-123 was particularly accurate to restratify patients bearing high-risk mutations, such as ASXL1, RUNX1 and TP53. The intention of the present work was to evaluate the prognostic performance of Stellae-123 in external cohorts using RNAseq technology. For this, we evaluated the signature in 3 different AML cohorts (2 adult and 1 pediatric). Our results indicate that the prognostic performance of the Stellae-123 signature is reproducible in the 3 cohorts of patients. Additionally, we evidenced that the signature was superior to the European LeukemiaNet 2017 and the pediatric clinical risk scores in the prediction of survival at most of the evaluated time points. Furthermore, integration with age substantially enhanced the accuracy of the model. In conclusion, Stellae-123 is a reproducible machine learning algorithm based on a gene expression signature with promising utility in the field of AML.
RESUMEN
Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin lymphoma. Despite notable therapeutic advances in the last decades, 30%-40% of affected patients develop relapsed or refractory disease that frequently precludes an infamous outcome. With the advent of new therapeutic options, it becomes necessary to predict responses to the standard treatment based on rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP). In a recent communication, we presented a new machine learning model (LymForest-25) that was based on 25 clinical, biochemical, and gene expression variables. LymForest-25 achieved high survival prediction accuracy in patients with DLBCL treated with upfront immunochemotherapy. In this study, we aimed to evaluate the performance of the different features that compose LymForest-25 in a new UK-based cohort, which contained 481 patients treated with upfront R-CHOP for whom clinical, biochemical and gene expression information for 17 out of 19 transcripts were available. Additionally, we explored potential improvements based on the integration of other gene expression signatures and mutational clusters. The validity of the LymForest-25 gene expression signature was confirmed, and indeed it achieved a substantially greater precision in the estimation of mortality at 6 months and 1, 2, and 5 years compared with the cell-of-origin (COO) plus molecular high-grade (MHG) classification. Indeed, this signature was predictive of survival within the MHG and all COO subgroups, with a particularly high accuracy in the "unclassified" group. Integration of this signature with the International Prognostic Index (IPI) score provided the best survival predictions. However, the increased performance of molecular models with the IPI score was almost exclusively restricted to younger patients (<70 y). Finally, we observed a tendency towards an improved performance by combining LymForest-25 with the LymphGen mutation-based classification. In summary, we have validated the predictive capacity of LymForest-25 and expanded the potential for improvement with mutation-based prognostic classifications.
RESUMEN
[This corrects the article DOI: 10.1371/journal.pone.0247093.].
RESUMEN
The International Staging System (ISS) and the Revised International Staging System (R-ISS) are commonly used prognostic scores in multiple myeloma (MM). These methods have significant gaps, particularly among intermediate-risk groups. The aim of this study was to improve risk stratification in newly diagnosed MM patients using data from three different trials developed by the Spanish Myeloma Group. For this, we applied an unsupervised machine learning clusterization technique on a set of clinical, biochemical and cytogenetic variables, and we identified two novel clusters of patients with significantly different survival. The prognostic precision of this clusterization was superior to those of ISS and R-ISS scores, and appeared to be particularly useful to improve risk stratification among R-ISS 2 patients. Additionally, patients assigned to the low-risk cluster in the GEM05 over 65 years trial had a significant survival benefit when treated with VMP as compared with VTD. In conclusion, we describe a simple prognostic model for newly diagnosed MM whose predictions are independent of the ISS and R-ISS scores. Notably, the model is particularly useful in order to re-classify R-ISS score 2 patients in 2 different prognostic subgroups. The combination of ISS, R-ISS and unsupervised machine learning clusterization brings a promising approximation to improve MM risk stratification.
Asunto(s)
Mieloma Múltiple , Humanos , Mieloma Múltiple/diagnóstico , Mieloma Múltiple/tratamiento farmacológico , Mieloma Múltiple/epidemiología , Estadificación de Neoplasias , Pronóstico , Medición de Riesgo , Aprendizaje Automático no SupervisadoRESUMEN
Multiple myeloma (MM) remains mostly an incurable disease with a heterogeneous clinical evolution. Despite the availability of several prognostic scores, substantial room for improvement still exists. Promising results have been obtained by integrating clinical and biochemical data with gene expression profiling (GEP). In this report, we applied machine learning algorithms to MM clinical and RNAseq data collected by the CoMMpass consortium. We created a 50-variable random forests model (IAC-50) that could predict overall survival with high concordance between both training and validation sets (c-indexes, 0.818 and 0.780). This model included the following covariates: patient age, ISS stage, serum B2-microglobulin, first-line treatment, and the expression of 46 genes. Survival predictions for each patient considering the first line of treatment evidenced that those individuals treated with the best-predicted drug combination were significantly less likely to die than patients treated with other schemes. This was particularly important among patients treated with a triplet combination including bortezomib, an immunomodulatory drug (ImiD), and dexamethasone. Finally, the model showed a trend to retain its predictive value in patients with high-risk cytogenetics. In conclusion, we report a predictive model for MM survival based on the integration of clinical, biochemical, and gene expression data with machine learning tools.
Asunto(s)
Algoritmos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Biomarcadores de Tumor/genética , Regulación Neoplásica de la Expresión Génica , Aprendizaje Automático , Mieloma Múltiple/mortalidad , Estudios de Cohortes , Femenino , Estudios de Seguimiento , Perfilación de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Mieloma Múltiple/genética , Mieloma Múltiple/patología , Pronóstico , Tasa de SupervivenciaRESUMEN
Acute Myeloid Leukemia (AML) is a heterogeneous neoplasm characterized by cytogenetic and molecular alterations that drive patient prognosis. Currently established risk stratification guidelines show a moderate predictive accuracy, and newer tools that integrate multiple molecular variables have proven to provide better results. In this report, we aimed to create a new machine learning model of AML survival using gene expression data. We used gene expression data from two publicly available cohorts in order to create and validate a random forest predictor of survival, which we named ST-123. The most important variables in the model were age and the expression of KDM5B and LAPTM4B, two genes previously associated with the biology and prognostication of myeloid neoplasms. This classifier achieved high concordance indexes in the training and validation sets (0.7228 and 0.6988, respectively), and predictions were particularly accurate in patients at the highest risk of death. Additionally, ST-123 provided significant prognostic improvements in patients with high-risk mutations. Our results indicate that survival of patients with AML can be predicted to a great extent by applying machine learning tools to transcriptomic data, and that such predictions are particularly precise among patients with high-risk mutations.
RESUMEN
There is growing evidence indicating the implication of germline variation in cancer predisposition and prognostication. Here, we describe an analysis of likely disruptive rare variants across the genomes of 726 patients with B-cell lymphoid neoplasms. We discovered a significant enrichment for two genes in rare dysfunctional variants, both of which participate in the regulation of oxidative stress pathways (CHMP6 and GSTA4). Additionally, we detected 1675 likely disrupting variants in genes associated with cancer, of which 44.75% were novel events and 7.88% were protein-truncating variants. Among these, the most frequently affected genes were ATM, BIRC6, CLTCL1A, and TSC2. Homozygous or germline double-hit variants were detected in 28 cases, and coexisting somatic events were observed in 17 patients, some of which affected key lymphoma drivers such as ATM, KMT2D, and MYC. Finally, we observed that variants in six different genes were independently associated with shorter survival in CLL. Our study results support an important role for rare germline variation in the pathogenesis and prognosis of B-cell lymphoid neoplasms.
RESUMEN
B-cell lymphoproliferative disorders exhibit a diverse spectrum of diagnostic entities with heterogeneous behaviour. Multiple efforts have focused on the determination of the genomic drivers of B-cell lymphoma subtypes. In the meantime, the aggregation of diverse tumors in pan-cancer genomic studies has become a useful tool to detect new driver genes, while enabling the comparison of mutational patterns across tumors. Here we present an integrated analysis of 354 B-cell lymphoid disorders. 112 recurrently mutated genes were discovered, of which KMT2D, CREBBP, IGLL5 and BCL2 were the most frequent, and 31 genes were putative new drivers. Mutations in CREBBP, TNFRSF14 and KMT2D predominated in follicular lymphoma, whereas those in BTG2, HTA-A and PIM1 were more frequent in diffuse large B-cell lymphoma. Additionally, we discovered 31 significantly mutated protein networks, reinforcing the role of genes such as CREBBP, EEF1A1, STAT6, GNA13 and TP53, but also pointing towards a myriad of infrequent players in lymphomagenesis. Finally, we report aberrant expression of oncogenes and tumor suppressors associated with novel noncoding mutations (DTX1 and S1PR2), and new recurrent copy number aberrations affecting immune check-point regulators (CD83, PVR) and B-cell specific genes (TNFRSF13C). Our analysis expands the number of mutational drivers of B-cell lymphoid neoplasms, and identifies several differential somatic events between disease subtypes.
Asunto(s)
Genoma Humano , Leucemia de Células B/genética , Linfoma de Células B/genética , Mutación , Proteína de Unión a CREB/genética , Proteínas de Unión al ADN/genética , Subunidades alfa de la Proteína de Unión al GTP G12-G13/genética , Redes Reguladoras de Genes , Humanos , Proteínas de Neoplasias/genética , Proteínas Proto-Oncogénicas c-bcl-2/genética , Miembro 14 de Receptores del Factor de Necrosis Tumoral/genética , Factor de Transcripción STAT6/genética , Proteína p53 Supresora de Tumor/genéticaRESUMEN
Follicular Lymphoma (FL) has a 10-year mortality rate of 20%, and this is mostly related to lymphoma progression and transformation to higher grades. In the era of personalized medicine it has become increasingly important to provide patients with an optimal prediction about their expected outcomes. The objective of this work was to apply machine learning (ML) tools on gene expression data in order to create individualized predictions about survival in patients with FL. Using data from two different studies, we were able to create a model which achieved good prediction accuracies in both cohorts (c-indexes of 0.793 and 0.662 in the training and test sets). Integration of this model with m7-FLIPI and age rendered high prediction accuracies in the test set (cox c-index 0.79), and a simplified approach identified 4 groups with remarkably different outcomes in terms of survival. Importantly, one of the groups comprised 27.35% of patients and had a median survival of 4.64 years. In summary, we have created a gene expression-based individualized predictor of overall survival in FL that can improve the predictions of the m7-FLIPI score.
RESUMEN
BACKGROUND: FLT3 mutation is present in 25-30% of all acute myeloid leukemias (AML), and it is associated with adverse outcome. FLT3 inhibitors have shown improved survival results in AML both as upfront treatment and in relapsed/refractory disease. Curiously, a variable proportion of wild-type FLT3 patients also responded to these drugs. METHODS: We analyzed 6 different transcriptomic datasets of AML cases. Differential expression between mutated and wild-type FLT3 AMLs was performed with the Wilcoxon-rank sum test. Hierarchical clustering was used to identify FLT3-mutation like AMLs. Finally, enrichment in recurrent mutations was performed with the Fisher's test. RESULTS: A FLT3 mutation-like gene expression pattern was identified among wild-type FLT3 AMLs. This pattern was highly enriched in NPM1 and DNMT3A mutants, and particularly in combined NPM1/DNMT3A mutants. CONCLUSIONS: We identified a FLT3 mutation-like gene expression pattern in AML which was highly enriched in NPM1 and DNMT3A mutations. Future analysis about the predictive role of this biomarker among wild-type FLT3 patients treated with FLT3 inhibitors is envisaged.
Asunto(s)
Leucemia Mieloide Aguda/genética , Leucemia/genética , Mutación/genética , Tirosina Quinasa 3 Similar a fms/genética , Biomarcadores/metabolismo , ADN (Citosina-5-)-Metiltransferasas/genética , ADN Metiltransferasa 3A , Perfilación de la Expresión Génica/métodos , Humanos , Proteínas Nucleares/genética , Nucleofosmina , Estaurosporina/análogos & derivados , Estaurosporina/farmacología , Tirosina Quinasa 3 Similar a fms/antagonistas & inhibidoresRESUMEN
Mutations in non-coding DNA regions are increasingly recognized as cancer drivers. These mutations can modify gene expression in cis or by inducing high-order chormatin structure modifications with long-range effects. Previous analysis reported the detection of recurrent and functional non-coding DNA mutations in the chronic lymphocytic leukemia (CLL) genome, such as those in the 3' untranslated region of NOTCH1 and in the PAX5 super-enhancer. In this report, we used whole genome sequencing data produced by the International Cancer Genome Consortium in order to analyze regions with previously reported regulatory activity. This approach enabled the identification of numerous recurrently mutated regions that were frequently positioned in the proximity of genes involved in immune and oncogenic pathways. By correlating these mutations with expression of their nearest genes, we detected significant transcriptional changes in genes such as PHF2 and S1PR2. More research is needed to clarify the function of these mutations in CLL, particularly those found in intergenic regions.
Asunto(s)
Leucemia Linfocítica Crónica de Células B/genética , Mutación , Secuencias Reguladoras de Ácidos Nucleicos , Regiones no Traducidas 3' , Análisis Mutacional de ADN , ADN Intergénico/genética , Proteínas de Homeodominio/genética , Humanos , Factor de Transcripción PAX5/genética , Receptor Notch1/genética , Receptores de Esfingosina-1-Fosfato/genética , Secuenciación Completa del GenomaRESUMEN
Mutations in the FMS-like tyrosine kinase 3 (FLT3) gene arise in 25-30% of all acute myeloid leukemia (AML) patients. These mutations lead to constitutive activation of the protein product and are divided in two broad types: internal tandem duplication (ITD) of the juxtamembrane domain (25% of cases) and point mutations in the tyrosine kinase domain (TKD). Patients with FLT3 ITD mutations have a high relapse risk and inferior cure rates, whereas the role of FLT3 TKD mutations still remains to be clarified. Additionally, growing research indicates that FLT3 status evolves through a disease continuum (clonal evolution), where AML cases can acquire FLT3 mutations at relapse - not present in the moment of diagnosis. Several FLT3 inhibitors have been tested in patients with FLT3-mutated AML. These drugs exhibit different kinase inhibitory profiles, pharmacokinetics and adverse events. First-generation multi-kinase inhibitors (sorafenib, midostaurin, lestaurtinib) are characterized by a broad-spectrum of drug targets, whereas second-generation inhibitors (quizartinib, crenolanib, gilteritinib) show more potent and specific FLT3 inhibition, and are thereby accompanied by less toxic effects. Notwithstanding, all FLT3 inhibitors face primary and acquired mechanisms of resistance, and therefore the combinations with other drugs (standard chemotherapy, hypomethylating agents, checkpoint inhibitors) and its application in different clinical settings (upfront therapy, maintenance, relapsed or refractory disease) are under study in a myriad of clinical trials. This review focuses on the role of FLT3 mutations in AML, pharmacological features of FLT3 inhibitors, known mechanisms of drug resistance and accumulated evidence for the use of FLT3 inhibitors in different clinical settings.
Asunto(s)
Antineoplásicos/farmacología , Leucemia Mieloide Aguda/tratamiento farmacológico , Inhibidores de Proteínas Quinasas/farmacología , Sorafenib/farmacología , Tirosina Quinasa 3 Similar a fms/antagonistas & inhibidores , Tirosina Quinasa 3 Similar a fms/genética , Compuestos de Anilina/farmacología , Bencimidazoles/farmacología , Benzotiazoles/farmacología , Carbazoles/farmacología , Resistencia a Múltiples Medicamentos , Resistencia a Antineoplásicos , Predicción , Furanos , Trasplante de Células Madre Hematopoyéticas , Humanos , Imidazoles/farmacología , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/terapia , Quimioterapia de Mantención/métodos , Mutación , Compuestos de Fenilurea/farmacología , Piperidinas/farmacología , Mutación Puntual , Pirazinas/farmacología , Piridazinas/farmacología , Recurrencia , Estaurosporina/análogos & derivados , Estaurosporina/farmacologíaRESUMEN
Chronic lymphocytic leukemia (CLL) is a lymphoproliferative disorder characterized by its heterogeneous clinical evolution. Despite the discovery of the most frequent cytogenomic drivers of disease during the last decade, new efforts are needed in order to improve prognostication. In this study, we used gene expression data of CLL samples in order to discover novel transcriptomic patterns associated with patient survival. We observed that a 3-gene expression signature composed of SCGB2A1, KLF4, and PPP1R14B differentiate a group of circa 5% of cases with short survival. This effect was independent of the main cytogenetic markers of adverse prognosis. Finally, this finding was reproduced in an independent retrospective cohort. We believe that this small gene expression pattern will be useful for CLL prognostication and its association with CLL response to novel drugs should be explored in the future.
RESUMEN
Chronic lymphocytic leukemia (CLL) is the most frequent lymphoproliferative syndrome in Western countries, and it is characterized by recurrent large genomic rearrangements. During the last decades, array techniques have expanded our knowledge about CLL's karyotypic aberrations. The advent of large sequencing databases expanded our knowledge cancer genomics to an unprecedented resolution and enabled the detection of small-scale structural aberrations in the cancer genome. In this study, we have performed exome-sequencing-based copy number aberration (CNA) and loss of heterozygosity (LOH) analysis in order to detect new recurrent structural aberrations. We describe 54 recurrent focal CNAs enriched in cancer-related pathways, and their association with gene expression and clinical evolution. Furthermore, we discovered recurrent large copy number neutral LOH events affecting key driver genes, and we recapitulate most of the large CNAs that characterize the CLL genome. These results provide "proof-of-concept" evidence supporting the existence of new genes involved in the pathogenesis of CLL.