RESUMO
Great efforts have been made to develop precision medicine-based treatments using machine learning. In this field, where the goal is to provide the optimal treatment for each patient based on his/her medical history and genomic characteristics, it is not sufficient to make excellent predictions. The challenge is to understand and trust the model's decisions while also being able to easily implement it. However, one of the issues with machine learning algorithms-particularly deep learning-is their lack of interpretability. This review compares six different machine learning methods to provide guidance for defining interpretability by focusing on accuracy, multi-omics capability, explainability and implementability. Our selection of algorithms includes tree-, regression- and kernel-based methods, which we selected for their ease of interpretation for the clinician. We also included two novel explainable methods in the comparison. No significant differences in accuracy were observed when comparing the methods, but an improvement was observed when using gene expression instead of mutational status as input for these methods. We concentrated on the current intriguing challenge: model comprehension and ease of use. Our comparison suggests that the tree-based methods are the most interpretable of those tested.
Assuntos
Oncologia , Neoplasias , Feminino , Humanos , Masculino , Neoplasias/genética , Algoritmos , Genômica , Aprendizado de MáquinaRESUMO
Artificial intelligence (AI) can unveil novel personalized treatments based on drug screening and whole-exome sequencing experiments (WES). However, the concept of "black box" in AI limits the potential of this approach to be translated into the clinical practice. In contrast, explainable AI (XAI) focuses on making AI results understandable to humans. Here, we present a novel XAI method -called multi-dimensional module optimization (MOM)- that associates drug screening with genetic events, while guaranteeing that predictions are interpretable and robust. We applied MOM to an acute myeloid leukemia (AML) cohort of 319 ex-vivo tumor samples with 122 screened drugs and WES. MOM returned a therapeutic strategy based on the FLT3, CBFß-MYH11, and NRAS status, which predicted AML patient response to Quizartinib, Trametinib, Selumetinib, and Crizotinib. We successfully validated the results in three different large-scale screening experiments. We believe that XAI will help healthcare providers and drug regulators better understand AI medical decisions.
Assuntos
Inteligência Artificial , Leucemia Mieloide Aguda , Crizotinibe/uso terapêutico , Humanos , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/patologia , Medicina de Precisão/métodosRESUMO
Alternative splicing (AS) plays a key role in cancer: all its hallmarks have been associated with different mechanisms of abnormal AS. The improvement of the human transcriptome annotation and the availability of fast and accurate software to estimate isoform concentrations has boosted the analysis of transcriptome profiling from RNA-seq. The statistical analysis of AS is a challenging problem not yet fully solved. We have included in EventPointer (EP), a Bioconductor package, a novel statistical method that can use the bootstrap of the pseudoaligners. We compared it with other state-of-the-art algorithms to analyze AS. Its performance is outstanding for shallow sequencing conditions. The statistical framework is very flexible since it is based on design and contrast matrices. EP now includes a convenient tool to find the primers to validate the discoveries using PCR. We also added a statistical module to study alteration in protein domain related to AS. Applying it to 9514 patients from TCGA and TARGET in 19 different tumor types resulted in two conclusions: i) aberrant alternative splicing alters the relative presence of Protein domains and, ii) the number of enriched domains is strongly correlated with the age of the patients.
RESUMO
Recent functional genomic screenssuch as CRISPR-Cas9 or RNAi screeninghave fostered a new wave of targeted treatments based on the concept of synthetic lethality. These approaches identified LEthal Dependencies (LEDs) by estimating the effect of genetic events on cell viability. The multiple-hypothesis problem is related to a large number of gene knockouts limiting the statistical power of these studies. Here, we show that predictions of LEDs from functional screens can be dramatically improved by incorporating the "HUb effect in Genetic Essentiality" (HUGE) of gene alterations. We analyze three recent genome-wide loss-of-function screensProject Score, CERES score and DEMETER scoreidentifying LEDs with 75 times larger statistical power than using state-of-the-art methods. Using acute myeloid leukemia, breast cancer, lung adenocarcinoma and colon adenocarcinoma as disease models, we validate that our predictions are enriched in a recent harmonized knowledge base of clinical interpretations of somatic genomic variants in cancer (AUROC > 0.87). Our approach is effective even in tumors with large genetic heterogeneity such as acute myeloid leukemia, where we identified LEDs not recalled by previous pipelines, including FLT3-mutant genotypes sensitive to FLT3 inhibitors. Interestingly, in-vitro validations confirm lethal dependencies of either NRAS or PTPN11 depending on the NRAS mutational status. HUGE will hopefully help discover novel genetic dependencies amenable for precision-targeted therapies in cancer. All the graphs showing lethal dependencies for the 19 tumor types analyzed can be visualized in an interactive tool.
RESUMO
The development of predictive biomarkers of response to targeted therapies is an unmet clinical need for many antitumoral agents. Recent genome-wide loss-of-function screens, such as RNA interference (RNAi) and CRISPR-Cas9 libraries, are an unprecedented resource to identify novel drug targets, reposition drugs and associate predictive biomarkers in the context of precision oncology. In this work, we have developed and validated a large-scale bioinformatics tool named DrugSniper, which exploits loss-of-function experiments to model the sensitivity of 6237 inhibitors and predict their corresponding biomarkers of sensitivity in 30 tumor types. Applying DrugSniper to small cell lung cancer (SCLC), we identified genes extensively explored in SCLC, such as Aurora kinases or epigenetic agents. Interestingly, the analysis suggested a remarkable vulnerability to polo-like kinase 1 (PLK1) inhibition in CREBBP-mutant SCLC cells. We validated this association in vitro using four mutated and four wild-type SCLC cell lines and two PLK1 inhibitors (Volasertib and BI2536), confirming that the effect of PLK1 inhibitors depended on the mutational status of CREBBP. Besides, DrugSniper was validated in-silico with several known clinically-used treatments, including the sensitivity of Tyrosine Kinase Inhibitors (TKIs) and Vemurafenib to FLT3 and BRAF mutant cells, respectively. These findings show the potential of genome-wide loss-of-function screens to identify new personalized therapeutic hypotheses in SCLC and potentially in other tumors, which is a valuable starting point for further drug development and drug repositioning projects.
RESUMO
Medulloblastoma is the most common and malignant pediatric brain tumor in childhood. It originates from dysregulation of cerebellar development, due to an excessive proliferation of cerebellar granule neuron precursor cells (CGNPs). The underlying molecular mechanisms, except for the role of SHH and WNT pathways, remain largely unknown. ERBB4 is a tyrosine kinase receptor whose activity in cancer is tissue dependent. In this study, we characterized the role of ERBB4 during cerebellum development and medulloblastoma progression paying particular interests to its role in CGNPs and medulloblastoma stem cells (MBSCs). Our results show that ERBB4 is expressed in the CGNPs during cerebellum development where it plays a critical role in migration, apoptosis and differentiation. Similarly, it is enriched in the population of MBSCs, where also controls those critical processes, as well as self-renewal and tumor initiation for medulloblastoma progression. These results are translated to clinical samples where high levels of ERBB4 correlate with poor outcome in Group 4 and all medulloblastomas groups. Transcriptomic analysis identified critical processes and pathways altered in cells with knock-down of ERBB4. These results highlight the impact and underlying mechanisms of ERBB4 in critical processes during cerebellum development and medulloblastoma.
RESUMO
BACKGROUND: Splicing is a genetic process that has important implications in several diseases including cancer. Deciphering the complex rules of splicing regulation is crucial to understand and treat splicing-related diseases. Splicing factors and other RNA-binding proteins (RBPs) play a key role in the regulation of splicing. The specific binding sites of an RBP can be measured using CLIP experiments. However, to unveil which RBPs regulate a condition, it is necessary to have a priori hypotheses, as a single CLIP experiment targets a single protein. RESULTS: In this work, we present a novel methodology to predict context-specific splicing factors from transcriptomic data. For this, we systematically collect, integrate and analyze more than 900 CLIP experiments stored in four CLIP databases: POSTAR2, CLIPdb, DoRiNA and StarBase. The analysis of these experiments shows the strong coherence between the binding sites of RBPs of similar families. Augmenting this information with expression changes, we are able to correctly predict the splicing factors that regulate splicing in two gold-standard experiments in which specific splicing factors are knocked-down. CONCLUSIONS: The methodology presented in this study allows the prediction of active splicing factors in either cancer or any other condition by only using the information of transcript expression. This approach opens a wide range of possible studies to understand the splicing regulation of different conditions. A tutorial with the source code and databases is available at https://gitlab.com/fcarazo.m/sfprediction .