RESUMEN
Triple-negative breast cancer (TNBC) is a rare cancer, characterized by high metastatic potential and poor prognosis, and has limited treatment options. The current standard of care in nonmetastatic settings is neoadjuvant chemotherapy (NACT), but treatment efficacy varies substantially across patients. This heterogeneity is still poorly understood, partly due to the paucity of curated TNBC data. Here we investigate the use of machine learning (ML) leveraging whole-slide images and clinical information to predict, at diagnosis, the histological response to NACT for early TNBC women patients. To overcome the biases of small-scale studies while respecting data privacy, we conducted a multicentric TNBC study using federated learning, in which patient data remain secured behind hospitals' firewalls. We show that local ML models relying on whole-slide images can predict response to NACT but that collaborative training of ML models further improves performance, on par with the best current approaches in which ML models are trained using time-consuming expert annotations. Our ML model is interpretable and is sensitive to specific histological patterns. This proof of concept study, in which federated learning is applied to real-world datasets, paves the way for future biomarker discovery using unprecedentedly large datasets.
Asunto(s)
Terapia Neoadyuvante , Neoplasias de la Mama Triple Negativas , Humanos , Femenino , Terapia Neoadyuvante/métodos , Neoplasias de la Mama Triple Negativas/tratamiento farmacológico , Neoplasias de la Mama Triple Negativas/patología , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Resultado del TratamientoRESUMEN
The use of monoclonal antibodies (mAbs) constitutes one of the most important strategies to treat patients suffering from cancers such as hematological malignancies and solid tumors. These antibodies are prescribed by the physician and prepared by hospital pharmacists. An analytical control enables the quality of the preparations to be ensured. The aim of this study was to explore the development of a rapid analytical method for quality control. The method used four mAbs (Infliximab, Bevacizumab, Rituximab and Ramucirumab) at various concentrations and was based on recording Raman data and coupling them to a traditional chemometric and machine learning approach for data analysis. Compared to conventional linear approach, prediction errors are reduced with a data-driven approach using statistical machine learning methods. In the latter, preprocessing and predictive models are jointly optimized. An additional original aspect of the work involved on submitting the problem to a collaborative data challenge platform called Rapid Analytics and Model Prototyping (RAMP). This allowed using solutions from about 300 data scientists in collaborative work. Using machine learning, the prediction of the four mAbs samples was considerably improved. The best predictive model showed a combined error of 2.4% versus 14.6% using linear approach. The concentration and classification errors were 5.8% and 0.7%, only three spectra were misclassified over the 429 spectra of the test set. This large improvement obtained with machine learning techniques was uniform for all molecules but maximal for Bevacizumab with an 88.3% reduction on combined errors (2.1% versus 17.9%).
Asunto(s)
Anticuerpos Monoclonales/análisis , Aprendizaje Automático , Humanos , Análisis de Regresión , Espectrometría RamanRESUMEN
A method is provided for designing and training noise-driven recurrent neural networks as models of stochastic processes. The method unifies and generalizes two known separate modeling approaches, Echo State Networks (ESN) and Linear Inverse Modeling (LIM), under the common principle of relative entropy minimization. The power of the new method is demonstrated on a stochastic approximation of the El Niño phenomenon studied in climate research.