RESUMO
PURPOSE: Decision about the optimal timing of a treatment procedure in patients with hematologic neoplasms is critical, especially for cellular therapies (most including allogeneic hematopoietic stem-cell transplantation [HSCT]). In the absence of evidence from randomized trials, real-world observational data become beneficial to study the effect of the treatment timing. In this study, a framework to estimate the expected outcome after an intervention in a time-to-event scenario is developed, with the aim of optimizing the timing in a personalized manner. METHODS: Retrospective real-world data are leveraged to emulate a target trial for treatment timing using multistate modeling and microsimulation. This case study focuses on myelodysplastic syndromes, serving as a prototype for rare cancers characterized by a heterogeneous clinical course and complex genomic background. A cohort of 7,118 patients treated according to conventional available treatments/evidence across Europe and United States is analyzed. The primary clinical objective is to determine the ideal timing for HSCT, the only curative option for these patients. RESULTS: This analysis enabled us to identify the most appropriate time frames for HSCT on the basis of each patient's unique profile, defined by a combination relevant patients' characteristics. CONCLUSION: The developed methodology offers a structured framework to address a relevant clinical issue in the field of hematology. It makes several valuable contributions: (1) novel insights into how to develop decision models to identify the most favorable HSCT timing, (2) evidence to inform clinical decisions in a real-world context, and (3) the incorporation of complex information into decision making. This framework can be applied to provide medical insights for clinical issues that cannot be adequately addressed through randomized clinical trials.
Assuntos
Neoplasias Hematológicas , Transplante de Células-Tronco Hematopoéticas , Medicina de Precisão , Transplante Homólogo , Humanos , Transplante de Células-Tronco Hematopoéticas/métodos , Neoplasias Hematológicas/terapia , Transplante Homólogo/métodos , Masculino , Pessoa de Meia-Idade , Feminino , Medicina de Precisão/métodos , Adulto , Idoso , Estudos Retrospectivos , Síndromes Mielodisplásicas/terapia , Adulto JovemRESUMO
PURPOSE: Rare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)-based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities. METHODS: We analyzed 4,427 patients with MDS divided into training and validation cohorts. Deep learning methods were applied to integrate and impute clinical/genomic features. Clustering was performed by combining Uniform Manifold Approximation and Projection for Dimension Reduction + Hierarchical Density-Based Spatial Clustering of Applications with Noise (UMAP + HDBSCAN) methods, compared with the conventional Hierarchical Dirichlet Process (HDP). Linear and AI-based nonlinear approaches were compared for survival prediction. Explainable AI (Shapley Additive Explanations approach [SHAP]) and federated learning were used to improve the interpretation and the performance of the clinical models, integrating them into distributed infrastructure. RESULTS: UMAP + HDBSCAN clustering obtained a more granular patient stratification, achieving a higher average silhouette coefficient (0.16) with respect to HDP (0.01) and higher balanced accuracy in cluster classification by Random Forest (92.7% ± 1.3% and 85.8% ± 0.8%). AI methods for survival prediction outperform conventional statistical techniques and the reference prognostic tool for MDS. Nonlinear Gradient Boosting Survival stands in the internal (Concordance-Index [C-Index], 0.77; SD, 0.01) and external validation (C-Index, 0.74; SD, 0.02). SHAP analysis revealed that similar features drove patients' subgroups and outcomes in both training and validation cohorts. Federated implementation improved the accuracy of developed models. CONCLUSION: MOSAIC provides an explainable and robust framework to optimize classification and prognostic assessment of rare cancers. AI-based approaches demonstrated superior accuracy in capturing genomic similarities and providing individual prognostic information compared with conventional statistical methods. Its federated implementation ensures broad clinical application, guaranteeing high performance and data protection.
Assuntos
Inteligência Artificial , Medicina de Precisão , Humanos , Prognóstico , Medicina de Precisão/métodos , Feminino , Doenças Raras/classificação , Doenças Raras/genética , Doenças Raras/diagnóstico , Masculino , Aprendizado Profundo , Neoplasias/classificação , Neoplasias/genética , Neoplasias/diagnóstico , Síndromes Mielodisplásicas/diagnóstico , Síndromes Mielodisplásicas/classificação , Síndromes Mielodisplásicas/genética , Síndromes Mielodisplásicas/terapia , Algoritmos , Pessoa de Meia-Idade , Idoso , Análise por ConglomeradosRESUMO
The WHO and International Consensus Classification 2022 classifications of myelodysplastic syndromes enhance diagnostic precision and refine decision-making processes in these diseases. However, some discrepancies still exist and potentially cause inconsistency in their adoption in a clinical setting. We adopted a data-driven approach to provide a harmonisation between these two classification systems. We investigated the importance of genomic features and their effect on the cluster assignment process to define harmonised entity labels. A panel of expert haematologists, haematopathologists, and data scientists who are members of the International Consortium for Myelodysplastic Syndromes was formed and a modified Delphi consensus process was adopted to harmonise morphologically defined categories without a distinct genomic profile. The panel held regular online meetings and participated in a two-round survey using an online voting tool. We identified nine clusters with distinct genomic features. The cluster of highest hierarchical importance was characterised by biallelic TP53 inactivation. Cluster assignment was irrespective of blast count. Individuals with monoallelic TP53 inactivation were assigned to other clusters. Hierarchically, the second most important group included myelodysplastic syndromes with del(5q). Isolated del(5q) and less than 5% of blast cells in the bone marrow were the most relevant label-defining features. The third most important cluster included myelodysplastic syndromes with mutated SF3B1. The absence of isolated del(5q), del(7q)/-7, abn3q26.2, complex karyotype, RUNX1 mutations, or biallelic TP53 were the basis for a harmonised label of this category. Morphologically defined myelodysplastic syndrome entities showed large genomic heterogeneity that was not efficiently captured by single-lineage versus multilineage dysplasia, marrow blasts, hypocellularity, or fibrosis. We investigated the biological continuum between myelodysplastic syndromes with more than 10% bone marrow blasts and acute myeloid leukaemia, and found only a partial overlap in genetic features. After the survey, myelodysplastic syndromes with low blasts (ie, less than 5%) and myelodysplastic syndromes with increased blasts (ie, 5% or more) were recognised as disease entities. Our data-driven approach can efficiently harmonise current classifications of myelodysplastic syndromes and provide a reference for patient management in a real-world setting.
Assuntos
Síndromes Mielodisplásicas , Síndromes Mielodisplásicas/classificação , Síndromes Mielodisplásicas/diagnóstico , Síndromes Mielodisplásicas/genética , Humanos , ConsensoRESUMO
PURPOSE: Synthetic data are artificial data generated without including any real patient information by an algorithm trained to learn the characteristics of a real source data set and became widely used to accelerate research in life sciences. We aimed to (1) apply generative artificial intelligence to build synthetic data in different hematologic neoplasms; (2) develop a synthetic validation framework to assess data fidelity and privacy preservability; and (3) test the capability of synthetic data to accelerate clinical/translational research in hematology. METHODS: A conditional generative adversarial network architecture was implemented to generate synthetic data. Use cases were myelodysplastic syndromes (MDS) and AML: 7,133 patients were included. A fully explainable validation framework was created to assess fidelity and privacy preservability of synthetic data. RESULTS: We generated MDS/AML synthetic cohorts (including information on clinical features, genomics, treatment, and outcomes) with high fidelity and privacy performances. This technology allowed resolution of lack/incomplete information and data augmentation. We then assessed the potential value of synthetic data on accelerating research in hematology. Starting from 944 patients with MDS available since 2014, we generated a 300% augmented synthetic cohort and anticipated the development of molecular classification and molecular scoring system obtained many years later from 2,043 to 2,957 real patients, respectively. Moreover, starting from 187 MDS treated with luspatercept into a clinical trial, we generated a synthetic cohort that recapitulated all the clinical end points of the study. Finally, we developed a website to enable clinicians generating high-quality synthetic data from an existing biobank of real patients. CONCLUSION: Synthetic data mimic real clinical-genomic features and outcomes, and anonymize patient information. The implementation of this technology allows to increase the scientific use and value of real data, thus accelerating precision medicine in hematology and the conduction of clinical trials.