RESUMEN
The detection and tracking of metastatic cancer over the lifetime of a patient remains a major challenge in clinical trials and real-world care. Advances in deep learning combined with massive datasets may enable the development of tools that can address this challenge. We present NYUMets-Brain, the world's largest, longitudinal, real-world dataset of cancer consisting of the imaging, clinical follow-up, and medical management of 1,429 patients. Using this dataset we developed Segmentation-Through-Time, a deep neural network which explicitly utilizes the longitudinal structure of the data and obtained state-of-the-art results at small (<10 mm3) metastases detection and segmentation. We also demonstrate that the monthly rate of change of brain metastases over time are strongly predictive of overall survival (HR 1.27, 95%CI 1.18-1.38). We are releasing the dataset, codebase, and model weights for other cancer researchers to build upon these results and to serve as a public benchmark.
Asunto(s)
Benchmarking , Neoplasias Encefálicas , Aprendizaje Profundo , Redes Neurales de la Computación , Humanos , Neoplasias Encefálicas/secundario , Neoplasias Encefálicas/diagnóstico por imagen , Estudios Longitudinales , Masculino , Femenino , Persona de Mediana Edad , AncianoRESUMEN
Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment1-3. Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing4,5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7-94.9%, with an improvement of 5.36-14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.