Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
Neuro Oncol ; 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38769022

RESUMO

MR imaging is central to the assessment of tumor burden and changes over time in neuro-oncology. Several response assessment guidelines have been set forth by the Response Assessment in Pediatric Neuro-Oncology (RAPNO) working groups in different tumor histologies; however, the visual delineation of tumor components using MRIs is not always straightforward, and complexities not currently addressed by these criteria can introduce inter- and intra-observer variability in manual assessments. Differentiation of non-enhancing tumor from peritumoral edema, mild enhancement from absence of enhancement, and various cystic components can be challenging; particularly given a lack of sufficient and uniform imaging protocols in clinical practice. Automated tumor segmentation with artificial intelligence (AI) may be able to provide more objective delineations, but rely on accurate and consistent training data created manually (ground truth). Herein, this paper reviews existing challenges and potential solutions to identifying and defining subregions of pediatric brain tumors (PBTs) that are not explicitly addressed by current guidelines. The goal is to assert the importance of defining and adopting criteria for addressing these challenges, as it will be critical to achieving standardized tumor measurements and reproducible response assessment in PBTs, ultimately leading to more precise outcome metrics and accurate comparisons among clinical studies.

3.
Radiol Artif Intell ; 6(3): e230333, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38446044

RESUMO

Purpose To develop and externally test a scan-to-prediction deep learning pipeline for noninvasive, MRI-based BRAF mutational status classification for pediatric low-grade glioma. Materials and Methods This retrospective study included two pediatric low-grade glioma datasets with linked genomic and diagnostic T2-weighted MRI data of patients: Dana-Farber/Boston Children's Hospital (development dataset, n = 214 [113 (52.8%) male; 104 (48.6%) BRAF wild type, 60 (28.0%) BRAF fusion, and 50 (23.4%) BRAF V600E]) and the Children's Brain Tumor Network (external testing, n = 112 [55 (49.1%) male; 35 (31.2%) BRAF wild type, 60 (53.6%) BRAF fusion, and 17 (15.2%) BRAF V600E]). A deep learning pipeline was developed to classify BRAF mutational status (BRAF wild type vs BRAF fusion vs BRAF V600E) via a two-stage process: (a) three-dimensional tumor segmentation and extraction of axial tumor images and (b) section-wise, deep learning-based classification of mutational status. Knowledge-transfer and self-supervised approaches were investigated to prevent model overfitting, with a primary end point of the area under the receiver operating characteristic curve (AUC). To enhance model interpretability, a novel metric, center of mass distance, was developed to quantify the model attention around the tumor. Results A combination of transfer learning from a pretrained medical imaging-specific network and self-supervised label cross-training (TransferX) coupled with consensus logic yielded the highest classification performance with an AUC of 0.82 (95% CI: 0.72, 0.91), 0.87 (95% CI: 0.61, 0.97), and 0.85 (95% CI: 0.66, 0.95) for BRAF wild type, BRAF fusion, and BRAF V600E, respectively, on internal testing. On external testing, the pipeline yielded an AUC of 0.72 (95% CI: 0.64, 0.86), 0.78 (95% CI: 0.61, 0.89), and 0.72 (95% CI: 0.64, 0.88) for BRAF wild type, BRAF fusion, and BRAF V600E, respectively. Conclusion Transfer learning and self-supervised cross-training improved classification performance and generalizability for noninvasive pediatric low-grade glioma mutational status prediction in a limited data scenario. Keywords: Pediatrics, MRI, CNS, Brain/Brain Stem, Oncology, Feature Detection, Diagnosis, Supervised Learning, Transfer Learning, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2024.


Assuntos
Neoplasias Encefálicas , Glioma , Humanos , Criança , Masculino , Feminino , Neoplasias Encefálicas/diagnóstico por imagem , Estudos Retrospectivos , Proteínas Proto-Oncogênicas B-raf/genética , Glioma/diagnóstico , Aprendizado de Máquina
4.
J Nucl Med ; 65(5): 803-809, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38514087

RESUMO

We aimed to investigate the effects of 18F-FDG PET voxel intensity normalization on radiomic features of oropharyngeal squamous cell carcinoma (OPSCC) and machine learning-generated radiomic biomarkers. Methods: We extracted 1,037 18F-FDG PET radiomic features quantifying the shape, intensity, and texture of 430 OPSCC primary tumors. The reproducibility of individual features across 3 intensity-normalized images (body-weight SUV, reference tissue activity ratio to lentiform nucleus of brain and cerebellum) and the raw PET data was assessed using an intraclass correlation coefficient (ICC). We investigated the effects of intensity normalization on the features' utility in predicting the human papillomavirus (HPV) status of OPSCCs in univariate logistic regression, receiver-operating-characteristic analysis, and extreme-gradient-boosting (XGBoost) machine-learning classifiers. Results: Of 1,037 features, a high (ICC ≥ 0.90), medium (0.90 > ICC ≥ 0.75), and low (ICC < 0.75) degree of reproducibility across normalization methods was attained in 356 (34.3%), 608 (58.6%), and 73 (7%) features, respectively. In univariate analysis, features from the PET normalized to the lentiform nucleus had the strongest association with HPV status, with 865 of 1,037 (83.4%) significant features after multiple testing corrections and a median area under the receiver-operating-characteristic curve (AUC) of 0.65 (interquartile range, 0.62-0.68). Similar tendencies were observed in XGBoost models, with the lentiform nucleus-normalized model achieving the numerically highest average AUC of 0.72 (SD, 0.07) in the cross validation within the training cohort. The model generalized well to the validation cohorts, attaining an AUC of 0.73 (95% CI, 0.60-0.85) in independent validation and 0.76 (95% CI, 0.58-0.95) in external validation. The AUCs of the XGBoost models were not significantly different. Conclusion: Only one third of the features demonstrated a high degree of reproducibility across intensity-normalization techniques, making uniform normalization a prerequisite for interindividual comparability of radiomic markers. The choice of normalization technique may affect the radiomic features' predictive value with respect to HPV. Our results show trends that normalization to the lentiform nucleus may improve model performance, although more evidence is needed to draw a firm conclusion.


Assuntos
Fluordesoxiglucose F18 , Aprendizado de Máquina , Neoplasias Orofaríngeas , Humanos , Neoplasias Orofaríngeas/diagnóstico por imagem , Masculino , Feminino , Pessoa de Meia-Idade , Tomografia por Emissão de Pósitrons/métodos , Processamento de Imagem Assistida por Computador/métodos , Idoso , Carcinoma de Células Escamosas/diagnóstico por imagem , Biomarcadores Tumorais/metabolismo , Reprodutibilidade dos Testes , Radiômica
6.
JAMA Otolaryngol Head Neck Surg ; 150(2): 151-156, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38175664

RESUMO

Importance: The likelihood that an oral cavity lesion harbors occult invasive disease after biopsy demonstrating carcinoma in situ (CIS) is unknown. While de-escalated treatment strategies may be appealing in the setting of CIS, knowing whether occult invasive disease may be present and its association with survival outcomes would lead to more informed management decisions. Objective: To evaluate rate of occult invasive disease and clinical outcomes in patients with oral cavity CIS. Design, Setting, and Participants: This was a retrospective population-based cohort study using the National Cancer Database and included adults with biopsy-proven oral cavity CIS as the first diagnosis of cancer between 2004 and 2020. Data were analyzed from October 10, 2022, to June 25, 2023. Exposures: Surgical resection vs no surgery. Main Outcomes and Measures: Analyses calculated the rate of occult invasive disease identified on resection of a biopsy-proven CIS lesion. Univariate and multivariate logistic regression with odds ratios and 95% CIs were used to identify significant demographic and clinical characteristics associated with risk of occult invasion (age, year of diagnosis, sex, race and ethnicity, oral cavity subsite, and comorbidity status). Kaplan-Meier curves for overall survival (OS) were calculated for both unresected and resected cohorts (stratified by presence of occult invasive disease). Results: A total of 1856 patients with oral cavity CIS were identified, with 122 who did not undergo surgery (median [range] age, 65 [26-90] years; 48 female individuals [39.3%] and 74 male individuals [60.7%]) and 1458 who underwent surgical resection and had documented pathology (median [range] age, 62 [21-90] years; 490 female individuals [33.6%] and 968 male individuals [66.4%]). Of the 1580 patients overall, 52 (3.3%) were Black; 39 (2.5%), Hispanic; 1365 (86.4%), White; and 124 (7.8%), other, not specified. Among those who proceeded with surgery with documented pathology, 408 patients (28.0%) were found to have occult invasive disease. Higher-risk features were present in 45 patients (11.0%) for final margin positivity, 16 patients (3.9%) for lymphovascular invasion, 13 patients (3.2%) for high-grade invasive disease, and 14 patients (3.4%) for nodal involvement. For those patients with occult disease, staging according to the American Joint Committee on Cancer's AJCC Cancer Staging Manual, eighth edition, was pT1 in 341 patients (83.6%), pT2 in 41 (10.0%), and pT3 or pT4 disease in 26 (6.4%). Factors associated with greater odds of occult invasive disease at resection were female sex, Black race, and alveolar ridge, vestibule, and retromolar subsite. With median 66-month follow-up, 5-year OS was 85.9% in patients who proceeded with surgical resection vs 59.7% in patients who did not undergo surgery (difference, 26.2%; 95% CI, 19.0%-33.4%). Conclusions and Relevance: This cohort study assessed the risk of concurrent occult invasion with biopsy-proven CIS of the oral cavity, demonstrating that 28.0% had invasive disease at resection. Reassuringly, even in the setting of occult invasion, high-risk disease features were rare, and 5-year OS was nearly 80% with resection. The findings support the practice of definitive resection if feasible following biopsy demonstrating oral cavity CIS.


Assuntos
Carcinoma de Células Escamosas , Neoplasias de Cabeça e Pescoço , Neoplasias Bucais , Adulto , Humanos , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Carcinoma de Células Escamosas de Cabeça e Pescoço/patologia , Estudos de Coortes , Estudos Retrospectivos , Estadiamento de Neoplasias , Carcinoma de Células Escamosas/patologia , Neoplasias Bucais/patologia , Biópsia , Neoplasias de Cabeça e Pescoço/patologia
7.
NPJ Digit Med ; 7(1): 6, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38200151

RESUMO

Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.

8.
Sci Rep ; 14(1): 2536, 2024 01 30.
Artigo em Inglês | MEDLINE | ID: mdl-38291051

RESUMO

Manual segmentation of tumors and organs-at-risk (OAR) in 3D imaging for radiation-therapy planning is time-consuming and subject to variation between different observers. Artificial intelligence (AI) can assist with segmentation, but challenges exist in ensuring high-quality segmentation, especially for small, variable structures, such as the esophagus. We investigated the effect of variation in segmentation quality and style of physicians for training deep-learning models for esophagus segmentation and proposed a new metric, edge roughness, for evaluating/quantifying slice-to-slice inconsistency. This study includes a real-world cohort of 394 patients who each received radiation therapy (mainly for lung cancer). Segmentation of the esophagus was performed by 8 physicians as part of routine clinical care. We evaluated manual segmentation by comparing the length and edge roughness of segmentations among physicians to analyze inconsistencies. We trained eight multiple- and individual-physician segmentation models in total, based on U-Net architectures and residual backbones. We used the volumetric Dice coefficient to measure the performance for each model. We proposed a metric, edge roughness, to quantify the shift of segmentation among adjacent slices by calculating the curvature of edges of the 2D sagittal- and coronal-view projections. The auto-segmentation model trained on multiple physicians (MD1-7) achieved the highest mean Dice of 73.7 ± 14.8%. The individual-physician model (MD7) with the highest edge roughness (mean ± SD: 0.106 ± 0.016) demonstrated significantly lower volumetric Dice for test cases compared with other individual models (MD7: 58.5 ± 15.8%, MD6: 67.1 ± 16.8%, p < 0.001). A multiple-physician model trained after removing the MD7 data resulted in fewer outliers (e.g., Dice ≤ 40%: 4 cases for MD1-6, 7 cases for MD1-7, Ntotal = 394). While we initially detected this pattern in a single clinician, we validated the edge roughness metric across the entire dataset. The model trained with the lowest-quantile edge roughness (MDER-Q1, Ntrain = 62) achieved significantly higher Dice (Ntest = 270) than the model trained with the highest-quantile ones (MDER-Q4, Ntrain = 62) (MDER-Q1: 67.8 ± 14.8%, MDER-Q4: 62.8 ± 15.7%, p < 0.001). This study demonstrates that there is significant variation in style and quality in manual segmentations in clinical care, and that training AI auto-segmentation algorithms from real-world, clinical datasets may result in unexpectedly under-performing algorithms with the inclusion of outliers. Importantly, this study provides a novel evaluation metric, edge roughness, to quantify physician variation in segmentation which will allow developers to filter clinical training data to optimize model performance.


Assuntos
Aprendizado Profundo , Humanos , Inteligência Artificial , Tórax , Algoritmos , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por Computador/métodos
9.
Radiother Oncol ; 190: 110034, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38030080

RESUMO

BACKGROUND/PURPOSE: Central/ultra-central thoracic tumors are challenging to treat with stereotactic radiotherapy due potential high-grade toxicity. Stereotactic MR-guided adaptive radiation therapy (SMART) may improve the therapeutic window through motion control with breath-hold gating and real-time MR-imaging as well as the option for daily online adaptive replanning to account for changes in target and/or organ-at-risk (OAR) location. MATERIALS/METHODS: 26 central (19 ultra-central) thoracic oligoprogressive/oligometastatic tumors treated with isotoxic (OAR constraints-driven) 5-fraction SMART (median 50 Gy, range 35-60) between 10/2019-10/2022 were reviewed. Central tumor was defined as tumor within or touching 2 cm around proximal tracheobronchial tree (PBT) or adjacent to mediastinal/pericardial pleura. Ultra-central was defined as tumor abutting the PBT, esophagus, or great vessel. Hard OAR constraints observed were ≤ 0.03 cc for PBT V40, great vessel V52.5, and esophagus V35. Local failure was defined as tumor progression/recurrence within the planning target volume. RESULTS: Tumor abutted the PBT in 31 %, esophagus in 31 %, great vessel in 65 %, and heart in 42 % of cases. 96 % of fractions were treated with reoptimized plan, necessary to meet OAR constraints (80 %) and/or target coverage (20 %). Median follow-up was 19 months (27 months among surviving patients). Local control (LC) was 96 % at 1-year and 90 % at 2-years (total 2/26 local failure). 23 % had G2 acute toxicities (esophagitis, dysphagia, anorexia, nausea) and one (4 %) had G3 acute radiation dermatitis. There were no G4-5 acute toxicities. There was no symptomatic pneumonitis and no G2 + late toxicities. CONCLUSION: Isotoxic 5-fraction SMART resulted in high rates of LC and minimal toxicity. This approach may widen the therapeutic window for high-risk oligoprogressive/oligometastatic thoracic tumors.


Assuntos
Neoplasias Pulmonares , Lesões por Radiação , Radiocirurgia , Neoplasias Torácicas , Humanos , Planejamento da Radioterapia Assistida por Computador/métodos , Recidiva Local de Neoplasia , Radiocirurgia/métodos , Neoplasias Torácicas/radioterapia , Imageamento por Ressonância Magnética/métodos , Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/radioterapia , Neoplasias Pulmonares/patologia
10.
Nat Commun ; 14(1): 6863, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945573

RESUMO

Lean muscle mass (LMM) is an important aspect of human health. Temporalis muscle thickness is a promising LMM marker but has had limited utility due to its unknown normal growth trajectory and reference ranges and lack of standardized measurement. Here, we develop an automated deep learning pipeline to accurately measure temporalis muscle thickness (iTMT) from routine brain magnetic resonance imaging (MRI). We apply iTMT to 23,876 MRIs of healthy subjects, ages 4 through 35, and generate sex-specific iTMT normal growth charts with percentiles. We find that iTMT was associated with specific physiologic traits, including caloric intake, physical activity, sex hormone levels, and presence of malignancy. We validate iTMT across multiple demographic groups and in children with brain tumors and demonstrate feasibility for individualized longitudinal monitoring. The iTMT pipeline provides unprecedented insights into temporalis muscle growth during human development and enables the use of LMM tracking to inform clinical decision-making.


Assuntos
Gráficos de Crescimento , Músculo Temporal , Masculino , Feminino , Humanos , Criança , Músculo Temporal/diagnóstico por imagem , Músculo Temporal/patologia
12.
medRxiv ; 2023 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-37745558

RESUMO

Because humans age at different rates, a person's physical appearance may yield insights into their biological age and physiological health more reliably than their chronological age. In medicine, however, appearance is incorporated into medical judgments in a subjective and non-standardized fashion. In this study, we developed and validated FaceAge, a deep learning system to estimate biological age from easily obtainable and low-cost face photographs. FaceAge was trained on data from 58,851 healthy individuals, and clinical utility was evaluated on data from 6,196 patients with cancer diagnoses from two institutions in the United States and The Netherlands. To assess the prognostic relevance of FaceAge estimation, we performed Kaplan Meier survival analysis. To test a relevant clinical application of FaceAge, we assessed the performance of FaceAge in end-of-life patients with metastatic cancer who received palliative treatment by incorporating FaceAge into clinical prediction models. We found that, on average, cancer patients look older than their chronological age, and looking older is correlated with worse overall survival. FaceAge demonstrated significant independent prognostic performance in a range of cancer types and stages. We found that FaceAge can improve physicians' survival predictions in incurable patients receiving palliative treatments, highlighting the clinical utility of the algorithm to support end-of-life decision-making. FaceAge was also significantly associated with molecular mechanisms of senescence through gene analysis, while age was not. These findings may extend to diseases beyond cancer, motivating using deep learning algorithms to translate a patient's visual appearance into objective, quantitative, and clinically useful measures.

13.
JTO Clin Res Rep ; 4(10): 100559, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37732171

RESUMO

Introduction: Thoracic radiotherapy (TRT) is increasingly used in patients receiving osimertinib for advanced NSCLC, and the risk of pneumonitis is not established. We investigated the risk of pneumonitis and potential risk factors in this population. Methods: We performed a multi-institutional retrospective analysis of patients under active treatment with osimertinib who received TRT between April 2016 and July 2022 at two institutions. Clinical characteristics, including whether osimertinib was held during TRT and pneumonitis incidence and grade (Common Terminology Criteria for Adverse Events version 5.0) were documented. Logistic regression analysis was performed to identify risk factors associated with grade 2 or higher (2+) pneumonitis. Results: The median follow-up was 10.2 months (range: 1.9-53.2). Of 102 patients, 14 (13.7%) developed grade 2+ pneumonitis, with a median time to pneumonitis of 3.2 months (range: 1.5-6.3). Pneumonitis risk was not significantly increased in patients who continued osimertinib during TRT compared with patients who held osimertinib during TRT (9.1% versus 15.0%, p = 0.729). Three patients (2.9%) had grade 3 pneumonitis, none had grade 4, and two patients had grade 5 events (2.0%, diagnosed 3.2 mo and 4.4 mo post-TRT). Mean lung dose was associated with the development of grade 2+ pneumonitis in multivariate analysis (OR = 1.19, p = 0.021). Conclusions: Although the overall rate of pneumonitis in patients receiving TRT and osimertinib was relatively low, there was a small risk of severe toxicity. The mean lung dose was associated with an increased risk of developing pneumonitis. These findings inform decision-making for patients and providers.

14.
medRxiv ; 2023 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-37609311

RESUMO

Purpose: To develop and externally validate a scan-to-prediction deep-learning pipeline for noninvasive, MRI-based BRAF mutational status classification for pLGG. Materials and Methods: We conducted a retrospective study of two pLGG datasets with linked genomic and diagnostic T2-weighted MRI of patients: BCH (development dataset, n=214 [60 (28%) BRAF fusion, 50 (23%) BRAF V600E, 104 (49%) wild-type), and Child Brain Tumor Network (CBTN) (external validation, n=112 [60 (53%) BRAF-Fusion, 17 (15%) BRAF-V600E, 35 (32%) wild-type]). We developed a deep learning pipeline to classify BRAF mutational status (V600E vs. fusion vs. wildtype) via a two-stage process: 1) 3D tumor segmentation and extraction of axial tumor images, and 2) slice-wise, deep learning-based classification of mutational status. We investigated knowledge-transfer and self-supervised approaches to prevent model overfitting with a primary endpoint of the area under the receiver operating characteristic curve (AUC). To enhance model interpretability, we developed a novel metric, COMDist, that quantifies the accuracy of model attention around the tumor. Results: A combination of transfer learning from a pretrained medical imaging-specific network and self-supervised label cross-training (TransferX) coupled with consensus logic yielded the highest macro-average AUC (0.82 [95% CI: 0.70-0.90]) and accuracy (77%) on internal validation, with an AUC improvement of +17.7% and a COMDist improvement of +6.4% versus training from scratch. On external validation, the TransferX model yielded AUC (0.73 [95% CI 0.68-0.88]) and accuracy (75%). Conclusion: Transfer learning and self-supervised cross-training improved classification performance and generalizability for noninvasive pLGG mutational status prediction in a limited data scenario.

15.
JAMA Oncol ; 9(10): 1459-1462, 2023 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-37615976

RESUMO

This survey study examines the performance of a large language model chatbot in providing cancer treatment recommendations that are concordant with National Comprehensive Cancer Network guidelines.


Assuntos
Inteligência Artificial , Neoplasias , Humanos , Neoplasias/terapia
16.
JAMA Netw Open ; 6(8): e2328280, 2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37561460

RESUMO

Importance: Sarcopenia is an established prognostic factor in patients with head and neck squamous cell carcinoma (HNSCC); the quantification of sarcopenia assessed by imaging is typically achieved through the skeletal muscle index (SMI), which can be derived from cervical skeletal muscle segmentation and cross-sectional area. However, manual muscle segmentation is labor intensive, prone to interobserver variability, and impractical for large-scale clinical use. Objective: To develop and externally validate a fully automated image-based deep learning platform for cervical vertebral muscle segmentation and SMI calculation and evaluate associations with survival and treatment toxicity outcomes. Design, Setting, and Participants: For this prognostic study, a model development data set was curated from publicly available and deidentified data from patients with HNSCC treated at MD Anderson Cancer Center between January 1, 2003, and December 31, 2013. A total of 899 patients undergoing primary radiation for HNSCC with abdominal computed tomography scans and complete clinical information were selected. An external validation data set was retrospectively collected from patients undergoing primary radiation therapy between January 1, 1996, and December 31, 2013, at Brigham and Women's Hospital. The data analysis was performed between May 1, 2022, and March 31, 2023. Exposure: C3 vertebral skeletal muscle segmentation during radiation therapy for HNSCC. Main Outcomes and Measures: Overall survival and treatment toxicity outcomes of HNSCC. Results: The total patient cohort comprised 899 patients with HNSCC (median [range] age, 58 [24-90] years; 140 female [15.6%] and 755 male [84.0%]). Dice similarity coefficients for the validation set (n = 96) and internal test set (n = 48) were 0.90 (95% CI, 0.90-0.91) and 0.90 (95% CI, 0.89-0.91), respectively, with a mean 96.2% acceptable rate between 2 reviewers on external clinical testing (n = 377). Estimated cross-sectional area and SMI values were associated with manually annotated values (Pearson r = 0.99; P < .001) across data sets. On multivariable Cox proportional hazards regression, SMI-derived sarcopenia was associated with worse overall survival (hazard ratio, 2.05; 95% CI, 1.04-4.04; P = .04) and longer feeding tube duration (median [range], 162 [6-1477] vs 134 [15-1255] days; hazard ratio, 0.66; 95% CI, 0.48-0.89; P = .006) than no sarcopenia. Conclusions and Relevance: This prognostic study's findings show external validation of a fully automated deep learning pipeline to accurately measure sarcopenia in HNSCC and an association with important disease outcomes. The pipeline could enable the integration of sarcopenia assessment into clinical decision making for individuals with HNSCC.


Assuntos
Aprendizado Profundo , Neoplasias de Cabeça e Pescoço , Sarcopenia , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Carcinoma de Células Escamosas de Cabeça e Pescoço/diagnóstico por imagem , Estudos Retrospectivos , Sarcopenia/diagnóstico por imagem , Sarcopenia/complicações , Neoplasias de Cabeça e Pescoço/complicações , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem
17.
Cancer Res Commun ; 3(6): 1140-1151, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37397861

RESUMO

Artificial intelligence (AI) and machine learning (ML) are becoming critical in developing and deploying personalized medicine and targeted clinical trials. Recent advances in ML have enabled the integration of wider ranges of data including both medical records and imaging (radiomics). However, the development of prognostic models is complex as no modeling strategy is universally superior to others and validation of developed models requires large and diverse datasets to demonstrate that prognostic models developed (regardless of method) from one dataset are applicable to other datasets both internally and externally. Using a retrospective dataset of 2,552 patients from a single institution and a strict evaluation framework that included external validation on three external patient cohorts (873 patients), we crowdsourced the development of ML models to predict overall survival in head and neck cancer (HNC) using electronic medical records (EMR) and pretreatment radiological images. To assess the relative contributions of radiomics in predicting HNC prognosis, we compared 12 different models using imaging and/or EMR data. The model with the highest accuracy used multitask learning on clinical data and tumor volume, achieving high prognostic accuracy for 2-year and lifetime survival prediction, outperforming models relying on clinical data only, engineered radiomics, or complex deep neural network architecture. However, when we attempted to extend the best performing models from this large training dataset to other institutions, we observed significant reductions in the performance of the model in those datasets, highlighting the importance of detailed population-based reporting for AI/ML model utility and stronger validation frameworks. We have developed highly prognostic models for overall survival in HNC using EMRs and pretreatment radiological images based on a large, retrospective dataset of 2,552 patients from our institution.Diverse ML approaches were used by independent investigators. The model with the highest accuracy used multitask learning on clinical data and tumor volume.External validation of the top three performing models on three datasets (873 patients) with significant differences in the distributions of clinical and demographic variables demonstrated significant decreases in model performance. Significance: ML combined with simple prognostic factors outperformed multiple advanced CT radiomics and deep learning methods. ML models provided diverse solutions for prognosis of patients with HNC but their prognostic value is affected by differences in patient populations and require extensive validation.


Assuntos
Aprendizado Profundo , Neoplasias de Cabeça e Pescoço , Humanos , Prognóstico , Estudos Retrospectivos , Inteligência Artificial , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem
18.
medRxiv ; 2023 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-37425854

RESUMO

Purpose: Artificial intelligence (AI)-automated tumor delineation for pediatric gliomas would enable real-time volumetric evaluation to support diagnosis, treatment response assessment, and clinical decision-making. Auto-segmentation algorithms for pediatric tumors are rare, due to limited data availability, and algorithms have yet to demonstrate clinical translation. Methods: We leveraged two datasets from a national brain tumor consortium (n=184) and a pediatric cancer center (n=100) to develop, externally validate, and clinically benchmark deep learning neural networks for pediatric low-grade glioma (pLGG) segmentation using a novel in-domain, stepwise transfer learning approach. The best model [via Dice similarity coefficient (DSC)] was externally validated and subject to randomized, blinded evaluation by three expert clinicians wherein clinicians assessed clinical acceptability of expert- and AI-generated segmentations via 10-point Likert scales and Turing tests. Results: The best AI model utilized in-domain, stepwise transfer learning (median DSC: 0.877 [IQR 0.715-0.914]) versus baseline model (median DSC 0.812 [IQR 0.559-0.888]; p<0.05). On external testing (n=60), the AI model yielded accuracy comparable to inter-expert agreement (median DSC: 0.834 [IQR 0.726-0.901] vs. 0.861 [IQR 0.795-0.905], p=0.13). On clinical benchmarking (n=100 scans, 300 segmentations from 3 experts), the experts rated the AI model higher on average compared to other experts (median Likert rating: 9 [IQR 7-9]) vs. 7 [IQR 7-9], p<0.05 for each). Additionally, the AI segmentations had significantly higher (p<0.05) overall acceptability compared to experts on average (80.2% vs. 65.4%). Experts correctly predicted the origins of AI segmentations in an average of 26.0% of cases. Conclusions: Stepwise transfer learning enabled expert-level, automated pediatric brain tumor auto-segmentation and volumetric measurement with a high level of clinical acceptability. This approach may enable development and translation of AI imaging segmentation algorithms in limited data scenarios.

19.
Lancet Digit Health ; 5(6): e360-e369, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37087370

RESUMO

BACKGROUND: Pretreatment identification of pathological extranodal extension (ENE) would guide therapy de-escalation strategies for in human papillomavirus (HPV)-associated oropharyngeal carcinoma but is diagnostically challenging. ECOG-ACRIN Cancer Research Group E3311 was a multicentre trial wherein patients with HPV-associated oropharyngeal carcinoma were treated surgically and assigned to a pathological risk-based adjuvant strategy of observation, radiation, or concurrent chemoradiation. Despite protocol exclusion of patients with overt radiographic ENE, more than 30% had pathological ENE and required postoperative chemoradiation. We aimed to evaluate a CT-based deep learning algorithm for prediction of ENE in E3311, a diagnostically challenging cohort wherein algorithm use would be impactful in guiding decision-making. METHODS: For this retrospective evaluation of deep learning algorithm performance, we obtained pretreatment CTs and corresponding surgical pathology reports from the multicentre, randomised de-escalation trial E3311. All enrolled patients on E3311 required pretreatment and diagnostic head and neck imaging; patients with radiographically overt ENE were excluded per study protocol. The lymph node with largest short-axis diameter and up to two additional nodes were segmented on each scan and annotated for ENE per pathology reports. Deep learning algorithm performance for ENE prediction was compared with four board-certified head and neck radiologists. The primary endpoint was the area under the curve (AUC) of the receiver operating characteristic. FINDINGS: From 178 collected scans, 313 nodes were annotated: 71 (23%) with ENE in general, 39 (13%) with ENE larger than 1 mm ENE. The deep learning algorithm AUC for ENE classification was 0·86 (95% CI 0·82-0·90), outperforming all readers (p<0·0001 for each). Among radiologists, there was high variability in specificity (43-86%) and sensitivity (45-96%) with poor inter-reader agreement (κ 0·32). Matching the algorithm specificity to that of the reader with highest AUC (R2, false positive rate 22%) yielded improved sensitivity to 75% (+ 13%). Setting the algorithm false positive rate to 30% yielded 90% sensitivity. The algorithm showed improved performance compared with radiologists for ENE larger than 1 mm (p<0·0001) and in nodes with short-axis diameter 1 cm or larger. INTERPRETATION: The deep learning algorithm outperformed experts in predicting pathological ENE on a challenging cohort of patients with HPV-associated oropharyngeal carcinoma from a randomised clinical trial. Deep learning algorithms should be evaluated prospectively as a treatment selection tool. FUNDING: ECOG-ACRIN Cancer Research Group and the National Cancer Institute of the US National Institutes of Health.


Assuntos
Carcinoma , Aprendizado Profundo , Neoplasias Orofaríngeas , Infecções por Papillomavirus , Humanos , Papillomavirus Humano , Estudos Retrospectivos , Infecções por Papillomavirus/diagnóstico por imagem , Infecções por Papillomavirus/complicações , Extensão Extranodal , Neoplasias Orofaríngeas/diagnóstico por imagem , Neoplasias Orofaríngeas/patologia , Algoritmos , Carcinoma/complicações , Tomografia Computadorizada por Raios X
20.
medRxiv ; 2023 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-36865296

RESUMO

Background: Oropharyngeal cancer (OPC) is a widespread disease, with radiotherapy being a core treatment modality. Manual segmentation of the primary gross tumor volume (GTVp) is currently employed for OPC radiotherapy planning, but is subject to significant interobserver variability. Deep learning (DL) approaches have shown promise in automating GTVp segmentation, but comparative (auto)confidence metrics of these models predictions has not been well-explored. Quantifying instance-specific DL model uncertainty is crucial to improving clinician trust and facilitating broad clinical implementation. Therefore, in this study, probabilistic DL models for GTVp auto-segmentation were developed using large-scale PET/CT datasets, and various uncertainty auto-estimation methods were systematically investigated and benchmarked. Methods: We utilized the publicly available 2021 HECKTOR Challenge training dataset with 224 co-registered PET/CT scans of OPC patients with corresponding GTVp segmentations as a development set. A separate set of 67 co-registered PET/CT scans of OPC patients with corresponding GTVp segmentations was used for external validation. Two approximate Bayesian deep learning methods, the MC Dropout Ensemble and Deep Ensemble, both with five submodels, were evaluated for GTVp segmentation and uncertainty performance. The segmentation performance was evaluated using the volumetric Dice similarity coefficient (DSC), mean surface distance (MSD), and Hausdorff distance at 95% (95HD). The uncertainty was evaluated using four measures from literature: coefficient of variation (CV), structure expected entropy, structure predictive entropy, and structure mutual information, and additionally with our novel Dice-risk measure. The utility of uncertainty information was evaluated with the accuracy of uncertainty-based segmentation performance prediction using the Accuracy vs Uncertainty (AvU) metric, and by examining the linear correlation between uncertainty estimates and DSC. In addition, batch-based and instance-based referral processes were examined, where the patients with high uncertainty were rejected from the set. In the batch referral process, the area under the referral curve with DSC (R-DSC AUC) was used for evaluation, whereas in the instance referral process, the DSC at various uncertainty thresholds were examined. Results: Both models behaved similarly in terms of the segmentation performance and uncertainty estimation. Specifically, the MC Dropout Ensemble had 0.776 DSC, 1.703 mm MSD, and 5.385 mm 95HD. The Deep Ensemble had 0.767 DSC, 1.717 mm MSD, and 5.477 mm 95HD. The uncertainty measure with the highest DSC correlation was structure predictive entropy with correlation coefficients of 0.699 and 0.692 for the MC Dropout Ensemble and the Deep Ensemble, respectively. The highest AvU value was 0.866 for both models. The best performing uncertainty measure for both models was the CV which had R-DSC AUC of 0.783 and 0.782 for the MC Dropout Ensemble and Deep Ensemble, respectively. With referring patients based on uncertainty thresholds from 0.85 validation DSC for all uncertainty measures, on average the DSC improved from the full dataset by 4.7% and 5.0% while referring 21.8% and 22% patients for MC Dropout Ensemble and Deep Ensemble, respectively. Conclusion: We found that many of the investigated methods provide overall similar but distinct utility in terms of predicting segmentation quality and referral performance. These findings are a critical first-step towards more widespread implementation of uncertainty quantification in OPC GTVp segmentation.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...