Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 426
Filtrar
1.
medRxiv ; 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39314948

RESUMEN

Purpose: This study aims to identify radiomic features extracted from contrast-enhanced CT scans that differentiate osteoradionecrosis (ORN) from normal mandibular bone in patients with head and neck cancer (HNC) treated with radiotherapy (RT). Materials and Methods: Contrast-enhanced CT (CECT) images were collected for 150 patients (80% train, 20% test) with confirmed ORN diagnosis at The University of Texas MD Anderson Cancer Center between 2008 and 2018. Using PyRadiomics, radiomic features were extracted from manually segmented ORN regions and the corresponding automated control regions, the later defined as the contralateral healthy mandible region. A subset of pre-selected features was obtained based on correlation analysis (r > 0.95) and used to train a Random Forest (RF) classifier with Recursive Feature Elimination. Model explainability SHapley Additive exPlanations (SHAP) analysis was performed on the 20 most important features identified by the trained RF classifier. Results: From a total of 1316 radiomic features extracted, 810 features were excluded due to high collinearity. From a set of 506 pre-selected radiomic features, the optimal subset resulting on the best discriminative accuracy of the RF classifier consisted of 67 features. The RF classifier was well calibrated (Log Loss 0.296, ECE 0.125) and achieved an accuracy of 88% and a ROC AUC of 0.96. The SHAP analysis revealed that higher values of Wavelet-LLH First-order Mean and Median were associated with ORN of the jaw (ORNJ). Conversely, higher Exponential GLDM Dependence Entropy and lower Square First-order Kurtosis were more characteristic of normal mandibular tissue. Conclusion: This study successfully developed a CECT-based radiomics model for differentiating ORNJ from healthy mandibular tissue in HNC patients after RT. Future work will focus on the detection of subclinical ORNJ regions to guide earlier interventions.

2.
Artículo en Inglés | MEDLINE | ID: mdl-39255169

RESUMEN

Digital twin models are of high interest to Head and Neck Cancer (HNC) oncologists, who have to navigate a series of complex treatment decisions that weigh the efficacy of tumor control against toxicity and mortality risks. Evaluating individual risk profiles necessitates a deeper understanding of the interplay between different factors such as patient health, spatial tumor location and spread, and risk of subsequent toxicities that can not be adequately captured through simple heuristics. To support clinicians in better understanding tradeoffs when deciding on treatment courses, we developed DITTO, a digital-twin and visual computing system that allows clinicians to analyze detailed risk profiles for each patient, and decide on a treatment plan. DITTO relies on a sequential Deep Reinforcement Learning digital twin (DT) to deliver personalized risk of both long-term and short-term disease outcome and toxicity risk for HNC patients. Based on a participatory collaborative design alongside oncologists, we also implement several visual explainability methods to promote clinical trust and encourage healthy skepticism when using our system. We evaluate the efficacy of DITTO through quantitative evaluation of performance and case studies with qualitative feedback. Finally, we discuss design lessons for developing clinical visual XAI applications for clinical end users.

3.
medRxiv ; 2024 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-39252894

RESUMEN

Objective: The purpose of this study was to investigate the technical feasibility of integrating the quantitative maps available from SyntheticMR into the head and neck adaptive radiation oncology workflow. While SyntheticMR has been investigated for diagnostic applications, no studies have investigated its feasibility and potential for MR-Simulation or MR-Linac workflow. Demonstrating the feasibility of using this technique will facilitate rapid quantitative biomarker extraction which can be leveraged to guide adaptive radiation therapy decision making. Approach: Two phantoms, two healthy volunteers, and one patient were scanned using SyntheticMR on the MR-Simulation and MR-Linac devices with scan times between four to six minutes. Images in phantoms and volunteers were conducted in a test/retest protocol. The correlation between measured and reference quantitative T1, T2, and PD values were determined across clinical ranges in the phantom. Distortion was also studied. Contours of head and neck organs-at-risk (OAR) were drawn and applied to extract T1, T2, and PD. These values were plotted against each other, clusters were computed, and their separability significance was determined to evaluate SyntheticMR for differentiating tumor and normal tissue. Main Results: The Lin's Concordance Correlation Coefficient between the measured and phantom reference values was above 0.98 for both the MR-Sim and MR-Linac. No significant levels of distortion were measured. The mean bias between the measured and phantom reference values across repeated scans was below 4% for T1, 7% for T2, and 4% for PD for both the MR-Sim and MR-Linac. For T1 vs. T2 and T1 vs. PD, the GTV contour exhibited perfect purity against neighboring OARs while being 0.7 for T2 vs. PD. All cluster significance levels between the GTV and the nearest OAR, the tongue, using the SigClust method was p < 0.001. Significance: The technical feasibility of SyntheticMR was confirmed. Application of this technique to the head and neck adaptive radiation therapy workflow can enrich the current quantitative biomarker landscape.

4.
Radiother Oncol ; 201: 110542, 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39299574

RESUMEN

BACKGROUND/PURPOSE: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. METHODS: We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. RESULTS: We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. CONCLUSION: Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.

5.
Artículo en Inglés | MEDLINE | ID: mdl-39147208

RESUMEN

PURPOSE: Conventional normal tissue complication probability (NTCP) models for patients with head and neck cancer are typically based on single-value variables, which, for radiation-induced xerostomia, are baseline xerostomia and mean salivary gland doses. This study aimed to improve the prediction of late xerostomia by using 3-dimensional information from radiation dose distributions, computed tomography imaging, organ-at-risk segmentations, and clinical variables with deep learning (DL). METHODS AND MATERIALS: An international cohort of 1208 patients with head and neck cancer from 2 institutes was used to train and twice validate DL models (deep convolutional neural network, EfficientNet-v2, and ResNet) with 3-dimensional dose distribution, computed tomography scan, organ-at-risk segmentations, baseline xerostomia score, sex, and age as input. The NTCP endpoint was moderate-to-severe xerostomia 12 months postradiation therapy. The DL models' prediction performance was compared with a reference model: a recently published xerostomia NTCP model that used baseline xerostomia score and mean salivary gland doses as input. Attention maps were created to visualize the focus regions of the DL predictions. Transfer learning was conducted to improve the DL model performance on the external validation set. RESULTS: All DL-based NTCP models showed better performance (area under the receiver operating characteristic curve [AUC]test, 0.78-0.79) than the reference NTCP model (AUCtest, 0.74) in the independent test. Attention maps showed that the DL model focused on the major salivary glands, particularly the stem cell-rich region of the parotid glands. DL models obtained lower external validation performance (AUCexternal, 0.63) than the reference model (AUCexternal, 0.66). After transfer learning on a small external subset, the DL model (AUCtl, external, 0.66) performed better than the reference model (AUCtl, external, 0.64). CONCLUSION: DL-based NTCP models performed better than the reference model when validated in data from the same institute. Improved performance in the external data set was achieved with transfer learning, demonstrating the need for multicenter training data to realize generalizable DL-based NTCP models.

6.
Clin Cancer Res ; 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39133081

RESUMEN

BACKGROUND: Survival analyses of novel agents with long-term responders often exhibit differential hazard rates over time. Such proportional hazards violations (PHVs) may reduce the power of the log-rank test and lead to misinterpretation of trial results. We aimed to characterize the incidence and study attributes associated with PHVs in phase 3 oncology trials and assess the utility of restricted mean survival time (RMST) and MaxCombo as additional analyses. METHODS: Clinicaltrials.gov and PubMed were searched to identify 2-arm, randomized, phase 3 superiority-design cancer trials with time-to-event primary endpoints and published results through 2020. Patient-level data were reconstructed from published Kaplan-Meier curves. PHVs were assessed using Schoenfeld residuals. RESULTS: Three hundred fifty-seven Kaplan-Meier comparisons across 341 trials were analyzed, encompassing 292,831 enrolled patients. PHVs were identified in 85/357 (23.8%; 95%CI 19.7%, 28.5%) comparisons. In multivariable analysis, non-OS endpoints (odds ratio [OR] 2.16 [95%CI 1.21, 3.87]; P=.009) were associated with higher odds of PHVs, and immunotherapy comparisons (OR 1.94 [95%CI 0.98, 3.86]; P=.058) were weakly suggestive of higher odds of PHVs. Few trials with PHVs (25/85, 29.4%) pre-specified a statistical plan to account for PHVs. Fourteen trials with PHVs exhibited discordant statistical signals with RMST or MaxCombo, of which ten (71%) reported negative results. CONCLUSION: PHVs are common across therapy types, and attempts to account for PHVs in statistical design are lacking despite the potential for results exhibiting non-proportional hazards to be misinterpreted.

7.
Int J Cancer ; 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39138841

RESUMEN

Disease progression in clinical trials is commonly defined by radiologic measures. However, clinical progression may be more meaningful to patients, may occur even when radiologic criteria for progression are not met, and often requires a change in therapy in clinical practice. The objective of this study was to determine the utilization of clinical progression criteria within progression-based trial endpoints among phase III trials testing systemic therapies for metastatic solid tumors. The primary manuscripts and protocols of phase III trials were reviewed for whether clinical events, such as refractory pain, tumor bleeding, or neurologic compromise, could constitute a progression event. Univariable logistic regression computed odds ratios (OR) and 95% CI for associations between trial-level covariates and clinical progression. A total of 216 trials enrolling 148,190 patients were included, with publication dates from 2006 through 2020. A major change in clinical status was included in the progression criteria of 13% of trials (n = 27), most commonly as a secondary endpoint (n = 22). Only 59% of trials (n = 16) reported distinct clinical progression outcomes that constituted the composite surrogate endpoint. Compared with other disease sites, genitourinary trials were more likely to include clinical progression definitions (16/33 [48%] vs. 11/183 [6%]; OR, 14.72; 95% CI, 5.99 to 37.84; p < .0001). While major tumor-related clinical events were seldom considered as disease progression events, increased attention to clinical progression may improve the meaningfulness and clinical applicability of surrogate endpoints for patients with metastatic solid tumors.

8.
Artículo en Inglés | MEDLINE | ID: mdl-39097246

RESUMEN

BACKGROUND/OBJECTIVES: Pain is a challenging multifaceted symptom reported by most cancer patients. This systematic review aims to explore applications of artificial intelligence/machine learning (AI/ML) in predicting pain-related outcomes and pain management in cancer. METHODS: A comprehensive search of Ovid MEDLINE, EMBASE and Web of Science databases was conducted using terms: "Cancer," "Pain," "Pain Management," "Analgesics," "Artificial Intelligence," "Machine Learning," and "Neural Networks" published up to September 7, 2023. AI/ML models, their validation and performance were summarized. Quality assessment was conducted using PROBAST risk-of-bias andadherence to TRIPOD guidelines. RESULTS: Forty four studies from 2006 to 2023 were included. Nineteen studies used AI/ML for classifying pain after cancer therapy [median AUC 0.80 (range 0.76-0.94)]. Eighteen studies focused on cancer pain research [median AUC 0.86 (range 0.50-0.99)], and 7 focused on applying AI/ML for cancer pain management, [median AUC 0.71 (range 0.47-0.89)]. Median AUC (0.77) of models across all studies. Random forest models demonstrated the highest performance (median AUC 0.81), lasso models had the highest median sensitivity (1), while Support Vector Machine had the highest median specificity (0.74). Overall adherence to TRIPOD guidelines was 70.7%. Overall, high risk-of-bias (77.3%), lack of external validation (14%) and clinical application (23%) was detected. Reporting of model calibration was also missing (5%). CONCLUSION: Implementation of AI/ML tools promises significant advances in the classification, risk stratification, and management decisions for cancer pain. Further research focusing on quality improvement, model calibration, rigorous external clinical validation in real healthcare settings is imperative for ensuring its practical and reliable application in clinical practice.

9.
Adv Radiat Oncol ; 9(8): 101533, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38993196

RESUMEN

Purpose: Our purpose was to develop a clinically intuitive and easily understandable scoring method using statistical metrics to visually determine the quality of a radiation treatment plan. Methods and Materials: Data from 111 patients with head and neck cancer were used to establish a percentile-based scoring system for treatment plan quality evaluation on both a plan-by-plan and objective-by-objective basis. The percentile scores for each clinical objective and the overall treatment plan score were then visualized using a daisy plot. To validate our scoring method, 6 physicians were recruited to assess 60 plans, each using a scoring table consisting of a 5-point Likert scale (with scores ≥3 considered passing). Spearman correlation analysis was conducted to assess the association between increasing treatment plan percentile rank and physician rating, with Likert scores of 1 and 2 representing clinically unacceptable plans, scores of 3 and 4 representing plans needing minor edits, and a score of 5 representing clinically acceptable plans. Receiver operating characteristic curve analysis was used to assess the scoring system's ability to quantify plan quality. Results: Of the 60 plans scored by the physicians, 8 were deemed as clinically acceptable; these plans had an 89.0th ± 14.5 percentile value using our scoring system. The plans needing minor edits or deemed unacceptable had more variation, with scores falling in the 62.6nd ± 25.1 percentile and 35.6th ± 25.7 percentile, respectively. The estimated Spearman correlation coefficient between the physician score and treatment plan percentile was 0.53 (P < .001), indicating a moderate but statistically significant correlation. Receiver operating characteristic curve analysis demonstrated discernment between acceptable and unacceptable plan quality, with an area under the curve of 0.76. Conclusions: Our scoring system correlates with physician ratings while providing intuitive visual feedback for identifying good treatment plan quality, thereby indicating its utility in the quality assurance process.

10.
Head Neck ; 2024 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-39073252

RESUMEN

BACKGROUND: Treatment for dural recurrence of olfactory neuroblastoma (ONB) is not standardized. We assess the outcomes of stereotactic body radiotherapy (SBRT) in this population. METHODS: ONB patients with dural recurrences treated between 2013 and 2022 on a prospective registry were included. Tumor control, survival, and patient-reported quality of life were analyzed. RESULTS: Fourteen patients with 32 dural lesions were evaluated. Time to dural recurrence was 58.3 months. Thirty lesions (94%) were treated with SBRT to a median dose of 27 Gy in three fractions. Two patients (3 of 32 lesions; 9%) developed in-field radiographic progression, five patients (38%) experienced progression in non-contiguous dura. Two-year local control was 85% (95% CI: 51-96%). There were no >grade 3 acute toxicities and 1 case of late grade 3 brain radionecrosis. CONCLUSION: In this largest study of SBRT reirradiation for ONB dural recurrence to date, high local control rates with minimal toxicity were attainable.

11.
Radiother Oncol ; 197: 110345, 2024 08.
Artículo en Inglés | MEDLINE | ID: mdl-38838989

RESUMEN

BACKGROUND AND PURPOSE: Artificial Intelligence (AI) models in radiation therapy are being developed with increasing pace. Despite this, the radiation therapy community has not widely adopted these models in clinical practice. A cohesive guideline on how to develop, report and clinically validate AI algorithms might help bridge this gap. METHODS AND MATERIALS: A Delphi process with all co-authors was followed to determine which topics should be addressed in this comprehensive guideline. Separate sections of the guideline, including Statements, were written by subgroups of the authors and discussed with the whole group at several meetings. Statements were formulated and scored as highly recommended or recommended. RESULTS: The following topics were found most relevant: Decision making, image analysis, volume segmentation, treatment planning, patient specific quality assurance of treatment delivery, adaptive treatment, outcome prediction, training, validation and testing of AI model parameters, model availability for others to verify, model quality assurance/updates and upgrades, ethics. Key references were given together with an outlook on current hurdles and possibilities to overcome these. 19 Statements were formulated. CONCLUSION: A cohesive guideline has been written which addresses main topics regarding AI in radiation therapy. It will help to guide development, as well as transparent and consistent reporting and validation of new AI tools and facilitate adoption.


Asunto(s)
Inteligencia Artificial , Técnica Delphi , Humanos , Planificación de la Radioterapia Asistida por Computador/normas , Planificación de la Radioterapia Asistida por Computador/métodos , Oncología por Radiación/normas , Radioterapia/normas , Radioterapia/métodos , Algoritmos
12.
INFORMS J Comput ; 36(2): 434-455, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38883557

RESUMEN

Chemotherapy drug administration is a complex problem that often requires expensive clinical trials to evaluate potential regimens; one way to alleviate this burden and better inform future trials is to build reliable models for drug administration. This paper presents a mixed-integer program for combination chemotherapy (utilization of multiple drugs) optimization that incorporates various important operational constraints and, besides dose and concentration limits, controls treatment toxicity based on its effect on the count of white blood cells. To address the uncertainty of tumor heterogeneity, we also propose chance constraints that guarantee reaching an operable tumor size with a high probability in a neoadjuvant setting. We present analytical results pertinent to the accuracy of the model in representing biological processes of chemotherapy and establish its potential for clinical applications through a numerical study of breast cancer.

13.
Oncologist ; 29(7): 547-550, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38824414

RESUMEN

Missing visual elements (MVE) in Kaplan-Meier (KM) curves can misrepresent data, preclude curve reconstruction, and hamper transparency. This study evaluated KM plots of phase III oncology trials. MVE were defined as an incomplete y-axis range or missing number at risk table in a KM curve. Surrogate endpoint KM curves were additionally evaluated for complete interpretability, defined by (1) reporting the number of censored patients and (2) correspondence of the disease assessment interval with the number at risk interval. Among 641 trials enrolling 518 235 patients, 116 trials (18%) had MVE in KM curves. Industry sponsorship, larger trials, and more recently published trials were correlated with lower odds of MVE. Only 3% of trials (15 of 574) published surrogate endpoint KM plots with complete interpretability. Improvements in the quality of KM curves of phase III oncology trials, particularly for surrogate endpoints, are needed for greater interpretability, reproducibility, and transparency in oncology research.


Asunto(s)
Ensayos Clínicos Fase III como Asunto , Estimación de Kaplan-Meier , Humanos , Ensayos Clínicos Fase III como Asunto/normas , Neoplasias/terapia , Oncología Médica/normas , Oncología Médica/métodos
14.
Commun Med (Lond) ; 4(1): 110, 2024 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-38851837

RESUMEN

BACKGROUND: Radiotherapy is a core treatment modality for oropharyngeal cancer (OPC), where the primary gross tumor volume (GTVp) is manually segmented with high interobserver variability. This calls for reliable and trustworthy automated tools in clinician workflow. Therefore, accurate uncertainty quantification and its downstream utilization is critical. METHODS: Here we propose uncertainty-aware deep learning for OPC GTVp segmentation, and illustrate the utility of uncertainty in multiple applications. We examine two Bayesian deep learning (BDL) models and eight uncertainty measures, and utilize a large multi-institute dataset of 292 PET/CT scans to systematically analyze our approach. RESULTS: We show that our uncertainty-based approach accurately predicts the quality of the deep learning segmentation in 86.6% of cases, identifies low performance cases for semi-automated correction, and visualizes regions of the scans where the segmentations likely fail. CONCLUSIONS: Our BDL-based analysis provides a first-step towards more widespread implementation of uncertainty quantification in OPC GTVp segmentation.


Radiotherapy is used as a treatment for people with oropharyngeal cancer. It is important to distinguish the areas where cancer is present so the radiotherapy treatment can be targeted at the cancer. Computational methods based on artificial intelligence can automate this task but need to be able to distinguish areas where it is unclear whether cancer is present. In this study we compare these computational methods that are able to highlight areas where it is unclear whether or not cancer is present. Our approach accurately predicts how well these areas are distinguished by the models. Our results could be applied to improve the computational methods used during radiotherapy treatment. This could enable more targeted treatment to be used in the future, which could result in better outcomes for people with oropharyngeal cancer.

16.
Med Phys ; 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38896829

RESUMEN

BACKGROUND: Head and neck (HN) gross tumor volume (GTV) auto-segmentation is challenging due to the morphological complexity and low image contrast of targets. Multi-modality images, including computed tomography (CT) and positron emission tomography (PET), are used in the routine clinic to assist radiation oncologists for accurate GTV delineation. However, the availability of PET imaging may not always be guaranteed. PURPOSE: To develop a deep learning segmentation framework for automated GTV delineation of HN cancers using a combination of PET/CT images, while addressing the challenge of missing PET data. METHODS: Two datasets were included for this study: Dataset I: 524 (training) and 359 (testing) oropharyngeal cancer patients from different institutions with their PET/CT pairs provided by the HECKTOR Challenge; Dataset II: 90 HN patients(testing) from a local institution with their planning CT, PET/CT pairs. To handle potentially missing PET images, a model training strategy named the "Blank Channel" method was implemented. To simulate the absence of a PET image, a blank array with the same dimensions as the CT image was generated to meet the dual-channel input requirement of the deep learning model. During the model training process, the model was randomly presented with either a real PET/CT pair or a blank/CT pair. This allowed the model to learn the relationship between the CT image and the corresponding GTV delineation based on available modalities. As a result, our model had the ability to handle flexible inputs during prediction, making it suitable for cases where PET images are missing. To evaluate the performance of our proposed model, we trained it using training patients from Dataset I and tested it with Dataset II. We compared our model (Model 1) with two other models which were trained for specific modality segmentations: Model 2 trained with only CT images, and Model 3 trained with real PET/CT pairs. The performance of the models was evaluated using quantitative metrics, including Dice similarity coefficient (DSC), mean surface distance (MSD), and 95% Hausdorff Distance (HD95). In addition, we evaluated our Model 1 and Model 3 using the 359 test cases in Dataset I. RESULTS: Our proposed model(Model 1) achieved promising results for GTV auto-segmentation using PET/CT images, with the flexibility of missing PET images. Specifically, when assessed with only CT images in Dataset II, Model 1 achieved DSC of 0.56 ± 0.16, MSD of 3.4 ± 2.1 mm, and HD95 of 13.9 ± 7.6 mm. When the PET images were included, the performance of our model was improved to DSC of 0.62 ± 0.14, MSD of 2.8 ± 1.7 mm, and HD95 of 10.5 ± 6.5 mm. These results are comparable to those achieved by Model 2 and Model 3, illustrating Model 1's effectiveness in utilizing flexible input modalities. Further analysis using the test dataset from Dataset I showed that Model 1 achieved an average DSC of 0.77, surpassing the overall average DSC of 0.72 among all participants in the HECKTOR Challenge. CONCLUSIONS: We successfully refined a multi-modal segmentation tool for accurate GTV delineation for HN cancer. Our method addressed the issue of missing PET images by allowing flexible data input, thereby providing a practical solution for clinical settings where access to PET imaging may be limited.

17.
JCO Clin Cancer Inform ; 8: e2300174, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38870441

RESUMEN

PURPOSE: The quality of radiotherapy auto-segmentation training data, primarily derived from clinician observers, is of utmost importance. However, the factors influencing the quality of clinician-derived segmentations are poorly understood; our study aims to quantify these factors. METHODS: Organ at risk (OAR) and tumor-related segmentations provided by radiation oncologists from the Contouring Collaborative for Consensus in Radiation Oncology data set were used. Segmentations were derived from five disease sites: breast, sarcoma, head and neck (H&N), gynecologic (GYN), and GI. Segmentation quality was determined on a structure-by-structure basis by comparing the observer segmentations with an expert-derived consensus, which served as a reference standard benchmark. The Dice similarity coefficient (DSC) was primarily used as a metric for the comparisons. DSC was stratified into binary groups on the basis of structure-specific expert-derived interobserver variability (IOV) cutoffs. Generalized linear mixed-effects models using Bayesian estimation were used to investigate the association between demographic variables and the binarized DSC for each disease site. Variables with a highest density interval excluding zero were considered to substantially affect the outcome measure. RESULTS: Five hundred seventy-four, 110, 452, 112, and 48 segmentations were used for the breast, sarcoma, H&N, GYN, and GI cases, respectively. The median percentage of segmentations that crossed the expert DSC IOV cutoff when stratified by structure type was 55% and 31% for OARs and tumors, respectively. Regression analysis revealed that the structure being tumor-related had a substantial negative impact on binarized DSC for the breast, sarcoma, H&N, and GI cases. There were no recurring relationships between segmentation quality and demographic variables across the cases, with most variables demonstrating large standard deviations. CONCLUSION: Our study highlights substantial uncertainty surrounding conventionally presumed factors influencing segmentation quality relative to benchmarks.


Asunto(s)
Teorema de Bayes , Benchmarking , Oncólogos de Radiación , Humanos , Benchmarking/métodos , Femenino , Planificación de la Radioterapia Asistida por Computador/métodos , Neoplasias/epidemiología , Neoplasias/radioterapia , Órganos en Riesgo , Masculino , Oncología por Radiación/normas , Oncología por Radiación/métodos , Demografía , Variaciones Dependientes del Observador
19.
medRxiv ; 2024 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-38798581

RESUMEN

Background/purpose: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods: We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results: We identified 56 articles published from 2015-2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50%), followed by image-synthesis (13%), and multiple applications simultaneously (11%). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32%). Imaging data was used in 91% of studies, while only 13% incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60%), with Monte Carlo dropout being the most commonly implemented UQ method (32%) followed by ensembling (16%). 55% of studies did not share code or datasets. Conclusion: Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, there was a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.

20.
J Clin Oncol ; 42(16): 1922-1933, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38691822

RESUMEN

PURPOSE: Osteoradionecrosis of the jaw (ORN) can manifest in varying severity. The aim of this study is to identify ORN risk factors and develop a novel classification to depict the severity of ORN. METHODS: Consecutive patients with head and neck cancer (HNC) treated with curative-intent intensity-modulated radiation therapy (IMRT) (≥45 Gy) from 2011 to 2017 were included. Occurrence of ORN was identified from in-house prospective dental and clinical databases and charts. Multivariable logistic regression model was used to identify risk factors and stratify patients into high-risk and low-risk groups. A novel ORN classification system was developed to depict ORN severity by modifying existing systems and incorporating expert opinion. The performance of the novel system was compared with 15 existing systems for their ability to identify and predict serious ORN event (jaw fracture or requiring jaw resection). RESULTS: ORN was identified in 219 of 2,732 (8%) consecutive patients with HNC. Factors associated with high risk of ORN were oral cavity or oropharyngeal primaries, received IMRT dose ≥60 Gy, current/ex-smokers, and/or stage III to IV periodontal condition. The ORN rate for high-risk versus low-risk patients was 12.7% versus 3.1% (P < .001) with an AUC of 0.71. Existing ORN systems overclassified serious ORN events and failed to recognize maxillary ORN. A novel ORN classification system, ClinRad, was proposed on the basis of vertical extent of bone necrosis and presence/absence of exposed bone/fistula. This system detected serious ORN events in 5.7% of patients and statistically outperformed existing systems. CONCLUSION: We identified risk factors for ORN and proposed a novel ORN classification system on the basis of vertical extent of bone necrosis and presence/absence of exposed bone/fistula. It outperformed existing systems in depicting the seriousness of ORN and may facilitate clinical care and clinical trials.


Asunto(s)
Neoplasias de Cabeza y Cuello , Osteorradionecrosis , Radioterapia de Intensidad Modulada , Humanos , Osteorradionecrosis/etiología , Osteorradionecrosis/clasificación , Masculino , Neoplasias de Cabeza y Cuello/radioterapia , Femenino , Persona de Mediana Edad , Anciano , Radioterapia de Intensidad Modulada/efectos adversos , Factores de Riesgo , Medición de Riesgo , Índice de Severidad de la Enfermedad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA