ABSTRACT
OBJECTIVES: Currently, the BRAF status of pediatric low-grade glioma (pLGG) patients is determined through a biopsy. We established a nomogram to predict BRAF status non-invasively using clinical and radiomic factors. Additionally, we assessed an advanced thresholding method to provide only high-confidence predictions for the molecular subtype. Finally, we tested whether radiomic features provide additional predictive information for this classification task, beyond that which is embedded in the location of the tumor. METHODS: Random forest (RF) models were trained on radiomic and clinical features both separately and together, to evaluate the utility of each feature set. Instead of using the traditional single threshold technique to convert the model outputs to class predictions, we implemented a double threshold mechanism that accounted for uncertainty. Additionally, a linear model was trained and depicted graphically as a nomogram. RESULTS: The combined RF (AUC: 0.925) outperformed the RFs trained on radiomic (AUC: 0.863) or clinical (AUC: 0.889) features alone. The linear model had a comparable AUC (0.916), despite its lower complexity. Traditional thresholding produced an accuracy of 84.5%, while the double threshold approach yielded 92.2% accuracy on the 80.7% of patients with the highest confidence predictions. CONCLUSION: Models that included radiomic features outperformed, underscoring their importance for the prediction of BRAF status. A linear model performed similarly to RF but with the added benefit that it can be visualized as a nomogram, improving the explainability of the model. The double threshold technique was able to identify uncertain predictions, enhancing the clinical utility of the model. CLINICAL RELEVANCE STATEMENT: Radiomic features and tumor location are both predictive of BRAF status in pLGG patients. We show that they contain complementary information and depict the optimal model as a nomogram, which can be used as a non-invasive alternative to biopsy. KEY POINTS: ⢠Radiomic features provide additional predictive information for the determination of the molecular subtype of pediatric low-grade gliomas patients, beyond what is embedded in the location of the tumor, which has an established relationship with genetic status. ⢠An advanced thresholding method can help to distinguish cases where machine learning models have a high chance of being (in)correct, improving the utility of these models. ⢠A simple linear model performs similarly to a more powerful random forest model at classifying the molecular subtype of pediatric low-grade gliomas but has the added benefit that it can be converted into a nomogram, which may facilitate clinical implementation by improving the explainability of the model.
Subject(s)
Brain Neoplasms , Glioma , Humans , Child , Proto-Oncogene Proteins B-raf/genetics , Brain Neoplasms/pathology , Radiomics , Retrospective Studies , Glioma/pathologyABSTRACT
INTRODUCTION: Machine learning (ML) shows promise for the automation of routine tasks related to the treatment of pediatric low-grade gliomas (pLGG) such as tumor grading, typing, and segmentation. Moreover, it has been shown that ML can identify crucial information from medical images that is otherwise currently unattainable. For example, ML appears to be capable of preoperatively identifying the underlying genetic status of pLGG. METHODS: In this chapter, we reviewed, to the best of our knowledge, all published works that have used ML techniques for the imaging-based evaluation of pLGGs. Additionally, we aimed to provide some context on what it will take to go from the exploratory studies we reviewed to clinically deployed models. RESULTS: Multiple studies have demonstrated that ML can accurately grade, type, and segment and detect the genetic status of pLGGs. We compared the approaches used between the different studies and observed a high degree of variability throughout the methodologies. Standardization and cooperation between the numerous groups working on these approaches will be key to accelerating the clinical deployment of these models. CONCLUSION: The studies reviewed in this chapter detail the potential for ML techniques to transform the treatment of pLGG. However, there are still challenges that need to be overcome prior to clinical deployment.
Subject(s)
Brain Neoplasms , Glioma , Machine Learning , Magnetic Resonance Imaging , Humans , Glioma/diagnostic imaging , Glioma/genetics , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/genetics , Magnetic Resonance Imaging/methods , Child , Neoplasm Grading/methodsABSTRACT
Purpose: To evaluate the clinical performance of a Protocol Recommendation System (PRS) automatic protocolling of chest CT imaging requests. Materials and Methods: 322 387 consecutive historical imaging requests for chest CT between 2017 and 2022 were extracted from a radiology information system (RIS) database containing 16 associated patient information values. Records with missing fields and protocols with <100 occurrences were removed, leaving 18 protocols for training. After freetext pre-processing and applying CLEVER terminology word replacements, the features of a bag-of-words model were used to train a multinomial logistic regression classifier. Four readers protocolled 300 clinically executed protocols (CEP) based on all clinically available information. After their selection was made, the PRS and CEP were unblinded, and the readers were asked to score their agreement (1 = severe error, 2 = moderate error, 3 = disagreement but acceptable, 4 = agreement). The ground truth was established by the readers' majority selection, a judge helped break ties. For the PRS and CEP, the accuracy and clinical acceptability (scores 3 and 4) were calculated. The readers' protocolling reliability was measured using Fleiss' Kappa. Results: Four readers agreed on 203/300 protocols, 3 on 82/300 cases, and in 15 cases, a judge was needed. PRS errors were found by the 4 readers in 1%, 2.7%, 1%, and 0.7% of the cases, respectively. The accuracy/clinical acceptability of the PRS and CEP were 84.3%/98.6% and 83.0%/99.3%, respectively. The Fleiss' Kappa for all readers and all protocols was 0.805. Conclusion: The PRS achieved similar accuracy to human performance and may help radiologists master the ever-increasing workload.
ABSTRACT
Purpose: MRI-based radiomics models can predict genetic markers in pediatric low-grade glioma (pLGG). These models usually require tumour segmentation, which is tedious and time consuming if done manually. We propose a deep learning (DL) model to automate tumour segmentation and build an end-to-end radiomics-based pipeline for pLGG classification. Methods: The proposed architecture is a 2-step U-Net based DL network. The first U-Net is trained on downsampled images to locate the tumour. The second U-Net is trained using image patches centred around the located tumour to produce more refined segmentations. The segmented tumour is then fed into a radiomics-based model to predict the genetic marker of the tumour. Results: Our segmentation model achieved a correlation value of over 80% for all volume-related radiomic features and an average Dice score of .795 in test cases. Feeding the auto-segmentation results into a radiomics model resulted in a mean area under the ROC curve (AUC) of .843, with 95% confidence interval (CI) [.78-.906] and .730, with 95% CI [.671-.789] on the test set for 2-class (BRAF V600E mutation BRAF fusion) and 3-class (BRAF V600E mutation BRAF fusion and Other) classification, respectively. This result was comparable to the AUC of .874, 95% CI [.829-.919] and .758, 95% CI [.724-.792] for the radiomics model trained and tested on the manual segmentations in 2-class and 3-class classification scenarios, respectively. Conclusion: The proposed end-to-end pipeline for pLGG segmentation and classification produced results comparable to manual segmentation when it was used for a radiomics-based genetic marker prediction model.
Subject(s)
Glioma , Proto-Oncogene Proteins B-raf , Humans , Child , Genetic Markers , Glioma/pathology , Magnetic Resonance Imaging/methods , Area Under CurveABSTRACT
Purpose: Scoliosis is a complex spine deformity with direct functional and cosmetic impacts on the individual. The reference standard for assessing scoliosis severity is the Cobb angle which is measured on radiographs by human specialists, carrying interobserver variability and inaccuracy of measurements. These limitations may result in lack of timely referral for management at a time the scoliotic deformity progression can be saved from surgery. We aimed to create a machine learning (ML) model for automatic calculation of Cobb angles on 3-foot standing spine radiographs of children and adolescents with clinical suspicion of scoliosis across 2 clinical scenarios (idiopathic, group 1 and congenital scoliosis, group 2). Methods: We retrospectively measured Cobb angles of 130 patients who had a 3-foot spine radiograph for scoliosis within a 10-year period for either idiopathic or congenital anomaly scoliosis. Cobb angles were measured both manually by radiologists and by an ML pipeline (segmentation-based approach-Augmented U-Net model with non-square kernels). Results: Our Augmented U-Net architecture achieved a Symmetric Mean Absolute Percentage Error (SMAPE) of 11.82% amongst a combined idiopathic and congenital scoliosis cohort. When stratifying for idiopathic and congenital scoliosis individually a SMAPE of 13.02% and 11.90% were achieved, respectively. Conclusion: The ML model used in this study is promising at providing automated Cobb angle measurement in both idiopathic scoliosis and congenital scoliosis. Nevertheless, larger studies are needed in the future to confirm the results of this study prior to translation of this ML algorithm into clinical practice.
Subject(s)
Machine Learning , Scoliosis , Humans , Scoliosis/diagnostic imaging , Scoliosis/congenital , Adolescent , Retrospective Studies , Female , Male , Child , Spine/diagnostic imaging , Spine/abnormalities , Radiography/methodsABSTRACT
Deep learning techniques using convolutional neural networks (CNNs) have been successfully developed for various medical image analysis tasks. However, the skills to understand and develop deep learning models are not usually taught during radiology training, which constitutes a barrier for radiologists looking to integrate machine learning (ML) into their research or clinical practice. In this work, we developed and evaluated an educational graphical user interface (GUI) to construct CNNs for teaching deep learning concepts to radiology trainees. The GUI was developed in Python using the PyQt and PyTorch frameworks. The functionality of the GUI was demonstrated through a binary classification task on a dataset of MR images of the brain. The usability of the GUI was assessed through 45-min user testing sessions with 5 neuroradiologists and neuroradiology fellows, assessing mean task completion times, the System Usability Scale (SUS), and a qualitative questionnaire as metrics. Task completion times were compared against a ML expert who performed the same tasks. After a 20-min introduction to CNNs and a walkthrough of the GUI, users were able to perform all assigned tasks successfully. There was no significant difference in task completion time compared to a ML expert. The educational GUI achieved a score of 82.5 on the SUS, suggesting that the system is highly usable. Users indicated that the GUI seems useful as an educational tool to teach ML topics to radiology trainees. An educational GUI allows interactive teaching in ML that can be incorporated into radiology training.
Subject(s)
Artificial Intelligence , Radiology , Humans , Neural Networks, Computer , Radiography , Radiology/methods , Machine LearningABSTRACT
Purpose: Scoliosis is a deformity of the spine, and as a measure of scoliosis severity, Cobb angle is fundamental to the diagnosis of deformities that require treatment. Conventional Cobb angle measurement and assessment is usually done manually, which is inherently time-consuming, and associated with high inter- and intra-observer variability. While there exist automatic scoliosis measurement methods, they suffer from insufficient accuracy. In this work, we propose a two-step segmentation-based deep learning architecture to automate Cobb angle measurement for scoliosis assessment using X-Ray images. Methods: The proposed architecture involves two steps. In the first step, we utilize a novel Augmented U-Net architecture to generate segmentations of vertebrae. The second step includes a non-learning-based pipeline to extract landmark coordinates from the segmented vertebrae and filter undesirable landmarks. Results: Our proposed Augmented U-Net architecture achieved a Symmetric Mean Absolute Percentage Error of 9.2%, with approximately 90% of estimations having less than 10 degrees difference compared with the AASCE-MICCAI challenge 2019 dataset ground truths. We further validated the model using an internal dataset and achieved almost the same level of performance. Conclusion: The proposed architecture is robust in providing automated spinal vertebrae segmentations and Cobb angle measurement, and is potentially generalizable to real-world clinical settings.
Subject(s)
Scoliosis , Humans , Adolescent , Scoliosis/diagnostic imaging , Spine , Observer Variation , Reproducibility of ResultsABSTRACT
Purpose: Biopsy-based assessment of H3 K27 M status helps in predicting survival, but biopsy is usually limited to unusual presentations and clinical trials. We aimed to evaluate whether radiomics can serve as prognostic marker to stratify diffuse intrinsic pontine glioma (DIPG) subsets. Methods: In this retrospective study, diagnostic brain MRIs of children with DIPG were analyzed. Radiomic features were extracted from tumor segmentations and data were split into training/testing sets (80:20). A conditional survival forest model was applied to predict progression-free survival (PFS) using training data. The trained model was validated on the test data, and concordances were calculated for PFS. Experiments were repeated 100 times using randomized versions of the respective percentage of the training/test data. Results: A total of 89 patients were identified (48 females, 53.9%). Median age at time of diagnosis was 6.64 years (range: 1-16.9 years) and median PFS was 8 months (range: 1-84 months). Molecular data were available for 26 patients (29.2%) (1 wild type, 3 K27M-H3.1, 22 K27M-H3.3). Radiomic features of FLAIR and nonenhanced T1-weighted sequences were predictive of PFS. The best FLAIR radiomics model yielded a concordance of .87 [95% CI: .86-.88] at 4 months PFS. The best T1-weighted radiomics model yielded a concordance of .82 [95% CI: .8-.84] at 4 months PFS. The best combined FLAIR + T1-weighted radiomics model yielded a concordance of .74 [95% CI: .71-.77] at 3 months PFS. The predominant predictive radiomic feature matrix was gray-level size-zone. Conclusion: MRI-based radiomics may predict progression-free survival in pediatric diffuse midline glioma/diffuse intrinsic pontine glioma.
Subject(s)
Brain Stem Neoplasms , Diffuse Intrinsic Pontine Glioma , Glioma , Female , Humans , Child , Progression-Free Survival , Retrospective Studies , Glioma/diagnostic imaging , Glioma/pathology , Magnetic Resonance Imaging , Brain Stem Neoplasms/diagnostic imagingABSTRACT
The integration of human and machine intelligence promises to profoundly change the practice of medicine. The rapidly increasing adoption of artificial intelligence (AI) solutions highlights its potential to streamline physician work and optimize clinical decision-making, also in the field of pediatric radiology. Large imaging databases are necessary for training, validating and testing these algorithms. To better promote data accessibility in multi-institutional AI-enabled radiologic research, these databases centralize the large volumes of data required to effect accurate models and outcome predictions. However, such undertakings must consider the sensitivity of patient information and therefore utilize requisite data governance measures to safeguard data privacy and security, to recognize and mitigate the effects of bias and to promote ethical use. In this article we define data stewardship and data governance, review their key considerations and applicability to radiologic research in the pediatric context, and consider the associated best practices along with the ramifications of poorly executed data governance. We summarize several adaptable data governance frameworks and describe strategies for their implementation in the form of distributed and centralized approaches to data management.
Subject(s)
Artificial Intelligence , Radiology , Algorithms , Child , Databases, Factual , Humans , Radiologists , Radiology/methodsABSTRACT
BACKGROUND: The COVID-19 pandemic has affected the lives of people globally for over 2 years. Changes in lifestyles due to the pandemic may cause psychosocial stressors for individuals and could lead to mental health problems. To provide high-quality mental health support, health care organizations need to identify COVID-19-specific stressors and monitor the trends in the prevalence of those stressors. OBJECTIVE: This study aims to apply natural language processing (NLP) techniques to social media data to identify the psychosocial stressors during the COVID-19 pandemic and to analyze the trend in the prevalence of these stressors at different stages of the pandemic. METHODS: We obtained a data set of 9266 Reddit posts from the subreddit \rCOVID19_support, from February 14, 2020, to July 19, 2021. We used the latent Dirichlet allocation (LDA) topic model to identify the topics that were mentioned on the subreddit and analyzed the trends in the prevalence of the topics. Lexicons were created for each of the topics and were used to identify the topics of each post. The prevalences of topics identified by the LDA and lexicon approaches were compared. RESULTS: The LDA model identified 6 topics from the data set: (1) "fear of coronavirus," (2) "problems related to social relationships," (3) "mental health symptoms," (4) "family problems," (5) "educational and occupational problems," and (6) "uncertainty on the development of pandemic." According to the results, there was a significant decline in the number of posts about the "fear of coronavirus" after vaccine distribution started. This suggests that the distribution of vaccines may have reduced the perceived risks of coronavirus. The prevalence of discussions on the uncertainty about the pandemic did not decline with the increase in the vaccinated population. In April 2021, when the Delta variant became prevalent in the United States, there was a significant increase in the number of posts about the uncertainty of pandemic development but no obvious effects on the topic of fear of the coronavirus. CONCLUSIONS: We created a dashboard to visualize the trend in the prevalence of topics about COVID-19-related stressors being discussed on a social media platform (Reddit). Our results provide insights into the prevalence of pandemic-related stressors during different stages of the COVID-19 pandemic. The NLP techniques leveraged in this study could also be applied to analyze event-specific stressors in the future.
Subject(s)
COVID-19 , Latent Class Analysis , Natural Language Processing , Pandemics , Social Media , Stress, Psychological , COVID-19/epidemiology , Humans , Mental Health/statistics & numerical data , Prevalence , SARS-CoV-2 , Stress, Psychological/epidemiology , United States/epidemiologyABSTRACT
OBJECTIVE: To differentiate combined hepatocellular cholangiocarcinoma (cHCC-CC) from cholangiocarcinoma (CC) and hepatocellular carcinoma (HCC) using machine learning on MRI and CT radiomics features. METHODS: This retrospective study included 85 patients aged 32 to 86 years with 86 histopathology-proven liver cancers: 24 cHCC-CC, 24 CC, and 38 HCC who had MRI and CT between 2004 and 2018. Initial CT reports and morphological evaluation of MRI features were used to assess the performance of radiologists read. Following tumor segmentation, 1419 radiomics features were extracted using PyRadiomics library and reduced to 20 principle components by principal component analysis. Support vector machine classifier was utilized to evaluate MRI and CT radiomics features for the prediction of cHCC-CC vs. non-cHCC-CC and HCC vs. non-HCC. Histopathology was the reference standard for all tumors. RESULTS: Radiomics MRI features demonstrated the best performance for differentiation of cHCC-CC from non-cHCC-CC with the highest AUC of 0.77 (SD 0.19) while CT was of limited value. Contrast-enhanced MRI phases and pre-contrast and portal-phase CT showed excellent performance for the differentiation of HCC from non-HCC (AUC of 0.79 (SD 0.07) to 0.81 (SD 0.13) for MRI and AUC of 0.81 (SD 0.06) and 0.71 (SD 0.15) for CT phases, respectively). The misdiagnosis of cHCC-CC as HCC or CC using radiologists read was 69% for CT and 58% for MRI. CONCLUSIONS: Our results demonstrate promising predictive performance of MRI and CT radiomics features using machine learning analysis for differentiation of cHCC-CC from HCC and CC with potential implications for treatment decisions. KEY POINTS: ⢠Retrospective study demonstrated promising predictive performance of MRI radiomics features in the differentiation of cHCC-CC from HCC and CC and of CT radiomics features in the differentiation of HCC from cHCC-CC and CC. ⢠With future validation, radiomics analysis has the potential to inform current clinical practice for the pre-operative diagnosis of cHCC-CC and to enable optimal treatment decisions regards liver resection and transplantation.
Subject(s)
Bile Duct Neoplasms , Carcinoma, Hepatocellular , Cholangiocarcinoma , Liver Neoplasms , Adult , Aged , Aged, 80 and over , Bile Duct Neoplasms/diagnostic imaging , Bile Ducts, Intrahepatic , Carcinoma, Hepatocellular/diagnostic imaging , Cholangiocarcinoma/diagnostic imaging , Humans , Liver Neoplasms/diagnostic imaging , Machine Learning , Middle Aged , Retrospective StudiesABSTRACT
OBJECTIVES: Skeletal muscle mass is a prognostic factor in pancreatic ductal adenocarcinoma (PDAC). However, it remains unclear whether changes in body composition provide an incremental prognostic value to established risk factors, especially the Response Evaluation Criteria in Solid Tumors version 1.1 (RECISTv1.1). The aim of this study was to determine the prognostic value of CT-quantified body composition changes in patients with unresectable PDAC starting chemotherapy. METHODS: We retrospectively evaluated 105 patients with unresectable (locally advanced or metastatic) PDAC treated with FOLFIRINOX (n = 64) or gemcitabine-based (n = 41) first-line chemotherapy within a multicenter prospective trial. Changes (Δ) in skeletal muscle index (SMI), subcutaneous (SATI), and visceral adipose tissue index (VATI) between pre-chemotherapy and first follow-up CT were assessed. Cox regression models and covariate-adjusted survival curves were used to identify predictors of overall survival (OS). RESULTS: At multivariable analysis, adjusting for RECISTv1.1-response at first follow-up, ΔSMI was prognostic for OS with a hazard ratio (HR) of 1.2 (95% CI: 1.08-1.33, p = 0.001). No significant association with OS was observed for ΔSATI (HR: 1, 95% CI: 0.97-1.04, p = 0.88) and ΔVATI (HR: 1.01, 95% CI: 0.99-1.04, p = 0.33). At an optimal cutoff of 2.8 cm2/m2 per 30 days, the median survival of patients with high versus low ΔSMI was 143 versus 233 days (p < 0.001). CONCLUSIONS: Patients with a lower rate of skeletal muscle loss at first follow-up demonstrated improved survival for unresectable PDAC, regardless of their RECISTv1.1-category. Assessing ΔSMI at the first follow-up CT may be useful for prognostication, in addition to routine radiological assessment. KEY POINTS: ⢠In patients with unresectable pancreatic ductal adenocarcinoma, change of skeletal muscle index (ΔSMI) in the early phase of chemotherapy is prognostic for overall survival, even after adjusting for Response Evaluation Criteria in Solid Tumors version 1.1 (RECISTv1.1) assessment at first follow-up. ⢠Changes in adipose tissue compartments at first follow-up demonstrated no significant association with overall survival. ⢠Integrating ΔSMI into routine radiological assessment may improve prognostic stratification and impact treatment decision-making at the first follow-up.
Subject(s)
Pancreatic Neoplasms , Sarcopenia , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Body Composition , Humans , Muscle, Skeletal/diagnostic imaging , Muscle, Skeletal/pathology , Pancreatic Neoplasms/diagnostic imaging , Pancreatic Neoplasms/drug therapy , Pancreatic Neoplasms/pathology , Prognosis , Prospective Studies , Retrospective Studies , Sarcopenia/pathology , Tomography, X-Ray ComputedABSTRACT
PURPOSE: Artificial intelligence (AI) is playing an ever-increasing role in Neuroradiology. METHODS: When designing AI-based research in neuroradiology and appreciating the literature, it is important to understand the fundamental principles of AI. Training, validation, and test datasets must be defined and set apart as priorities. External validation and testing datasets are preferable, when feasible. The specific type of learning process (supervised vs. unsupervised) and the machine learning model also require definition. Deep learning (DL) is an AI-based approach that is modelled on the structure of neurons of the brain; convolutional neural networks (CNN) are a commonly used example in neuroradiology. RESULTS: Radiomics is a frequently used approach in which a multitude of imaging features are extracted from a region of interest and subsequently reduced and selected to convey diagnostic or prognostic information. Deep radiomics uses CNNs to directly extract features and obviate the need for predefined features. CONCLUSION: Common limitations and pitfalls in AI-based research in neuroradiology are limited sample sizes ("small-n-large-p problem"), selection bias, as well as overfitting and underfitting.
Subject(s)
Artificial Intelligence , Deep Learning , Humans , Machine Learning , Neural Networks, Computer , PrognosisABSTRACT
Data augmentation refers to a group of techniques whose goal is to battle limited amount of available data to improve model generalization and push sample distribution toward the true distribution. While different augmentation strategies and their combinations have been investigated for various computer vision tasks in the context of deep learning, a specific work in the domain of medical imaging is rare and to the best of our knowledge, there has been no dedicated work on exploring the effects of various augmentation methods on the performance of deep learning models in prostate cancer detection. In this work, we have statically applied five most frequently used augmentation techniques (random rotation, horizontal flip, vertical flip, random crop, and translation) to prostate diffusion-weighted magnetic resonance imaging training dataset of 217 patients separately and evaluated the effect of each method on the accuracy of prostate cancer detection. The augmentation algorithms were applied independently to each data channel and a shallow as well as a deep convolutional neural network (CNN) was trained on the five augmented sets separately. We used area under receiver operating characteristic (ROC) curve (AUC) to evaluate the performance of the trained CNNs on a separate test set of 95 patients, using a validation set of 102 patients for finetuning. The shallow network outperformed the deep network with the best 2D slice-based AUC of 0.85 obtained by the rotation method.
Subject(s)
Neural Networks, Computer , Prostatic Neoplasms , Algorithms , Diffusion Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging , Male , Prostatic Neoplasms/diagnostic imagingABSTRACT
BACKGROUND: Radiomic features in pancreatic ductal adenocarcinoma (PDAC) often lack validation in independent test sets or are limited to early or late stage disease. Given the lethal nature of PDAC it is possible that there are similarities in radiomic features of both early and advanced disease reflective of aggressive biology. PURPOSE: To assess the performance of prognostic radiomic features previously published in patients with resectable PDAC in a test set of patients with unresectable PDAC undergoing chemotherapy. METHODS: The pre-treatment CT of 108 patients enrolled in a prospective chemotherapy trial were used as a test cohort for 2 previously published prognostic radiomic features in resectable PDAC (Sum Entropy and Cluster Tendency with square-root filter[Sqrt]). We assessed the performance of these 2 radiomic features for the prediction of overall survival (OS) and time to progression (TTP) using Cox proportional-hazard models. RESULTS: Sqrt Cluster Tendency was significantly associated with outcome with a hazard ratio (HR) of 1.27(for primary pancreatic tumor plus local nodes), (Confidence Interval(CI):1.01 -1.6, P-value = 0.039) for OS and a HR of 1.25(CI:1.00 -1.55, P-value = 0.047) for TTP. Sum entropy was not associated with outcomes. Sqrt Cluster Tendency remained significant in multivariate analysis. CONCLUSION: The CT radiomic feature Sqrt Cluster Tendency, previously demonstrated to be prognostic in resectable PDAC, remained a significant prognostic factor for OS and TTP in a test set of unresectable PDAC patients. This radiomic feature warrants further investigation to understand its biologic correlates and CT applicability in PDAC patients.
Subject(s)
Adenocarcinoma/diagnostic imaging , Adenocarcinoma/drug therapy , Carcinoma, Pancreatic Ductal/diagnostic imaging , Carcinoma, Pancreatic Ductal/drug therapy , Pancreatic Neoplasms/diagnostic imaging , Pancreatic Neoplasms/drug therapy , Tomography, X-Ray Computed/methods , Aged , Cohort Studies , Female , Humans , Male , Middle Aged , Pancreas/diagnostic imaging , Prognosis , Reproducibility of Results , Retrospective StudiesABSTRACT
PURPOSE: We sought to develop a triage strategy to reduce negative and indeterminate multiparametric magnetic resonance imaging scans in patients at risk for prostate cancer. MATERIALS AND METHODS: In this retrospective study we evaluated 865 patients with no prior prostate cancer diagnosis who underwent prostate multiparametric magnetic resonance imaging between 2009 and 2017. Age, prostate volume, prostate specific antigen and prostate specific antigen density were assessed as predictors of positive multiparametric magnetic resonance imaging, defined as PI-RADS™ (Prostate Imaging Reporting and Data System) version 2/Likert score 4 or greater. The cohort was split into a training cohort of 605 patients and a validation cohort of 260. The optimal threshold to rule out positive multiparametric magnetic resonance imaging was chosen to achieve a negative predictive value greater than 90%. RESULTS: All clinical variables were significant predictors of positive multiparametric magnetic resonance imaging (p <0.05). Prostate specific antigen density outperformed other parameters in diagnostic accuracy and did not significantly differ compared to a multivariate model (AUC=0.74 vs 0.75). At prostate specific antigen density greater than 0.078 ng/ml2 sensitivity, specificity, positive and negative predictive values were 94%, 29%, 22% and 95%, respectively, resulting in 25% fewer scans (64 of 260). In the multivariate model sensitivity, specificity, positive and negative predictive values were 85%, 32%, 22% and 91%, respectively, resulting in 29% fewer scans (75 of 260). Biopsies in men who would not have undergone multiparametric magnetic resonance imaging according to our proposed strategies revealed 2 clinically significant prostate cancers using prostate specific antigen density and 1 using the multivariate model. CONCLUSIONS: In patients at risk for prostate cancer applying a multivariate prediction model or a prostate specific antigen density cutoff of 0.078 ng/ml2 resulted in 25% to 29% fewer multiparametric magnetic resonance imaging scans performed while missing only a minimal number of clinically significant prostate cancers. Further prospective validation is required.
Subject(s)
Kallikreins/blood , Multiparametric Magnetic Resonance Imaging/statistics & numerical data , Prostate-Specific Antigen/blood , Prostatic Neoplasms/blood , Prostatic Neoplasms/diagnostic imaging , Unnecessary Procedures/statistics & numerical data , Adult , Aged , Aged, 80 and over , Humans , Male , Middle Aged , Predictive Value of Tests , Retrospective Studies , Tumor BurdenABSTRACT
OBJECTIVES: To benchmark the performance of a calibrated 3D convolutional neural network (CNN) applied to multiparametric MRI (mpMRI) for risk assessment of clinically significant prostate cancer (csPCa) using decision curve analysis (DCA). METHODS: We retrospectively analyzed 499 patients who had positive mpMRI (PI-RADSv2 ≥ 3) and MRI-targeted biopsy. The training cohort comprised 449 men, including a calibration set of 50 men. Biopsy decision strategies included using risk estimates from the CNN (original and calibrated), to perform biopsy in men with PI-RADSv2 ≥ 4 only, or additionally in men with PI-RADSv2 3 and PSA density (PSAd) ≥ 0.15 ng/ml/ml. Discrimination, calibration and clinical usefulness in the unseen test cohort (n = 50) were assessed using C-statistic, calibration plots and DCA, respectively. RESULTS: The calibrated CNN achieved moderate calibration (Hosmer-Lemeshow calibration test, p = 0.41) and good discrimination (C = 0.85). DCA revealed consistently higher net benefit and net reduction in biopsies for the calibrated CNN compared with the original CNN, PI-RADSv2 ≥ 4 and the combined strategy of PI-RADSv2 and PSAd. Original CNN predictions were severely miscalibrated (p < 0.0001) resulting in net harm compared with a 'biopsy all' patients strategy. At-risk thresholds ≥ 10% using the calibrated CNN and the combined strategy reduced the number of biopsies by an estimated 201 and 55 men, respectively, per 1000 men at risk, without missing csPCa, while original CNN and PI-RADSv2 ≥ 4 could not achieve a net reduction in biopsies. CONCLUSIONS: DCA revealed that our calibrated 3D-CNN resulted in fewer unnecessary biopsies compared with using PI-RADSv2 alone or in combination with PSAd. CNN calibration is important in achieving clinical utility. KEY POINTS: ⢠A 3D deep learning model applied to multiparametric MRI may help to prevent unnecessary prostate biopsies in patients eligible for MRI-targeted biopsy. ⢠Owing to miscalibration, original risk estimates by the deep learning model require prior calibration to enable clinical utility. ⢠Decision curve analysis confirmed a net benefit of using our calibrated deep learning model for biopsy decisions compared with alternative strategies, including PI-RADSv2 alone and in combination with prostate-specific antigen density.
Subject(s)
Biopsy/methods , Deep Learning , Magnetic Resonance Imaging , Prostatic Neoplasms/diagnostic imaging , Risk Assessment/methods , Algorithms , Benchmarking , Calibration , Humans , Image Processing, Computer-Assisted , Machine Learning , Male , Normal Distribution , Observer Variation , Prostate-Specific Antigen/blood , Prostatic Neoplasms/pathology , Retrospective StudiesABSTRACT
BACKGROUND: Cox proportional hazard model (CPH) is commonly used in clinical research for survival analysis. In quantitative medical imaging (radiomics) studies, CPH plays an important role in feature reduction and modeling. However, the underlying linear assumption of CPH model limits the prognostic performance. In this work, using transfer learning, a convolutional neural network (CNN) based survival model was built and tested on preoperative CT images of resectable Pancreatic Ductal Adenocarcinoma (PDAC) patients. RESULTS: The proposed CNN-based survival model outperformed the traditional CPH-based radiomics approach in terms of concordance index and index of prediction accuracy, providing a better fit for patients' survival patterns. CONCLUSIONS: The proposed CNN-based survival model outperforms CPH-based radiomics pipeline in PDAC prognosis. This approach offers a better fit for survival patterns based on CT images and overcomes the limitations of conventional survival models.
Subject(s)
Carcinoma, Pancreatic Ductal/diagnostic imaging , Carcinoma, Pancreatic Ductal/mortality , Pancreatic Neoplasms/diagnostic imaging , Pancreatic Neoplasms/mortality , Humans , Neural Networks, Computer , Prognosis , Proportional Hazards Models , Survival Analysis , Tomography, X-Ray ComputedABSTRACT
Prostate cancer is the most commonly diagnosed cancer in North American men; however, prognosis is relatively good given early diagnosis. This motivates the need for fast and reliable prostate cancer sensing. Diffusion weighted imaging (DWI) has gained traction in recent years as a fast non-invasive approach to cancer sensing. The most commonly used DWI sensing modality currently is apparent diffusion coefficient (ADC) imaging, with the recently introduced computed high-b value diffusion weighted imaging (CHB-DWI) showing considerable promise for cancer sensing. In this study, we investigate the efficacy of ADC and CHB-DWI sensing modalities when applied to zone-level prostate cancer sensing by introducing several radiomics driven zone-level prostate cancer sensing strategies geared around hand-engineered radiomic sequences from DWI sensing (which we term as Zone-X sensing strategies). Furthermore, we also propose Zone-DR, a discovery radiomics approach based on zone-level deep radiomic sequencer discovery that discover radiomic sequences directly for radiomics driven sensing. Experimental results using 12,466 pathology-verified zones obtained through the different DWI sensing modalities of 101 patients showed that: (i) the introduced Zone-X and Zone-DR radiomics driven sensing strategies significantly outperformed the traditional clinical heuristics driven strategy in terms of AUC, (ii) the introduced Zone-DR and Zone-SVM strategies achieved the highest sensitivity and specificity, respectively for ADC amongst the tested radiomics driven strategies, (iii) the introduced Zone-DR and Zone-LR strategies achieved the highest sensitivities for CHB-DWI amongst the tested radiomics driven strategies, and (iv) the introduced Zone-DR, Zone-LR, and Zone-SVM strategies achieved the highest specificities for CHB-DWI amongst the tested radiomics driven strategies. Furthermore, the results showed that the trade-off between sensitivity and specificity can be optimized based on the particular clinical scenario we wish to employ radiomic driven DWI prostate cancer sensing strategies for, such as clinical screening versus surgical planning. Finally, we investigate the critical regions within sensing data that led to a given radiomic sequence generated by a Zone-DR sequencer using an explainability method to get a deeper understanding on the biomarkers important for zone-level cancer sensing.
Subject(s)
Diffusion Magnetic Resonance Imaging , Prostate/diagnostic imaging , Prostatic Neoplasms/diagnostic imaging , Algorithms , Area Under Curve , Decision Trees , Humans , Image Interpretation, Computer-Assisted/methods , Image Processing, Computer-Assisted , Male , Regression Analysis , Sensitivity and Specificity , Support Vector MachineABSTRACT
Purpose To determine the increase in clinically significant cancer detection in the prostate with increasing number of core samples obtained by using cognitive MRI-targeted transrectal US biopsy. Materials and Methods This retrospective cross-sectional study included 330 consecutive patients (mean age, 64.3 years; range, 42-84 years) who underwent multiparametric prostate MRI from March 2012 to July 2017 and had an index lesion that subsequently underwent cognitive MRI-targeted biopsy using transrectal US with at least five core samples (which were sequentially labeled) per lesion. The detection rate of clinically significant cancer was calculated on sequential biopsy cores, comparing the first core alone versus three cores versus five cores per target. Clinically significant cancer was defined as International Society of Urological Pathology Grade Group 2 or higher. Results Increasing the number of biopsy core samples from one to three per target and from three to five per target increased the detection rate of clinically significant cancer by 6.4% (21 of 330) and 2.4% (eight of 330), respectively. The target yield for clinically significant cancer was 26% (87 of 330), 33% (108 of 330), and 35% (116 of 330) for one, three, and five cores, respectively. Subgroup analysis showed no significant difference in upgrade rates as a function of multiparametric MRI lesion size (P = .53-.59) or location (P = .28-.89). Conclusion More clinically significant prostate cancers are detected when increasing the number of core biopsy samples per index lesion from one to three and from three to five (6.4% and 2.4%, respectively) when performing cognitive MRI-targeted transrectal US biopsy. © RSNA, 2019 See also the editorial by Oto in this issue.