ABSTRACT
BACKGROUND: Accurate mortality risk quantification is crucial for the management of hepatocellular carcinoma (HCC); however, most scoring systems are subjective. PURPOSE: To develop and independently validate a machine learning mortality risk quantification method for HCC patients using standard-of-care clinical data and liver radiomics on baseline magnetic resonance imaging (MRI). METHODS: This retrospective study included all patients with multiphasic contrast-enhanced MRI at the time of diagnosis treated at our institution. Patients were censored at their last date of follow-up, end-of-observation, or liver transplantation date. The data were randomly sampled into independent cohorts, with 85% for development and 15% for independent validation. An automated liver segmentation framework was adopted for radiomic feature extraction. A random survival forest combined clinical and radiomic variables to predict overall survival (OS), and performance was evaluated using Harrell's C-index. RESULTS: A total of 555 treatment-naïve HCC patients (mean age, 63.8 years ± 8.9 [standard deviation]; 118 females) with MRI at the time of diagnosis were included, of which 287 (51.7%) died after a median time of 14.40 (interquartile range, 22.23) months, and had median followed up of 32.47 (interquartile range, 61.5) months. The developed risk prediction framework required 1.11 min on average and yielded C-indices of 0.8503 and 0.8234 in the development and independent validation cohorts, respectively, outperforming conventional clinical staging systems. Predicted risk scores were significantly associated with OS (p < .00001 in both cohorts). CONCLUSIONS: Machine learning reliably, rapidly, and reproducibly predicts mortality risk in patients with hepatocellular carcinoma from data routinely acquired in clinical practice. CLINICAL RELEVANCE STATEMENT: Precision mortality risk prediction using routinely available standard-of-care clinical data and automated MRI radiomic features could enable personalized follow-up strategies, guide management decisions, and improve clinical workflow efficiency in tumor boards. KEY POINTS: ⢠Machine learning enables hepatocellular carcinoma mortality risk prediction using standard-of-care clinical data and automated radiomic features from multiphasic contrast-enhanced MRI. ⢠Automated mortality risk prediction achieved state-of-the-art performances for mortality risk quantification and outperformed conventional clinical staging systems. ⢠Patients were stratified into low, intermediate, and high-risk groups with significantly different survival times, generalizable to an independent evaluation cohort.
Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Machine Learning , Magnetic Resonance Imaging , Humans , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/mortality , Female , Male , Carcinoma, Hepatocellular/diagnostic imaging , Carcinoma, Hepatocellular/mortality , Middle Aged , Retrospective Studies , Prognosis , Magnetic Resonance Imaging/methods , Contrast Media , Aged , Risk Assessment/methodsABSTRACT
OBJECTIVES: To develop and evaluate a deep convolutional neural network (DCNN) for automated liver segmentation, volumetry, and radiomic feature extraction on contrast-enhanced portal venous phase magnetic resonance imaging (MRI). MATERIALS AND METHODS: This retrospective study included hepatocellular carcinoma patients from an institutional database with portal venous MRI. After manual segmentation, the data was randomly split into independent training, validation, and internal testing sets. From a collaborating institution, de-identified scans were used for external testing. The public LiverHccSeg dataset was used for further external validation. A 3D DCNN was trained to automatically segment the liver. Segmentation accuracy was quantified by the Dice similarity coefficient (DSC) with respect to manual segmentation. A Mann-Whitney U test was used to compare the internal and external test sets. Agreement of volumetry and radiomic features was assessed using the intraclass correlation coefficient (ICC). RESULTS: In total, 470 patients met the inclusion criteria (63.9±8.2 years; 376 males) and 20 patients were used for external validation (41±12 years; 13 males). DSC segmentation accuracy of the DCNN was similarly high between the internal (0.97±0.01) and external (0.96±0.03) test sets (p=0.28) and demonstrated robust segmentation performance on public testing (0.93±0.03). Agreement of liver volumetry was satisfactory in the internal (ICC, 0.99), external (ICC, 0.97), and public (ICC, 0.85) test sets. Radiomic features demonstrated excellent agreement in the internal (mean ICC, 0.98±0.04), external (mean ICC, 0.94±0.10), and public (mean ICC, 0.91±0.09) datasets. CONCLUSION: Automated liver segmentation yields robust and generalizable segmentation performance on MRI data and can be used for volumetry and radiomic feature extraction. CLINICAL RELEVANCE STATEMENT: Liver volumetry, anatomic localization, and extraction of quantitative imaging biomarkers require accurate segmentation, but manual segmentation is time-consuming. A deep convolutional neural network demonstrates fast and accurate segmentation performance on T1-weighted portal venous MRI. KEY POINTS: ⢠This deep convolutional neural network yields robust and generalizable liver segmentation performance on internal, external, and public testing data. ⢠Automated liver volumetry demonstrated excellent agreement with manual volumetry. ⢠Automated liver segmentations can be used for robust and reproducible radiomic feature extraction.
Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Magnetic Resonance Imaging , Humans , Male , Magnetic Resonance Imaging/methods , Female , Middle Aged , Liver Neoplasms/diagnostic imaging , Retrospective Studies , Carcinoma, Hepatocellular/diagnostic imaging , Adult , Neural Networks, Computer , Liver/diagnostic imaging , Contrast Media , Aged , RadiomicsABSTRACT
BACKGROUND. Posttreatment recurrence is an unpredictable complication after liver transplant for hepatocellular carcinoma (HCC) that is associated with poor survival. Biomarkers are needed to estimate recurrence risk before organ allocation. OBJECTIVE. This proof-of-concept study evaluated the use of machine learning (ML) to predict recurrence from pretreatment laboratory, clinical, and MRI data in patients with early-stage HCC initially eligible for liver transplant. METHODS. This retrospective study included 120 patients (88 men, 32 women; median age, 60.0 years) with early-stage HCC diagnosed who were initially eligible for liver transplant and underwent treatment by transplant, resection, or thermal ablation between June 2005 and March 2018. Patients underwent pretreatment MRI and posttreatment imaging surveillance. Imaging features were extracted from postcontrast phases of pretreatment MRI examinations using a pretrained convolutional neural network. Pretreatment clinical characteristics (including laboratory data) and extracted imaging features were integrated to develop three ML models (clinical model, imaging model, combined model) for predicting recurrence within six time frames ranging from 1 through 6 years after treatment. Kaplan-Meier analysis with time to recurrence as the endpoint was used to assess the clinical relevance of model predictions. RESULTS. Tumor recurred in 44 of 120 (36.7%) patients during follow-up. The three models predicted recurrence with AUCs across the six time frames of 0.60-0.78 (clinical model), 0.71-0.85 (imaging model), and 0.62-0.86 (combined model). The mean AUC was higher for the imaging model than the clinical model (0.76 vs 0.68, respectively; p = .03), but the mean AUC was not significantly different between the clinical and combined models or between the imaging and combined models (p > .05). Kaplan-Meier curves were significantly different between patients predicted to be at low risk and those predicted to be at high risk by all three models for the 2-, 3-, 4-, 5-, and 6-year time frames (p < .05). CONCLUSION. The findings suggest that ML-based models can predict recurrence before therapy allocation in patients with early-stage HCC initially eligible for liver transplant. Adding MRI data as model input improved predictive performance over clinical parameters alone. The combined model did not surpass the imaging model's performance. CLINICAL IMPACT. ML-based models applied to currently underutilized imaging features may help design more reliable criteria for organ allocation and liver transplant eligibility.
Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Male , Humans , Female , Middle Aged , Carcinoma, Hepatocellular/diagnostic imaging , Carcinoma, Hepatocellular/surgery , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/surgery , Retrospective Studies , Risk Factors , Magnetic Resonance Imaging/methods , Neoplasm Recurrence, Local/epidemiologyABSTRACT
PURPOSE: To characterize the effects of commonly used transcatheter arterial chemoembolization (TACE) regimens on the immune response and immune checkpoint marker expression using a VX2 rabbit liver tumor model. MATERIALS AND METHODS: Twenty-four VX2 liver tumor-bearing New Zealand white rabbits were assigned to 7 groups (n = 3 per group) undergoing locoregional therapy as follows: (a) bicarbonate infusion without embolization, (b) conventional TACE (cTACE) using a water-in-oil emulsion containing doxorubicin mixed 1:2 with Lipiodol, drug-eluting embolic-TACE with either (c) idarubicin-eluting Oncozene microspheres (40 µm) or (d) doxorubicin-eluting Lumi beads (40-90 µm). For each therapy arm (b-d), a tandem set of 3 animals with additional bicarbonate infusion before TACE was added, to evaluate the effect of pH modification on the immune response. Three untreated rabbits served as controls. Tissue was harvested 24 hours after treatment, followed by digital immunohistochemistry quantification (counts/µm2 ± SEM) of tumor-infiltrating cluster of differentiation 3+ T-lymphocytes, human leukocyte antigen DR type antigen-presenting cells (APCs), cytotoxic T-lymphocyte-associated protein-4 (CTLA-4), and programmed cell death protein-1 (PD-1)/PD-1 ligand (PD-L1) pathway axis expression. RESULTS: Lumi-bead TACE induced significantly more intratumoral T-cell and APC infiltration than cTACE and Oncozene-microsphere TACE. Additionally, tumors treated with Lumi-bead TACE expressed significantly higher intratumoral immune checkpoint markers compared with cTACE and Oncozene-microsphere TACE. Neoadjuvant bicarbonate demonstrated the most pronounced effect on cTACE and resulted in a significant increase in intratumoral cluster of differentiation 3+ T-cell infiltration compared with cTACE alone. CONCLUSIONS: This preclinical study revealed significant differences in evoked tumor immunogenicity depending on the choice of chemoembolic regimen for TACE.
Subject(s)
Carcinoma, Hepatocellular , Chemoembolization, Therapeutic , Liver Neoplasms , Animals , Antibiotics, Antineoplastic , Bicarbonates/therapeutic use , Carcinoma, Hepatocellular/therapy , Chemoembolization, Therapeutic/methods , Doxorubicin , Liver Neoplasms/therapy , Programmed Cell Death 1 Receptor , RabbitsABSTRACT
BACKGROUND AND PURPOSE: Radiomics provides a framework for automated extraction of high-dimensional feature sets from medical images. We aimed to determine radiomics signature correlates of admission clinical severity and medium-term outcome from intracerebral hemorrhage (ICH) lesions on baseline head computed tomography (CT). METHODS: We used the ATACH-2 (Antihypertensive Treatment of Acute Cerebral Hemorrhage II) trial dataset. Patients included in this analysis (n = 895) were randomly allocated to discovery (n = 448) and independent validation (n = 447) cohorts. We extracted 1130 radiomics features from hematoma lesions on baseline noncontrast head CT scans and generated radiomics signatures associated with admission Glasgow Coma Scale (GCS), admission National Institutes of Health Stroke Scale (NIHSS), and 3-month modified Rankin Scale (mRS) scores. Spearman's correlation between radiomics signatures and corresponding target variables was compared with hematoma volume. RESULTS: In the discovery cohort, radiomics signatures, compared to ICH volume, had a significantly stronger association with admission GCS (0.47 vs. 0.44, p = 0.008), admission NIHSS (0.69 vs. 0.57, p < 0.001), and 3-month mRS scores (0.44 vs. 0.32, p < 0.001). Similarly, in independent validation, radiomics signatures, compared to ICH volume, had a significantly stronger association with admission GCS (0.43 vs. 0.41, p = 0.02), NIHSS (0.64 vs. 0.56, p < 0.001), and 3-month mRS scores (0.43 vs. 0.33, p < 0.001). In multiple regression analysis adjusted for known predictors of ICH outcome, the radiomics signature was an independent predictor of 3-month mRS in both cohorts. CONCLUSIONS: Limited by the enrollment criteria of the ATACH-2 trial, we showed that radiomics features quantifying hematoma texture, density, and shape on baseline CT can provide imaging correlates for clinical presentation and 3-month outcome. These findings couldtrigger a paradigm shift where imaging biomarkers may improve current modelsfor prognostication, risk-stratification, and treatment triage of ICH patients.
Subject(s)
Cerebral Hemorrhage , Hematoma , Cerebral Hemorrhage/diagnostic imaging , Glasgow Coma Scale , Hematoma/diagnostic imaging , Humans , Prognosis , Tomography, X-Ray ComputedABSTRACT
To develop a deep learning-based model capable of segmenting the left ventricular (LV) myocardium on native T1 maps from cardiac MRI in both long-axis and short-axis orientations. Models were trained on native myocardial T1 maps from 50 healthy volunteers and 75 patients using manual segmentation as the reference standard. Based on a U-Net architecture, we systematically optimized the model design using two different training metrics (Sørensen-Dice coefficient = DSC and Intersection-over-Union = IOU), two different activation functions (ReLU and LeakyReLU) and various numbers of training epochs. Training with DSC metric and a ReLU activation function over 35 epochs achieved the highest overall performance (mean error in T1 10.6 ± 17.9 ms, mean DSC 0.88 ± 0.07). Limits of agreement between model results and ground truth were from -35.5 to + 36.1 ms. This was superior to the agreement between two human raters (-34.7 to + 59.1 ms). Segmentation was as accurate for long-axis views (mean error T1: 6.77 ± 8.3 ms, mean DSC: 0.89 ± 0.03) as for short-axis images (mean error ΔT1: 11.6 ± 19.7 ms, mean DSC: 0.88 ± 0.08). Fully automated segmentation and quantitative analysis of native myocardial T1 maps is possible in both long-axis and short-axis orientations with very high accuracy.
Subject(s)
Deep Learning , Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging/methods , Male , Female , Adult , Middle Aged , Image Processing, Computer-Assisted/methods , Myocardium , Heart Ventricles/diagnostic imaging , Heart/diagnostic imagingABSTRACT
PURPOSE: Accurate liver segmentation is key for volumetry assessment to guide treatment decisions. Moreover, it is an important pre-processing step for cancer detection algorithms. Liver segmentation can be especially challenging in patients with cancer-related tissue changes and shape deformation. The aim of this study was to assess the ability of state-of-the-art deep learning 3D liver segmentation algorithms to generalize across all different Barcelona Clinic Liver Cancer (BCLC) liver cancer stages. METHODS: This retrospective study, included patients from an institutional database that had arterial-phase T1-weighted magnetic resonance images with corresponding manual liver segmentations. The data was split into 70/15/15% for training/validation/testing each proportionally equal across BCLC stages. Two 3D convolutional neural networks were trained using identical U-net-derived architectures with equal sized training datasets: one spanning all BCLC stages ("All-Stage-Net": AS-Net), and one limited to early and intermediate BCLC stages ("Early-Intermediate-Stage-Net": EIS-Net). Segmentation accuracy was evaluated by the Dice Similarity Coefficient (DSC) on a dataset spanning all BCLC stages and a Wilcoxon signed-rank test was used for pairwise comparisons. RESULTS: 219 subjects met the inclusion criteria (170 males, 49 females, 62.8±9.1 years) from all BCLC stages. Both networks were trained using 129 subjects: AS-Net training comprised 19, 74, 18, 8, and 10 BCLC 0, A, B, C, and D patients, respectively; EIS-Net training comprised 21, 86, and 22 BCLC 0, A, and B patients, respectively. DSCs (mean±SD) were 0.954±0.018 and 0.946±0.032 for AS-Net and EIS-Net (p<0.001), respectively. The AS-Net 0.956±0.014 significantly outperformed the EIS-Net 0.941±0.038 on advanced BCLC stages (p<0.001) and yielded similarly good segmentation performance on early and intermediate stages (AS-Net: 0.952±0.021; EIS-Net: 0.949±0.027; p = 0.107). CONCLUSION: To ensure robust segmentation performance across cancer stages that is independent of liver shape deformation and tumor burden, it is critical to train deep learning models on heterogeneous imaging data spanning all BCLC stages.