Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Eur Radiol ; 34(4): 2772-2781, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37803212

ABSTRACT

OBJECTIVES: Currently, the BRAF status of pediatric low-grade glioma (pLGG) patients is determined through a biopsy. We established a nomogram to predict BRAF status non-invasively using clinical and radiomic factors. Additionally, we assessed an advanced thresholding method to provide only high-confidence predictions for the molecular subtype. Finally, we tested whether radiomic features provide additional predictive information for this classification task, beyond that which is embedded in the location of the tumor. METHODS: Random forest (RF) models were trained on radiomic and clinical features both separately and together, to evaluate the utility of each feature set. Instead of using the traditional single threshold technique to convert the model outputs to class predictions, we implemented a double threshold mechanism that accounted for uncertainty. Additionally, a linear model was trained and depicted graphically as a nomogram. RESULTS: The combined RF (AUC: 0.925) outperformed the RFs trained on radiomic (AUC: 0.863) or clinical (AUC: 0.889) features alone. The linear model had a comparable AUC (0.916), despite its lower complexity. Traditional thresholding produced an accuracy of 84.5%, while the double threshold approach yielded 92.2% accuracy on the 80.7% of patients with the highest confidence predictions. CONCLUSION: Models that included radiomic features outperformed, underscoring their importance for the prediction of BRAF status. A linear model performed similarly to RF but with the added benefit that it can be visualized as a nomogram, improving the explainability of the model. The double threshold technique was able to identify uncertain predictions, enhancing the clinical utility of the model. CLINICAL RELEVANCE STATEMENT: Radiomic features and tumor location are both predictive of BRAF status in pLGG patients. We show that they contain complementary information and depict the optimal model as a nomogram, which can be used as a non-invasive alternative to biopsy. KEY POINTS: • Radiomic features provide additional predictive information for the determination of the molecular subtype of pediatric low-grade gliomas patients, beyond what is embedded in the location of the tumor, which has an established relationship with genetic status. • An advanced thresholding method can help to distinguish cases where machine learning models have a high chance of being (in)correct, improving the utility of these models. • A simple linear model performs similarly to a more powerful random forest model at classifying the molecular subtype of pediatric low-grade gliomas but has the added benefit that it can be converted into a nomogram, which may facilitate clinical implementation by improving the explainability of the model.


Subject(s)
Brain Neoplasms , Glioma , Humans , Child , Proto-Oncogene Proteins B-raf/genetics , Brain Neoplasms/pathology , Radiomics , Retrospective Studies , Glioma/pathology
2.
Can Assoc Radiol J ; : 8465371241231577, 2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38538619

ABSTRACT

Purpose: Scoliosis is a complex spine deformity with direct functional and cosmetic impacts on the individual. The reference standard for assessing scoliosis severity is the Cobb angle which is measured on radiographs by human specialists, carrying interobserver variability and inaccuracy of measurements. These limitations may result in lack of timely referral for management at a time the scoliotic deformity progression can be saved from surgery. We aimed to create a machine learning (ML) model for automatic calculation of Cobb angles on 3-foot standing spine radiographs of children and adolescents with clinical suspicion of scoliosis across 2 clinical scenarios (idiopathic, group 1 and congenital scoliosis, group 2). Methods: We retrospectively measured Cobb angles of 130 patients who had a 3-foot spine radiograph for scoliosis within a 10-year period for either idiopathic or congenital anomaly scoliosis. Cobb angles were measured both manually by radiologists and by an ML pipeline (segmentation-based approach-Augmented U-Net model with non-square kernels). Results: Our Augmented U-Net architecture achieved a Symmetric Mean Absolute Percentage Error (SMAPE) of 11.82% amongst a combined idiopathic and congenital scoliosis cohort. When stratifying for idiopathic and congenital scoliosis individually a SMAPE of 13.02% and 11.90% were achieved, respectively. Conclusion: The ML model used in this study is promising at providing automated Cobb angle measurement in both idiopathic scoliosis and congenital scoliosis. Nevertheless, larger studies are needed in the future to confirm the results of this study prior to translation of this ML algorithm into clinical practice.

3.
Can Assoc Radiol J ; 74(4): 667-675, 2023 Nov.
Article in English | MEDLINE | ID: mdl-36949410

ABSTRACT

Purpose: Scoliosis is a deformity of the spine, and as a measure of scoliosis severity, Cobb angle is fundamental to the diagnosis of deformities that require treatment. Conventional Cobb angle measurement and assessment is usually done manually, which is inherently time-consuming, and associated with high inter- and intra-observer variability. While there exist automatic scoliosis measurement methods, they suffer from insufficient accuracy. In this work, we propose a two-step segmentation-based deep learning architecture to automate Cobb angle measurement for scoliosis assessment using X-Ray images. Methods: The proposed architecture involves two steps. In the first step, we utilize a novel Augmented U-Net architecture to generate segmentations of vertebrae. The second step includes a non-learning-based pipeline to extract landmark coordinates from the segmented vertebrae and filter undesirable landmarks. Results: Our proposed Augmented U-Net architecture achieved a Symmetric Mean Absolute Percentage Error of 9.2%, with approximately 90% of estimations having less than 10 degrees difference compared with the AASCE-MICCAI challenge 2019 dataset ground truths. We further validated the model using an internal dataset and achieved almost the same level of performance. Conclusion: The proposed architecture is robust in providing automated spinal vertebrae segmentations and Cobb angle measurement, and is potentially generalizable to real-world clinical settings.


Subject(s)
Scoliosis , Humans , Adolescent , Scoliosis/diagnostic imaging , Spine , Observer Variation , Reproducibility of Results
4.
Can Assoc Radiol J ; 74(1): 119-126, 2023 Feb.
Article in English | MEDLINE | ID: mdl-35768942

ABSTRACT

Purpose: Biopsy-based assessment of H3 K27 M status helps in predicting survival, but biopsy is usually limited to unusual presentations and clinical trials. We aimed to evaluate whether radiomics can serve as prognostic marker to stratify diffuse intrinsic pontine glioma (DIPG) subsets. Methods: In this retrospective study, diagnostic brain MRIs of children with DIPG were analyzed. Radiomic features were extracted from tumor segmentations and data were split into training/testing sets (80:20). A conditional survival forest model was applied to predict progression-free survival (PFS) using training data. The trained model was validated on the test data, and concordances were calculated for PFS. Experiments were repeated 100 times using randomized versions of the respective percentage of the training/test data. Results: A total of 89 patients were identified (48 females, 53.9%). Median age at time of diagnosis was 6.64 years (range: 1-16.9 years) and median PFS was 8 months (range: 1-84 months). Molecular data were available for 26 patients (29.2%) (1 wild type, 3 K27M-H3.1, 22 K27M-H3.3). Radiomic features of FLAIR and nonenhanced T1-weighted sequences were predictive of PFS. The best FLAIR radiomics model yielded a concordance of .87 [95% CI: .86-.88] at 4 months PFS. The best T1-weighted radiomics model yielded a concordance of .82 [95% CI: .8-.84] at 4 months PFS. The best combined FLAIR + T1-weighted radiomics model yielded a concordance of .74 [95% CI: .71-.77] at 3 months PFS. The predominant predictive radiomic feature matrix was gray-level size-zone. Conclusion: MRI-based radiomics may predict progression-free survival in pediatric diffuse midline glioma/diffuse intrinsic pontine glioma.


Subject(s)
Brain Stem Neoplasms , Diffuse Intrinsic Pontine Glioma , Glioma , Female , Humans , Child , Progression-Free Survival , Retrospective Studies , Glioma/diagnostic imaging , Glioma/pathology , Magnetic Resonance Imaging , Brain Stem Neoplasms/diagnostic imaging
5.
Eur Radiol ; 31(1): 244-255, 2021 Jan.
Article in English | MEDLINE | ID: mdl-32749585

ABSTRACT

OBJECTIVE: To differentiate combined hepatocellular cholangiocarcinoma (cHCC-CC) from cholangiocarcinoma (CC) and hepatocellular carcinoma (HCC) using machine learning on MRI and CT radiomics features. METHODS: This retrospective study included 85 patients aged 32 to 86 years with 86 histopathology-proven liver cancers: 24 cHCC-CC, 24 CC, and 38 HCC who had MRI and CT between 2004 and 2018. Initial CT reports and morphological evaluation of MRI features were used to assess the performance of radiologists read. Following tumor segmentation, 1419 radiomics features were extracted using PyRadiomics library and reduced to 20 principle components by principal component analysis. Support vector machine classifier was utilized to evaluate MRI and CT radiomics features for the prediction of cHCC-CC vs. non-cHCC-CC and HCC vs. non-HCC. Histopathology was the reference standard for all tumors. RESULTS: Radiomics MRI features demonstrated the best performance for differentiation of cHCC-CC from non-cHCC-CC with the highest AUC of 0.77 (SD 0.19) while CT was of limited value. Contrast-enhanced MRI phases and pre-contrast and portal-phase CT showed excellent performance for the differentiation of HCC from non-HCC (AUC of 0.79 (SD 0.07) to 0.81 (SD 0.13) for MRI and AUC of 0.81 (SD 0.06) and 0.71 (SD 0.15) for CT phases, respectively). The misdiagnosis of cHCC-CC as HCC or CC using radiologists read was 69% for CT and 58% for MRI. CONCLUSIONS: Our results demonstrate promising predictive performance of MRI and CT radiomics features using machine learning analysis for differentiation of cHCC-CC from HCC and CC with potential implications for treatment decisions. KEY POINTS: • Retrospective study demonstrated promising predictive performance of MRI radiomics features in the differentiation of cHCC-CC from HCC and CC and of CT radiomics features in the differentiation of HCC from cHCC-CC and CC. • With future validation, radiomics analysis has the potential to inform current clinical practice for the pre-operative diagnosis of cHCC-CC and to enable optimal treatment decisions regards liver resection and transplantation.


Subject(s)
Bile Duct Neoplasms , Carcinoma, Hepatocellular , Cholangiocarcinoma , Liver Neoplasms , Adult , Aged , Aged, 80 and over , Bile Duct Neoplasms/diagnostic imaging , Bile Ducts, Intrahepatic , Carcinoma, Hepatocellular/diagnostic imaging , Cholangiocarcinoma/diagnostic imaging , Humans , Liver Neoplasms/diagnostic imaging , Machine Learning , Middle Aged , Retrospective Studies
6.
Neuroradiology ; 63(12): 1957-1967, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34537858

ABSTRACT

PURPOSE: Artificial intelligence (AI) is playing an ever-increasing role in Neuroradiology. METHODS: When designing AI-based research in neuroradiology and appreciating the literature, it is important to understand the fundamental principles of AI. Training, validation, and test datasets must be defined and set apart as priorities. External validation and testing datasets are preferable, when feasible. The specific type of learning process (supervised vs. unsupervised) and the machine learning model also require definition. Deep learning (DL) is an AI-based approach that is modelled on the structure of neurons of the brain; convolutional neural networks (CNN) are a commonly used example in neuroradiology. RESULTS: Radiomics is a frequently used approach in which a multitude of imaging features are extracted from a region of interest and subsequently reduced and selected to convey diagnostic or prognostic information. Deep radiomics uses CNNs to directly extract features and obviate the need for predefined features. CONCLUSION: Common limitations and pitfalls in AI-based research in neuroradiology are limited sample sizes ("small-n-large-p problem"), selection bias, as well as overfitting and underfitting.


Subject(s)
Artificial Intelligence , Deep Learning , Humans , Machine Learning , Neural Networks, Computer , Prognosis
7.
J Digit Imaging ; 34(4): 862-876, 2021 08.
Article in English | MEDLINE | ID: mdl-34254200

ABSTRACT

Data augmentation refers to a group of techniques whose goal is to battle limited amount of available data to improve model generalization and push sample distribution toward the true distribution. While different augmentation strategies and their combinations have been investigated for various computer vision tasks in the context of deep learning, a specific work in the domain of medical imaging is rare and to the best of our knowledge, there has been no dedicated work on exploring the effects of various augmentation methods on the performance of deep learning models in prostate cancer detection. In this work, we have statically applied five most frequently used augmentation techniques (random rotation, horizontal flip, vertical flip, random crop, and translation) to prostate diffusion-weighted magnetic resonance imaging training dataset of 217 patients separately and evaluated the effect of each method on the accuracy of prostate cancer detection. The augmentation algorithms were applied independently to each data channel and a shallow as well as a deep convolutional neural network (CNN) was trained on the five augmented sets separately. We used area under receiver operating characteristic (ROC) curve (AUC) to evaluate the performance of the trained CNNs on a separate test set of 95 patients, using a validation set of 102 patients for finetuning. The shallow network outperformed the deep network with the best 2D slice-based AUC of 0.85 obtained by the rotation method.


Subject(s)
Neural Networks, Computer , Prostatic Neoplasms , Algorithms , Diffusion Magnetic Resonance Imaging , Humans , Magnetic Resonance Imaging , Male , Prostatic Neoplasms/diagnostic imaging
8.
Eur Radiol ; 30(12): 6867-6876, 2020 Dec.
Article in English | MEDLINE | ID: mdl-32591889

ABSTRACT

OBJECTIVES: To benchmark the performance of a calibrated 3D convolutional neural network (CNN) applied to multiparametric MRI (mpMRI) for risk assessment of clinically significant prostate cancer (csPCa) using decision curve analysis (DCA). METHODS: We retrospectively analyzed 499 patients who had positive mpMRI (PI-RADSv2 ≥ 3) and MRI-targeted biopsy. The training cohort comprised 449 men, including a calibration set of 50 men. Biopsy decision strategies included using risk estimates from the CNN (original and calibrated), to perform biopsy in men with PI-RADSv2 ≥ 4 only, or additionally in men with PI-RADSv2 3 and PSA density (PSAd) ≥ 0.15 ng/ml/ml. Discrimination, calibration and clinical usefulness in the unseen test cohort (n = 50) were assessed using C-statistic, calibration plots and DCA, respectively. RESULTS: The calibrated CNN achieved moderate calibration (Hosmer-Lemeshow calibration test, p = 0.41) and good discrimination (C = 0.85). DCA revealed consistently higher net benefit and net reduction in biopsies for the calibrated CNN compared with the original CNN, PI-RADSv2 ≥ 4 and the combined strategy of PI-RADSv2 and PSAd. Original CNN predictions were severely miscalibrated (p < 0.0001) resulting in net harm compared with a 'biopsy all' patients strategy. At-risk thresholds ≥ 10% using the calibrated CNN and the combined strategy reduced the number of biopsies by an estimated 201 and 55 men, respectively, per 1000 men at risk, without missing csPCa, while original CNN and PI-RADSv2 ≥ 4 could not achieve a net reduction in biopsies. CONCLUSIONS: DCA revealed that our calibrated 3D-CNN resulted in fewer unnecessary biopsies compared with using PI-RADSv2 alone or in combination with PSAd. CNN calibration is important in achieving clinical utility. KEY POINTS: • A 3D deep learning model applied to multiparametric MRI may help to prevent unnecessary prostate biopsies in patients eligible for MRI-targeted biopsy. • Owing to miscalibration, original risk estimates by the deep learning model require prior calibration to enable clinical utility. • Decision curve analysis confirmed a net benefit of using our calibrated deep learning model for biopsy decisions compared with alternative strategies, including PI-RADSv2 alone and in combination with prostate-specific antigen density.


Subject(s)
Biopsy/methods , Deep Learning , Magnetic Resonance Imaging , Prostatic Neoplasms/diagnostic imaging , Risk Assessment/methods , Algorithms , Benchmarking , Calibration , Humans , Image Processing, Computer-Assisted , Machine Learning , Male , Normal Distribution , Observer Variation , Prostate-Specific Antigen/blood , Prostatic Neoplasms/pathology , Retrospective Studies
9.
AJNR Am J Neuroradiol ; 45(6): 753-760, 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38604736

ABSTRACT

BACKGROUND AND PURPOSE: Molecular biomarker identification increasingly influences the treatment planning of pediatric low-grade neuroepithelial tumors (PLGNTs). We aimed to develop and validate a radiomics-based ADC signature predictive of the molecular status of PLGNTs. MATERIALS AND METHODS: In this retrospective bi-institutional study, we searched the PACS for baseline brain MRIs from children with PLGNTs. Semiautomated tumor segmentation on ADC maps was performed using the semiautomated level tracing effect tool with 3D Slicer. Clinical variables, including age, sex, and tumor location, were collected from chart review. The molecular status of tumors was derived from biopsy. Multiclass random forests were used to predict the molecular status and fine-tuned using a grid search on the validation sets. Models were evaluated using independent and unseen test sets based on the combined data, and the area under the receiver operating characteristic curve (AUC) was calculated for the prediction of 3 classes: KIAA1549-BRAF fusion, BRAF V600E mutation, and non-BRAF cohorts. Experiments were repeated 100 times using different random data splits and model initializations to ensure reproducible results. RESULTS: Two hundred ninety-nine children from the first institution and 23 children from the second institution were included (53.6% male; mean, age 8.01 years; 51.8% supratentorial; 52.2% with KIAA1549-BRAF fusion). For the 3-class prediction using radiomics features only, the average test AUC was 0.74 (95% CI, 0.73-0.75), and using clinical features only, the average test AUC was 0.67 (95% CI, 0.66-0.68). The combination of both radiomics and clinical features improved the AUC to 0.77 (95% CI, 0.75-0.77). The diagnostic performance of the per-class test AUC was higher in identifying KIAA1549-BRAF fusion tumors among the other subgroups (AUC = 0.81 for the combined radiomics and clinical features versus 0.75 and 0.74 for BRAF V600E mutation and non-BRAF, respectively). CONCLUSIONS: ADC values of tumor segmentations have differentiative signals that can be used for training machine learning classifiers for molecular biomarker identification of PLGNTs. ADC-based pretherapeutic differentiation of the BRAF status of PLGNTs has the potential to avoid invasive tumor biopsy and enable earlier initiation of targeted therapy.


Subject(s)
Brain Neoplasms , Diffusion Magnetic Resonance Imaging , Machine Learning , Neoplasms, Neuroepithelial , Humans , Child , Female , Male , Retrospective Studies , Neoplasms, Neuroepithelial/diagnostic imaging , Neoplasms, Neuroepithelial/genetics , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/genetics , Brain Neoplasms/pathology , Child, Preschool , Adolescent , Diffusion Magnetic Resonance Imaging/methods , Proto-Oncogene Proteins B-raf/genetics , Infant , Neoplasm Grading , Biomarkers, Tumor/genetics
10.
Front Public Health ; 11: 968319, 2023.
Article in English | MEDLINE | ID: mdl-36908403

ABSTRACT

In this work, we examine magnetic resonance imaging (MRI) and ultrasound (US) appointments at the Diagnostic Imaging (DI) department of a pediatric hospital to discover possible relationships between selected patient features and no-show or long waiting room time endpoints. The chosen features include age, sex, income, distance from the hospital, percentage of non-English speakers in a postal code, percentage of single caregivers in a postal code, appointment time slot (morning, afternoon, evening), and day of the week (Monday to Sunday). We trained univariate Logistic Regression (LR) models using the training sets and identified predictive (significant) features that remained significant in the test sets. We also implemented multivariate Random Forest (RF) models to predict the endpoints. We achieved Area Under the Receiver Operating Characteristic Curve (AUC) of 0.82 and 0.73 for predicting no-show and long waiting room time endpoints, respectively. The univariate LR analysis on DI appointments uncovered the effect of the time of appointment during the day/week, and patients' demographics such as income and the number of caregivers on the no-shows and long waiting room time endpoints. For predicting no-show, we found age, time slot, and percentage of single caregiver to be the most critical contributors. Age, distance, and percentage of non-English speakers were the most important features for our long waiting room time prediction models. We found no sex discrimination among the scheduled pediatric DI appointments. Nonetheless, inequities based on patient features such as low income and language barrier did exist.


Subject(s)
Appointments and Schedules , Magnetic Resonance Imaging , Humans , Child , Magnetic Resonance Imaging/methods , Logistic Models , Hospitals , Machine Learning
11.
ACS Appl Nano Mater ; 6(17): 15385-15396, 2023 Sep 08.
Article in English | MEDLINE | ID: mdl-37706067

ABSTRACT

Characterizing complex biofluids using surface-enhanced Raman spectroscopy (SERS) coupled with machine learning (ML) has been proposed as a powerful tool for point-of-care detection of clinical disease. ML is well-suited to categorizing otherwise uninterpretable, patient-derived SERS spectra that contain a multitude of low concentration, disease-specific molecular biomarkers among a dense spectral background of biological molecules. However, ML can generate false, non-generalizable models when data sets used for model training are inadequate. It is thus critical to determine how different SERS experimental methodologies and workflow parameters can potentially impact ML disease classification of clinical samples. In this study, a label-free, broadband, Ag nanoparticle-based SERS platform was coupled with ML to assess simulated clinical samples for cardiovascular disease (CVD), containing randomized combinations of five key CVD biomarkers at clinically relevant concentrations in serum. Raman spectra obtained at 532, 633, and 785 nm from up to 300 unique samples were classified into physiological and pathological categories using two standard ML models. Label-free SERS and ML could correctly classify randomized CVD samples with high accuracies of up to 90.0% at 532 nm using as few as 200 training samples. Spectra obtained at 532 nm produced the highest accuracies with no significant increase achieved using multiwavelength SERS. Sample preparation and measurement methodologies (e.g., different SERS substrate lots, sample volumes, sample sizes, and known variations in randomization and experimental handling) were shown to strongly influence the ML classification and could artificially increase classification accuracies by as much as 27%. This detailed investigation into the proper application of ML techniques for CVD classification can lead to improved data set acquisition required for the SERS community, such that ML on labeled and robust SERS data sets can be practically applied for future point-of-care testing in patients.

12.
Front Radiol ; 2: 991683, 2022.
Article in English | MEDLINE | ID: mdl-37492678

ABSTRACT

As deep learning is widely used in the radiology field, the explainability of Artificial Intelligence (AI) models is becoming increasingly essential to gain clinicians' trust when using the models for diagnosis. In this research, three experiment sets were conducted with a U-Net architecture to improve the disease classification performance while enhancing the heatmaps corresponding to the model's focus through incorporating heatmap generators during training. All experiments used the dataset that contained chest radiographs, associated labels from one of the three conditions ["normal", "congestive heart failure (CHF)", and "pneumonia"], and numerical information regarding a radiologist's eye-gaze coordinates on the images. The paper that introduced this dataset developed a U-Net model, which was treated as the baseline model for this research, to show how the eye-gaze data can be used in multi-modal training for explainability improvement and disease classification. To compare the classification performances among this research's three experiment sets and the baseline model, the 95% confidence intervals (CI) of the area under the receiver operating characteristic curve (AUC) were measured. The best method achieved an AUC of 0.913 with a 95% CI of [0.860, 0.966]. "Pneumonia" and "CHF" classes, which the baseline model struggled the most to classify, had the greatest improvements, resulting in AUCs of 0.859 with a 95% CI of [0.732, 0.957] and 0.962 with a 95% CI of [0.933, 0.989], respectively. The decoder of the U-Net for the best-performing proposed method generated heatmaps that highlight the determining image parts in model classifications. These predicted heatmaps, which can be used for the explainability of the model, also improved to align well with the radiologist's eye-gaze data. Hence, this work showed that incorporating heatmap generators and eye-gaze information into training can simultaneously improve disease classification and provide explainable visuals that align well with how the radiologist viewed the chest radiographs when making diagnosis.

13.
Front Artif Intell ; 4: 635766, 2021.
Article in English | MEDLINE | ID: mdl-34079932

ABSTRACT

Brain tumor is one of the leading causes of cancer-related death globally among children and adults. Precise classification of brain tumor grade (low-grade and high-grade glioma) at an early stage plays a key role in successful prognosis and treatment planning. With recent advances in deep learning, artificial intelligence-enabled brain tumor grading systems can assist radiologists in the interpretation of medical images within seconds. The performance of deep learning techniques is, however, highly depended on the size of the annotated dataset. It is extremely challenging to label a large quantity of medical images, given the complexity and volume of medical data. In this work, we propose a novel transfer learning-based active learning framework to reduce the annotation cost while maintaining stability and robustness of the model performance for brain tumor classification. In this retrospective research, we employed a 2D slice-based approach to train and fine-tune our model on the magnetic resonance imaging (MRI) training dataset of 203 patients and a validation dataset of 66 patients which was used as the baseline. With our proposed method, the model achieved area under receiver operating characteristic (ROC) curve (AUC) of 82.89% on a separate test dataset of 66 patients, which was 2.92% higher than the baseline AUC while saving at least 40% of labeling cost. In order to further examine the robustness of our method, we created a balanced dataset, which underwent the same procedure. The model achieved AUC of 82% compared with AUC of 78.48% for the baseline, which reassures the robustness and stability of our proposed transfer learning augmented with active learning framework while significantly reducing the size of training data.

14.
Front Artif Intell ; 4: 582928, 2021.
Article in English | MEDLINE | ID: mdl-34917933

ABSTRACT

Receiver operating characteristic (ROC) curve is an informative tool in binary classification and Area Under ROC Curve (AUC) is a popular metric for reporting performance of binary classifiers. In this paper, first we present a comprehensive review of ROC curve and AUC metric. Next, we propose a modified version of AUC that takes confidence of the model into account and at the same time, incorporates AUC into Binary Cross Entropy (BCE) loss used for training a Convolutional neural Network for classification tasks. We demonstrate this on three datasets: MNIST, prostate MRI, and brain MRI. Furthermore, we have published GenuineAI, a new python library, which provides the functions for conventional AUC and the proposed modified AUC along with metrics including sensitivity, specificity, recall, precision, and F1 for each point of the ROC curve.

15.
Transplantation ; 105(11): 2435-2444, 2021 11 01.
Article in English | MEDLINE | ID: mdl-33982917

ABSTRACT

BACKGROUND: Despite transarterial chemoembolization (TACE) for hepatocellular carcinoma (HCC), a significant number of patients will develop progression on the liver transplant (LT) waiting list or disease recurrence post-LT. We sought to evaluate the feasibility of a pre-TACE radiomics model, an imaging-based tool to predict these adverse outcomes. METHODS: We analyzed the pre-TACE computed tomography images of patients waiting for a LT. The primary endpoint was a combined event that included waitlist dropout for tumor progression or tumor recurrence post-LT. The radiomic features were extracted from the largest HCC volume from the arterial and portal venous phase. A third set of features was created, combining the features from these 2 contrast phases. We applied a least absolute shrinkage and selection operator feature selection method and a support vector machine classifier. Three prognostic models were built using each feature set. The models' performance was compared using 5-fold cross-validated area under the receiver operating characteristic curves. RESULTS: . Eighty-eight patients were included, of whom 33 experienced the combined event (37.5%). The median time to dropout was 5.6 mo (interquartile range: 3.6-9.3), and the median time for post-LT recurrence was 19.2 mo (interquartile range: 6.1-34.0). Twenty-four patients (27.3%) dropped out and 64 (72.7%) patients were transplanted. Of these, 14 (21.9%) had recurrence post-LT. Model performance yielded a mean area under the receiver operating characteristic curves of 0.70 (±0.07), 0.87 (±0.06), and 0.81 (±0.06) for the arterial, venous, and the combined models, respectively. CONCLUSIONS: A pre-TACE radiomics model for HCC patients undergoing LT may be a useful tool for outcome prediction. Further external model validation with a larger sample size is required.


Subject(s)
Carcinoma, Hepatocellular , Chemoembolization, Therapeutic , Liver Neoplasms , Liver Transplantation , Biomarkers , Carcinoma, Hepatocellular/diagnostic imaging , Carcinoma, Hepatocellular/surgery , Chemoembolization, Therapeutic/adverse effects , Humans , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/surgery , Liver Transplantation/adverse effects , Liver Transplantation/methods , Neoplasm Recurrence, Local/etiology , Pilot Projects , Retrospective Studies
SELECTION OF CITATIONS
SEARCH DETAIL