RESUMEN
OBJECTIVES: Transvaginal ultrasound is typically the initial diagnostic approach in patients with postmenopausal bleeding for detecting endometrial atypical hyperplasia/cancer. Although transvaginal ultrasound demonstrates notable sensitivity, its specificity remains limited. The objective of this study was to enhance the diagnostic accuracy of transvaginal ultrasound through the integration of artificial intelligence. By using transvaginal ultrasound images, we aimed to develop an artificial intelligence based automated segmentation model and an artificial intelligence based classifier model. METHODS: Patients with postmenopausal bleeding undergoing transvaginal ultrasound and endometrial sampling at Mayo Clinic between 2016 and 2021 were retrospectively included. Manual segmentation of images was performed by four physicians (readers). Patients were classified into cohort A (atypical hyperplasia/cancer) and cohort B (benign) based on the pathologic report of endometrial sampling. A fully automated segmentation model was developed, and the performance of the model in correctly identifying the endometrium was compared with physician made segmentation using similarity metrics. To develop the classifier model, radiomic features were calculated from the manually segmented regions-of-interest. These features were used to train a wide range of machine learning based classifiers. The top performing machine learning classifier was evaluated using a threefold approach, and diagnostic accuracy was assessed through the F1 score and area under the receiver operating characteristic curve (AUC-ROC). RESULTS: 302 patients were included. Automated segmentation-reader agreement was 0.79±0.21 using the Dice coefficient. For the classification task, 92 radiomic features related to pixel texture/shape/intensity were found to be significantly different between cohort A and B. The threefold evaluation of the top performing classifier model showed an AUC-ROC of 0.90 (range 0.88-0.92) on the validation set and 0.88 (range 0.86-0.91) on the hold-out test set. Sensitivity and specificity were 0.87 (range 0.77-0.94) and 0.86 (range 0.81-0.94), respectively. CONCLUSIONS: We trained an artificial intelligence based algorithm to differentiate endometrial atypical hyperplasia/cancer from benign conditions on transvaginal ultrasound images in a population of patients with postmenopausal bleeding.
Asunto(s)
Inteligencia Artificial , Hiperplasia Endometrial , Neoplasias Endometriales , Ultrasonografía , Humanos , Femenino , Neoplasias Endometriales/diagnóstico por imagen , Neoplasias Endometriales/patología , Hiperplasia Endometrial/diagnóstico por imagen , Hiperplasia Endometrial/patología , Estudios Retrospectivos , Ultrasonografía/métodos , Persona de Mediana Edad , Anciano , Sensibilidad y EspecificidadRESUMEN
Automatic abnormality identification of brachial plexus (BP) from normal magnetic resonance imaging to localize and identify a neurologic injury in clinical practice (MRI) is still a novel topic in brachial plexopathy. This study developed and evaluated an approach to differentiate abnormal BP with artificial intelligence (AI) over three commonly used MRI sequences, i.e. T1, FLUID sensitive and post-gadolinium sequences. A BP dataset was collected by radiological experts and a semi-supervised artificial intelligence method was used to segment the BP (based on nnU-net). Hereafter, a radiomics method was utilized to extract 107 shape and texture features from these ROIs. From various machine learning methods, we selected six widely recognized classifiers for training our Brachial plexus (BP) models and assessing their efficacy. To optimize these models, we introduced a dynamic feature selection approach aimed at discarding redundant and less informative features. Our experimental findings demonstrated that, in the context of identifying abnormal BP cases, shape features displayed heightened sensitivity compared to texture features. Notably, both the Logistic classifier and Bagging classifier outperformed other methods in our study. These evaluations illuminated the exceptional performance of our model trained on FLUID-sensitive sequences, which notably exceeded the results of both T1 and post-gadolinium sequences. Crucially, our analysis highlighted that both its classification accuracies and AUC score (area under the curve of receiver operating characteristics) over FLUID-sensitive sequence exceeded 90%. This outcome served as a robust experimental validation, affirming the substantial potential and strong feasibility of integrating AI into clinical practice.
Asunto(s)
Inteligencia Artificial , Plexo Braquial , Imagen por Resonancia Magnética , Humanos , Imagen por Resonancia Magnética/métodos , Plexo Braquial/diagnóstico por imagen , Neuropatías del Plexo Braquial/diagnóstico por imagen , Aprendizaje Automático , Femenino , Masculino , AdultoRESUMEN
PURPOSE: To determine if pancreas radiomics-based AI model can detect the CT imaging signature of type 2 diabetes (T2D). METHODS: Total 107 radiomic features were extracted from volumetrically segmented normal pancreas in 422 T2D patients and 456 age-matched controls. Dataset was randomly split into training (300 T2D, 300 control CTs) and test subsets (122 T2D, 156 control CTs). An XGBoost model trained on 10 features selected through top-K-based selection method and optimized through threefold cross-validation on training subset was evaluated on test subset. RESULTS: Model correctly classified 73 (60%) T2D patients and 96 (62%) controls yielding F1-score, sensitivity, specificity, precision, and AUC of 0.57, 0.62, 0.61, 0.55, and 0.65, respectively. Model's performance was equivalent across gender, CT slice thicknesses, and CT vendors (p values > 0.05). There was no difference between correctly classified versus misclassified patients in the mean (range) T2D duration [4.5 (0-15.4) versus 4.8 (0-15.7) years, p = 0.8], antidiabetic treatment [insulin (22% versus 18%), oral antidiabetics (10% versus 18%), both (41% versus 39%) (p > 0.05)], and treatment duration [5.4 (0-15) versus 5 (0-13) years, p = 0.4]. CONCLUSION: Pancreas radiomics-based AI model can detect the imaging signature of T2D. Further refinement and validation are needed to evaluate its potential for opportunistic T2D detection on millions of CTs that are performed annually.
Asunto(s)
Diabetes Mellitus Tipo 2 , Insulinas , Abdomen , Diabetes Mellitus Tipo 2/diagnóstico por imagen , Humanos , Hipoglucemiantes , Aprendizaje Automático , Estudios Retrospectivos , Tomografía Computarizada por Rayos X/métodosRESUMEN
BACKGROUND & AIMS: Our purpose was to detect pancreatic ductal adenocarcinoma (PDAC) at the prediagnostic stage (3-36 months before clinical diagnosis) using radiomics-based machine-learning (ML) models, and to compare performance against radiologists in a case-control study. METHODS: Volumetric pancreas segmentation was performed on prediagnostic computed tomography scans (CTs) (median interval between CT and PDAC diagnosis: 398 days) of 155 patients and an age-matched cohort of 265 subjects with normal pancreas. A total of 88 first-order and gray-level radiomic features were extracted and 34 features were selected through the least absolute shrinkage and selection operator-based feature selection method. The dataset was randomly divided into training (292 CTs: 110 prediagnostic and 182 controls) and test subsets (128 CTs: 45 prediagnostic and 83 controls). Four ML classifiers, k-nearest neighbor (KNN), support vector machine (SVM), random forest (RM), and extreme gradient boosting (XGBoost), were evaluated. Specificity of model with highest accuracy was further validated on an independent internal dataset (n = 176) and the public National Institutes of Health dataset (n = 80). Two radiologists (R4 and R5) independently evaluated the pancreas on a 5-point diagnostic scale. RESULTS: Median (range) time between prediagnostic CTs of the test subset and PDAC diagnosis was 386 (97-1092) days. SVM had the highest sensitivity (mean; 95% confidence interval) (95.5; 85.5-100.0), specificity (90.3; 84.3-91.5), F1-score (89.5; 82.3-91.7), area under the curve (AUC) (0.98; 0.94-0.98), and accuracy (92.2%; 86.7-93.7) for classification of CTs into prediagnostic versus normal. All 3 other ML models, KNN, RF, and XGBoost, had comparable AUCs (0.95, 0.95, and 0.96, respectively). The high specificity of SVM was generalizable to both the independent internal (92.6%) and the National Institutes of Health dataset (96.2%). In contrast, interreader radiologist agreement was only fair (Cohen's kappa 0.3) and their mean AUC (0.66; 0.46-0.86) was lower than each of the 4 ML models (AUCs: 0.95-0.98) (P < .001). Radiologists also recorded false positive indirect findings of PDAC in control subjects (n = 83) (7% R4, 18% R5). CONCLUSIONS: Radiomics-based ML models can detect PDAC from normal pancreas when it is beyond human interrogation capability at a substantial lead time before clinical diagnosis. Prospective validation and integration of such models with complementary fluid-based biomarkers has the potential for PDAC detection at a stage when surgical cure is a possibility.
Asunto(s)
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Estudios de Casos y Controles , Neoplasias Pancreáticas/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Carcinoma Ductal Pancreático/diagnóstico por imagen , Aprendizaje Automático , Estudios Retrospectivos , Neoplasias PancreáticasRESUMEN
PURPOSE: Total kidney volume (TKV) is the most important imaging biomarker for quantifying the severity of autosomal-dominant polycystic kidney disease (ADPKD). 3D ultrasound (US) can accurately measure kidney volume compared to 2D US; however, manual segmentation is tedious and requires expert annotators. We investigated a deep learning-based approach for automated segmentation of TKV from 3D US in ADPKD patients. METHOD: We used axially acquired 3D US-kidney images in 22 ADPKD patients where each patient and each kidney were scanned three times, resulting in 132 scans that were manually segmented. We trained a convolutional neural network to segment the whole kidney and measure TKV. All patients were subsequently imaged with MRI for measurement comparison. RESULTS: Our method automatically segmented polycystic kidneys in 3D US images obtaining an average Dice coefficient of 0.80 on the test dataset. The kidney volume measurement compared with linear regression coefficient and bias from human tracing were R2 = 0.81, and - 4.42%, and between AI and reference standard were R2 = 0.93, and - 4.12%, respectively. MRI and US measured kidney volumes had R2 = 0.84 and a bias of 7.47%. CONCLUSION: This is the first study applying deep learning to 3D US in ADPKD. Our method shows promising performance for auto-segmentation of kidneys using 3D US to measure TKV, close to human tracing and MRI measurement. This imaging and analysis method may be useful in a number of settings, including pediatric imaging, clinical studies, and longitudinal tracking of patient disease progression.
Asunto(s)
Enfermedades Renales Poliquísticas , Riñón Poliquístico Autosómico Dominante , Niño , Humanos , Imagenología Tridimensional , Riñón/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Riñón Poliquístico Autosómico Dominante/diagnóstico por imagenRESUMEN
BACKGROUND: In kidney transplantation, a contrast CT scan is obtained in the donor candidate to detect subclinical pathology in the kidney. Recent work from the Aging Kidney Anatomy study has characterized kidney, cortex, and medulla volumes using a manual image-processing tool. However, this technique is time consuming and impractical for clinical care, and thus, these measurements are not obtained during donor evaluations. This study proposes a fully automated segmentation approach for measuring kidney, cortex, and medulla volumes. METHODS: A total of 1930 contrast-enhanced CT exams with reference standard manual segmentations from one institution were used to develop the algorithm. A convolutional neural network model was trained (n=1238) and validated (n=306), and then evaluated in a hold-out test set of reference standard segmentations (n=386). After the initial evaluation, the algorithm was further tested on datasets originating from two external sites (n=1226). RESULTS: The automated model was found to perform on par with manual segmentation, with errors similar to interobserver variability with manual segmentation. Compared with the reference standard, the automated approach achieved a Dice similarity metric of 0.94 (right cortex), 0.90 (right medulla), 0.94 (left cortex), and 0.90 (left medulla) in the test set. Similar performance was observed when the algorithm was applied on the two external datasets. CONCLUSIONS: A fully automated approach for measuring cortex and medullary volumes in CT images of the kidneys has been established. This method may prove useful for a wide range of clinical applications.