Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 48
Filter
1.
Radiology ; 312(2): e232635, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39105640

ABSTRACT

Background Multiparametric MRI can help identify clinically significant prostate cancer (csPCa) (Gleason score ≥7) but is limited by reader experience and interobserver variability. In contrast, deep learning (DL) produces deterministic outputs. Purpose To develop a DL model to predict the presence of csPCa by using patient-level labels without information about tumor location and to compare its performance with that of radiologists. Materials and Methods Data from patients without known csPCa who underwent MRI from January 2017 to December 2019 at one of multiple sites of a single academic institution were retrospectively reviewed. A convolutional neural network was trained to predict csPCa from T2-weighted images, diffusion-weighted images, apparent diffusion coefficient maps, and T1-weighted contrast-enhanced images. The reference standard was pathologic diagnosis. Radiologist performance was evaluated as follows: Radiology reports were used for the internal test set, and four radiologists' PI-RADS ratings were used for the external (ProstateX) test set. The performance was compared using areas under the receiver operating characteristic curves (AUCs) and the DeLong test. Gradient-weighted class activation maps (Grad-CAMs) were used to show tumor localization. Results Among 5735 examinations in 5215 patients (mean age, 66 years ± 8 [SD]; all male), 1514 examinations (1454 patients) showed csPCa. In the internal test set (400 examinations), the AUC was 0.89 and 0.89 for the DL classifier and radiologists, respectively (P = .88). In the external test set (204 examinations), the AUC was 0.86 and 0.84 for the DL classifier and radiologists, respectively (P = .68). DL classifier plus radiologists had an AUC of 0.89 (P < .001). Grad-CAMs demonstrated activation over the csPCa lesion in 35 of 38 and 56 of 58 true-positive examinations in internal and external test sets, respectively. Conclusion The performance of a DL model was not different from that of radiologists in the detection of csPCa at MRI, and Grad-CAMs localized the tumor. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Johnson and Chandarana in this issue.


Subject(s)
Deep Learning , Magnetic Resonance Imaging , Prostatic Neoplasms , Male , Humans , Prostatic Neoplasms/diagnostic imaging , Retrospective Studies , Aged , Middle Aged , Magnetic Resonance Imaging/methods , Image Interpretation, Computer-Assisted/methods , Multiparametric Magnetic Resonance Imaging/methods , Prostate/diagnostic imaging , Prostate/pathology
2.
Bioengineering (Basel) ; 11(7)2024 Jun 25.
Article in English | MEDLINE | ID: mdl-39061730

ABSTRACT

Thyroid Ultrasound (US) is the primary method to evaluate thyroid nodules. Deep learning (DL) has been playing a significant role in evaluating thyroid cancer. We propose a DL-based pipeline to detect and classify thyroid nodules into benign or malignant groups relying on two views of US imaging. Transverse and longitudinal US images of thyroid nodules from 983 patients were collected retrospectively. Eighty-one cases were held out as a testing set, and the rest of the data were used in five-fold cross-validation (CV). Two You Look Only Once (YOLO) v5 models were trained to detect nodules and classify them. For each view, five models were developed during the CV, which was ensembled by using non-max suppression (NMS) to boost their collective generalizability. An extreme gradient boosting (XGBoost) model was trained on the outputs of the ensembled models for both views to yield a final prediction of malignancy for each nodule. The test set was evaluated by an expert radiologist using the American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TIRADS). The ensemble models for each view achieved a mAP0.5 of 0.797 (transverse) and 0.716 (longitudinal). The whole pipeline reached an AUROC of 0.84 (CI 95%: 0.75-0.91) with sensitivity and specificity of 84% and 63%, respectively, while the ACR-TIRADS evaluation of the same set had a sensitivity of 76% and specificity of 34% (p-value = 0.003). Our proposed work demonstrated the potential possibility of a deep learning model to achieve diagnostic performance for thyroid nodule evaluation.

3.
Spine Deform ; 2024 Jul 22.
Article in English | MEDLINE | ID: mdl-39039392

ABSTRACT

PURPOSE: The purpose of this study is to develop and apply an algorithm that automatically classifies spine radiographs of pediatric scoliosis patients. METHODS: Anterior-posterior (AP) and lateral spine radiographs were extracted from the institutional picture archive for patients with scoliosis. Overall, there were 7777 AP images and 5621 lateral images. Radiographs were manually classified into ten categories: two preoperative and three postoperative categories each for AP and lateral images. The images were split into training, validation, and testing sets (70:15:15 proportional split). A deep learning classifier using the EfficientNet B6 architecture was trained on the spine training set. Hyperparameters and model architecture were tuned against the performance of the models in the validation set. RESULTS: The trained classifiers had an overall accuracy on the test set of 1.00 on 1166 AP images and 1.00 on 843 lateral images. Precision ranged from 0.98 to 1.00 in the AP images, and from 0.91 to 1.00 on the lateral images. Lower performance was observed on classes with fewer than 100 images in the dataset. Final performance metrics were calculated on the assigned test set, including accuracy, precision, recall, and F1 score (the harmonic mean of precision and recall). CONCLUSIONS: A deep learning convolutional neural network classifier was trained to a high degree of accuracy to distinguish between 10 categories pre- and postoperative spine radiographs of patients with scoliosis. Observed performance was higher in more prevalent categories. These models represent an important step in developing an automatic system for data ingestion into large, labeled imaging registries.

4.
Res Diagn Interv Imaging ; 9: 100044, 2024 Mar.
Article in English | MEDLINE | ID: mdl-39076582

ABSTRACT

Background: Dual-energy CT (DECT) is a non-invasive way to determine the presence of monosodium urate (MSU) crystals in the workup of gout. Color-coding distinguishes MSU from calcium following material decomposition and post-processing. Most software labels MSU as green and calcium as blue. There are limitations in the current image processing methods of segmenting green-encoded pixels. Additionally, identifying green foci is tedious, and automated detection would improve workflow. This study aimed to determine the optimal deep learning (DL) algorithm for segmenting green-encoded pixels of MSU crystals on DECTs. Methods: DECT images of positive and negative gout cases were retrospectively collected. The dataset was split into train (N = 28) and held-out test (N = 30) sets. To perform cross-validation, the train set was split into seven folds. The images were presented to two musculoskeletal radiologists, who independently identified green-encoded voxels. Two 3D Unet-based DL models, Segresnet and SwinUNETR, were trained, and the Dice similarity coefficient (DSC), sensitivity, and specificity were reported as the segmentation metrics. Results: Segresnet showed superior performance, achieving a DSC of 0.9999 for the background pixels, 0.7868 for the green pixels, and an average DSC of 0.8934 for both types of pixels, respectively. According to the post-processed results, the Segresnet reached voxel-level sensitivity and specificity of 98.72 % and 99.98 %, respectively. Conclusion: In this study, we compared two DL-based segmentation approaches for detecting MSU deposits in a DECT dataset. The Segresnet resulted in superior performance metrics. The developed algorithm provides a potential fast, consistent, highly sensitive and specific computer-aided diagnosis tool. Ultimately, such an algorithm could be used by radiologists to streamline DECT workflow and improve accuracy in the detection of gout.

8.
EBioMedicine ; 104: 105174, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38821021

ABSTRACT

BACKGROUND: Chest X-rays (CXR) are essential for diagnosing a variety of conditions, but when used on new populations, model generalizability issues limit their efficacy. Generative AI, particularly denoising diffusion probabilistic models (DDPMs), offers a promising approach to generating synthetic images, enhancing dataset diversity. This study investigates the impact of synthetic data supplementation on the performance and generalizability of medical imaging research. METHODS: The study employed DDPMs to create synthetic CXRs conditioned on demographic and pathological characteristics from the CheXpert dataset. These synthetic images were used to supplement training datasets for pathology classifiers, with the aim of improving their performance. The evaluation involved three datasets (CheXpert, MIMIC-CXR, and Emory Chest X-ray) and various experiments, including supplementing real data with synthetic data, training with purely synthetic data, and mixing synthetic data with external datasets. Performance was assessed using the area under the receiver operating curve (AUROC). FINDINGS: Adding synthetic data to real datasets resulted in a notable increase in AUROC values (up to 0.02 in internal and external test sets with 1000% supplementation, p-value <0.01 in all instances). When classifiers were trained exclusively on synthetic data, they achieved performance levels comparable to those trained on real data with 200%-300% data supplementation. The combination of real and synthetic data from different sources demonstrated enhanced model generalizability, increasing model AUROC from 0.76 to 0.80 on the internal test set (p-value <0.01). INTERPRETATION: Synthetic data supplementation significantly improves the performance and generalizability of pathology classifiers in medical imaging. FUNDING: Dr. Gichoya is a 2022 Robert Wood Johnson Foundation Harold Amos Medical Faculty Development Program and declares support from RSNA Health Disparities grant (#EIHD2204), Lacuna Fund (#67), Gordon and Betty Moore Foundation, NIH (NIBIB) MIDRC grant under contracts 75N92020C00008 and 75N92020C00021, and NHLBI Award Number R01HL167811.


Subject(s)
Diagnostic Imaging , ROC Curve , Humans , Diagnostic Imaging/methods , Algorithms , Radiography, Thoracic/methods , Image Processing, Computer-Assisted/methods , Databases, Factual , Area Under Curve , Models, Statistical
10.
J Imaging Inform Med ; 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38558368

ABSTRACT

In recent years, the role of Artificial Intelligence (AI) in medical imaging has become increasingly prominent, with the majority of AI applications approved by the FDA being in imaging and radiology in 2023. The surge in AI model development to tackle clinical challenges underscores the necessity for preparing high-quality medical imaging data. Proper data preparation is crucial as it fosters the creation of standardized and reproducible AI models while minimizing biases. Data curation transforms raw data into a valuable, organized, and dependable resource and is a fundamental process to the success of machine learning and analytical projects. Considering the plethora of available tools for data curation in different stages, it is crucial to stay informed about the most relevant tools within specific research areas. In the current work, we propose a descriptive outline for different steps of data curation while we furnish compilations of tools collected from a survey applied among members of the Society of Imaging Informatics (SIIM) for each of these stages. This collection has the potential to enhance the decision-making process for researchers as they select the most appropriate tool for their specific tasks.

11.
J Imaging Inform Med ; 37(4): 1664-1673, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38483694

ABSTRACT

The application of deep learning (DL) in medicine introduces transformative tools with the potential to enhance prognosis, diagnosis, and treatment planning. However, ensuring transparent documentation is essential for researchers to enhance reproducibility and refine techniques. Our study addresses the unique challenges presented by DL in medical imaging by developing a comprehensive checklist using the Delphi method to enhance reproducibility and reliability in this dynamic field. We compiled a preliminary checklist based on a comprehensive review of existing checklists and relevant literature. A panel of 11 experts in medical imaging and DL assessed these items using Likert scales, with two survey rounds to refine responses and gauge consensus. We also employed the content validity ratio with a cutoff of 0.59 to determine item face and content validity. Round 1 included a 27-item questionnaire, with 12 items demonstrating high consensus for face and content validity that were then left out of round 2. Round 2 involved refining the checklist, resulting in an additional 17 items. In the last round, 3 items were deemed non-essential or infeasible, while 2 newly suggested items received unanimous agreement for inclusion, resulting in a final 26-item DL model reporting checklist derived from the Delphi process. The 26-item checklist facilitates the reproducible reporting of DL tools and enables scientists to replicate the study's results.


Subject(s)
Checklist , Deep Learning , Delphi Technique , Diagnostic Imaging , Humans , Reproducibility of Results , Diagnostic Imaging/methods , Diagnostic Imaging/standards , Surveys and Questionnaires
12.
14.
J Arthroplasty ; 39(4): 966-973.e17, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37770007

ABSTRACT

BACKGROUND: Revision total hip arthroplasty (THA) requires preoperatively identifying in situ implants, a time-consuming and sometimes unachievable task. Although deep learning (DL) tools have been attempted to automate this process, existing approaches are limited by classifying few femoral and zero acetabular components, only classify on anterior-posterior (AP) radiographs, and do not report prediction uncertainty or flag outlier data. METHODS: This study introduces Total Hip Arhtroplasty Automated Implant Detector (THA-AID), a DL tool trained on 241,419 radiographs that identifies common designs of 20 femoral and 8 acetabular components from AP, lateral, or oblique views and reports prediction uncertainty using conformal prediction and outlier detection using a custom framework. We evaluated THA-AID using internal, external, and out-of-domain test sets and compared its performance with human experts. RESULTS: THA-AID achieved internal test set accuracies of 98.9% for both femoral and acetabular components with no significant differences based on radiographic view. The femoral classifier also achieved 97.0% accuracy on the external test set. Adding conformal prediction increased true label prediction by 0.1% for acetabular and 0.7 to 0.9% for femoral components. More than 99% of out-of-domain and >89% of in-domain outlier data were correctly identified by THA-AID. CONCLUSIONS: The THA-AID is an automated tool for implant identification from radiographs with exceptional performance on internal and external test sets and no decrement in performance based on radiographic view. Importantly, this is the first study in orthopedics to our knowledge including uncertainty quantification and outlier detection of a DL model.


Subject(s)
Arthroplasty, Replacement, Hip , Deep Learning , Hip Prosthesis , Humans , Uncertainty , Acetabulum/surgery , Retrospective Studies
15.
J Arthroplasty ; 39(3): 727-733.e4, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37619804

ABSTRACT

BACKGROUND: This study introduces THA-Net, a deep learning inpainting algorithm for simulating postoperative total hip arthroplasty (THA) radiographs from a single preoperative pelvis radiograph input, while being able to generate predictions either unconditionally (algorithm chooses implants) or conditionally (surgeon chooses implants). METHODS: The THA-Net is a deep learning algorithm which receives an input preoperative radiograph and subsequently replaces the target hip joint with THA implants to generate a synthetic yet realistic postoperative radiograph. We trained THA-Net on 356,305 pairs of radiographs from 14,357 patients from a single institution's total joint registry and evaluated the validity (quality of surgical execution) and realism (ability to differentiate real and synthetic radiographs) of its outputs against both human-based and software-based criteria. RESULTS: The surgical validity of synthetic postoperative radiographs was significantly higher than their real counterparts (mean difference: 0.8 to 1.1 points on 10-point Likert scale, P < .001), but they were not able to be differentiated in terms of realism in blinded expert review. Synthetic images showed excellent validity and realism when analyzed with already validated deep learning models. CONCLUSION: We developed a THA next-generation templating tool that can generate synthetic radiographs graded higher on ultimate surgical execution than real radiographs from training data. Further refinement of this tool may potentiate patient-specific surgical planning and enable technologies such as robotics, navigation, and augmented reality (an online demo of THA-Net is available at: https://demo.osail.ai/tha_net).


Subject(s)
Arthroplasty, Replacement, Hip , Deep Learning , Hip Prosthesis , Humans , Arthroplasty, Replacement, Hip/methods , Hip Joint/diagnostic imaging , Hip Joint/surgery , Radiography , Retrospective Studies
16.
Radiol Artif Intell ; 5(6): e230085, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38074777

ABSTRACT

Radiographic markers contain protected health information that must be removed before public release. This work presents a deep learning algorithm that localizes radiographic markers and selectively removes them to enable de-identified data sharing. The authors annotated 2000 hip and pelvic radiographs to train an object detection computer vision model. Data were split into training, validation, and test sets at the patient level. Extracted markers were then characterized using an image processing algorithm, and potentially useful markers (eg, "L" and "R") without identifying information were retained. The model achieved an area under the precision-recall curve of 0.96 on the internal test set. The de-identification accuracy was 100% (400 of 400), with a de-identification false-positive rate of 1% (eight of 632) and a retention accuracy of 93% (359 of 386) for laterality markers. The algorithm was further validated on an external dataset of chest radiographs, achieving a de-identification accuracy of 96% (221 of 231). After fine-tuning the model on 20 images from the external dataset to investigate the potential for improvement, a 99.6% (230 of 231, P = .04) de-identification accuracy and decreased false-positive rate of 5% (26 of 512) were achieved. These results demonstrate the effectiveness of a two-pass approach in image de-identification. Keywords: Conventional Radiography, Skeletal-Axial, Thorax, Experimental Investigations, Supervised Learning, Transfer Learning, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Chang and Li in this issue.

17.
Orthop J Sports Med ; 11(12): 23259671231215820, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38107846

ABSTRACT

Background: An increased posterior tibial slope (PTS) corresponds with an increased risk of graft failure after anterior cruciate ligament (ACL) reconstruction (ACLR). Validated methods of manual PTS measurements are subject to potential interobserver variability and can be inefficient on large datasets. Purpose/Hypothesis: To develop a deep learning artificial intelligence technique for automated PTS measurement from standard lateral knee radiographs. It was hypothesized that this deep learning tool would be able to measure the PTS on a high volume of radiographs expeditiously and that these measurements would be similar to previously validated manual measurements. Study Design: Cohort study (diagnosis); Level of evidence, 2. Methods: A deep learning U-Net model was developed on a cohort of 300 postoperative short-leg lateral radiographs from patients who underwent ACLR to segment the tibial shaft, tibial joint surface, and tibial tuberosity. The model was trained via a random split after an 80 to 20 train-validation scheme. Masks for training images were manually segmented, and the model was trained for 400 epochs. An image processing pipeline was then deployed to annotate and measure the PTS using the predicted segmentation masks. Finally, the performance of this combined pipeline was compared with human measurements performed by 2 study personnel using a previously validated manual technique for measuring the PTS on short-leg lateral radiographs on an independent test set consisting of both pre- and postoperative images. Results: The U-Net semantic segmentation model achieved a mean Dice similarity coefficient of 0.885 on the validation cohort. The mean difference between the human-made and computer-vision measurements was 1.92° (σ = 2.81° [P = .24]). Extreme disagreements between the human and machine measurements, as defined by ≥5° differences, occurred <5% of the time. The model was incorporated into a web-based digital application front-end for demonstration purposes, which can measure a single uploaded image in Portable Network Graphics format in a mean time of 5 seconds. Conclusion: We developed an efficient and reliable deep learning computer vision algorithm to automate the PTS measurement on short-leg lateral knee radiographs. This tool, which demonstrated good agreement with human annotations, represents an effective clinical adjunct for measuring the PTS as part of the preoperative assessment of patients with ACL injuries.

18.
Article in English | MEDLINE | ID: mdl-37849415

ABSTRACT

The digitization of medical records and expanding electronic health records has created an era of "Big Data" with an abundance of available information ranging from clinical notes to imaging studies. In the field of rheumatology, medical imaging is used to guide both diagnosis and treatment of a wide variety of rheumatic conditions. Although there is an abundance of data to analyze, traditional methods of image analysis are human resource intensive. Fortunately, the growth of artificial intelligence (AI) may be a solution to handle large datasets. In particular, computer vision is a field within AI that analyzes images and extracts information. Computer vision has impressive capabilities and can be applied to rheumatologic conditions, necessitating a need to understand how computer vision works. In this article, we provide an overview of AI in rheumatology and conclude with a five step process to plan and conduct research in the field of computer vision. The five steps include (1) project definition, (2) data handling, (3) model development, (4) performance evaluation, and (5) deployment into clinical care.

19.
Comput Methods Programs Biomed ; 242: 107832, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37778140

ABSTRACT

BACKGROUND: Medical image analysis pipelines often involve segmentation, which requires a large amount of annotated training data, which is time-consuming and costly. To address this issue, we proposed leveraging generative models to achieve few-shot image segmentation. METHODS: We trained a denoising diffusion probabilistic model (DDPM) on 480,407 pelvis radiographs to generate 256 âœ• 256 px synthetic images. The DDPM was conditioned on demographic and radiologic characteristics and was rigorously validated by domain experts and objective image quality metrics (Frechet inception distance [FID] and inception score [IS]). For the next step, three landmarks (greater trochanter [GT], lesser trochanter [LT], and obturator foramen [OF]) were annotated on 45 real-patient radiographs; 25 for training and 20 for testing. To extract features, each image was passed through the pre-trained DDPM at three timesteps and for each pass, features from specific blocks were extracted. The features were concatenated with the real image to form an image with 4225 channels. The feature-set was broken into random patches, which were fed to a U-Net. Dice Similarity Coefficient (DSC) was used to compare the performance with a vanilla U-Net trained on radiographs. RESULTS: Expert accuracy was 57.5 % in determining real versus generated images, while the model reached an FID = 7.2 and IS = 210. The segmentation UNet trained on the 20 feature-sets achieved a DSC of 0.90, 0.84, and 0.61 for OF, GT, and LT segmentation, respectively, which was at least 0.30 points higher than the naively trained model. CONCLUSION: We demonstrated the applicability of DDPMs as feature extractors, facilitating medical image segmentation with few annotated samples.


Subject(s)
Benchmarking , Bisacodyl , Humans , Diffusion , Femur , Image Processing, Computer-Assisted
20.
N Am Spine Soc J ; 15: 100236, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37599816

ABSTRACT

Background: Artificial intelligence is a revolutionary technology that promises to assist clinicians in improving patient care. In radiology, deep learning (DL) is widely used in clinical decision aids due to its ability to analyze complex patterns and images. It allows for rapid, enhanced data, and imaging analysis, from diagnosis to outcome prediction. The purpose of this study was to evaluate the current literature and clinical utilization of DL in spine imaging. Methods: This study is a scoping review and utilized the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) methodology to review the scientific literature from 2012 to 2021. A search in PubMed, Web of Science, Embased, and IEEE Xplore databases with syntax specific for DL and medical imaging in spine care applications was conducted to collect all original publications on the subject. Specific data was extracted from the available literature, including algorithm application, algorithms tested, database type and size, algorithm training method, and outcome of interest. Results: A total of 365 studies (total sample of 232,394 patients) were included and grouped into 4 general applications: diagnostic tools, clinical decision support tools, automated clinical/instrumentation assessment, and clinical outcome prediction. Notable disparities exist in the selected algorithms and the training across multiple disparate databases. The most frequently used algorithms were U-Net and ResNet. A DL model was developed and validated in 92% of included studies, while a pre-existing DL model was investigated in 8%. Of all developed models, only 15% of them have been externally validated. Conclusions: Based on this scoping review, DL in spine imaging is used in a broad range of clinical applications, particularly for diagnosing spinal conditions. There is a wide variety of DL algorithms, database characteristics, and training methods. Future studies should focus on external validation of existing models before bringing them into clinical use.

SELECTION OF CITATIONS
SEARCH DETAIL