Búsqueda | Portal Regional de la BVS

1.

Simplifying risk stratification for thyroid nodules on ultrasound: validation and performance of an artificial intelligence thyroid imaging reporting and data system.

Wildman-Tobriner, Benjamin; Yang, Jichen; Allen, Brian C; Ho, Lisa M; Miller, Chad M; Mazurowski, Maciej A.

Curr Probl Diagn Radiol ; 2024 Jul 09.

Artículo en Inglés | MEDLINE | ID: mdl-39033064

RESUMEN

PURPOSE: To validate the performance of a recently created risk stratification system (RSS) for thyroid nodules on ultrasound, the Artificial Intelligence Thyroid Imaging Reporting and Data System (AI TI-RADS). MATERIALS AND METHODS: 378 thyroid nodules from 320 patients were included in this retrospective evaluation. All nodules had ultrasound images and had undergone fine needle aspiration (FNA). 147 nodules were Bethesda V or VI (suspicious or diagnostic for malignancy), and 231 were Bethesda II (benign). Three radiologists assigned features according to the AI TI-RADS lexicon (same categories and features as the American College of Radiology TI-RADS) to each nodule based on ultrasound images. FNA recommendations using AI TI-RADS and ACR TI-RADS were then compared and sensitivity and specificity for each RSS were calculated. RESULTS: Across three readers, mean sensitivity of AI TI-RADS was lower than ACR TI-RADS (0.69 vs 0.72, p < 0.02), while mean specificity was higher (0.40 vs 0.37, p < 0.02). Overall total number of points assigned by all three readers decreased slightly when using AI TI-RADS (5,998 for AI TI-RADS vs 6,015 for ACR TI-RADS), including more values of 0 to several features. CONCLUSION: AI TI-RADS performed similarly to ACR TI-RADS while eliminating point assignments for many features, allowing for simplification of future TI-RADS versions.

2.

Automated selection of abdominal MRI series using a DICOM metadata classifier and selective use of a pixel-based classifier.

Miller, Chad M; Zhu, Zhe; Mazurowski, Maciej A; Bashir, Mustafa R; Wiggins, Walter F.

Abdom Radiol (NY) ; 2024 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-38860997

RESUMEN

Accurate, automated MRI series identification is important for many applications, including display ("hanging") protocols, machine learning, and radiomics. The use of the series description or a pixel-based classifier each has limitations. We demonstrate a combined approach utilizing a DICOM metadata-based classifier and selective use of a pixel-based classifier to identify abdominal MRI series. The metadata classifier was assessed alone as Group metadata and combined with selective use of the pixel-based classifier for predictions with less than 70% certainty (Group combined). The overall accuracy (mean and 95% confidence intervals) for Groups metadata and combined on the test dataset were 0.870 CI (0.824,0.912) and 0.930 CI (0.893,0.963), respectively. With this combined metadata and pixel-based approach, we demonstrate accurate classification of 95% or greater for all pre-contrast MRI series and improved performance for some post-contrast series.

3.

A publicly available deep learning model and dataset for segmentation of breast, fibroglandular tissue, and vessels in breast MRI.

Lew, Christopher O; Harouni, Majid; Kirksey, Ella R; Kang, Elianne J; Dong, Haoyu; Gu, Hanxue; Grimm, Lars J; Walsh, Ruth; Lowell, Dorothy A; Mazurowski, Maciej A.

Sci Rep ; 14(1): 5383, 2024 03 05.

Artículo en Inglés | MEDLINE | ID: mdl-38443410

RESUMEN

Breast density, or the amount of fibroglandular tissue (FGT) relative to the overall breast volume, increases the risk of developing breast cancer. Although previous studies have utilized deep learning to assess breast density, the limited public availability of data and quantitative tools hinders the development of better assessment tools. Our objective was to (1) create and share a large dataset of pixel-wise annotations according to well-defined criteria, and (2) develop, evaluate, and share an automated segmentation method for breast, FGT, and blood vessels using convolutional neural networks. We used the Duke Breast Cancer MRI dataset to randomly select 100 MRI studies and manually annotated the breast, FGT, and blood vessels for each study. Model performance was evaluated using the Dice similarity coefficient (DSC). The model achieved DSC values of 0.92 for breast, 0.86 for FGT, and 0.65 for blood vessels on the test set. The correlation between our model's predicted breast density and the manually generated masks was 0.95. The correlation between the predicted breast density and qualitative radiologist assessment was 0.75. Our automated models can accurately segment breast, FGT, and blood vessels using pre-contrast breast MRI data. The data and the models were made publicly available.

Asunto(s)

Neoplasias de la Mama , Aprendizaje Profundo , Humanos , Femenino , Imagen por Resonancia Magnética , Radiografía , Densidad de la Mama , Neoplasias de la Mama/diagnóstico por imagen

4.

Computed Tomography Volumetrics for Size Matching in Lung Transplantation for Restrictive Disease.

Prabhu, Neel K; Wong, Megan K; Klapper, Jacob A; Haney, John C; Mazurowski, Maciej A; Mammarappallil, Joseph G; Hartwig, Matthew G.

Ann Thorac Surg ; 117(2): 413-421, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-37031770

RESUMEN

BACKGROUND: There is no consensus on the optimal allograft sizing strategy for lung transplantation in restrictive lung disease. Current methods that are based on predicted total lung capacity (pTLC) ratios do not account for the diminutive recipient chest size. The study investigators hypothesized that a new sizing ratio incorporating preoperative recipient computed tomographic lung volumes (CTVol) would be associated with postoperative outcomes. METHODS: A retrospective single-institution study was conducted of adults undergoing primary bilateral lung transplantation between January 2016 and July 2020 for restrictive lung disease. CTVol was computed for recipients by using advanced segmentation software. Two sizing ratios were calculated: pTLC ratio (pTLCdonor/pTLCrecipient) and a new volumetric ratio (pTLCdonor/CTVolrecipient). Patients were divided into reference, oversized, and undersized groups on the basis of ratio quintiles, and multivariable models were used to assess the effect of the ratios on primary graft dysfunction and survival. RESULTS: CTVol was successfully acquired in 218 of 220 (99.1%) patients. In adjusted analysis, undersizing on the basis of the volumetric ratio was independently associated with decreased primary graft dysfunction grade 2 or 3 within 72 hours (odds ratio, 0.42; 95% CI, 0.20-0.87; P =.02). The pTLC ratio was not significantly associated with primary graft dysfunction. Oversizing on the basis of the volumetric ratio was independently associated with an increased risk of death (hazard ratio, 2.27; 95% CI, 1.04-4.99; P =.04], whereas the pTLC ratio did not have a significant survival association. CONCLUSIONS: Using computed tomography-acquired lung volumes for donor-recipient size matching in lung transplantation is feasible with advanced segmentation software. This method may be more predictive of outcome compared with current sizing methods, which use gender and height only.

Asunto(s)

Enfermedades Pulmonares , Trasplante de Pulmón , Disfunción Primaria del Injerto , Adulto , Humanos , Pulmón/cirugía , Estudios Retrospectivos , Disfunción Primaria del Injerto/etiología , Tamaño de los Órganos , Trasplante de Pulmón/métodos , Enfermedades Pulmonares/cirugía , Donantes de Tejidos , Tomografía Computarizada por Rayos X

5.

MRI-based Deep Learning Assessment of Amyloid, Tau, and Neurodegeneration Biomarker Status across the Alzheimer Disease Spectrum.

Lew, Christopher O; Zhou, Longfei; Mazurowski, Maciej A; Doraiswamy, P Murali; Petrella, Jeffrey R.

Radiology ; 309(1): e222441, 2023 10.

Artículo en Inglés | MEDLINE | ID: mdl-37815445

RESUMEN

Background PET can be used for amyloid-tau-neurodegeneration (ATN) classification in Alzheimer disease, but incurs considerable cost and exposure to ionizing radiation. MRI currently has limited use in characterizing ATN status. Deep learning techniques can detect complex patterns in MRI data and have potential for noninvasive characterization of ATN status. Purpose To use deep learning to predict PET-determined ATN biomarker status using MRI and readily available diagnostic data. Materials and Methods MRI and PET data were retrospectively collected from the Alzheimer's Disease Imaging Initiative. PET scans were paired with MRI scans acquired within 30 days, from August 2005 to September 2020. Pairs were randomly split into subsets as follows: 70% for training, 10% for validation, and 20% for final testing. A bimodal Gaussian mixture model was used to threshold PET scans into positive and negative labels. MRI data were fed into a convolutional neural network to generate imaging features. These features were combined in a logistic regression model with patient demographics, APOE gene status, cognitive scores, hippocampal volumes, and clinical diagnoses to classify each ATN biomarker component as positive or negative. Area under the receiver operating characteristic curve (AUC) analysis was used for model evaluation. Feature importance was derived from model coefficients and gradients. Results There were 2099 amyloid (mean patient age, 75 years ± 10 [SD]; 1110 male), 557 tau (mean patient age, 75 years ± 7; 280 male), and 2768 FDG PET (mean patient age, 75 years ± 7; 1645 male) and MRI pairs. Model AUCs for the test set were as follows: amyloid, 0.79 (95% CI: 0.74, 0.83); tau, 0.73 (95% CI: 0.58, 0.86); and neurodegeneration, 0.86 (95% CI: 0.83, 0.89). Within the networks, high gradients were present in key temporal, parietal, frontal, and occipital cortical regions. Model coefficients for cognitive scores, hippocampal volumes, and APOE status were highest. Conclusion A deep learning algorithm predicted each component of PET-determined ATN status with acceptable to excellent efficacy using MRI and other available diagnostic data. © RSNA, 2023 Supplemental material is available for this article.

Asunto(s)

Enfermedad de Alzheimer , Disfunción Cognitiva , Aprendizaje Profundo , Anciano , Humanos , Masculino , Enfermedad de Alzheimer/diagnóstico por imagen , Amiloide , Péptidos beta-Amiloides , Apolipoproteínas E , Biomarcadores , Imagen por Resonancia Magnética/métodos , Tomografía de Emisión de Positrones/métodos , Estudios Retrospectivos , Proteínas tau , Femenino

6.

Duke Liver Dataset: A Publicly Available Liver MRI Dataset with Liver Segmentation Masks and Series Labels.

Macdonald, Jacob A; Zhu, Zhe; Konkel, Brandon; Mazurowski, Maciej A; Wiggins, Walter F; Bashir, Mustafa R.

Radiol Artif Intell ; 5(5): e220275, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37795141

RESUMEN

The Duke Liver Dataset contains 2146 abdominal MRI series from 105 patients, including a majority with cirrhotic features, and 310 image series with corresponding manually segmented liver masks.

7.

SWSSL: Sliding Window-Based Self-Supervised Learning for Anomaly Detection in High-Resolution Images.

Dong, Haoyu; Zhang, Yifan; Gu, Hanxue; Konz, Nicholas; Zhang, Yixin; Mazurowski, Maciej A.

IEEE Trans Med Imaging ; 42(12): 3860-3870, 2023 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-37695965

RESUMEN

Anomaly detection (AD) aims to determine if an instance has properties different from those seen in normal cases. The success of this technique depends on how well a neural network learns from normal instances. We observe that the learning difficulty scales exponentially with the input resolution, making it infeasible to apply AD to high-resolution images. Resizing them to a lower resolution is a compromising solution and does not align with clinical practice where the diagnosis could depend on image details. In this work, we propose to train the network and perform inference at the patch level, through the sliding window algorithm. This simple operation allows the network to receive high-resolution images but introduces additional training difficulties, including inconsistent image structure and higher variance. We address these concerns by setting the network's objective to learn augmentation-invariant features. We further study the augmentation function in the context of medical imaging. In particular, we observe that the resizing operation, a key augmentation in general computer vision literature, is detrimental to detection accuracy, and the inverting operation can be beneficial. We also propose a new module that encourages the network to learn from adjacent patches to boost detection performance. Extensive experiments are conducted on breast tomosynthesis and chest X-ray datasets and our method improves 8.03% and 5.66% AUC on image-level classification respectively over the current leading techniques. The experimental results demonstrate the effectiveness of our approach.

Asunto(s)

Algoritmos , Redes Neurales de la Computación , Aprendizaje Automático Supervisado

8.

Segment anything model for medical image analysis: An experimental study.

Mazurowski, Maciej A; Dong, Haoyu; Gu, Hanxue; Yang, Jichen; Konz, Nicholas; Zhang, Yixin.

Med Image Anal ; 89: 102918, 2023 10.

Artículo en Inglés | MEDLINE | ID: mdl-37595404

RESUMEN

Training segmentation models for medical images continues to be challenging due to the limited availability of data annotations. Segment Anything Model (SAM) is a foundation model trained on over 1 billion annotations, predominantly for natural images, that is intended to segment user-defined objects of interest in an interactive manner. While the model performance on natural images is impressive, medical image domains pose their own set of challenges. Here, we perform an extensive evaluation of SAM's ability to segment medical images on a collection of 19 medical imaging datasets from various modalities and anatomies. In our experiments, we generated point and box prompts for SAM using a standard method that simulates interactive segmentation. We report the following findings: (1) SAM's performance based on single prompts highly varies depending on the dataset and the task, from IoU=0.1135 for spine MRI to IoU=0.8650 for hip X-ray. (2) Segmentation performance appears to be better for well-circumscribed objects with prompts with less ambiguity such as the segmentation of organs in computed tomography and poorer in various other scenarios such as the segmentation of brain tumors. (3) SAM performs notably better with box prompts than with point prompts. (4) SAM outperforms similar methods RITM, SimpleClick, and FocalClick in almost all single-point prompt settings. (5) When multiple-point prompts are provided iteratively, SAM's performance generally improves only slightly while other methods' performance improves to the level that surpasses SAM's point-based performance. We also provide several illustrations for SAM's performance on all tested datasets, iterative segmentation, and SAM's behavior given prompt ambiguity. We conclude that SAM shows impressive zero-shot segmentation performance for certain medical imaging datasets, but moderate to poor performance for others. SAM has the potential to make a significant impact in automated medical image segmentation in medical imaging, but appropriate care needs to be applied when using it. Code for evaluation SAM is made publicly available at https://github.com/mazurowski-lab/segment-anything-medical-evaluation.

Asunto(s)

Neoplasias Encefálicas , Humanos , S-Adenosilmetionina , Tomografía Computarizada por Rayos X

9.

Improving Image Classification of Knee Radiographs: An Automated Image Labeling Approach.

Zhang, Jikai; Santos, Carlos; Park, Christine; Mazurowski, Maciej A; Colglazier, Roy.

J Digit Imaging ; 36(6): 2402-2410, 2023 12.

Artículo en Inglés | MEDLINE | ID: mdl-37620710

RESUMEN

Large numbers of radiographic images are available in musculoskeletal radiology practices which could be used for training of deep learning models for diagnosis of knee abnormalities. However, those images do not typically contain readily available labels due to limitations of human annotations. The purpose of our study was to develop an automated labeling approach that improves the image classification model to distinguish normal knee images from those with abnormalities or prior arthroplasty. The automated labeler was trained on a small set of labeled data to automatically label a much larger set of unlabeled data, further improving the image classification performance for knee radiographic diagnosis. We used BioBERT and EfficientNet as the feature extraction backbone of the labeler and imaging model, respectively. We developed our approach using 7382 patients and validated it on a separate set of 637 patients. The final image classification model, trained using both manually labeled and pseudo-labeled data, had the higher weighted average AUC (WA-AUC 0.903) value and higher AUC values among all classes (normal AUC 0.894; abnormal AUC 0.896, arthroplasty AUC 0.990) compared to the baseline model (WA-AUC = 0.857; normal AUC 0.842; abnormal AUC 0.848, arthroplasty AUC 0.987), trained using only manually labeled data. Statistical tests show that the improvement is significant on normal (p value < 0.002), abnormal (p value < 0.001), and WA-AUC (p value = 0.001). Our findings demonstrated that the proposed automated labeling approach significantly improves the performance of image classification for radiographic knee diagnosis, allowing for facilitating patient care and curation of large knee datasets.

Asunto(s)

Articulación de la Rodilla , Radiología , Humanos , Radiografía , Articulación de la Rodilla/diagnóstico por imagen , Artroplastia

10.

Feasibility of predicting a screening digital breast tomosynthesis recall using features extracted from the electronic medical record.

Zhang, Jikai; Mazurowski, Maciej A; Grimm, Lars J.

Eur J Radiol ; 166: 110979, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37473618

RESUMEN

PURPOSE: Tools to predict a screening mammogram recall at the time of scheduling could improve patient care. We extracted patient demographic and breast care history information within the electronic medical record (EMR) for women undergoing digital breast tomosynthesis (DBT) to identify which factors were associated with a screening recall recommendation. METHOD: In 2018, 21,543 women aged 40 years or greater who underwent screening DBT at our institution were identified. Demographic information and breast care factors were extracted automatically from the EMR. The primary outcome was a screening recall recommendation of BI-RADS 0. A multivariable logistic regression model was built and included age, race, ethnicity groups, family breast cancer history, personal breast cancer history, surgical breast cancer history, recall history, and days since last available screening mammogram. RESULTS: Multiple factors were associated with a recall on the multivariable model: history of breast cancer surgery (OR: 2.298, 95% CI: 1.854, 2.836); prior recall within the last five years (vs no prior, OR: 0.768, 95% CI: 0.687, 0.858); prior screening mammogram within 0-18 months (vs no prior, OR: 0.601, 95% CI: 0.520, 0.691), prior screening mammogram within 18-30 months (vs no prior, OR: 0.676, 95% CI: 0.520, 0.691); and age (normalized OR: 0.723, 95% CI: 0.690, 0.758). CONCLUSIONS: It is feasible to predict a DBT screening recall recommendation using patient demographics and breast care factors that can be extracted automatically from the EMR.

Asunto(s)

Neoplasias de la Mama , Registros Electrónicos de Salud , Femenino , Humanos , Estudios de Factibilidad , Mamografía , Neoplasias de la Mama/diagnóstico por imagen , Densidad de la Mama , Mama , Detección Precoz del Cáncer , Tamizaje Masivo , Estudios Retrospectivos

11.

Multistep Automated Data Labelling Procedure (MADLaP) for thyroid nodules on ultrasound: An artificial intelligence approach for automating image annotation.

Zhang, Jikai; Mazurowski, Maciej A; Allen, Brian C; Wildman-Tobriner, Benjamin.

Artif Intell Med ; 141: 102553, 2023 07.

Artículo en Inglés | MEDLINE | ID: mdl-37295897

RESUMEN

Machine learning (ML) for diagnosis of thyroid nodules on ultrasound is an active area of research. However, ML tools require large, well-labeled datasets, the curation of which is time-consuming and labor-intensive. The purpose of our study was to develop and test a deep-learning-based tool to facilitate and automate the data annotation process for thyroid nodules; we named our tool Multistep Automated Data Labelling Procedure (MADLaP). MADLaP was designed to take multiple inputs including pathology reports, ultrasound images, and radiology reports. Using multiple step-wise 'modules' including rule-based natural language processing, deep-learning-based imaging segmentation, and optical character recognition, MADLaP automatically identified images of a specific thyroid nodule and correctly assigned a pathology label. The model was developed using a training set of 378 patients across our health system and tested on a separate set of 93 patients. Ground truths for both sets were selected by an experienced radiologist. Performance metrics including yield (how many labeled images the model produced) and accuracy (percentage correct) were measured using the test set. MADLaP achieved a yield of 63 % and an accuracy of 83 %. The yield progressively increased as the input data moved through each module, while accuracy peaked part way through. Error analysis showed that inputs from certain examination sites had lower accuracy (40 %) than the other sites (90 %, 100 %). MADLaP successfully created curated datasets of labeled ultrasound images of thyroid nodules. While accurate, the relatively suboptimal yield of MADLaP exposed some challenges when trying to automatically label radiology images from heterogeneous sources. The complex task of image curation and annotation could be automated, allowing for enrichment of larger datasets for use in machine learning development.

Asunto(s)

Nódulo Tiroideo , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Nódulo Tiroideo/patología , Inteligencia Artificial , Curaduría de Datos , Ultrasonografía/métodos , Redes Neurales de la Computación

12.

Unsupervised anomaly localization in high-resolution breast scans using deep pluralistic image completion.

Konz, Nicholas; Dong, Haoyu; Mazurowski, Maciej A.

Med Image Anal ; 87: 102836, 2023 07.

Artículo en Inglés | MEDLINE | ID: mdl-37201220

RESUMEN

Automated tumor detection in Digital Breast Tomosynthesis (DBT) is a difficult task due to natural tumor rarity, breast tissue variability, and high resolution. Given the scarcity of abnormal images and the abundance of normal images for this problem, an anomaly detection/localization approach could be well-suited. However, most anomaly localization research in machine learning focuses on non-medical datasets, and we find that these methods fall short when adapted to medical imaging datasets. The problem is alleviated when we solve the task from the image completion perspective, in which the presence of anomalies can be indicated by a discrepancy between the original appearance and its auto-completion conditioned on the surroundings. However, there are often many valid normal completions given the same surroundings, especially in the DBT dataset, making this evaluation criterion less precise. To address such an issue, we consider pluralistic image completion by exploring the distribution of possible completions instead of generating fixed predictions. This is achieved through our novel application of spatial dropout on the completion network during inference time only, which requires no additional training cost and is effective at generating diverse completions. We further propose minimum completion distance (MCD), a new metric for detecting anomalies, thanks to these stochastic completions. We provide theoretical as well as empirical support for the superiority over existing methods of using the proposed method for anomaly localization. On the DBT dataset, our model outperforms other state-of-the-art methods by at least 10% AUROC for pixel-level detection.

Asunto(s)

Neoplasias de la Mama , Mamografía , Humanos , Femenino , Mamografía/métodos , Aprendizaje Automático , Neoplasias de la Mama/diagnóstico por imagen

13.

Deep learning for classification of thyroid nodules on ultrasound: validation on an independent dataset.

Weng, Jingxi; Wildman-Tobriner, Benjamin; Buda, Mateusz; Yang, Jichen; Ho, Lisa M; Allen, Brian C; Ehieli, Wendy L; Miller, Chad M; Zhang, Jikai; Mazurowski, Maciej A.

Clin Imaging ; 99: 60-66, 2023 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-37116263

RESUMEN

OBJECTIVES: The purpose is to apply a previously validated deep learning algorithm to a new thyroid nodule ultrasound image dataset and compare its performances with radiologists. METHODS: Prior study presented an algorithm which is able to detect thyroid nodules and then make malignancy classifications with two ultrasound images. A multi-task deep convolutional neural network was trained from 1278 nodules and originally tested with 99 separate nodules. The results were comparable with that of radiologists. The algorithm was further tested with 378 nodules imaged with ultrasound machines from different manufacturers and product types than the training cases. Four experienced radiologists were requested to evaluate the nodules for comparison with deep learning. RESULTS: The Area Under Curve (AUC) of the deep learning algorithm and four radiologists were calculated with parametric, binormal estimation. For the deep learning algorithm, the AUC was 0.69 (95% CI: 0.64-0.75). The AUC of radiologists were 0.63 (95% CI: 0.59-0.67), 0.66 (95% CI:0.61-0.71), 0.65 (95% CI: 0.60-0.70), and 0.63 (95%CI: 0.58-0.67). CONCLUSION: In the new testing dataset, the deep learning algorithm achieved similar performances with all four radiologists. The relative performance difference between the algorithm and the radiologists is not significantly affected by the difference of ultrasound scanner.

Asunto(s)

Aprendizaje Profundo , Nódulo Tiroideo , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Nódulo Tiroideo/patología , Estudios Retrospectivos , Ultrasonografía/métodos , Redes Neurales de la Computación

14.

A Competition, Benchmark, Code, and Data for Using Artificial Intelligence to Detect Lesions in Digital Breast Tomosynthesis.

Konz, Nicholas; Buda, Mateusz; Gu, Hanxue; Saha, Ashirbani; Yang, Jichen; Chledowski, Jakub; Park, Jungkyu; Witowski, Jan; Geras, Krzysztof J; Shoshan, Yoel; Gilboa-Solomon, Flora; Khapun, Daniel; Ratner, Vadim; Barkan, Ella; Ozery-Flato, Michal; Martí, Robert; Omigbodun, Akinyinka; Marasinou, Chrysostomos; Nakhaei, Noor; Hsu, William; Sahu, Pranjal; Hossain, Md Belayat; Lee, Juhun; Santos, Carlos; Przelaskowski, Artur; Kalpathy-Cramer, Jayashree; Bearce, Benjamin; Cha, Kenny; Farahani, Keyvan; Petrick, Nicholas; Hadjiiski, Lubomir; Drukker, Karen; Armato, Samuel G; Mazurowski, Maciej A.

JAMA Netw Open ; 6(2): e230524, 2023 02 01.

Artículo en Inglés | MEDLINE | ID: mdl-36821110

RESUMEN

Importance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide. Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods. Design, Setting, and Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22â¯032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021. Main Outcomes and Measures: The overall performance was evaluated by mean sensitivity for biopsied lesions using only DBT volumes with biopsied lesions; ties were broken by including all DBT volumes. Results: A total of 8 teams participated in the challenge. The team with the highest mean sensitivity for biopsied lesions was the NYU B-Team, with 0.957 (95% CI, 0.924-0.984), and the second-place team, ZeDuS, had a mean sensitivity of 0.926 (95% CI, 0.881-0.964). When the results were aggregated, the mean sensitivity for all submitted algorithms was 0.879; for only those who participated in phase 2, it was 0.926. Conclusions and Relevance: In this diagnostic study, an international competition produced algorithms with high sensitivity for using AI to detect lesions on DBT images. A standardized performance benchmark for the detection task using publicly available clinical imaging data was released, with detailed descriptions and analyses of submitted algorithms accompanied by a public release of their predictions and code for selected methods. These resources will serve as a foundation for future research on computer-assisted diagnosis methods for DBT, significantly lowering the barrier of entry for new researchers.

Asunto(s)

Inteligencia Artificial , Neoplasias de la Mama , Humanos , Femenino , Benchmarking , Mamografía/métodos , Algoritmos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Neoplasias de la Mama/diagnóstico por imagen

15.

Thyroid Nodules on Ultrasound in Children and Young Adults: Comparison of Diagnostic Performance of Radiologists' Impressions, ACR TI-RADS, and a Deep Learning Algorithm.

Yang, Jichen; Page, Laura C; Wagner, Lars; Wildman-Tobriner, Benjamin; Bisset, Logan; Frush, Donald; Mazurowski, Maciej A.

AJR Am J Roentgenol ; 220(3): 408-417, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36259591

RESUMEN

BACKGROUND. In current clinical practice, thyroid nodules in children are generally evaluated on the basis of radiologists' overall impressions of ultrasound images. OBJECTIVE. The purpose of this article is to compare the diagnostic performance of radiologists' overall impression, the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS), and a deep learning algorithm in differentiating benign and malignant thyroid nodules on ultrasound in children and young adults. METHODS. This retrospective study included 139 patients (median age 17.5 years; 119 female patients, 20 male patients) evaluated from January 1, 2004, to September 18, 2020, who were 21 years old and younger with a thyroid nodule on ultrasound with definitive pathologic results from fine-needle aspiration and/or surgical excision to serve as the reference standard. A single nodule per patient was selected, and one transverse and one longitudinal image each of the nodules were extracted for further evaluation. Three radiologists independently characterized nodules on the basis of their overall impression (benign vs malignant) and ACR TI-RADS. A previously developed deep learning algorithm determined for each nodule a likelihood of malignancy, which was used to derive a risk level. Sensitivities and specificities for malignancy were calculated. Agreement was assessed using Cohen kappa coefficients. RESULTS. For radiologists' overall impression, sensitivity ranged from 32.1% to 75.0% (mean, 58.3%; 95% CI, 49.2-67.3%), and specificity ranged from 63.8% to 93.9% (mean, 79.9%; 95% CI, 73.8-85.7%). For ACR TI-RADS, sensitivity ranged from 82.1% to 87.5% (mean, 85.1%; 95% CI, 77.3-92.1%), and specificity ranged from 47.0% to 54.2% (mean, 50.6%; 95% CI, 41.4-59.8%). The deep learning algorithm had a sensitivity of 87.5% (95% CI, 78.3-95.5%) and specificity of 36.1% (95% CI, 25.6-46.8%). Interobserver agreement among pairwise combinations of readers, expressed as kappa, for overall impression was 0.227-0.472 and for ACR TI-RADS was 0.597-0.643. CONCLUSION. Both ACR TI-RADS and the deep learning algorithm had higher sensitivity albeit lower specificity compared with overall impressions. The deep learning algorithm had similar sensitivity but lower specificity than ACR TI-RADS. Interobserver agreement was higher for ACR TI-RADS than for overall impressions. CLINICAL IMPACT. ACR TI-RADS and the deep learning algorithm may serve as potential alternative strategies for guiding decisions to perform fine-needle aspiration of thyroid nodules in children.

Asunto(s)

Aprendizaje Profundo , Nódulo Tiroideo , Humanos , Masculino , Niño , Femenino , Adulto Joven , Adolescente , Adulto , Nódulo Tiroideo/patología , Estudios Retrospectivos , Ultrasonografía/métodos , Radiólogos

16.

Deep Learning for Breast MRI Style Transfer with Limited Training Data.

Cao, Shixing; Konz, Nicholas; Duncan, James; Mazurowski, Maciej A.

J Digit Imaging ; 36(2): 666-678, 2023 04.

Artículo en Inglés | MEDLINE | ID: mdl-36544066

RESUMEN

In this work we introduce a novel medical image style transfer method, StyleMapper, that can transfer medical scans to an unseen style with access to limited training data. This is made possible by training our model on unlimited possibilities of simulated random medical imaging styles on the training set, making our work more computationally efficient when compared with other style transfer methods. Moreover, our method enables arbitrary style transfer: transferring images to styles unseen in training. This is useful for medical imaging, where images are acquired using different protocols and different scanner models, resulting in a variety of styles that data may need to be transferred between. Our model disentangles image content from style and can modify an image's style by simply replacing the style encoding with one extracted from a single image of the target style, with no additional optimization required. This also allows the model to distinguish between different styles of images, including among those that were unseen in training. We propose a formal description of the proposed model. Experimental results on breast magnetic resonance images indicate the effectiveness of our method for style transfer. Our style transfer method allows for the alignment of medical images taken with different scanners into a single unified style dataset, allowing for the training of other downstream tasks on such a dataset for tasks such as classification, object detection and others.

Asunto(s)

Aprendizaje Profundo , Humanos , Imagen por Resonancia Magnética , Radiografía , Procesamiento de Imagen Asistido por Computador/métodos

17.

The Need for Targeted Labeling of Machine Learning-Based Software as a Medical Device.

Goldstein, Benjamin A; Mazurowski, Maciej A; Li, Cheng.

JAMA Netw Open ; 5(11): e2242351, 2022 11 01.

Artículo en Inglés | MEDLINE | ID: mdl-36409502

Asunto(s)

Aprendizaje Automático , Programas Informáticos , Humanos

18.

Artificial Intelligence (AI) Tools for Thyroid Nodules on Ultrasound, From the AJR Special Series on AI Applications.

Wildman-Tobriner, Benjamin; Taghi-Zadeh, Elmira; Mazurowski, Maciej A.

AJR Am J Roentgenol ; 219(4): 1-8, 2022 10.

Artículo en Inglés | MEDLINE | ID: mdl-35383487

RESUMEN

Artificial intelligence (AI) methods for evaluating thyroid nodules on ultrasound have been widely described in the literature, with reported performance of AI tools matching or in some instances surpassing radiologists' performance. As these data have accumulated, products for classification and risk stratification of thyroid nodules on ultrasound have become commercially available. This article reviews FDA-approved products currently on the market, with a focus on product features, reported performance, and considerations for implementation. The products perform risk stratification primarily using a Thyroid Imaging Reporting and Data System (TIRADS), though may provide additional prediction tools independent of TIRADS. Key issues in implementation include integration with radiologist interpretation, impact on workflow and efficiency, and performance monitoring. AI applications beyond nodule classification, including report construction and incidental findings follow-up, are also described. Anticipated future directions of research and development in AI tools for thyroid nodules are highlighted.

Asunto(s)

Neoplasias de la Tiroides , Nódulo Tiroideo , Inteligencia Artificial , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Ultrasonografía/métodos

19.

Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning.

D'Anniballe, Vincent M; Tushar, Fakrul Islam; Faryna, Khrystyna; Han, Songyue; Mazurowski, Maciej A; Rubin, Geoffrey D; Lo, Joseph Y.

BMC Med Inform Decis Mak ; 22(1): 102, 2022 04 15.

Artículo en Inglés | MEDLINE | ID: mdl-35428335

RESUMEN

BACKGROUND: There is progress to be made in building artificially intelligent systems to detect abnormalities that are not only accurate but can handle the true breadth of findings that radiologists encounter in body (chest, abdomen, and pelvis) computed tomography (CT). Currently, the major bottleneck for developing multi-disease classifiers is a lack of manually annotated data. The purpose of this work was to develop high throughput multi-label annotators for body CT reports that can be applied across a variety of abnormalities, organs, and disease states thereby mitigating the need for human annotation. METHODS: We used a dictionary approach to develop rule-based algorithms (RBA) for extraction of disease labels from radiology text reports. We targeted three organ systems (lungs/pleura, liver/gallbladder, kidneys/ureters) with four diseases per system based on their prevalence in our dataset. To expand the algorithms beyond pre-defined keywords, attention-guided recurrent neural networks (RNN) were trained using the RBA-extracted labels to classify reports as being positive for one or more diseases or normal for each organ system. Alternative effects on disease classification performance were evaluated using random initialization or pre-trained embedding as well as different sizes of training datasets. The RBA was tested on a subset of 2158 manually labeled reports and performance was reported as accuracy and F-score. The RNN was tested against a test set of 48,758 reports labeled by RBA and performance was reported as area under the receiver operating characteristic curve (AUC), with 95% CIs calculated using the DeLong method. RESULTS: Manual validation of the RBA confirmed 91-99% accuracy across the 15 different labels. Our models extracted disease labels from 261,229 radiology reports of 112,501 unique subjects. Pre-trained models outperformed random initialization across all diseases. As the training dataset size was reduced, performance was robust except for a few diseases with a relatively small number of cases. Pre-trained classification AUCs reached > 0.95 for all four disease outcomes and normality across all three organ systems. CONCLUSIONS: Our label-extracting pipeline was able to encompass a variety of cases and diseases in body CT reports by generalizing beyond strict rules with exceptional accuracy. The method described can be easily adapted to enable automated labeling of hospital-scale medical data sets for training image-based disease classifiers.

Asunto(s)

Aprendizaje Profundo , Abdomen , Humanos , Redes Neurales de la Computación , Pelvis/diagnóstico por imagen , Tomografía Computarizada por Rayos X

20.

Classification of Multiple Diseases on Body CT Scans Using Weakly Supervised Deep Learning.

Tushar, Fakrul Islam; D'Anniballe, Vincent M; Hou, Rui; Mazurowski, Maciej A; Fu, Wanyi; Samei, Ehsan; Rubin, Geoffrey D; Lo, Joseph Y.

Radiol Artif Intell ; 4(1): e210026, 2022 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-35146433

RESUMEN

PURPOSE: To design multidisease classifiers for body CT scans for three different organ systems using automatically extracted labels from radiology text reports. MATERIALS AND METHODS: This retrospective study included a total of 12 092 patients (mean age, 57 years ± 18 [standard deviation]; 6172 women) for model development and testing. Rule-based algorithms were used to extract 19 225 disease labels from 13 667 body CT scans performed between 2012 and 2017. Using a three-dimensional DenseVNet, three organ systems were segmented: lungs and pleura, liver and gallbladder, and kidneys and ureters. For each organ system, a three-dimensional convolutional neural network classified each as no apparent disease or for the presence of four common diseases, for a total of 15 different labels across all three models. Testing was performed on a subset of 2158 CT volumes relative to 2875 manually derived reference labels from 2133 patients (mean age, 58 years ± 18; 1079 women). Performance was reported as area under the receiver operating characteristic curve (AUC), with 95% CIs calculated using the DeLong method. RESULTS: Manual validation of the extracted labels confirmed 91%-99% accuracy across the 15 different labels. AUCs for lungs and pleura labels were as follows: atelectasis, 0.77 (95% CI: 0.74, 0.81); nodule, 0.65 (95% CI: 0.61, 0.69); emphysema, 0.89 (95% CI: 0.86, 0.92); effusion, 0.97 (95% CI: 0.96, 0.98); and no apparent disease, 0.89 (95% CI: 0.87, 0.91). AUCs for liver and gallbladder were as follows: hepatobiliary calcification, 0.62 (95% CI: 0.56, 0.67); lesion, 0.73 (95% CI: 0.69, 0.77); dilation, 0.87 (95% CI: 0.84, 0.90); fatty, 0.89 (95% CI: 0.86, 0.92); and no apparent disease, 0.82 (95% CI: 0.78, 0.85). AUCs for kidneys and ureters were as follows: stone, 0.83 (95% CI: 0.79, 0.87); atrophy, 0.92 (95% CI: 0.89, 0.94); lesion, 0.68 (95% CI: 0.64, 0.72); cyst, 0.70 (95% CI: 0.66, 0.73); and no apparent disease, 0.79 (95% CI: 0.75, 0.83). CONCLUSION: Weakly supervised deep learning models were able to classify diverse diseases in multiple organ systems from CT scans.Keywords: CT, Diagnosis/Classification/Application Domain, Semisupervised Learning, Whole-Body Imaging© RSNA, 2022.

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA