Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
IEEE Trans Med Imaging ; 43(1): 542-557, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37713220

ABSTRACT

The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios due to the presence of out-of-distribution and low-quality images. To address this issue, we propose the Artificial Intelligence for Robust Glaucoma Screening (AIROGS) challenge. This challenge includes a large dataset of around 113,000 images from about 60,000 patients and 500 different screening centers, and encourages the development of algorithms that are robust to ungradable and unexpected input data. We evaluated solutions from 14 teams in this paper and found that the best teams performed similarly to a set of 20 expert ophthalmologists and optometrists. The highest-scoring team achieved an area under the receiver operating characteristic curve of 0.99 (95% CI: 0.98-0.99) for detecting ungradable images on-the-fly. Additionally, many of the algorithms showed robust performance when tested on three other publicly available datasets. These results demonstrate the feasibility of robust AI-enabled glaucoma screening.


Subject(s)
Artificial Intelligence , Glaucoma , Humans , Glaucoma/diagnostic imaging , Fundus Oculi , Diagnostic Techniques, Ophthalmological , Algorithms
2.
Med Image Anal ; 92: 103059, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38104402

ABSTRACT

Artificial intelligence (AI) has a multitude of applications in cancer research and oncology. However, the training of AI systems is impeded by the limited availability of large datasets due to data protection requirements and other regulatory obstacles. Federated and swarm learning represent possible solutions to this problem by collaboratively training AI models while avoiding data transfer. However, in these decentralized methods, weight updates are still transferred to the aggregation server for merging the models. This leaves the possibility for a breach of data privacy, for example by model inversion or membership inference attacks by untrusted servers. Somewhat-homomorphically-encrypted federated learning (SHEFL) is a solution to this problem because only encrypted weights are transferred, and model updates are performed in the encrypted space. Here, we demonstrate the first successful implementation of SHEFL in a range of clinically relevant tasks in cancer image analysis on multicentric datasets in radiology and histopathology. We show that SHEFL enables the training of AI models which outperform locally trained models and perform on par with models which are centrally trained. In the future, SHEFL can enable multiple institutions to co-train AI models without forsaking data governance and without ever transmitting any decryptable data to untrusted servers.


Subject(s)
Neoplasms , Radiology , Humans , Artificial Intelligence , Learning , Neoplasms/diagnostic imaging , Image Processing, Computer-Assisted
3.
Radiology ; 309(1): e230806, 2023 10.
Article in English | MEDLINE | ID: mdl-37787671

ABSTRACT

Background Clinicians consider both imaging and nonimaging data when diagnosing diseases; however, current machine learning approaches primarily consider data from a single modality. Purpose To develop a neural network architecture capable of integrating multimodal patient data and compare its performance to models incorporating a single modality for diagnosing up to 25 pathologic conditions. Materials and Methods In this retrospective study, imaging and nonimaging patient data were extracted from the Medical Information Mart for Intensive Care (MIMIC) database and an internal database comprised of chest radiographs and clinical parameters inpatients in the intensive care unit (ICU) (January 2008 to December 2020). The MIMIC and internal data sets were each split into training (n = 33 893, n = 28 809), validation (n = 740, n = 7203), and test (n = 1909, n = 9004) sets. A novel transformer-based neural network architecture was trained to diagnose up to 25 conditions using nonimaging data alone, imaging data alone, or multimodal data. Diagnostic performance was assessed using area under the receiver operating characteristic curve (AUC) analysis. Results The MIMIC and internal data sets included 36 542 patients (mean age, 63 years ± 17 [SD]; 20 567 male patients) and 45 016 patients (mean age, 66 years ± 16; 27 577 male patients), respectively. The multimodal model showed improved diagnostic performance for all pathologic conditions. For the MIMIC data set, the mean AUC was 0.77 (95% CI: 0.77, 0.78) when both chest radiographs and clinical parameters were used, compared with 0.70 (95% CI: 0.69, 0.71; P < .001) for only chest radiographs and 0.72 (95% CI: 0.72, 0.73; P < .001) for only clinical parameters. These findings were confirmed on the internal data set. Conclusion A model trained on imaging and nonimaging data outperformed models trained on only one type of data for diagnosing multiple diseases in patients in an ICU setting. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Kitamura and Topol in this issue.


Subject(s)
Deep Learning , Humans , Male , Middle Aged , Aged , Retrospective Studies , Radiography , Databases, Factual , Inpatients
4.
Sci Rep ; 13(1): 14207, 2023 08 30.
Article in English | MEDLINE | ID: mdl-37648728

ABSTRACT

Accurate and automatic segmentation of fibroglandular tissue in breast MRI screening is essential for the quantification of breast density and background parenchymal enhancement. In this retrospective study, we developed and evaluated a transformer-based neural network for breast segmentation (TraBS) in multi-institutional MRI data, and compared its performance to the well established convolutional neural network nnUNet. TraBS and nnUNet were trained and tested on 200 internal and 40 external breast MRI examinations using manual segmentations generated by experienced human readers. Segmentation performance was assessed in terms of the Dice score and the average symmetric surface distance. The Dice score for nnUNet was lower than for TraBS on the internal testset (0.909 ± 0.069 versus 0.916 ± 0.067, P < 0.001) and on the external testset (0.824 ± 0.144 versus 0.864 ± 0.081, P = 0.004). Moreover, the average symmetric surface distance was higher (= worse) for nnUNet than for TraBS on the internal (0.657 ± 2.856 versus 0.548 ± 2.195, P = 0.001) and on the external testset (0.727 ± 0.620 versus 0.584 ± 0.413, P = 0.03). Our study demonstrates that transformer-based networks improve the quality of fibroglandular tissue segmentation in breast MRI compared to convolutional-based models like nnUNet. These findings might help to enhance the accuracy of breast density and parenchymal enhancement quantification in breast MRI screening.


Subject(s)
Breast Density , Magnetic Resonance Imaging , Humans , Retrospective Studies , Radiography , Electric Power Supplies
5.
Sci Rep ; 13(1): 10666, 2023 07 01.
Article in English | MEDLINE | ID: mdl-37393383

ABSTRACT

When clinicians assess the prognosis of patients in intensive care, they take imaging and non-imaging data into account. In contrast, many traditional machine learning models rely on only one of these modalities, limiting their potential in medical applications. This work proposes and evaluates a transformer-based neural network as a novel AI architecture that integrates multimodal patient data, i.e., imaging data (chest radiographs) and non-imaging data (clinical data). We evaluate the performance of our model in a retrospective study with 6,125 patients in intensive care. We show that the combined model (area under the receiver operating characteristic curve [AUROC] of 0.863) is superior to the radiographs-only model (AUROC = 0.811, p < 0.001) and the clinical data-only model (AUROC = 0.785, p < 0.001) when tasked with predicting in-hospital survival per patient. Furthermore, we demonstrate that our proposed model is robust in cases where not all (clinical) data points are available.


Subject(s)
Critical Care , Diagnostic Imaging , Humans , Retrospective Studies , Area Under Curve , Electric Power Supplies
6.
Sci Rep ; 13(1): 12098, 2023 07 26.
Article in English | MEDLINE | ID: mdl-37495660

ABSTRACT

Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images.


Subject(s)
Artifacts , Mental Recall , Diffusion , Models, Statistical , Ophthalmoscopy , Image Processing, Computer-Assisted
7.
Sci Rep ; 13(1): 7303, 2023 05 05.
Article in English | MEDLINE | ID: mdl-37147413

ABSTRACT

Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen, and Stable Diffusion. However, their use in medicine, where imaging data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy-preserving artificial intelligence and can also be used to augment small datasets. We show that diffusion probabilistic models can synthesize high-quality medical data for magnetic resonance imaging (MRI) and computed tomography (CT). For quantitative evaluation, two radiologists rated the quality of the synthesized images regarding "realistic image appearance", "anatomical correctness", and "consistency between slices". Furthermore, we demonstrate that synthetic images can be used in self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (Dice scores, 0.91 [without synthetic data], 0.95 [with synthetic data]).


Subject(s)
Artificial Intelligence , Imaging, Three-Dimensional , Magnetic Resonance Imaging , Tomography, X-Ray Computed , Models, Statistical , Image Processing, Computer-Assisted/methods
8.
Sci Rep ; 13(1): 6046, 2023 04 13.
Article in English | MEDLINE | ID: mdl-37055456

ABSTRACT

Due to the rapid advancements in recent years, medical image analysis is largely dominated by deep learning (DL). However, building powerful and robust DL models requires training with large multi-party datasets. While multiple stakeholders have provided publicly available datasets, the ways in which these data are labeled vary widely. For Instance, an institution might provide a dataset of chest radiographs containing labels denoting the presence of pneumonia, while another institution might have a focus on determining the presence of metastases in the lung. Training a single AI model utilizing all these data is not feasible with conventional federated learning (FL). This prompts us to propose an extension to the widespread FL process, namely flexible federated learning (FFL) for collaborative training on such data. Using 695,000 chest radiographs from five institutions from across the globe-each with differing labels-we demonstrate that having heterogeneously labeled datasets, FFL-based training leads to significant performance increase compared to conventional FL training, where only the uniformly annotated images are utilized. We believe that our proposed algorithm could accelerate the process of bringing collaborative training methods from research and simulation phase to the real-world applications in healthcare.


Subject(s)
Algorithms , Artificial Intelligence , Computer Simulation , Health Facilities , Thorax
9.
Radiology ; 307(3): e222211, 2023 05.
Article in English | MEDLINE | ID: mdl-36943080

ABSTRACT

Background Reducing the amount of contrast agent needed for contrast-enhanced breast MRI is desirable. Purpose To investigate if generative adversarial networks (GANs) can recover contrast-enhanced breast MRI scans from unenhanced images and virtual low-contrast-enhanced images. Materials and Methods In this retrospective study of breast MRI performed from January 2010 to December 2019, simulated low-contrast images were produced by adding virtual noise to the existing contrast-enhanced images. GANs were then trained to recover the contrast-enhanced images from the simulated low-contrast images (approach A) or from the unenhanced T1- and T2-weighted images (approach B). Two experienced radiologists were tasked with distinguishing between real and synthesized contrast-enhanced images using both approaches. Image appearance and conspicuity of enhancing lesions on the real versus synthesized contrast-enhanced images were independently compared and rated on a five-point Likert scale. P values were calculated by using bootstrapping. Results A total of 9751 breast MRI examinations from 5086 patients (mean age, 56 years ± 10 [SD]) were included. Readers who were blinded to the nature of the images could not distinguish real from synthetic contrast-enhanced images (average accuracy of differentiation: approach A, 52 of 100; approach B, 61 of 100). The test set included images with and without enhancing lesions (29 enhancing masses and 21 nonmass enhancement; 50 total). When readers who were not blinded compared the appearance of the real versus synthetic contrast-enhanced images side by side, approach A image ratings were significantly higher than those of approach B (mean rating, 4.6 ± 0.1 vs 3.0 ± 0.2; P < .001), with the noninferiority margin met by synthetic images from approach A (P < .001) but not B (P > .99). Conclusion Generative adversarial networks may be useful to enable breast MRI with reduced contrast agent dose. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Bahl in this issue.


Subject(s)
Contrast Media , Magnetic Resonance Imaging , Humans , Middle Aged , Retrospective Studies , Magnetic Resonance Imaging/methods , Breast , Machine Learning
10.
Radiology ; 307(1): e220510, 2023 04.
Article in English | MEDLINE | ID: mdl-36472534

ABSTRACT

Background Supine chest radiography for bedridden patients in intensive care units (ICUs) is one of the most frequently ordered imaging studies worldwide. Purpose To evaluate the diagnostic performance of a neural network-based model that is trained on structured semiquantitative radiologic reports of bedside chest radiographs. Materials and Methods For this retrospective single-center study, children and adults in the ICU of a university hospital who had been imaged using bedside chest radiography from January 2009 to December 2020 were reported by using a structured and itemized template. Ninety-eight radiologists rated the radiographs semiquantitatively for the severity of disease patterns. These data were used to train a neural network to identify cardiomegaly, pulmonary congestion, pleural effusion, pulmonary opacities, and atelectasis. A held-out internal test set (100 radiographs from 100 patients) that was assessed independently by an expert panel of six radiologists provided the ground truth. Individual assessments by each of these six radiologists, by two nonradiologist physicians in the ICU, and by the neural network were compared with the ground truth. Separately, the nonradiologist physicians assessed the images without and with preliminary readings provided by the neural network. The weighted Cohen κ coefficient was used to measure agreement between the readers and the ground truth. Results A total of 193 566 radiographs in 45 016 patients (mean age, 66 years ± 16 [SD]; 61% men) were included and divided into training (n = 122 294; 64%), validation (n = 31 243; 16%), and test (n = 40 029; 20%) sets. The neural network exhibited higher agreement with a majority vote of the expert panel (κ = 0.86) than each individual radiologist compared with the majority vote of the expert panel (κ = 0.81 to ≤0.84). When the neural network provided preliminary readings, the reports of the nonradiologist physicians improved considerably (aided vs unaided, κ = 0.87 vs 0.79, respectively; P < .001). Conclusion A neural network trained with structured semiquantitative bedside chest radiography reports allowed nonradiologist physicians improved interpretations compared with the consensus reading of expert radiologists. © RSNA, 2022 Supplemental material is available for this article. See also the editorial by Wielpütz in this issue.


Subject(s)
Artificial Intelligence , Radiography, Thoracic , Male , Adult , Child , Humans , Aged , Female , Retrospective Studies , Radiography, Thoracic/methods , Lung , Radiography
12.
Med Image Anal ; 79: 102474, 2022 07.
Article in English | MEDLINE | ID: mdl-35588568

ABSTRACT

Artificial intelligence (AI) can extract visual information from histopathological slides and yield biological insight and clinical biomarkers. Whole slide images are cut into thousands of tiles and classification problems are often weakly-supervised: the ground truth is only known for the slide, not for every single tile. In classical weakly-supervised analysis pipelines, all tiles inherit the slide label while in multiple-instance learning (MIL), only bags of tiles inherit the label. However, it is still unclear how these widely used but markedly different approaches perform relative to each other. We implemented and systematically compared six methods in six clinically relevant end-to-end prediction tasks using data from N=2980 patients for training with rigorous external validation. We tested three classical weakly-supervised approaches with convolutional neural networks and vision transformers (ViT) and three MIL-based approaches with and without an additional attention module. Our results empirically demonstrate that histological tumor subtyping of renal cell carcinoma is an easy task in which all approaches achieve an area under the receiver operating curve (AUROC) of above 0.9. In contrast, we report significant performance differences for clinically relevant tasks of mutation prediction in colorectal, gastric, and bladder cancer. In these mutation prediction tasks, classical weakly-supervised workflows outperformed MIL-based weakly-supervised methods for mutation prediction, which is surprising given their simplicity. This shows that new end-to-end image analysis pipelines in computational pathology should be compared to classical weakly-supervised methods. Also, these findings motivate the development of new methods which combine the elegant assumptions of MIL with the empirically observed higher performance of classical weakly-supervised approaches. We make all source codes publicly available at https://github.com/KatherLab/HIA, allowing easy application of all methods to any similar task.


Subject(s)
Deep Learning , Artificial Intelligence , Benchmarking , Humans , Neural Networks, Computer , Supervised Machine Learning
13.
Nat Med ; 28(6): 1232-1239, 2022 06.
Article in English | MEDLINE | ID: mdl-35469069

ABSTRACT

Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical and legal obstacles. These obstacles could be overcome with swarm learning (SL), in which partners jointly train AI models while avoiding data transfer and monopolistic data governance. Here, we demonstrate the successful use of SL in large, multicentric datasets of gigapixel histopathology images from over 5,000 patients. We show that AI models trained using SL can predict BRAF mutational status and microsatellite instability directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer. We trained AI models on three patient cohorts from Northern Ireland, Germany and the United States, and validated the prediction performance in two independent datasets from the United Kingdom. Our data show that SL-trained AI models outperform most locally trained models, and perform on par with models that are trained on the merged datasets. In addition, we show that SL-based AI models are data efficient. In the future, SL can be used to train distributed AI models for any histopathology image analysis task, eliminating the need for data transfer.


Subject(s)
Artificial Intelligence , Neoplasms , Humans , Image Processing, Computer-Assisted , Neoplasms/genetics , Staining and Labeling , United Kingdom
14.
Diagnostics (Basel) ; 12(3)2022 Mar 11.
Article in English | MEDLINE | ID: mdl-35328240

ABSTRACT

For T2 mapping, the underlying mono-exponential signal decay is traditionally quantified by non-linear Least-Squares Estimation (LSE) curve fitting, which is prone to outliers and computationally expensive. This study aimed to validate a fully connected neural network (NN) to estimate T2 relaxation times and to assess its performance versus LSE fitting methods. To this end, the NN was trained and tested in silico on a synthetic dataset of 75 million signal decays. Its quantification error was comparatively evaluated against three LSE methods, i.e., traditional methods without any modification, with an offset, and one with noise correction. Following in-situ acquisition of T2 maps in seven human cadaveric knee joint specimens at high and low signal-to-noise ratios, the NN and LSE methods were used to estimate the T2 relaxation times of the manually segmented patellofemoral cartilage. In-silico modeling at low signal-to-noise ratio indicated significantly lower quantification error for the NN (by medians of 6−33%) than for the LSE methods (p < 0.001). These results were confirmed by the in-situ measurements (medians of 10−35%). T2 quantification by the NN took only 4 s, which was faster than the LSE methods (28−43 s). In conclusion, NNs provide fast, accurate, and robust quantification of T2 relaxation times.

15.
Diagnostics (Basel) ; 12(2)2022 Jan 19.
Article in English | MEDLINE | ID: mdl-35204338

ABSTRACT

Machine learning results based on radiomic analysis are often not transferrable. A potential reason for this is the variability of radiomic features due to varying human made segmentations. Therefore, the aim of this study was to provide comprehensive inter-reader reliability analysis of radiomic features in five clinical image datasets and to assess the association of inter-reader reliability and survival prediction. In this study, we analyzed 4598 tumor segmentations in both computed tomography and magnetic resonance imaging data. We used a neural network to generate 100 additional segmentation outlines for each tumor and performed a reliability analysis of radiomic features. To prove clinical utility, we predicted patient survival based on all features and on the most reliable features. Survival prediction models for both computed tomography and magnetic resonance imaging datasets demonstrated less statistical spread and superior survival prediction when based on the most reliable features. Mean concordance indices were Cmean = 0.58 [most reliable] vs. Cmean = 0.56 [all] (p < 0.001, CT) and Cmean = 0.58 vs. Cmean = 0.57 (p = 0.23, MRI). Thus, preceding reliability analyses and selection of the most reliable radiomic features improves the underlying model's ability to predict patient survival across clinical imaging modalities and tumor entities.

16.
IEEE Trans Med Imaging ; 40(12): 3543-3554, 2021 12.
Article in English | MEDLINE | ID: mdl-34138702

ABSTRACT

The emergence of deep learning has considerably advanced the state-of-the-art in cardiac magnetic resonance (CMR) segmentation. Many techniques have been proposed over the last few years, bringing the accuracy of automated segmentation close to human performance. However, these models have been all too often trained and validated using cardiac imaging samples from single clinical centres or homogeneous imaging protocols. This has prevented the development and validation of models that are generalizable across different clinical centres, imaging conditions or scanner vendors. To promote further research and scientific benchmarking in the field of generalizable deep learning for cardiac segmentation, this paper presents the results of the Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation (M&Ms) Challenge, which was recently organized as part of the MICCAI 2020 Conference. A total of 14 teams submitted different solutions to the problem, combining various baseline models, data augmentation strategies, and domain adaptation techniques. The obtained results indicate the importance of intensity-driven data augmentation, as well as the need for further research to improve generalizability towards unseen scanner vendors or new imaging protocols. Furthermore, we present a new resource of 375 heterogeneous CMR datasets acquired by using four different scanner vendors in six hospitals and three different countries (Spain, Canada and Germany), which we provide as open-access for the community to enable future research in the field.


Subject(s)
Heart , Magnetic Resonance Imaging , Cardiac Imaging Techniques , Heart/diagnostic imaging , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...