Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Commun Med (Lond) ; 4(1): 133, 2024 Jul 06.
Article in English | MEDLINE | ID: mdl-38971887

ABSTRACT

BACKGROUND: Advances in self-supervised learning (SSL) have enabled state-of-the-art automated medical image diagnosis from small, labeled datasets. This label efficiency is often desirable, given the difficulty of obtaining expert labels for medical image recognition tasks. However, most efforts toward SSL in medical imaging are not adapted to video-based modalities, such as echocardiography. METHODS: We developed a self-supervised contrastive learning approach, EchoCLR, for echocardiogram videos with the goal of learning strong representations for efficient fine-tuning on downstream cardiac disease diagnosis. EchoCLR pretraining involves (i) contrastive learning, where the model is trained to identify distinct videos of the same patient, and (ii) frame reordering, where the model is trained to predict the correct of video frames after being randomly shuffled. RESULTS: When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improves classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS) over other transfer learning and SSL approaches across internal and external test sets. When fine-tuning on 10% of available training data (519 studies), an EchoCLR-pretrained model achieves 0.72 AUROC (95% CI: [0.69, 0.75]) on LVH classification, compared to 0.61 AUROC (95% CI: [0.57, 0.64]) with a standard transfer learning approach. Similarly, using 1% of available training data (53 studies), EchoCLR pretraining achieves 0.82 AUROC (95% CI: [0.79, 0.84]) on severe AS classification, compared to 0.61 AUROC (95% CI: [0.58, 0.65]) with transfer learning. CONCLUSIONS: EchoCLR is unique in its ability to learn representations of echocardiogram videos and demonstrates that SSL can enable label-efficient disease classification from small amounts of labeled data.


Artificial intelligence (AI) has been used to develop software that can automatically diagnose diseases from medical images. However, these AI models require thousands or millions of examples to properly learn from, which can be very expensive, as diagnosis is often time-consuming and requires clinical expertise. Using a technique called self-supervised learning (SSL), we develop an AI method to effectively diagnose heart disease from as few as 50 instances. Our method, EchoCLR, is designed for echocardiography, a key imaging technique to monitor heart health, and outperforms other methods on disease diagnosis from small amounts of data. This method can advance AI for echocardiography and enable researchers with limited resources to create disease diagnosis models from small medical imaging datasets.

2.
Med Image Anal ; 97: 103224, 2024 May 31.
Article in English | MEDLINE | ID: mdl-38850624

ABSTRACT

Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.

3.
JAMA Cardiol ; 9(6): 534-544, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38581644

ABSTRACT

Importance: Aortic stenosis (AS) is a major public health challenge with a growing therapeutic landscape, but current biomarkers do not inform personalized screening and follow-up. A video-based artificial intelligence (AI) biomarker (Digital AS Severity index [DASSi]) can detect severe AS using single-view long-axis echocardiography without Doppler characterization. Objective: To deploy DASSi to patients with no AS or with mild or moderate AS at baseline to identify AS development and progression. Design, Setting, and Participants: This is a cohort study that examined 2 cohorts of patients without severe AS undergoing echocardiography in the Yale New Haven Health System (YNHHS; 2015-2021) and Cedars-Sinai Medical Center (CSMC; 2018-2019). A novel computational pipeline for the cross-modal translation of DASSi into cardiac magnetic resonance (CMR) imaging was further developed in the UK Biobank. Analyses were performed between August 2023 and February 2024. Exposure: DASSi (range, 0-1) derived from AI applied to echocardiography and CMR videos. Main Outcomes and Measures: Annualized change in peak aortic valve velocity (AV-Vmax) and late (>6 months) aortic valve replacement (AVR). Results: A total of 12 599 participants were included in the echocardiographic study (YNHHS: n = 8798; median [IQR] age, 71 [60-80] years; 4250 [48.3%] women; median [IQR] follow-up, 4.1 [2.4-5.4] years; and CSMC: n = 3801; median [IQR] age, 67 [54-78] years; 1685 [44.3%] women; median [IQR] follow-up, 3.4 [2.8-3.9] years). Higher baseline DASSi was associated with faster progression in AV-Vmax (per 0.1 DASSi increment: YNHHS, 0.033 m/s per year [95% CI, 0.028-0.038] among 5483 participants; CSMC, 0.082 m/s per year [95% CI, 0.053-0.111] among 1292 participants), with values of 0.2 or greater associated with a 4- to 5-fold higher AVR risk than values less than 0.2 (YNHHS: 715 events; adjusted hazard ratio [HR], 4.97 [95% CI, 2.71-5.82]; CSMC: 56 events; adjusted HR, 4.04 [95% CI, 0.92-17.70]), independent of age, sex, race, ethnicity, ejection fraction, and AV-Vmax. This was reproduced across 45 474 participants (median [IQR] age, 65 [59-71] years; 23 559 [51.8%] women; median [IQR] follow-up, 2.5 [1.6-3.9] years) undergoing CMR imaging in the UK Biobank (for participants with DASSi ≥0.2 vs those with DASSi <.02, adjusted HR, 11.38 [95% CI, 2.56-50.57]). Saliency maps and phenome-wide association studies supported associations with cardiac structure and function and traditional cardiovascular risk factors. Conclusions and Relevance: In this cohort study of patients without severe AS undergoing echocardiography or CMR imaging, a new AI-based video biomarker was independently associated with AS development and progression, enabling opportunistic risk stratification across cardiovascular imaging modalities as well as potential application on handheld devices.


Subject(s)
Aortic Valve Stenosis , Artificial Intelligence , Disease Progression , Echocardiography , Severity of Illness Index , Humans , Aortic Valve Stenosis/diagnostic imaging , Aortic Valve Stenosis/surgery , Aortic Valve Stenosis/physiopathology , Female , Male , Aged , Echocardiography/methods , Middle Aged , Biomarkers , Aged, 80 and over , Cohort Studies , Video Recording , Multimodal Imaging/methods , Magnetic Resonance Imaging/methods
4.
medRxiv ; 2024 Jun 29.
Article in English | MEDLINE | ID: mdl-38559021

ABSTRACT

BACKGROUND: Point-of-care ultrasonography (POCUS) enables cardiac imaging at the bedside and in communities but is limited by abbreviated protocols and variation in quality. We developed and tested artificial intelligence (AI) models to automate the detection of under-diagnosed cardiomyopathies from cardiac POCUS. METHODS: In a development set of 290,245 transthoracic echocardiographic videos across the Yale-New Haven Health System (YNHHS), we used augmentation approaches and a customized loss function weighted for view quality to derive a POCUS-adapted, multi-label, video-based convolutional neural network (CNN) that discriminates HCM (hypertrophic cardiomyopathy) and ATTR-CM (transthyretin amyloid cardiomyopathy) from controls without known disease. We evaluated the final model across independent, internal and external, retrospective cohorts of individuals who underwent cardiac POCUS across YNHHS and Mount Sinai Health System (MSHS) emergency departments (EDs) (2011-2024) to prioritize key views and validate the diagnostic and prognostic performance of single-view screening protocols. FINDINGS: We identified 33,127 patients (median age 61 [IQR: 45-75] years, n=17,276 [52.2%] female) at YNHHS and 5,624 (57 [IQR: 39-71] years, n=1,953 [34.7%] female) at MSHS with 78,054 and 13,796 eligible cardiac POCUS videos, respectively. An AI-enabled single-view screening approach successfully discriminated HCM (AUROC of 0.90 [YNHHS] & 0.89 [MSHS]) and ATTR-CM (YNHHS: AUROC of 0.92 [YNHHS] & 0.99 [MSHS]). In YNHHS, 40 (58.0%) HCM and 23 (47.9%) ATTR-CM cases had a positive screen at median of 2.1 [IQR: 0.9-4.5] and 1.9 [IQR: 1.0-3.4] years before clinical diagnosis. Moreover, among 24,448 participants without known cardiomyopathy followed over 2.2 [IQR: 1.1-5.8] years, AI-POCUS probabilities in the highest (vs lowest) quintile for HCM and ATTR-CM conferred a 15% (adj.HR 1.15 [95%CI: 1.02-1.29]) and 39% (adj.HR 1.39 [95%CI: 1.22-1.59]) higher age- and sex-adjusted mortality risk, respectively. INTERPRETATION: We developed and validated an AI framework that enables scalable, opportunistic screening of treatable cardiomyopathies wherever POCUS is used.

5.
Sci Rep ; 14(1): 8372, 2024 04 10.
Article in English | MEDLINE | ID: mdl-38600311

ABSTRACT

Rib fractures are highly predictive of non-accidental trauma in children under 3 years old. Rib fracture detection in pediatric radiographs is challenging because fractures can be obliquely oriented to the imaging detector, obfuscated by other structures, incomplete, and non-displaced. Prior studies have shown up to two-thirds of rib fractures may be missed during initial interpretation. In this paper, we implemented methods for improving the sensitivity (i.e. recall) performance for detecting and localizing rib fractures in pediatric chest radiographs to help augment performance of radiology interpretation. These methods adapted two convolutional neural network (CNN) architectures, RetinaNet and YOLOv5, and our previously proposed decision scheme, "avalanche decision", that dynamically reduces the acceptance threshold for proposed regions in each image. Additionally, we present contributions of using multiple image pre-processing and model ensembling techniques. Using a custom dataset of 1109 pediatric chest radiographs manually labeled by seven pediatric radiologists, we performed 10-fold cross-validation and reported detection performance using several metrics, including F2 score which summarizes precision and recall for high-sensitivity tasks. Our best performing model used three ensembled YOLOv5 models with varied input processing and an avalanche decision scheme, achieving an F2 score of 0.725 ± 0.012. Expert inter-reader performance yielded an F2 score of 0.732. Results demonstrate that our combination of sensitivity-driving methods provides object detector performance approaching the capabilities of expert human readers, suggesting that these methods may provide a viable approach to identify all rib fractures.


Subject(s)
Radiology , Rib Fractures , Humans , Child , Child, Preschool , Rib Fractures/diagnostic imaging , Rib Fractures/etiology , Radiography , Neural Networks, Computer , Radiologists , Retrospective Studies , Sensitivity and Specificity
6.
J Am Med Inform Assoc ; 31(4): 855-865, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38269618

ABSTRACT

OBJECTIVE: Artificial intelligence (AI) detects heart disease from images of electrocardiograms (ECGs). However, traditional supervised learning is limited by the need for large amounts of labeled data. We report the development of Biometric Contrastive Learning (BCL), a self-supervised pretraining approach for label-efficient deep learning on ECG images. MATERIALS AND METHODS: Using pairs of ECGs from 78 288 individuals from Yale (2000-2015), we trained a convolutional neural network to identify temporally separated ECG pairs that varied in layouts from the same patient. We fine-tuned BCL-pretrained models to detect atrial fibrillation (AF), gender, and LVEF < 40%, using ECGs from 2015 to 2021. We externally tested the models in cohorts from Germany and the United States. We compared BCL with ImageNet initialization and general-purpose self-supervised contrastive learning for images (simCLR). RESULTS: While with 100% labeled training data, BCL performed similarly to other approaches for detecting AF/Gender/LVEF < 40% with an AUROC of 0.98/0.90/0.90 in the held-out test sets, it consistently outperformed other methods with smaller proportions of labeled data, reaching equivalent performance at 50% of data. With 0.1% data, BCL achieved AUROC of 0.88/0.79/0.75, compared with 0.51/0.52/0.60 (ImageNet) and 0.61/0.53/0.49 (simCLR). In external validation, BCL outperformed other methods even at 100% labeled training data, with an AUROC of 0.88/0.88 for Gender and LVEF < 40% compared with 0.83/0.83 (ImageNet) and 0.84/0.83 (simCLR). DISCUSSION AND CONCLUSION: A pretraining strategy that leverages biometric signatures of different ECGs from the same patient enhances the efficiency of developing AI models for ECG images. This represents a major advance in detecting disorders from ECG images with limited labeled data.


Subject(s)
Atrial Fibrillation , Deep Learning , Humans , Artificial Intelligence , Electrocardiography , Biometry
7.
medRxiv ; 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-37808685

ABSTRACT

Importance: Aortic stenosis (AS) is a major public health challenge with a growing therapeutic landscape, but current biomarkers do not inform personalized screening and follow-up. Objective: A video-based artificial intelligence (AI) biomarker (Digital AS Severity index [DASSi]) can detect severe AS using single-view long-axis echocardiography without Doppler. Here, we deploy DASSi to patients with no or mild/moderate AS at baseline to identify AS development and progression. Design Setting and Participants: We defined two cohorts of patients without severe AS undergoing echocardiography in the Yale-New Haven Health System (YNHHS) (2015-2021, 4.1[IQR:2.4-5.4] follow-up years) and Cedars-Sinai Medical Center (CSMC) (2018-2019, 3.4[IQR:2.8-3.9] follow-up years). We further developed a novel computational pipeline for the cross-modality translation of DASSi into cardiac magnetic resonance (CMR) imaging in the UK Biobank (2.5[IQR:1.6-3.9] follow-up years). Analyses were performed between August 2023-February 2024. Exposure: DASSi (range: 0-1) derived from AI applied to echocardiography and CMR videos. Main Outcomes and Measures: Annualized change in peak aortic valve velocity (AV-Vmax) and late (>6 months) aortic valve replacement (AVR). Results: A total of 12,599 participants were included in the echocardiographic study (YNHHS: n=8,798, median age of 71 [IQR (interquartile range):60-80] years, 4250 [48.3%] women, and CSMC: n=3,801, 67 [IQR:54-78] years, 1685 [44.3%] women). Higher baseline DASSi was associated with faster progression in AV-Vmax (per 0.1 DASSi increments: YNHHS: +0.033 m/s/year [95%CI:0.028-0.038], n=5,483, and CSMC: +0.082 m/s/year [0.053-0.111], n=1,292), with levels ≥ vs <0.2 linked to a 4-to-5-fold higher AVR risk (715 events in YNHHS; adj.HR 4.97 [95%CI: 2.71-5.82], 56 events in CSMC: 4.04 [0.92-17.7]), independent of age, sex, ethnicity/race, ejection fraction and AV-Vmax. This was reproduced across 45,474 participants (median age 65 [IQR:59-71] years, 23,559 [51.8%] women) undergoing CMR in the UK Biobank (adj.HR 11.4 [95%CI:2.56-50.60] for DASSi ≥vs<0.2). Saliency maps and phenome-wide association studies supported links with traditional cardiovascular risk factors and diastolic dysfunction. Conclusions and Relevance: In this cohort study of patients without severe AS undergoing echocardiography or CMR imaging, a new AI-based video biomarker is independently associated with AS development and progression, enabling opportunistic risk stratification across cardiovascular imaging modalities as well as potential application on handheld devices.

8.
ArXiv ; 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-37986726

ABSTRACT

Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.

9.
ArXiv ; 2023 Aug 17.
Article in English | MEDLINE | ID: mdl-37791108

ABSTRACT

Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance. However, the nuanced ways in which pruning impacts model behavior are not well understood, particularly for long-tailed, multi-label datasets commonly found in clinical settings. This knowledge gap could have dangerous implications when deploying a pruned model for diagnosis, where unexpected model behavior could impact patient well-being. To fill this gap, we perform the first analysis of pruning's effect on neural networks trained to diagnose thorax diseases from chest X-rays (CXRs). On two large CXR datasets, we examine which diseases are most affected by pruning and characterize class "forgettability" based on disease frequency and co-occurrence behavior. Further, we identify individual CXRs where uncompressed and heavily pruned models disagree, known as pruning-identified exemplars (PIEs), and conduct a human reader study to evaluate their unifying qualities. We find that radiologists perceive PIEs as having more label noise, lower image quality, and higher diagnosis difficulty. This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification. All code, model weights, and data access instructions can be found at https://github.com/VITA-Group/PruneCXR.

10.
Med Image Comput Comput Assist Interv ; 14224: 663-673, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37829549

ABSTRACT

Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance. However, the nuanced ways in which pruning impacts model behavior are not well understood, particularly for long-tailed, multi-label datasets commonly found in clinical settings. This knowledge gap could have dangerous implications when deploying a pruned model for diagnosis, where unexpected model behavior could impact patient well-being. To fill this gap, we perform the first analysis of pruning's effect on neural networks trained to diagnose thorax diseases from chest X-rays (CXRs). On two large CXR datasets, we examine which diseases are most affected by pruning and characterize class "forgettability" based on disease frequency and co-occurrence behavior. Further, we identify individual CXRs where uncompressed and heavily pruned models disagree, known as pruning-identified exemplars (PIEs), and conduct a human reader study to evaluate their unifying qualities. We find that radiologists perceive PIEs as having more label noise, lower image quality, and higher diagnosis difficulty. This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification. All code, model weights, and data access instructions can be found at https://github.com/VITA-Group/PruneCXR.

11.
Nat Commun ; 14(1): 6261, 2023 10 06.
Article in English | MEDLINE | ID: mdl-37803009

ABSTRACT

Deep learning has become a popular tool for computer-aided diagnosis using medical images, sometimes matching or exceeding the performance of clinicians. However, these models can also reflect and amplify human bias, potentially resulting inaccurate missed diagnoses. Despite this concern, the problem of improving model fairness in medical image classification by deep learning has yet to be fully studied. To address this issue, we propose an algorithm that leverages the marginal pairwise equal opportunity to reduce bias in medical image classification. Our evaluations across four tasks using four independent large-scale cohorts demonstrate that our proposed algorithm not only improves fairness in individual and intersectional subgroups but also maintains overall performance. Specifically, the relative change in pairwise fairness difference between our proposed model and the baseline model was reduced by over 35%, while the relative change in AUC value was typically within 1%. By reducing the bias generated by deep learning models, our proposed approach can potentially alleviate concerns about the fairness and reliability of image-based computer-aided diagnosis.


Subject(s)
Algorithms , Diagnosis, Computer-Assisted , Humans , Reproducibility of Results , Diagnosis, Computer-Assisted/methods , Computers
12.
medRxiv ; 2023 Sep 14.
Article in English | MEDLINE | ID: mdl-37745527

ABSTRACT

Objective: Artificial intelligence (AI) detects heart disease from images of electrocardiograms (ECGs), however traditional supervised learning is limited by the need for large amounts of labeled data. We report the development of Biometric Contrastive Learning (BCL), a self-supervised pretraining approach for label-efficient deep learning on ECG images. Materials and Methods: Using pairs of ECGs from 78,288 individuals from Yale (2000-2015), we trained a convolutional neural network to identify temporally-separated ECG pairs that varied in layouts from the same patient. We fine-tuned BCL-pretrained models to detect atrial fibrillation (AF), gender, and LVEF<40%, using ECGs from 2015-2021. We externally tested the models in cohorts from Germany and the US. We compared BCL with random initialization and general-purpose self-supervised contrastive learning for images (simCLR). Results: While with 100% labeled training data, BCL performed similarly to other approaches for detecting AF/Gender/LVEF<40% with AUROC of 0.98/0.90/0.90 in the held-out test sets, it consistently outperformed other methods with smaller proportions of labeled data, reaching equivalent performance at 50% of data. With 0.1% data, BCL achieved AUROC of 0.88/0.79/0.75, compared with 0.51/0.52/0.60 (random) and 0.61/0.53/0.49 (simCLR). In external validation, BCL outperformed other methods even at 100% labeled training data, with AUROC of 0.88/0.88 for Gender and LVEF<40% compared with 0.83/0.83 (random) and 0.84/0.83 (simCLR). Discussion and Conclusion: A pretraining strategy that leverages biometric signatures of different ECGs from the same patient enhances the efficiency of developing AI models for ECG images. This represents a major advance in detecting disorders from ECG images with limited labeled data.

13.
Eur Heart J ; 44(43): 4592-4604, 2023 11 14.
Article in English | MEDLINE | ID: mdl-37611002

ABSTRACT

BACKGROUND AND AIMS: Early diagnosis of aortic stenosis (AS) is critical to prevent morbidity and mortality but requires skilled examination with Doppler imaging. This study reports the development and validation of a novel deep learning model that relies on two-dimensional (2D) parasternal long axis videos from transthoracic echocardiography without Doppler imaging to identify severe AS, suitable for point-of-care ultrasonography. METHODS AND RESULTS: In a training set of 5257 studies (17 570 videos) from 2016 to 2020 [Yale-New Haven Hospital (YNHH), Connecticut], an ensemble of three-dimensional convolutional neural networks was developed to detect severe AS, leveraging self-supervised contrastive pretraining for label-efficient model development. This deep learning model was validated in a temporally distinct set of 2040 consecutive studies from 2021 from YNHH as well as two geographically distinct cohorts of 4226 and 3072 studies, from California and other hospitals in New England, respectively. The deep learning model achieved an area under the receiver operating characteristic curve (AUROC) of 0.978 (95% CI: 0.966, 0.988) for detecting severe AS in the temporally distinct test set, maintaining its diagnostic performance in geographically distinct cohorts [0.952 AUROC (95% CI: 0.941, 0.963) in California and 0.942 AUROC (95% CI: 0.909, 0.966) in New England]. The model was interpretable with saliency maps identifying the aortic valve, mitral annulus, and left atrium as the predictive regions. Among non-severe AS cases, predicted probabilities were associated with worse quantitative metrics of AS suggesting an association with various stages of AS severity. CONCLUSION: This study developed and externally validated an automated approach for severe AS detection using single-view 2D echocardiography, with potential utility for point-of-care screening.


Subject(s)
Aortic Valve Stenosis , Deep Learning , Humans , Echocardiography , Aortic Valve Stenosis/diagnostic imaging , Aortic Valve Stenosis/complications , Aortic Valve/diagnostic imaging , Ultrasonography
14.
IEEE Trans Med Imaging ; 42(3): 750-761, 2023 03.
Article in English | MEDLINE | ID: mdl-36288235

ABSTRACT

Before the recent success of deep learning methods for automated medical image analysis, practitioners used handcrafted radiomic features to quantitatively describe local patches of medical images. However, extracting discriminative radiomic features relies on accurate pathology localization, which is difficult to acquire in real-world settings. Despite advances in disease classification and localization from chest X-rays, many approaches fail to incorporate clinically-informed domainspecific radiomic features. For these reasons, we propose a Radiomics-Guided Transformer (RGT) that fuses global image information with local radiomics-guided auxiliary information to provide accurate cardiopulmonary pathology localization and classification without any bounding box annotations. RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomics information. Using the learned self-attention of its image branch, RGT extracts a bounding box for which to compute radiomic features, which are further processed by the radiomics branch; learned image and radiomic features are then fused and mutually interact via cross-attention layers. Thus, RGT utilizes a novel end-to-end feedback loop that can bootstrap accurate pathology localization only using image-level disease labels. Experiments on the NIH ChestXRay dataset demonstrate that RGT outperforms prior works in weakly supervised disease localization (by an average margin of 3.6% over various intersection-over-union thresholds) and classification (by 1.1% in average area under the receiver operating characteristic curve). We publicly release our codes and pre-trained models at https://github.com/VITAGroup/chext.


Subject(s)
X-Rays , Radiography , ROC Curve
15.
Article in English | MEDLINE | ID: mdl-36318048

ABSTRACT

Imaging exams, such as chest radiography, will yield a small set of common findings and a much larger set of uncommon findings. While a trained radiologist can learn the visual presentation of rare conditions by studying a few representative examples, teaching a machine to learn from such a "long-tailed" distribution is much more difficult, as standard methods would be easily biased toward the most frequent classes. In this paper, we present a comprehensive benchmark study of the long-tailed learning problem in the specific domain of thorax diseases on chest X-rays. We focus on learning from naturally distributed chest X-ray data, optimizing classification accuracy over not only the common "head" classes, but also the rare yet critical "tail" classes. To accomplish this, we introduce a challenging new long-tailed chest X-ray benchmark to facilitate research on developing long-tailed learning methods for medical image classification. The benchmark consists of two chest X-ray datasets for 19- and 20-way thorax disease classification, containing classes with as many as 53,000 and as few as 7 labeled training images. We evaluate both standard and state-of-the-art long-tailed learning methods on this new benchmark, analyzing which aspects of these methods are most beneficial for long-tailed medical image classification and summarizing insights for future algorithm design. The datasets, trained models, and code are available at https://github.com/VITA-Group/LongTailCXR.

SELECTION OF CITATIONS
SEARCH DETAIL
...