Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Med Image Comput Comput Assist Interv ; 14220: 651-662, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38751905

RESUMEN

Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google's proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark's superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google's proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

2.
Domain Adapt Represent Transf (2022) ; 13542: 77-87, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36507898

RESUMEN

Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.

3.
Domain Adapt Represent Transf (2022) ; 13542: 12-22, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36383492

RESUMEN

Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. As the first step, we benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data. Our extensive empirical evaluations reveal the following insights in medical imaging: (1) good initialization is more crucial for transformer-based models than for CNNs, (2) self-supervised learning based on masked image modeling captures more generalizable representations than supervised models, and (3) assembling a larger-scale domain-specific dataset can better bridge the domain gap between photographic and medical images via self-supervised continuous pre-training. We hope this benchmark study can direct future research on applying transformers to medical imaging analysis. All codes and pre-trained models are available on our GitHub page https://github.com/JLiangLab/BenchmarkTransformers.

4.
Sensors (Basel) ; 20(12)2020 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-32575567

RESUMEN

Like natural images, remote sensing scene images; of which the quality represents the imaging performance of the remote sensor, also suffer from the degradation caused by imaging system. However, current methods measuring the imaging performance in engineering applications require for particular image patterns and lack generality. Therefore, a more universal approach is demanded to assess the imaging performance of remote sensor without constraints of land cover. Due to the fact that existing general-purpose blind image quality assessment (BIQA) methods cannot obtain satisfying results on remote sensing scene images; in this work, we propose a BIQA model of improved performance for natural images as well as remote sensing scene images namely BM-IQE. We employ a novel block-matching strategy called Structural Similarity Block-Matching (SSIM-BM) to match and group similar image patches. In this way, the potential local information among different patches can get expressed; thus, the validity of natural scene statistics (NSS) feature modeling is enhanced. At the same time, we introduce several features to better characterize and express remote sensing images. The NSS features are extracted from each group and the feature vectors are then fitted to a multivariate Gaussian (MVG) model. This MVG model is therefore used against a reference MVG model learned from a corpus of high-quality natural images to produce a basic quality estimation of each patch (centroid of each group). The further quality estimation of each patch is obtained by weighting averaging of its similar patches' basic quality estimations. The overall quality score of the test image is then computed through average pooling of the patch estimations. Extensive experiments demonstrate that the proposed BM-IQE method can not only outperforms other BIQA methods on remote sensing scene image datasets but also achieve competitive performance on general-purpose natural image datasets as compared to existing state-of-the-art FR/NR-IQA methods.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...