Search | VHL Regional Portal

1.

IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.

Chen, Zhihao; Hu, Bin; Niu, Chuang; Chen, Tao; Li, Yuxin; Shan, Hongming; Wang, Ge.

Vis Comput Ind Biomed Art ; 7(1): 20, 2024 Aug 05.

Article in English | MEDLINE | ID: mdl-39101954

ABSTRACT

Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.

2.

HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation.

Chen, Tao; Wang, Chenhui; Chen, Zhihao; Lei, Yiming; Shan, Hongming.

IEEE Trans Med Imaging ; PP2024 Jul 08.

Article in English | MEDLINE | ID: mdl-38976467

ABSTRACT

Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from unstable feature space. In this work, we propose to complement discriminative segmentation methods with the knowledge of underlying data distribution from generative models. To that end, we propose a novel hybrid diffusion framework for medical image segmentation, termed HiDiff, which can synergize the strengths of existing discriminative segmentation models and new generative diffusion models. HiDiff comprises two key components: discriminative segmentor and diffusion refiner. First, we utilize any conventional trained segmentation models as discriminative segmentor, which can provide a segmentation mask prior for diffusion refiner. Second, we propose a novel binary Bernoulli diffusion model (BBDM) as the diffusion refiner, which can effectively, efficiently, and interactively refine the segmentation mask by modeling the underlying data distribution. Third, we train the segmentor and BBDM in an alternate-collaborative manner to mutually boost each other. Extensive experimental results on abdomen organ, brain tumor, polyps, and retinal vessels segmentation datasets, covering four widely-used modalities, demonstrate the superior performance of HiDiff over existing medical segmentation algorithms, including the state-of-the-art transformer- and diffusion-based ones. In addition, HiDiff excels at segmenting small objects and generalizing to new datasets. Source codes are made available at https://github.com/takimailto/HiDiff.

3.

LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring.

Chen, Zhihao; Niu, Chuang; Gao, Qi; Wang, Ge; Shan, Hongming.

IEEE Trans Med Imaging ; 43(5): 1880-1894, 2024 May.

Article in English | MEDLINE | ID: mdl-38194396

ABSTRACT

This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feed-forward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergizes these two sub-tasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former.

Subject(s)

Algorithms , Imaging, Three-Dimensional , Tomography, X-Ray Computed , Tomography, X-Ray Computed/methods , Humans , Imaging, Three-Dimensional/methods , Deep Learning , Phantoms, Imaging

4.

Quad-Net: Quad-Domain Network for CT Metal Artifact Reduction.

Li, Zilong; Gao, Qi; Wu, Yaping; Niu, Chuang; Zhang, Junping; Wang, Meiyun; Wang, Ge; Shan, Hongming.

IEEE Trans Med Imaging ; 43(5): 1866-1879, 2024 May.

Article in English | MEDLINE | ID: mdl-38194399

ABSTRACT

Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the state-of-the-art dual-domain deep network approach into a quad-domain counterpart so that all the features in the sinogram, image, and their corresponding Fourier domains are synergized to eliminate metal artifacts optimally without compromising structural subtleties. Our proposed quad-domain network for MAR, referred to as Quad-Net, takes little additional computational cost since the Fourier transform is highly efficient, and works across the four receptive fields to learn both global and local features as well as their relations. Specifically, we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the sinogram domain and its Fourier space to faithfully inpaint metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier Refinement Network (IFR-Net) which takes both an image and its Fourier spectrum to improve a CT image reconstructed from the SFR-Net output using cross-domain contextual information. Quad-Net is trained on clinical datasets to minimize a composite loss function. Quad-Net does not require precise metal masks, which is of great importance in clinical practice. Our experimental results demonstrate the superiority of Quad-Net over the state-of-the-art MAR methods quantitatively, visually, and statistically. The Quad-Net code is publicly available at https://github.com/longzilicart/Quad-Net.

Subject(s)

Artifacts , Metals , Tomography, X-Ray Computed , Humans , Tomography, X-Ray Computed/methods , Metals/chemistry , Fourier Analysis , Algorithms , Deep Learning , Prostheses and Implants , Image Processing, Computer-Assisted/methods , Phantoms, Imaging

5.

Promoting fast MR imaging pipeline by full-stack AI.

Wang, Zhiwen; Li, Bowen; Yu, Hui; Zhang, Zhongzhou; Ran, Maosong; Xia, Wenjun; Yang, Ziyuan; Lu, Jingfeng; Chen, Hu; Zhou, Jiliu; Shan, Hongming; Zhang, Yi.

iScience ; 27(1): 108608, 2024 Jan 19.

Article in English | MEDLINE | ID: mdl-38174317

ABSTRACT

Magnetic resonance imaging (MRI) is a widely used imaging modality in clinics for medical disease diagnosis, staging, and follow-up. Deep learning has been extensively used to accelerate k-space data acquisition, enhance MR image reconstruction, and automate tissue segmentation. However, these three tasks are usually treated as independent tasks and optimized for evaluation by radiologists, thus ignoring the strong dependencies among them; this may be suboptimal for downstream intelligent processing. Here, we present a novel paradigm, full-stack learning (FSL), which can simultaneously solve these three tasks by considering the overall imaging process and leverage the strong dependence among them to further improve each task, significantly boosting the efficiency and efficacy of practical MRI workflows. Experimental results obtained on multiple open MR datasets validate the superiority of FSL over existing state-of-the-art methods on each task. FSL has great potential to optimize the practical workflow of MRI for medical diagnosis and radiotherapy.

6.

HOPE: Hybrid-Granularity Ordinal Prototype Learning for Progression Prediction of Mild Cognitive Impairment.

Wang, Chenhui; Lei, Yiming; Chen, Tao; Zhang, Junping; Li, Yuxin; Shan, Hongming.

IEEE J Biomed Health Inform ; PP2024 Jan 23.

Article in English | MEDLINE | ID: mdl-38261490

ABSTRACT

Mild cognitive impairment (MCI) is often at high risk of progression to Alzheimer's disease (AD). Existing works to identify the progressive MCI (pMCI) typically require MCI subtype labels, pMCI vs. stable MCI (sMCI), determined by whether or not an MCI patient will progress to AD after a long follow-up. However, prospectively acquiring MCI subtype data is time-consuming and resource-intensive; the resultant small datasets could lead to severe overfitting and difficulty in extracting discriminative information. Inspired by that various longitudinal biomarkers and cognitive measurements present an ordinal pathway on AD progression, we propose a novel Hybrid-granularity Ordinal PrototypE learning (HOPE) method to characterize AD ordinal progression for MCI progression prediction. First, HOPE learns an ordinal metric space that enables progression prediction by prototype comparison. Second, HOPE leverages a novel hybrid-granularity ordinal loss to learn the ordinal nature of AD via effectively integrating instance-to-instance ordinality, instance-to-class compactness, and class-to-class separation. Third, to make the prototype learning more stable, HOPE employs an exponential moving average strategy to learn the global prototypes of NC and AD dynamically. Experimental results on the internal ADNI and the external NACC datasets demonstrate the superiority of the proposed HOPE over existing state-of-the-art methods as well as its interpretability. Source code is made available at https://github.com/thibault-wch/HOPE-for-mild-cognitive-impairment.

7.

Joint learning framework of cross-modal synthesis and diagnosis for Alzheimer's disease by mining underlying shared modality information.

Wang, Chenhui; Piao, Sirong; Huang, Zhizhong; Gao, Qi; Zhang, Junping; Li, Yuxin; Shan, Hongming.

Med Image Anal ; 91: 103032, 2024 Jan.

Article in English | MEDLINE | ID: mdl-37995628

ABSTRACT

Alzheimer's disease (AD) is one of the most common neurodegenerative disorders presenting irreversible progression of cognitive impairment. How to identify AD as early as possible is critical for intervention with potential preventive measures. Among various neuroimaging modalities used to diagnose AD, functional positron emission tomography (PET) has higher sensitivity than structural magnetic resonance imaging (MRI), but it is also costlier and often not available in many hospitals. How to leverage massive unpaired unlabeled PET to improve the diagnosis performance of AD from MRI becomes rather important. To address this challenge, this paper proposes a novel joint learning framework of unsupervised cross-modal synthesis and AD diagnosis by mining underlying shared modality information, improving the AD diagnosis from MRI while synthesizing more discriminative PET images. We mine underlying shared modality information in two aspects: diversifying modality information through the cross-modal synthesis network and locating critical diagnosis-related patterns through the AD diagnosis network. First, to diversify the modality information, we propose a novel unsupervised cross-modal synthesis network, which implements the inter-conversion between 3D PET and MRI in a single model modulated by the AdaIN module. Second, to locate shared critical diagnosis-related patterns, we propose an interpretable diagnosis network based on fully 2D convolutions, which takes either 3D synthesized PET or original MRI as input. Extensive experimental results on the ADNI dataset show that our framework can synthesize more realistic images, outperform the state-of-the-art AD diagnosis methods, and have better generalization on external AIBL and NACC datasets.

Subject(s)

Alzheimer Disease , Cognitive Dysfunction , Humans , Alzheimer Disease/pathology , Neuroimaging/methods , Positron-Emission Tomography/methods , Magnetic Resonance Imaging/methods , Learning , Cognitive Dysfunction/diagnostic imaging

8.

Physics-/Model-Based and Data-Driven Methods for Low-Dose Computed Tomography: A survey.

Xia, Wenjun; Shan, Hongming; Wang, Ge; Zhang, Yi.

IEEE Signal Process Mag ; 40(2): 89-100, 2023 Mar.

Article in English | MEDLINE | ID: mdl-38404742

ABSTRACT

Since 2016, deep learning (DL) has advanced tomographic imaging with remarkable successes, especially in low-dose computed tomography (LDCT) imaging. Despite being driven by big data, the LDCT denoising and pure end-to-end reconstruction networks often suffer from the black box nature and major issues such as instabilities, which is a major barrier to apply deep learning methods in low-dose CT applications. An emerging trend is to integrate imaging physics and model into deep networks, enabling a hybridization of physics/model-based and data-driven elements. In this paper, we systematically review the physics/model-based data-driven methods for LDCT, summarize the loss functions and training strategies, evaluate the performance of different methods, and discuss relevant issues and future directions.

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL