Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring.

Chen, Zhihao; Niu, Chuang; Gao, Qi; Wang, Ge; Shan, Hongming.

IEEE Trans Med Imaging ; 43(5): 1880-1894, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38194396

RESUMO

This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feed-forward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergizes these two sub-tasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former.

Assuntos

Algoritmos , Imageamento Tridimensional , Tomografia Computadorizada por Raios X , Tomografia Computadorizada por Raios X/métodos , Humanos , Imageamento Tridimensional/métodos , Aprendizado Profundo , Imagens de Fantasmas

2.

Quad-Net: Quad-Domain Network for CT Metal Artifact Reduction.

Li, Zilong; Gao, Qi; Wu, Yaping; Niu, Chuang; Zhang, Junping; Wang, Meiyun; Wang, Ge; Shan, Hongming.

IEEE Trans Med Imaging ; 43(5): 1866-1879, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38194399

RESUMO

Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the state-of-the-art dual-domain deep network approach into a quad-domain counterpart so that all the features in the sinogram, image, and their corresponding Fourier domains are synergized to eliminate metal artifacts optimally without compromising structural subtleties. Our proposed quad-domain network for MAR, referred to as Quad-Net, takes little additional computational cost since the Fourier transform is highly efficient, and works across the four receptive fields to learn both global and local features as well as their relations. Specifically, we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the sinogram domain and its Fourier space to faithfully inpaint metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier Refinement Network (IFR-Net) which takes both an image and its Fourier spectrum to improve a CT image reconstructed from the SFR-Net output using cross-domain contextual information. Quad-Net is trained on clinical datasets to minimize a composite loss function. Quad-Net does not require precise metal masks, which is of great importance in clinical practice. Our experimental results demonstrate the superiority of Quad-Net over the state-of-the-art MAR methods quantitatively, visually, and statistically. The Quad-Net code is publicly available at https://github.com/longzilicart/Quad-Net.

Assuntos

Artefatos , Metais , Tomografia Computadorizada por Raios X , Humanos , Tomografia Computadorizada por Raios X/métodos , Metais/química , Análise de Fourier , Algoritmos , Aprendizado Profundo , Próteses e Implantes , Processamento de Imagem Assistida por Computador/métodos , Imagens de Fantasmas

3.

Promoting fast MR imaging pipeline by full-stack AI.

Wang, Zhiwen; Li, Bowen; Yu, Hui; Zhang, Zhongzhou; Ran, Maosong; Xia, Wenjun; Yang, Ziyuan; Lu, Jingfeng; Chen, Hu; Zhou, Jiliu; Shan, Hongming; Zhang, Yi.

iScience ; 27(1): 108608, 2024 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-38174317

RESUMO

Magnetic resonance imaging (MRI) is a widely used imaging modality in clinics for medical disease diagnosis, staging, and follow-up. Deep learning has been extensively used to accelerate k-space data acquisition, enhance MR image reconstruction, and automate tissue segmentation. However, these three tasks are usually treated as independent tasks and optimized for evaluation by radiologists, thus ignoring the strong dependencies among them; this may be suboptimal for downstream intelligent processing. Here, we present a novel paradigm, full-stack learning (FSL), which can simultaneously solve these three tasks by considering the overall imaging process and leverage the strong dependence among them to further improve each task, significantly boosting the efficiency and efficacy of practical MRI workflows. Experimental results obtained on multiple open MR datasets validate the superiority of FSL over existing state-of-the-art methods on each task. FSL has great potential to optimize the practical workflow of MRI for medical diagnosis and radiotherapy.

4.

HOPE: Hybrid-Granularity Ordinal Prototype Learning for Progression Prediction of Mild Cognitive Impairment.

Wang, Chenhui; Lei, Yiming; Chen, Tao; Zhang, Junping; Li, Yuxin; Shan, Hongming.

IEEE J Biomed Health Inform ; PP2024 Jan 23.

Artigo em Inglês | MEDLINE | ID: mdl-38261490

RESUMO

Mild cognitive impairment (MCI) is often at high risk of progression to Alzheimer's disease (AD). Existing works to identify the progressive MCI (pMCI) typically require MCI subtype labels, pMCI vs. stable MCI (sMCI), determined by whether or not an MCI patient will progress to AD after a long follow-up. However, prospectively acquiring MCI subtype data is time-consuming and resource-intensive; the resultant small datasets could lead to severe overfitting and difficulty in extracting discriminative information. Inspired by that various longitudinal biomarkers and cognitive measurements present an ordinal pathway on AD progression, we propose a novel Hybrid-granularity Ordinal PrototypE learning (HOPE) method to characterize AD ordinal progression for MCI progression prediction. First, HOPE learns an ordinal metric space that enables progression prediction by prototype comparison. Second, HOPE leverages a novel hybrid-granularity ordinal loss to learn the ordinal nature of AD via effectively integrating instance-to-instance ordinality, instance-to-class compactness, and class-to-class separation. Third, to make the prototype learning more stable, HOPE employs an exponential moving average strategy to learn the global prototypes of NC and AD dynamically. Experimental results on the internal ADNI and the external NACC datasets demonstrate the superiority of the proposed HOPE over existing state-of-the-art methods as well as its interpretability. Source code is made available at https://github.com/thibault-wch/HOPE-for-mild-cognitive-impairment.

5.

Weakly supervised learning-based 3D bladder reconstruction from 2D ultrasound images for bladder volume measurement.

Peng, Zhao; Shan, Hongming; Yang, Xiaoyu; Li, Shuzhou; Tang, Du; Cao, Ying; Shao, Qigang; Huo, Wanli; Yang, Zhen.

Med Phys ; 51(2): 1277-1288, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-37486288

RESUMO

BACKGROUND: Accurate measurement of bladder volume is necessary to maintain the consistency of the patient's anatomy in radiation therapy for pelvic tumors. As the diversity of the bladder shape, traditional methods for bladder volume measurement from 2D ultrasound have been found to produce inaccurate results. PURPOSE: To improve the accuracy of bladder volume measurement from 2D ultrasound images for patients with pelvic tumors. METHODS: The bladder ultrasound images from 130 patients with pelvic cancer were collected retrospectively. All data were split into a training set (80 patients), a validation set (20 patients), and a test set (30 patients). A total of 12 transabdominal ultrasound images for one patient were captured by automatically rotating the ultrasonic probe with an angle step of 15°. An incomplete 3D ultrasound volume was synthesized by arranging these 2D ultrasound images in 3D space according to the acquisition angles. With this as input, a weakly supervised learning-based 3D bladder reconstruction neural network model was built to predict the complete 3D bladder. The key point is that we designed a novel loss function, including the supervised loss of bladder segmentation in the ultrasound images at known angles and the compactness loss of the 3D bladder. Bladder volume was calculated by counting the number of voxels belonging to the 3D bladder. The dice similarity coefficient (DSC) was used to evaluate the accuracy of bladder segmentation, and the relative standard deviation (RSD) was used to evaluate the calculation accuracy of bladder volume with that of computed tomography (CT) images as the gold standard. RESULTS: The results showed that the mean DSC was up to 0.94 and the mean absolute RSD can be reduced to 6.3% when using 12 ultrasound images of one patient. Further, the mean DSC also was up to 0.90 and the mean absolute RSD can be reduced to 9.0% even if only two ultrasound images were used (i.e., the angle step is 90°). Compared with the commercial algorithm in bladder scanners, which has a mean absolute RSD of 13.6%, our proposed method showed a considerably huge improvement. CONCLUSIONS: The proposed weakly supervised learning-based 3D bladder reconstruction method can greatly improve the accuracy of bladder volume measurement. It has great potential to be used in bladder volume measurement devices in the future.

Assuntos

Neoplasias Pélvicas , Bexiga Urinária , Humanos , Bexiga Urinária/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Estudos Retrospectivos , Aprendizado de Máquina Supervisionado

6.

CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for Low-Dose CT Denoising and Generalization.

Gao, Qi; Li, Zilong; Zhang, Junping; Zhang, Yi; Shan, Hongming.

IEEE Trans Med Imaging ; 43(2): 745-759, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-37773896

RESUMO

Low-dose computed tomography (CT) images suffer from noise and artifacts due to photon starvation and electronic noise. Recently, some works have attempted to use diffusion models to address the over-smoothness and training instability encountered by previous deep-learning-based denoising models. However, diffusion models suffer from long inference time due to a large number of sampling steps involved. Very recently, cold diffusion model generalizes classical diffusion models and has greater flexibility. Inspired by cold diffusion, this paper presents a novel COntextual eRror-modulated gEneralized Diffusion model for low-dose CT (LDCT) denoising, termed CoreDiff. First, CoreDiff utilizes LDCT images to displace the random Gaussian noise and employs a novel mean-preserving degradation operator to mimic the physical process of CT degradation, significantly reducing sampling steps thanks to the informative LDCT images as the starting point of the sampling process. Second, to alleviate the error accumulation problem caused by the imperfect restoration operator in the sampling process, we propose a novel ContextuaL Error-modulAted Restoration Network (CLEAR-Net), which can leverage contextual information to constrain the sampling process from structural distortion and modulate time step embedding features for better alignment with the input at the next time step. Third, to rapidly generalize the trained model to a new, unseen dose level with as few resources as possible, we devise a one-shot learning framework to make CoreDiff generalize faster and better using only one single LDCT image (un)paired with normal-dose CT (NDCT). Extensive experimental results on four datasets demonstrate that our CoreDiff outperforms competing methods in denoising and generalization performance, with clinically acceptable inference time. Source code is made available at https://github.com/qgao21/CoreDiff.

Assuntos

Software , Tomografia Computadorizada por Raios X , Razão Sinal-Ruído , Tomografia Computadorizada por Raios X/métodos , Artefatos , Difusão , Processamento de Imagem Assistida por Computador/métodos , Algoritmos

7.

Joint learning framework of cross-modal synthesis and diagnosis for Alzheimer's disease by mining underlying shared modality information.

Wang, Chenhui; Piao, Sirong; Huang, Zhizhong; Gao, Qi; Zhang, Junping; Li, Yuxin; Shan, Hongming.

Med Image Anal ; 91: 103032, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37995628

RESUMO

Alzheimer's disease (AD) is one of the most common neurodegenerative disorders presenting irreversible progression of cognitive impairment. How to identify AD as early as possible is critical for intervention with potential preventive measures. Among various neuroimaging modalities used to diagnose AD, functional positron emission tomography (PET) has higher sensitivity than structural magnetic resonance imaging (MRI), but it is also costlier and often not available in many hospitals. How to leverage massive unpaired unlabeled PET to improve the diagnosis performance of AD from MRI becomes rather important. To address this challenge, this paper proposes a novel joint learning framework of unsupervised cross-modal synthesis and AD diagnosis by mining underlying shared modality information, improving the AD diagnosis from MRI while synthesizing more discriminative PET images. We mine underlying shared modality information in two aspects: diversifying modality information through the cross-modal synthesis network and locating critical diagnosis-related patterns through the AD diagnosis network. First, to diversify the modality information, we propose a novel unsupervised cross-modal synthesis network, which implements the inter-conversion between 3D PET and MRI in a single model modulated by the AdaIN module. Second, to locate shared critical diagnosis-related patterns, we propose an interpretable diagnosis network based on fully 2D convolutions, which takes either 3D synthesized PET or original MRI as input. Extensive experimental results on the ADNI dataset show that our framework can synthesize more realistic images, outperform the state-of-the-art AD diagnosis methods, and have better generalization on external AIBL and NACC datasets.

Assuntos

Doença de Alzheimer , Disfunção Cognitiva , Humanos , Doença de Alzheimer/patologia , Neuroimagem/métodos , Tomografia por Emissão de Pósitrons/métodos , Imageamento por Ressonância Magnética/métodos , Aprendizagem , Disfunção Cognitiva/diagnóstico por imagem

8.

Deep Rank-Consistent Pyramid Model for Enhanced Crowd Counting.

Gao, Jiaqi; Huang, Zhizhong; Lei, Yiming; Shan, Hongming; Wang, James Z; Wang, Fei-Yue; Zhang, Junping.

IEEE Trans Neural Netw Learn Syst ; PP2023 Dec 13.

Artigo em Inglês | MEDLINE | ID: mdl-38090870

RESUMO

Most conventional crowd counting methods utilize a fully-supervised learning framework to establish a mapping between scene images and crowd density maps. They usually rely on a large quantity of costly and time-intensive pixel-level annotations for training supervision. One way to mitigate the intensive labeling effort and improve counting accuracy is to leverage large amounts of unlabeled images. This is attributed to the inherent self-structural information and rank consistency within a single image, offering additional qualitative relation supervision during training. Contrary to earlier methods that utilized the rank relations at the original image level, we explore such rank-consistency relation within the latent feature spaces. This approach enables the incorporation of numerous pyramid partial orders, strengthening the model representation capability. A notable advantage is that it can also increase the utilization ratio of unlabeled samples. Specifically, we propose a Deep Rank-consistEnt pyrAmid Model (), which makes full use of rank consistency across coarse-to-fine pyramid features in latent spaces for enhanced crowd counting with massive unlabeled images. In addition, we have collected a new unlabeled crowd counting dataset, FUDAN-UCC, comprising 4000 images for training purposes. Extensive experiments on four benchmark datasets, namely UCF-QNRF, ShanghaiTech PartA and PartB, and UCF-CC-50, show the effectiveness of our method compared with previous semi-supervised methods. The codes are available at https://github.com/bridgeqiqi/DREAM.

9.

Impact of loss functions on the performance of a deep neural network designed to restore low-dose digital mammography.

Shan, Hongming; Vimieiro, Rodrigo B; Borges, Lucas R; Vieira, Marcelo A C; Wang, Ge.

Artif Intell Med ; 142: 102555, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37316093

RESUMO

Digital mammography is currently the most common imaging tool for breast cancer screening. Although the benefits of using digital mammography for cancer screening outweigh the risks associated with the x-ray exposure, the radiation dose must be kept as low as possible while maintaining the diagnostic utility of the generated images, thus minimizing patient risks. Many studies investigated the feasibility of dose reduction by restoring low-dose images using deep neural networks. In these cases, choosing the appropriate training database and loss function is crucial and impacts the quality of the results. In this work, we used a standard residual network (ResNet) to restore low-dose digital mammography images and evaluated the performance of several loss functions. For training purposes, we extracted 256,000 image patches from a dataset of 400 images of retrospective clinical mammography exams, where dose reduction factors of 75% and 50% were simulated to generate low and standard-dose pairs. We validated the network in a real scenario by using a physical anthropomorphic breast phantom to acquire real low-dose and standard full-dose images in a commercially available mammography system, which were then processed through our trained model. We benchmarked our results against an analytical restoration model for low-dose digital mammography. Objective assessment was performed through the signal-to-noise ratio (SNR) and the mean normalized squared error (MNSE), decomposed into residual noise and bias. Statistical tests revealed that the use of the perceptual loss (PL4) resulted in statistically significant differences when compared to all other loss functions. Additionally, images restored using the PL4 achieved the closest residual noise to the standard dose. On the other hand, perceptual loss PL3, structural similarity index (SSIM) and one of the adversarial losses achieved the lowest bias for both dose reduction factors. The source code of our deep neural network is available at https://github.com/WANG-AXIS/LdDMDenoising.

Assuntos

Mama , Mamografia , Humanos , Estudos Retrospectivos , Bases de Dados Factuais , Redes Neurais de Computação

10.

SAN-Net: Learning generalization to unseen sites for stroke lesion segmentation with self-adaptive normalization.

Yu, Weiyi; Huang, Zhizhong; Zhang, Junping; Shan, Hongming.

Comput Biol Med ; 156: 106717, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-36878125

RESUMO

There are considerable interests in automatic stroke lesion segmentation on magnetic resonance (MR) images in the medical imaging field, as stroke is an important cerebrovascular disease. Although deep learning-based models have been proposed for this task, generalizing these models to unseen sites is difficult due to not only the large inter-site discrepancy among different scanners, imaging protocols, and populations, but also the variations in stroke lesion shape, size, and location. To tackle this issue, we introduce a self-adaptive normalization network, termed SAN-Net, to achieve adaptive generalization on unseen sites for stroke lesion segmentation. Motivated by traditional z-score normalization and dynamic network, we devise a masked adaptive instance normalization (MAIN) to minimize inter-site discrepancies, which standardizes input MR images from different sites into a site-unrelated style by dynamically learning affine parameters from the input; i.e., MAIN can affinely transform the intensity values. Then, we leverage a gradient reversal layer to force the U-net encoder to learn site-invariant representation with a site classifier, which further improves the model generalization in conjunction with MAIN. Finally, inspired by the "pseudosymmetry" of the human brain, we introduce a simple yet effective data augmentation technique, termed symmetry-inspired data augmentation (SIDA), that can be embedded within SAN-Net to double the sample size while halving memory consumption. Experimental results on the benchmark Anatomical Tracings of Lesions After Stroke (ATLAS) v1.2 dataset, which includes MR images from 9 different sites, demonstrate that under the "leave-one-site-out" setting, the proposed SAN-Net outperforms recently published methods in terms of quantitative metrics and qualitative comparisons.

Assuntos

Redes Neurais de Computação , Acidente Vascular Cerebral , Humanos , Imageamento por Ressonância Magnética/métodos , Encéfalo , Processamento de Imagem Assistida por Computador/métodos

11.

Learning Representation for Clustering Via Prototype Scattering and Positive Sampling.

Huang, Zhizhong; Chen, Jie; Zhang, Junping; Shan, Hongming.

IEEE Trans Pattern Anal Mach Intell ; 45(6): 7509-7524, 2023 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-36269906

RESUMO

Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos.

12.

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework and a New Benchmark.

Huang, Zhizhong; Zhang, Junping; Shan, Hongming.

IEEE Trans Pattern Anal Mach Intell ; 45(6): 7917-7932, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36306297

RESUMO

To minimize the impact of age variation on face recognition, age-invariant face recognition (AIFR) extracts identity-related discriminative features by minimizing the correlation between identity- and age-related features while face age synthesis (FAS) eliminates age variation by converting the faces in different age groups to the same group. However, AIFR lacks visual results for model interpretation and FAS compromises downstream recognition due to artifacts. Therefore, we propose a unified, multi-task framework to jointly handle these two tasks, termed MTLFace, which can learn the age-invariant identity-related representation for face recognition while achieving pleasing face synthesis for model interpretation. Specifically, we propose an attention-based feature decomposition to decompose the mixed face features into two uncorrelated components-identity- and age-related features-in a spatially constrained way. Unlike the conventional one-hot encoding that achieves group-level FAS, we propose a novel identity conditional module to achieve identity-level FAS, which can improve the age smoothness of synthesized faces through a weight-sharing strategy. Benefiting from the proposed multi-task framework, we then leverage those high-quality synthesized faces from FAS to further boost AIFR via a novel selective fine-tuning strategy. Furthermore, to advance both AIFR and FAS, we collect and release a large cross-age face dataset with age and gender annotations, and a new benchmark specifically designed for tracing long-missing children. Extensive experimental results on five benchmark cross-age datasets demonstrate that MTLFace yields superior performance than state-of-the-art methods for both AIFR and FAS. We further validate MTLFace on two popular general face recognition datasets, obtaining competitive performance on face recognition in the wild. The source code and datasets are available at http://hzzone.github.io/MTLFace.

Assuntos

Reconhecimento Facial , Criança , Humanos , Algoritmos , Benchmarking , Face , Processamento de Imagem Assistida por Computador/métodos

13.

M₃NAS: Multi-Scale and Multi-Level Memory-Efficient Neural Architecture Search for Low-Dose CT Denoising.

Lu, Zexin; Xia, Wenjun; Huang, Yongqiang; Hou, Mingzheng; Chen, Hu; Zhou, Jiliu; Shan, Hongming; Zhang, Yi.

IEEE Trans Med Imaging ; 42(3): 850-863, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36327187

RESUMO

Lowering the radiation dose in computed tomography (CT) can greatly reduce the potential risk to public health. However, the reconstructed images from dose-reduced CT or low-dose CT (LDCT) suffer from severe noise which compromises the subsequent diagnosis and analysis. Recently, convolutional neural networks have achieved promising results in removing noise from LDCT images. The network architectures that are used are either handcrafted or built on top of conventional networks such as ResNet and U-Net. Recent advances in neural network architecture search (NAS) have shown that the network architecture has a dramatic effect on the model performance. This indicates that current network architectures for LDCT may be suboptimal. Therefore, in this paper, we make the first attempt to apply NAS to LDCT and propose a multi-scale and multi-level memory-efficient NAS for LDCT denoising, termed M3NAS. On the one hand, the proposed M3NAS fuses features extracted by different scale cells to capture multi-scale image structural details. On the other hand, the proposed M3NAS can search a hybrid cell- and network-level structure for better performance. In addition, M3NAS can effectively reduce the number of model parameters and increase the speed of inference. Extensive experimental results on two different datasets demonstrate that the proposed M3NAS can achieve better performance and fewer parameters than several state-of-the-art methods. In addition, we also validate the effectiveness of the multi-scale and multi-level architecture for LDCT denoising, and present further analysis for different configurations of super-net.

Assuntos

Redes Neurais de Computação , Tomografia Computadorizada por Raios X , Razão Sinal-Ruído , Tomografia Computadorizada por Raios X/métodos

14.

Physics-/Model-Based and Data-Driven Methods for Low-Dose Computed Tomography: A survey.

Xia, Wenjun; Shan, Hongming; Wang, Ge; Zhang, Yi.

IEEE Signal Process Mag ; 40(2): 89-100, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38404742

RESUMO

Since 2016, deep learning (DL) has advanced tomographic imaging with remarkable successes, especially in low-dose computed tomography (LDCT) imaging. Despite being driven by big data, the LDCT denoising and pure end-to-end reconstruction networks often suffer from the black box nature and major issues such as instabilities, which is a major barrier to apply deep learning methods in low-dose CT applications. An emerging trend is to integrate imaging physics and model into deep networks, enabling a hybridization of physics/model-based and data-driven elements. In this paper, we systematically review the physics/model-based data-driven methods for LDCT, summarize the loss functions and training strategies, evaluate the performance of different methods, and discuss relevant issues and future directions.

15.

SPICE: Semantic Pseudo-Labeling for Image Clustering.

Niu, Chuang; Shan, Hongming; Wang, Ge.

IEEE Trans Image Process ; 31: 7264-7278, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36378790

RESUMO

The similarity among samples and the discrepancy among clusters are two crucial aspects of image clustering. However, current deep clustering methods suffer from inaccurate estimation of either feature similarity or semantic discrepancy. In this paper, we present a Semantic Pseudo-labeling-based Image ClustEring (SPICE) framework, which divides the clustering network into a feature model for measuring the instance-level similarity and a clustering head for identifying the cluster-level discrepancy. We design two semantics-aware pseudo-labeling algorithms, prototype pseudo-labeling and reliable pseudo-labeling, which enable accurate and reliable self-supervision over clustering. Without using any ground-truth label, we optimize the clustering network in three stages: 1) train the feature model through contrastive learning to measure the instance similarity; 2) train the clustering head with the prototype pseudo-labeling algorithm to identify cluster semantics; and 3) jointly train the feature model and clustering head with the reliable pseudo-labeling algorithm to improve the clustering performance. Extensive experimental results demonstrate that SPICE achieves significant improvements (~10%) over existing methods and establishes the new state-of-the-art clustering results on six balanced benchmark datasets in terms of three popular metrics. Importantly, SPICE significantly reduces the gap between unsupervised and fully-supervised classification; e.g. there is only 2% (91.8% vs 93.8%) accuracy difference on CIFAR-10. Our code is made publicly available at https://github.com/niuchuangnn/SPICE.

16.

OpenKBP-Opt: an international and reproducible evaluation of 76 knowledge-based planning pipelines.

Babier, Aaron; Mahmood, Rafid; Zhang, Binghao; Alves, Victor G L; Barragán-Montero, Ana Maria; Beaudry, Joel; Cardenas, Carlos E; Chang, Yankui; Chen, Zijie; Chun, Jaehee; Diaz, Kelly; David Eraso, Harold; Faustmann, Erik; Gaj, Sibaji; Gay, Skylar; Gronberg, Mary; Guo, Bingqi; He, Junjun; Heilemann, Gerd; Hira, Sanchit; Huang, Yuliang; Ji, Fuxin; Jiang, Dashan; Carlo Jimenez Giraldo, Jean; Lee, Hoyeon; Lian, Jun; Liu, Shuolin; Liu, Keng-Chi; Marrugo, José; Miki, Kentaro; Nakamura, Kunio; Netherton, Tucker; Nguyen, Dan; Nourzadeh, Hamidreza; Osman, Alexander F I; Peng, Zhao; Darío Quinto Muñoz, José; Ramsl, Christian; Joo Rhee, Dong; David Rodriguez, Juan; Shan, Hongming; Siebers, Jeffrey V; Soomro, Mumtaz H; Sun, Kay; Usuga Hoyos, Andrés; Valderrama, Carlos; Verbeek, Rob; Wang, Enpei; Willems, Siri; Wu, Qi.

Phys Med Biol ; 67(18)2022 09 12.

Artigo em Inglês | MEDLINE | ID: mdl-36093921

RESUMO

Objective.To establish an open framework for developing plan optimization models for knowledge-based planning (KBP).Approach.Our framework includes radiotherapy treatment data (i.e. reference plans) for 100 patients with head-and-neck cancer who were treated with intensity-modulated radiotherapy. That data also includes high-quality dose predictions from 19 KBP models that were developed by different research groups using out-of-sample data during the OpenKBP Grand Challenge. The dose predictions were input to four fluence-based dose mimicking models to form 76 unique KBP pipelines that generated 7600 plans (76 pipelines × 100 patients). The predictions and KBP-generated plans were compared to the reference plans via: the dose score, which is the average mean absolute voxel-by-voxel difference in dose; the deviation in dose-volume histogram (DVH) points; and the frequency of clinical planning criteria satisfaction. We also performed a theoretical investigation to justify our dose mimicking models.Main results.The range in rank order correlation of the dose score between predictions and their KBP pipelines was 0.50-0.62, which indicates that the quality of the predictions was generally positively correlated with the quality of the plans. Additionally, compared to the input predictions, the KBP-generated plans performed significantly better (P< 0.05; one-sided Wilcoxon test) on 18 of 23 DVH points. Similarly, each optimization model generated plans that satisfied a higher percentage of criteria than the reference plans, which satisfied 3.5% more criteria than the set of all dose predictions. Lastly, our theoretical investigation demonstrated that the dose mimicking models generated plans that are also optimal for an inverse planning model.Significance.This was the largest international effort to date for evaluating the combination of KBP prediction and optimization models. We found that the best performing models significantly outperformed the reference dose and dose predictions. In the interest of reproducibility, our data and code is freely available.

Assuntos

Planejamento da Radioterapia Assistida por Computador , Radioterapia de Intensidade Modulada , Humanos , Bases de Conhecimento , Dosagem Radioterapêutica , Planejamento da Radioterapia Assistida por Computador/métodos , Radioterapia de Intensidade Modulada/métodos , Reprodutibilidade dos Testes

17.

Low-dimensional Manifold Constrained Disentanglement Network for Metal Artifact Reduction.

Niu, Chuang; Cong, Wenxiang; Fan, Feng-Lei; Shan, Hongming; Li, Mengzhou; Liang, Jimin; Wang, Ge.

IEEE Trans Radiat Plasma Med Sci ; 6(6): 656-666, 2022 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-35865007

RESUMO

Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for supervised learning. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical datasets. However, as the discriminator can only judge if large regions semantically look artifact-free or artifact-affected, it is difficult for ADN to recover small structural details of artifact-affected CT images based on adversarial losses only without sufficient constraints. To overcome the illposedness of this problem, here we propose a low-dimensional manifold (LDM) constrained disentanglement network (DN), leveraging the image characteristics that the patch manifold of CT images is generally low-dimensional. Specifically, we design an LDM-DN learning algorithm to empower the disentanglement network through optimizing the synergistic loss functions used in ADN while constraining the recovered images to be on a low-dimensional patch manifold. Moreover, learning from both paired and unpaired data, an efficient hybrid optimization scheme is proposed to further improve the MAR performance on clinical datasets. Extensive experiments demonstrate that the proposed LDM-DN approach can consistently improve the MAR performance in paired and/or unpaired learning settings, outperforming competing methods on synthesized and clinical datasets.

18.

Stabilizing deep tomographic reconstruction: Part B. Convergence analysis and adversarial attacks.

Wu, Weiwen; Hu, Dianlin; Cong, Wenxiang; Shan, Hongming; Wang, Shaoyu; Niu, Chuang; Yan, Pingkun; Yu, Hengyong; Vardhanabhuti, Varut; Wang, Ge.

Patterns (N Y) ; 3(5): 100475, 2022 May 13.

Artigo em Inglês | MEDLINE | ID: mdl-35607615

RESUMO

Due to lack of the kernel awareness, some popular deep image reconstruction networks are unstable. To address this problem, here we introduce the bounded relative error norm (BREN) property, which is a special case of the Lipschitz continuity. Then, we perform a convergence study consisting of two parts: (1) a heuristic analysis on the convergence of the analytic compressed iterative deep (ACID) scheme (with the simplification that the CS module achieves a perfect sparsification), and (2) a mathematically denser analysis (with the two approximations: [1] AT is viewed as an inverse A- 1 in the perspective of an iterative reconstruction procedure and [2] a pseudo-inverse is used for a total variation operator H). Also, we present adversarial attack algorithms to perturb the selected reconstruction networks respectively and, more importantly, to attack the ACID workflow as a whole. Finally, we show the numerical convergence of the ACID iteration in terms of the Lipschitz constant and the local stability against noise.

19.

Stabilizing deep tomographic reconstruction: Part A. Hybrid framework and experimental results.

Wu, Weiwen; Hu, Dianlin; Cong, Wenxiang; Shan, Hongming; Wang, Shaoyu; Niu, Chuang; Yan, Pingkun; Yu, Hengyong; Vardhanabhuti, Varut; Wang, Ge.

Patterns (N Y) ; 3(5): 100474, 2022 May 13.

Artigo em Inglês | MEDLINE | ID: mdl-35607623

RESUMO

A recent PNAS paper reveals that several popular deep reconstruction networks are unstable. Specifically, three kinds of instabilities were reported: (1) strong image artefacts from tiny perturbations, (2) small features missed in a deeply reconstructed image, and (3) decreased imaging performance with increased input data. Here, we propose an analytic compressed iterative deep (ACID) framework to address this challenge. ACID synergizes a deep network trained on big data, kernel awareness from compressed sensing (CS)-inspired processing, and iterative refinement to minimize the data residual relative to real measurement. Our study demonstrates that the ACID reconstruction is accurate, is stable, and sheds light on the converging mechanism of the ACID iteration under a bounded relative error norm assumption. ACID not only stabilizes an unstable deep reconstruction network but also is resilient against adversarial attacks to the whole ACID workflow, being superior to classic sparsity-regularized reconstruction and eliminating the three kinds of instabilities.

20.

Convolutional Ordinal Regression Forest for Image Ordinal Estimation.

Zhu, Haiping; Shan, Hongming; Zhang, Yuheng; Che, Lingfu; Xu, Xiaoyang; Zhang, Junping; Shi, Jianbo; Wang, Fei-Yue.

IEEE Trans Neural Netw Learn Syst ; 33(8): 4084-4095, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-33600323

RESUMO

Image ordinal estimation is to predict the ordinal label of a given image, which can be categorized as an ordinal regression (OR) problem. Recent methods formulate an OR problem as a series of binary classification problems. Such methods cannot ensure that the global ordinal relationship is preserved since the relationships among different binary classifiers are neglected. We propose a novel OR approach, termed convolutional OR forest (CORF), for image ordinal estimation, which can integrate OR and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships. The advantages of the proposed CORF are twofold. First, instead of learning a series of binary classifiers independently, the proposed method aims at learning an ordinal distribution for OR by optimizing those binary classifiers simultaneously. Second, the differentiable decision trees in the proposed CORF can be trained together with the ordinal distribution in an end-to-end manner. The effectiveness of the proposed CORF is verified on two image ordinal estimation tasks, i.e., facial age estimation and image esthetic assessment, showing significant improvements and better stability over the state-of-the-art OR methods.

Assuntos

Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos , Análise de Regressão

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA