Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
1.
IEEE Trans Med Imaging ; PP2024 May 21.
Article in English | MEDLINE | ID: mdl-38771692

ABSTRACT

Left ventricle (LV) endocardium segmentation in echocardiography video has received much attention as an important step in quantifying LV ejection fraction. Most existing methods are dedicated to exploiting temporal information on top of 2D convolutional networks. In addition to single appearance semantic learning, some research attempted to introduce motion cues through the optical flow estimation (OFE) task to enhance temporal consistency modeling. However, OFE in these methods is tightly coupled to LV endocardium segmentation, resulting in noisy inter-frame flow prediction, and post-optimization based on these flows accumulates errors. To address these drawbacks, we propose dynamic-guided spatiotemporal attention (DSA) for semi-supervised echocardiography video segmentation. We first fine-tune the off-the-shelf OFE network RAFT on echocardiography data to provide dynamic information. Taking inter-frame flows as additional input, we use a dual-encoder structure to extract motion and appearance features separately. Based on the connection between dynamic continuity and semantic consistency, we propose a bilateral feature calibration module to enhance both features. For temporal consistency modeling, the DSA is proposed to aggregate neighboring frame context using deformable attention that is realized by offsets grid attention. Dynamic information is introduced into DSA through a bilateral offset estimation module to effectively combine with appearance semantics and predict attention offsets, thereby guiding semantic-based spatiotemporal attention. We evaluated our method on two popular echocardiography datasets, CAMUS and EchoNet-Dynamic, and achieved state-of-the-art.

2.
Article in English | MEDLINE | ID: mdl-38502621

ABSTRACT

Cartoon animation video is a popular visual entertainment form worldwide, however many classic animations were produced in a 4:3 aspect ratio that is incompatible with modern widescreen displays. Existing methods like cropping lead to information loss while retargeting causes distortion. Animation companies still rely on manual labor to renovate classic cartoon animations, which is tedious and labor-intensive, but can yield higher-quality videos. Conventional extrapolation or inpainting methods tailored for natural videos struggle with cartoon animations due to the lack of textures in anime, which affects the motion estimation of the objects. In this paper, we propose a novel framework designed to automatically outpaint 4:3 anime to 16:9 via region-guided motion inference. Our core concept is to identify the motion correspondences between frames within a sequence in order to reconstruct missing pixels. Initially, we estimate optical flow guided by region information to address challenges posed by exaggerated movements and solid-color regions in cartoon animations. Subsequently, frames are stitched to produce a pre-filled guide frame, offering structural clues for the extension of optical flow maps. Finally, a voting and fusion scheme utilizes learned fusion weights to blend the aligned neighboring reference frames, resulting in the final outpainting frame. Extensive experiments confirm the superiority of our approach over existing methods.

3.
IEEE Trans Cybern ; 54(6): 3652-3665, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38236677

ABSTRACT

Alzheimer's disease (AD) is characterized by alterations of the brain's structural and functional connectivity during its progressive degenerative processes. Existing auxiliary diagnostic methods have accomplished the classification task, but few of them can accurately evaluate the changing characteristics of brain connectivity. In this work, a prior-guided adversarial learning with hypergraph (PALH) model is proposed to predict abnormal brain connections using triple-modality medical images. Concretely, a prior distribution from anatomical knowledge is estimated to guide multimodal representation learning using an adversarial strategy. Also, the pairwise collaborative discriminator structure is further utilized to narrow the difference in representation distribution. Moreover, the hypergraph perceptual network is developed to effectively fuse the learned representations while establishing high-order relations within and between multimodal images. Experimental results demonstrate that the proposed model outperforms other related methods in analyzing and predicting AD progression. More importantly, the identified abnormal connections are partly consistent with previous neuroscience discoveries. The proposed model can evaluate the characteristics of abnormal brain connections at different stages of AD, which is helpful for cognitive disease study and early treatment.


Subject(s)
Alzheimer Disease , Brain , Alzheimer Disease/diagnostic imaging , Alzheimer Disease/physiopathology , Humans , Brain/diagnostic imaging , Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Algorithms , Machine Learning , Neural Networks, Computer , Aged
4.
IEEE Trans Med Imaging ; 43(2): 649-661, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37703140

ABSTRACT

Existing federated learning works mainly focus on the fully supervised training setting. In realistic scenarios, however, most clinical sites can only provide data without annotations due to the lack of resources or expertise. In this work, we are concerned with the practical yet challenging federated semi-supervised segmentation (FSSS), where labeled data are only with several clients and other clients can just provide unlabeled data. We take an early attempt to tackle this problem and propose a novel FSSS method with prototype-based pseudo-labeling and contrastive learning. First, we transmit a labeled-aggregated model, which is obtained based on prototype similarity, to each unlabeled client, to work together with the global model for debiased pseudo labels generation via a consistency- and entropy-aware selection strategy. Second, we transfer image-level prototypes from labeled datasets to unlabeled clients and conduct prototypical contrastive learning on unlabeled models to enhance their discriminative power. Finally, we perform the dynamic model aggregation with a designed consistency-aware aggregation strategy to dynamically adjust the aggregation weights of each local model. We evaluate our method on COVID-19 X-ray infected region segmentation, COVID-19 CT infected region segmentation and colorectal polyp segmentation, and experimental results consistently demonstrate the effectiveness of our proposed method. Codes areavailable at https://github.com/zhangbaiming/FedSemiSeg.


Subject(s)
COVID-19 , Humans , Entropy , Supervised Machine Learning , Image Processing, Computer-Assisted
5.
IEEE Trans Cybern ; 54(5): 3299-3312, 2024 May.
Article in English | MEDLINE | ID: mdl-37471181

ABSTRACT

Automatic kidney and tumor segmentation from CT volumes is a critical prerequisite/tool for diagnosis and surgical treatment (such as partial nephrectomy). However, it remains a particularly challenging issue as kidneys and tumors often exhibit large-scale variations, irregular shapes, and blurring boundaries. We propose a novel 3-D network to comprehensively tackle these problems; we call it 3DSN-Net. Compared with existing solutions, it has two compelling characteristics. First, with a new scale-aware feature extraction (SAFE) module, the proposed 3DSN-Net is capable of adaptively selecting appropriate receptive fields according to the sizes of targets instead of indiscriminately enlarging them, which is particularly essential for improving the segmentation accuracy of the tumor with large scale variation. Second, we propose a novel yet efficient nonlocal context guidance (NCG) mechanism to capture global dependencies to tackle irregular shapes and blurring boundaries of kidneys and tumors. Instead of directly harnessing a 3-D NCG mechanism, which makes the number of parameters exponentially increase and hence the network difficult to be trained under limited training data, we develop a 2.5D NCG mechanism based on projections of feature cubes, which achieves a tradeoff between segmentation accuracy and network complexity. We extensively evaluate the proposed 3DSN-Net on the famous KiTS dataset with many challenging kidney and tumor cases. Experimental results demonstrate our solution consistently outperforms state-of-the-art 3-D networks after being equipped with scale aware and NCG mechanisms, particularly for tumor segmentation.


Subject(s)
Kidney , Neoplasms , Humans , Kidney/diagnostic imaging , Tomography, X-Ray Computed , Image Processing, Computer-Assisted
6.
Article in English | MEDLINE | ID: mdl-37815971

ABSTRACT

Integrating the brain structural and functional connectivity features is of great significance in both exploring brain science and analyzing cognitive impairment clinically. However, it remains a challenge to effectively fuse structural and functional features in exploring the complex brain network. In this paper, a novel brain structure-function fusing-representation learning (BSFL) model is proposed to effectively learn fused representation from diffusion tensor imaging (DTI) and resting-state functional magnetic resonance imaging (fMRI) for mild cognitive impairment (MCI) analysis. Specifically, the decomposition-fusion framework is developed to first decompose the feature space into the union of the uniform and unique spaces for each modality, and then adaptively fuse the decomposed features to learn MCI-related representation. Moreover, a knowledge-aware transformer module is designed to automatically capture local and global connectivity features throughout the brain. Also, a uniform-unique contrastive loss is further devised to make the decomposition more effective and enhance the complementarity of structural and functional features. The extensive experiments demonstrate that the proposed model achieves better performance than other competitive methods in predicting and analyzing MCI. More importantly, the proposed model could be a potential tool for reconstructing unified brain networks and predicting abnormal connections during the degenerative processes in MCI.


Subject(s)
Cognitive Dysfunction , Diffusion Tensor Imaging , Humans , Brain Mapping/methods , Magnetic Resonance Imaging/methods , Brain/diagnostic imaging , Cognitive Dysfunction/diagnostic imaging
7.
IEEE Trans Med Imaging ; 42(12): 3794-3804, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37610902

ABSTRACT

Deep learning models have achieved remarkable success in multi-type nuclei segmentation. These models are mostly trained at once with the full annotation of all types of nuclei available, while lack the ability of continually learning new classes due to the problem of catastrophic forgetting. In this paper, we study the practical and important class-incremental continual learning problem, where the model is incrementally updated to new classes without accessing to previous data. We propose a novel continual nuclei segmentation method, to avoid forgetting knowledge of old classes and facilitate the learning of new classes, by achieving feature-level knowledge distillation with prototype-wise relation distillation and contrastive learning. Concretely, prototype-wise relation distillation imposes constraints on the inter-class relation similarity, encouraging the encoder to extract similar class distribution for old classes in the feature space. Prototype-wise contrastive learning with a hard sampling strategy enhances the intra-class compactness and inter-class separability of features, improving the performance on both old and new classes. Experiments on two multi-type nuclei segmentation benchmarks, i.e., MoNuSAC and CoNSeP, demonstrate the effectiveness of our method with superior performance over many competitive methods. Codes are available at https://github.com/zzw-szu/CoNuSeg.


Subject(s)
Deep Learning
8.
Article in English | MEDLINE | ID: mdl-37432812

ABSTRACT

Image fusion technology aims to obtain a comprehensive image containing a specific target or detailed information by fusing data of different modalities. However, many deep learning-based algorithms consider edge texture information through loss functions instead of specifically constructing network modules. The influence of the middle layer features is ignored, which leads to the loss of detailed information between layers. In this article, we propose a multidiscriminator hierarchical wavelet generative adversarial network (MHW-GAN) for multimodal image fusion. First, we construct a hierarchical wavelet fusion (HWF) module as the generator of MHW-GAN to fuse feature information at different levels and scales, which avoids information loss in the middle layers of different modalities. Second, we design an edge perception module (EPM) to integrate edge information from different modalities to avoid the loss of edge information. Third, we leverage the adversarial learning relationship between the generator and three discriminators for constraining the generation of fusion images. The generator aims to generate a fusion image to fool the three discriminators, while the three discriminators aim to distinguish the fusion image and edge fusion image from two source images and the joint edge image, respectively. The final fusion image contains both intensity information and structure information via adversarial learning. Experiments on public and self-collected four types of multimodal image datasets show that the proposed algorithm is superior to the previous algorithms in terms of both subjective and objective evaluation.

9.
Article in English | MEDLINE | ID: mdl-37267131

ABSTRACT

Manga screening is a critical process in manga production, which still requires intensive labor and cost. Existing manga screening methods either generate simple dotted screentones only or rely on color information and manual hints during screentone selection. Due to the large domain gap between line drawings and screened manga, and the difficulties in generating high-quality, properly selected and shaded screentones, even state-of-the-art deep learning methods cannot convert line drawings to screened manga well. Besides, ambiguity exists in the screening process since different artists may screen differently for the same line drawing. In this paper, we propose to introduce shaded line drawing as the intermediate counterpart of the screened manga so that the manga screening task can be decomposed into two sub-tasks, generating shading from a line drawing and replacing shading with proper screentones. The reference image is adopted to resolve the ambiguity issue and provides options and controls on the generated screened manga. We proposed a reference-based shading generation network and a reference-based screentone generation module to achieve the two sub-tasks individually. We conduct extensive visual and quantitative experiments to verify the effectiveness of our system. Results and statistics show that our method outperforms existing methods on the manga screening task.

10.
Article in English | MEDLINE | ID: mdl-37021997

ABSTRACT

Shading plays an important role in cartoon drawings to present the 3D lighting and depth information in a 2D image to improve the visual information and pleasantness. But it also introduces apparent challenges in analyzing and processing the cartoon drawings for different computer graphics and vision applications, such as segmentation, depth estimation, and relighting. Extensive research has been made in removing or separating the shading information to facilitate these applications. Unfortunately, the existing researches only focused on natural images, which are natively different from cartoons since the shading in natural images is physically correct and can be modeled based on physical priors. However, shading in cartoons is manually created by artists, which may be imprecise, abstract, and stylized. This makes it extremely difficult to model the shading in cartoon drawings. Without modeling the shading prior, in the paper, we propose a learning-based solution to separate the shading from the original colors using a two-branch system consisting of two subnetworks. To the best of our knowledge, our method is the first attempt in separating shading information from cartoon drawings. Our method significantly outperforms the methods tailored for natural images. Extensive evaluations have been performed with convincing results in all cases.

11.
IEEE Trans Med Imaging ; 42(6): 1619-1631, 2023 06.
Article in English | MEDLINE | ID: mdl-37018315

ABSTRACT

We present a novel deep network (namely BUSSeg) equipped with both within- and cross-image long-range dependency modeling for automated lesions segmentation from breast ultrasound images, which is a quite daunting task due to (1) the large variation of breast lesions, (2) the ambiguous lesion boundaries, and (3) the existence of speckle noise and artifacts in ultrasound images. Our work is motivated by the fact that most existing methods only focus on modeling the within-image dependencies while neglecting the cross-image dependencies, which are essential for this task under limited training data and noise. We first propose a novel cross-image dependency module (CDM) with a cross-image contextual modeling scheme and a cross-image dependency loss (CDL) to capture more consistent feature expression and alleviate noise interference. Compared with existing cross-image methods, the proposed CDM has two merits. First, we utilize more complete spatial features instead of commonly used discrete pixel vectors to capture the semantic dependencies between images, mitigating the negative effects of speckle noise and making the acquired features more representative. Second, the proposed CDM includes both intra- and inter-class contextual modeling rather than just extracting homogeneous contextual dependencies. Furthermore, we develop a parallel bi-encoder architecture (PBA) to tame a Transformer and a convolutional neural network to enhance BUSSeg's capability in capturing within-image long-range dependencies and hence offer richer features for CDM. We conducted extensive experiments on two representative public breast ultrasound datasets, and the results demonstrate that the proposed BUSSeg consistently outperforms state-of-the-art approaches in most metrics.


Subject(s)
Artifacts , Ultrasonography, Mammary , Female , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer , Semantics
12.
IEEE Trans Med Imaging ; 42(6): 1668-1680, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37018336

ABSTRACT

Detecting cells in blood smear images is of great significance for automatic diagnosis of blood diseases. However, this task is rather challenging, mainly because there are dense cells that are often overlapping, making some of the occluded boundary parts invisible. In this paper, we propose a generic and effective detection framework that exploits non-overlapping regions (NOR) for providing discriminative and confident information to compensate the intensity deficiency. In particular, we propose a feature masking (FM) to exploit the NOR mask generated from the original annotation information, which can guide the network to extract NOR features as supplementary information. Furthermore, we exploit NOR features to directly predict the NOR bounding boxes (NOR BBoxes). NOR BBoxes are combined with the original BBoxes for generating one-to-one corresponding BBox-pairs that are used for further improving the detection performance. Different from the non-maximum suppression (NMS), our proposed non-overlapping regions NMS (NOR-NMS) uses the NOR BBoxes in the BBox-pairs to calculate intersection over union (IoU) for suppressing redundant BBoxes, and consequently retains the corresponding original BBoxes, circumventing the dilemma of NMS. We conducted extensive experiments on two publicly available datasets, with positive results demonstrating the effectiveness of the proposed method against existing methods.

13.
IEEE Trans Cybern ; 53(4): 2610-2621, 2023 Apr.
Article in English | MEDLINE | ID: mdl-35417366

ABSTRACT

Automatic polyp segmentation from colonoscopy videos is a prerequisite for the development of a computer-assisted colon cancer examination and diagnosis system. However, it remains a very challenging task owing to the large variation of polyps, the low contrast between polyps and background, and the blurring boundaries of polyps. More importantly, real-time performance is a necessity of this task, as it is anticipated that the segmented results can be immediately presented to the doctor during the colonoscopy intervention for his/her prompt decision and action. It is difficult to develop a model with powerful representation capability, yielding satisfactory segmentation results and, simultaneously, maintaining real-time performance. In this article, we present a novel lightweight context-aware network, namely, PolypSeg+, attempting to capture distinguishable features of polyps without increasing network complexity and sacrificing time performance. To achieve this, a set of novel lightweight techniques is developed and integrated into the proposed PolypSeg+, including an adaptive scale context (ASC) module equipped with a lightweight attention mechanism to tackle the large-scale variation of polyps, an efficient global context (EGC) module to promote the fusion of low-level and high-level features by excluding background noise and preserving boundary details, and a lightweight feature pyramid fusion (FPF) module to further refine the features extracted from the ASC and EGC. We extensively evaluate the proposed PolypSeg+ on two famous public available datasets for the polyp segmentation task: 1) Kvasir-SEG and 2) CVC-Endoscenestill. The experimental results demonstrate that our PolypSeg+ consistently outperforms other state-of-the-art networks by achieving better segmentation accuracy in much less running time. The code is available at https://github.com/szu-zzb/polypsegplus.


Subject(s)
Colonic Neoplasms , Image Interpretation, Computer-Assisted , Humans , Colonic Neoplasms/diagnostic imaging , Colonoscopy
14.
Med Image Anal ; 78: 102397, 2022 05.
Article in English | MEDLINE | ID: mdl-35259635

ABSTRACT

We present a novel model for left ventricle endocardium segmentation from echocardiography video, which is of great significance in clinical practice and yet a challenging task due to (1) the severe speckle noise in echocardiography videos, (2) the irregular motion of pathological heart, and (3) the limited training data caused by high annotation cost. The proposed model has three compelling characteristics. First, we propose a novel adaptive spatiotemporal semantic calibration method to align the feature maps of consecutive frames, where the spatiotemporal correspondences are figured out based on feature maps instead of pixels, thereby mitigating the adverse effects of speckle noise in the calibration. Second, we further learn the importance of each feature map of neighbouring frames to the current frame from the temporal perspective so as to distinctively rather than uniformly harness the temporal information to tackle the irregular and anisotropic motions. Third, we integrate these techniques into the mean teacher semi-supervised architecture to leverage a large amount of unlabeled data to improve the segmentation accuracy. We extensively evaluate the proposed method on two public echocardiography video datasets (EchoNet-Dynamic and CAMUS), where the average dice coefficient on the left ventricular endocardium segmentation achieves 92.87% and 93.79%, respectively. Comparisons with state-of-the-art methods also demonstrate the effectiveness of the proposed method by achieving a better segmentation performance with a faster speed.


Subject(s)
Echocardiography , Semantics , Calibration , Heart/diagnostic imaging , Heart Ventricles/diagnostic imaging , Humans
15.
Med Image Anal ; 76: 102327, 2022 02.
Article in English | MEDLINE | ID: mdl-34923250

ABSTRACT

Skin lesion segmentation from dermoscopic image is essential for improving the quantitative analysis of melanoma. However, it is still a challenging task due to the large scale variations and irregular shapes of the skin lesions. In addition, the blurred lesion boundaries between the skin lesions and the surrounding tissues may also increase the probability of incorrect segmentation. Due to the inherent limitations of traditional convolutional neural networks (CNNs) in capturing global context information, traditional CNN-based methods usually cannot achieve a satisfactory segmentation performance. In this paper, we propose a novel feature adaptive transformer network based on the classical encoder-decoder architecture, named FAT-Net, which integrates an extra transformer branch to effectively capture long-range dependencies and global context information. Furthermore, we also employ a memory-efficient decoder and a feature adaptation module to enhance the feature fusion between the adjacent-level features by activating the effective channels and restraining the irrelevant background noise. We have performed extensive experiments to verify the effectiveness of our proposed method on four public skin lesion segmentation datasets, including the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets. Ablation studies demonstrate the effectiveness of our feature adaptive transformers and memory-efficient strategies. Comparisons with state-of-the-art methods also verify the superiority of our proposed FAT-Net in terms of both accuracy and inference speed. The code is available at https://github.com/SZUcsh/FAT-Net.


Subject(s)
Image Processing, Computer-Assisted , Skin Diseases , Humans , Neural Networks, Computer
16.
Med Image Anal ; 70: 102025, 2021 05.
Article in English | MEDLINE | ID: mdl-33721692

ABSTRACT

Accurately segmenting retinal vessel from retinal images is essential for the detection and diagnosis of many eye diseases. However, it remains a challenging task due to (1) the large variations of scale in the retinal vessels and (2) the complicated anatomical context of retinal vessels, including complex vasculature and morphology, the low contrast between some vessels and the background, and the existence of exudates and hemorrhage. It is difficult for a model to capture representative and distinguishing features for retinal vessels under such large scale and semantics variations. Limited training data also make this task even harder. In order to comprehensively tackle these challenges, we propose a novel scale and context sensitive network (a.k.a., SCS-Net) for retinal vessel segmentation. We first propose a scale-aware feature aggregation (SFA) module, aiming at dynamically adjusting the receptive fields to effectively extract multi-scale features. Then, an adaptive feature fusion (AFF) module is designed to guide efficient fusion between adjacent hierarchical features to capture more semantic information. Finally, a multi-level semantic supervision (MSS) module is employed to learn more distinctive semantic representation for refining the vessel maps. We conduct extensive experiments on the six mainstream retinal image databases (DRIVE, CHASEDB1, STARE, IOSTAR, HRF, and LES-AV). The experimental results demonstrate the effectiveness of the proposed SCS-Net, which is capable of achieving better segmentation performance than other state-of-the-art approaches, especially for the challenging cases with large scale variations and complex context environments.


Subject(s)
Image Processing, Computer-Assisted , Retinal Vessels , Databases, Factual , Humans , Retinal Vessels/diagnostic imaging
17.
IEEE Trans Cybern ; 51(9): 4464-4475, 2021 Sep.
Article in English | MEDLINE | ID: mdl-31794419

ABSTRACT

Symmetry detection is a method to extract the ideal mid-sagittal plane (MSP) from brain magnetic resonance (MR) images, which can significantly improve the diagnostic accuracy of brain diseases. In this article, we propose an automatic symmetry detection method for brain MR images in 2-D slices based on a 2-channel convolutional neural network (CNN). Different from the existing detection methods that mainly rely on the local image features (gradient, edge, etc.) to determine the MSP, we use a CNN-based model to implement the brain symmetry detection, which does not require any local feature detections and feature matchings. By training to learn a wide variety of benchmarks in the brain images, we can further use a 2-channel CNN to evaluate the similarity between the pairs of brain patches, which are randomly extracted from the whole brain slice based on a Poisson sampling. Finally, a scoring and ranking scheme is used to identify the optimal symmetry axis for each input brain MR slice. Our method was evaluated in 2166 artificial synthesized brain images and 3064 collected in vivo MR images, which included both healthy and pathological cases. The experimental results display that our method achieves excellent performance for symmetry detection. Comparisons with the state-of-the-art methods also demonstrate the effectiveness and advantages for our approach in achieving higher accuracy than the previous competitors.


Subject(s)
Magnetic Resonance Imaging , Neural Networks, Computer , Brain/diagnostic imaging , Neuroimaging
18.
Med Image Anal ; 68: 101891, 2021 02.
Article in English | MEDLINE | ID: mdl-33260108

ABSTRACT

Left ventricular (LV) segmentation is essential for the early diagnosis of cardiovascular diseases, which has been reported as the leading cause of death all over the world. However, automated LV segmentation from cardiac magnetic resonance images (CMRI) using the traditional convolutional neural networks (CNNs) is still a challenging task due to the limited labeled CMRI data and low tolerances to irregular scales, shapes and deformations of LV. In this paper, we propose an automated LV segmentation method based on adversarial learning by integrating a multi-stage pose estimation network (MSPN) and a co-discrimination network. Different from existing CNNs, we use a MSPN with multi-scale dilated convolution (MDC) modules to enhance the ranges of receptive field for deep feature extraction. To fully utilize both labeled and unlabeled CMRI data, we propose a novel generative adversarial network (GAN) framework for LV segmentation by combining MSPN with co-discrimination networks. Specifically, the labeled CMRI are first used to initialize our segmentation network (MSPN) and co-discrimination network. Our GAN training includes two different kinds of epochs fed with both labeled and unlabeled CMRI data alternatively, which are different from the traditional CNNs only relied on the limited labeled samples to train the segmentation networks. As both ground truth and unlabeled samples are involved in guiding training, our method not only can converge faster but also obtain a better performance in LV segmentation. Our method is evaluated using MICCAI 2009 and 2017 challenge databases. Experimental results show that our method has obtained promising performance in LV segmentation, which also outperforms the state-of-the-art methods in terms of LV segmentation accuracy from the comparison results.


Subject(s)
Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Heart , Heart Ventricles/diagnostic imaging , Neural Networks, Computer
19.
IEEE Trans Med Imaging ; 40(1): 357-370, 2021 01.
Article in English | MEDLINE | ID: mdl-32986547

ABSTRACT

We present a convolutional neural network (CNN) equipped with a novel and efficient adaptive dual attention module (ADAM) for automated skin lesion segmentation from dermoscopic images, which is an essential yet challenging step for the development of a computer-assisted skin disease diagnosis system. The proposed ADAM has three compelling characteristics. First, we integrate two global context modeling mechanisms into the ADAM, one aiming at capturing the boundary continuity of skin lesion by global average pooling while the other dealing with the shape irregularity by pixel-wise correlation. In this regard, our network, thanks to the proposed ADAM, is capable of extracting more comprehensive and discriminative features for recognizing the boundary of skin lesions. Second, the proposed ADAM supports multi-scale resolution fusion, and hence can capture multi-scale features to further improve the segmentation accuracy. Third, as we harness a spatial information weighting method in the proposed network, our method can reduce a lot of redundancies compared with traditional CNNs. The proposed network is implemented based on a dual encoder architecture, which is able to enlarge the receptive field without greatly increasing the network parameters. In addition, we assign different dilation rates to different ADAMs so that it can adaptively capture distinguishing features according to the size of a lesion. We extensively evaluate the proposed method on both ISBI2017 and ISIC2018 datasets and the experimental results demonstrate that, without using network ensemble schemes, our method is capable of achieving better segmentation performance than state-of-the-art deep learning models, particularly those equipped with attention mechanisms.


Subject(s)
Image Processing, Computer-Assisted , Skin Diseases , Diagnosis, Computer-Assisted , Humans , Neural Networks, Computer
20.
Sensors (Basel) ; 17(1)2017 Jan 18.
Article in English | MEDLINE | ID: mdl-28106764

ABSTRACT

Single-image blind deblurring for imaging sensors in the Internet of Things (IoT) is a challenging ill-conditioned inverse problem, which requires regularization techniques to stabilize the image restoration process. The purpose is to recover the underlying blur kernel and latent sharp image from only one blurred image. Under many degraded imaging conditions, the blur kernel could be considered not only spatially sparse, but also piecewise smooth with the support of a continuous curve. By taking advantage of the hybrid sparse properties of the blur kernel, a hybrid regularization method is proposed in this paper to robustly and accurately estimate the blur kernel. The effectiveness of the proposed blur kernel estimation method is enhanced by incorporating both the L 1 -norm of kernel intensity and the squared L 2 -norm of the intensity derivative. Once the accurate estimation of the blur kernel is obtained, the original blind deblurring can be simplified to the direct deconvolution of blurred images. To guarantee robust non-blind deconvolution, a variational image restoration model is presented based on the L 1 -norm data-fidelity term and the total generalized variation (TGV) regularizer of second-order. All non-smooth optimization problems related to blur kernel estimation and non-blind deconvolution are effectively handled by using the alternating direction method of multipliers (ADMM)-based numerical methods. Comprehensive experiments on both synthetic and realistic datasets have been implemented to compare the proposed method with several state-of-the-art methods. The experimental comparisons have illustrated the satisfactory imaging performance of the proposed method in terms of quantitative and qualitative evaluations.

SELECTION OF CITATIONS
SEARCH DETAIL
...