Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Article in English | MEDLINE | ID: mdl-37204953

ABSTRACT

Existing deep learning-based interactive image segmentation methods have significantly reduced the user's interaction burden with simple click interactions. However, they still require excessive numbers of clicks to continuously correct the segmentation for satisfactory results. This article explores how to harvest accurate segmentation of interested targets while minimizing the user interaction cost. To achieve the above goal, we propose a one-click-based interactive segmentation approach in this work. For this particularly challenging problem in the interactive segmentation task, we build a top-down framework dividing the original problem into a one-click-based coarse localization followed by a fine segmentation. A two-stage interactive object localization network is first designed, which aims to completely enclose the target of interest based on the supervision of object integrity (OI). Click centrality (CC) is also utilized to overcome the overlapping problem between objects. This coarse localization helps to reduce the search space and increase the focus of the click at a higher resolution. A principled multilayer segmentation network is then designed by a progressive layer-by-layer structure, which aims to accurately perceive the target with extremely limited prior guidance. A diffusion module is also designed to enhance the information flow between layers. Besides, the proposed model can be naturally extended to multiobject segmentation task. Our method achieves the state-of-the-art performance under one-click interaction on several benchmarks.

2.
J Autism Dev Disord ; 53(6): 2475-2489, 2023 Jun.
Article in English | MEDLINE | ID: mdl-35389185

ABSTRACT

Previous studies have demonstrated abnormal brain overgrowth in children with autism spectrum disorder (ASD), but the development of specific brain regions, such as the amygdala and hippocampal subfields in infants, is incompletely documented. To address this issue, we performed the first MRI study of amygdala and hippocampal subfields in infants from 6 to 24 months of age using a longitudinal dataset. A novel deep learning approach, Dilated-Dense U-Net, was proposed to address the challenge of low tissue contrast and small structural size of these subfields. We performed a volume-based analysis on the segmentation results. Our results show that infants who were later diagnosed with ASD had larger left and right volumes of amygdala and hippocampal subfields than typically developing controls.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Child , Humans , Infant , Autism Spectrum Disorder/diagnostic imaging , Hippocampus/diagnostic imaging , Brain , Amygdala/diagnostic imaging , Magnetic Resonance Imaging/methods
3.
Article in English | MEDLINE | ID: mdl-36256720

ABSTRACT

With the rapid advances in digital imaging and communication technologies, recently image set classification has attracted significant attention and has been widely used in many real-world scenarios. As an effective technology, the class-specific representation theory-based methods have demonstrated their superior performances. However, this type of methods either only uses one gallery set to measure the gallery-to-probe set distance or ignores the inner connection between different metrics, leading to the learned distance metric lacking robustness, and is sensitive to the size of image sets. In this article, we propose a novel joint metric learning-based class-specific representation framework (JMLC), which can jointly learn the related and unrelated metrics. By iteratively modeling probe set and related or unrelated gallery sets as affine hull, we reconstruct this hull sparsely or collaboratively over another image set. With the obtained representation coefficients, the combined metric between the query set and the gallery set can then be calculated. In addition, we also derive the kernel extension of JMLC and propose two new unrelated set constituting strategies. Specifically, kernelized JMLC (KJMLC) embeds the gallery sets and probe sets into the high-dimensional Hilbert space, and in the kernel space, the data become approximately linear separable. Extensive experiments on seven benchmark databases show the superiority of the proposed methods to the state-of-the-art image set classifiers.

4.
IEEE Trans Image Process ; 31: 6471-6486, 2022.
Article in English | MEDLINE | ID: mdl-36223352

ABSTRACT

In the field of image set classification, most existing works focus on exploiting effective latent discriminative features. However, it remains a research gap to efficiently handle this problem. In this paper, benefiting from the superiority of hashing in terms of its computational complexity and memory costs, we present a novel Discrete Metric Learning (DML) approach based on the Riemannian manifold for fast image set classification. The proposed DML jointly learns a metric in the induced space and a compact Hamming space, where efficient classification is carried out. Specifically, each image set is modeled as a point on Riemannian manifold after which the proposed DML minimizes the Hamming distance between similar Riemannian pairs and maximizes the Hamming distance between dissimilar ones by introducing a discriminative Mahalanobis-like matrix. To overcome the shortcoming of DML that relies on the vectorization of Riemannian representations, we further develop Bilinear Discrete Metric Learning (BDML) to directly manipulate the original Riemannian representations and explore the natural matrix structure for high-dimensional data. Different from conventional Riemannian metric learning methods, which require complicated Riemannian optimizations (e.g., Riemannian conjugate gradient), both DML and BDML can be efficiently optimized by computing the geodesic mean between the similarity matrix and inverse of the dissimilarity matrix. Extensive experiments conducted on different visual recognition tasks (face recognition, object recognition, and action recognition) demonstrate that the proposed methods achieve competitive performance in terms of accuracy and efficiency.

5.
IEEE Trans Neural Netw Learn Syst ; 32(5): 1839-1851, 2021 May.
Article in English | MEDLINE | ID: mdl-32406846

ABSTRACT

Multiple kernel learning (MKL) is generally recognized to perform better than single kernel learning (SKL) in handling nonlinear clustering problem, largely thanks to MKL avoids selecting and tuning predefined kernel. By integrating the self-expression learning framework, the graph-based MKL subspace clustering has recently attracted considerable attention. However, the graph structure of data in kernel space is largely ignored by previous MKL methods, which is a key concept of affinity graph construction for spectral clustering purposes. In order to address this problem, a novel MKL method is proposed in this article, namely, structure-preserving multiple kernel clustering (SPMKC). Specifically, SPMKC proposes a new kernel affine weight strategy to learn an optimal consensus kernel from a predefined kernel pool, which can assign a suitable weight for each base kernel automatically. Furthermore, SPMKC proposes a kernel group self-expressiveness term and a kernel adaptive local structure learning term to preserve the global and local structure of the input data in kernel space, respectively, rather than the original space. In addition, an efficient algorithm is proposed to solve the resulting unified objective function, which iteratively updates the consensus kernel and the affinity graph so that collaboratively promoting each of them to reach the optimum condition. Experiments on both image and text clustering demonstrate that SPMKC outperforms the state-of-the-art MKL clustering methods in terms of clustering performance and computational cost.

6.
IEEE Trans Cybern ; 51(6): 3273-3284, 2021 Jun.
Article in English | MEDLINE | ID: mdl-32584777

ABSTRACT

Significant attention to multiple kernel graph-based clustering (MKGC) has emerged in recent years, primarily due to the superiority of multiple kernel learning (MKL) and the outstanding performance of graph-based clustering. However, many existing MKGC methods design a fat model that poses challenges for computational cost and clustering performance, as they learn both an affinity graph and an extra consensus kernel cumbersomely. To tackle this challenging problem, this article proposes a new MKGC method to learn a consensus affinity graph directly. By using the self-expressiveness graph learning and an adaptive local structure learning term, the local manifold structure of the data in kernel space is preserved for learning multiple candidate affinity graphs from a kernel pool first. After that, these candidate affinity graphs are synthesized to learn a consensus affinity graph via a thin autoweighted fusion model, in which a self-tuned Laplacian rank constraint and a top- k neighbors sparse strategy are introduced to improve the quality of the consensus affinity graph for accurate clustering purposes. The experimental results on ten benchmark datasets and two synthetic datasets show that the proposed method consistently and significantly outperforms the state-of-the-art methods.

7.
IEEE Trans Image Process ; 29(1): 2094-2107, 2020.
Article in English | MEDLINE | ID: mdl-31502975

ABSTRACT

To defy the curse of dimensionality, the inputs are always projected from the original high-dimensional space into the target low-dimension space for feature extraction. However, due to the existence of noise and outliers, the feature extraction task for corrupted data is still a challenging problem. Recently, a robust method called low rank embedding (LRE) was proposed. Despite the success of LRE in experimental studies, it also has many disadvantages: 1) The learned projection cannot quantitatively interpret the importance of features. 2) LRE does not perform data reconstruction so that the features may not be capable of holding the main energy of the original "clean" data. 3) LRE explicitly transforms error into the target space. 4) LRE is an unsupervised method, which is only suitable for unsupervised scenarios. To address these problems, in this paper, we propose a novel method to exploit the latent discriminative features. In particular, we first utilize an orthogonal matrix to hold the main energy of the original data. Next, we introduce an l2,1 -norm term to encourage the features to be more compact, discriminative and interpretable. Then, we enforce a columnwise l2,1 -norm constraint on an error component to resist noise. Finally, we integrate a classification loss term into the objective function to fit supervised scenarios. Our method performs better than several state-of-the-art methods in terms of effectiveness and robustness, as demonstrated on six publicly available datasets.

8.
Proc IEEE Int Symp Biomed Imaging ; 2019: 1052-1056, 2019 Apr.
Article in English | MEDLINE | ID: mdl-31681457

ABSTRACT

Currently, autism spectrum disorder (ASD) is mainly diagnosed by the observation of core behavioral symptoms. Consequently, the window of opportunity for effective intervention may have passed, when the disorder is detected until 3 years of age. Thus, it is of great importance to identify imaging-based biomarkers for early diagnosis of ASD. Previous findings indicate that an abnormal pattern of the amygdala and hippocampal development in autism persists through childhood and adolescence. However, due to the low tissue contrast and small structural size of amygdala and hippocampal subfields, our knowledge on their growth in autistics in early stage still remains very limited. In this paper, for the first time, we propose a volume-based analysis of the amygdala and hippocampal subfields of the infant subjects with risk of ASD at around 24 months of age. Specifically, to address the challenge of low tissue contrast, we propose a novel deep-learning approach, i.e., dilated-dense U-Net, to automatically segment the amygdala and hippocampal subfields. Experimental results on National Database for Autism Research (NDAR) show the advantages of our proposed method in terms of segmentation accuracy. Our volume-based analysis shows the overgrowths of amygdala and CA1-3 of hippocampus, which may link to the emergence of autism spectrum disorder.

9.
Graph Learn Med Imaging (2019) ; 11849: 164-171, 2019 Oct.
Article in English | MEDLINE | ID: mdl-32104792

ABSTRACT

Currently, there are still no early biomarkers to detect infants with risk of autism spectrum disorder (ASD), which is mainly diagnosed based on behavioral observations at three or four years of age. Since intervention efforts may miss a critical developmental window after 2 years old, it is clinically significant to identify imaging-based biomarkers at an early stage for better intervention, before behavioral diagnostic signs of ASD typically arising. Previous studies on older children and young adults with ASD demonstrate altered developmental trajectories of the amygdala and hippocampus. However, our knowledge on their developmental trajectories in early postnatal stages remains very limited. In this paper, for the first time, we propose a volume-based analysis of the amygdala and hippocampal subfields of the infant subjects with risk of ASD at 6, 12, and 24 months of age. To address the challenge of low tissue contrast and small structural size of infant amygdala and hippocampal subfields, we propose a novel deep-learning approach, dilated-dense U-Net, to digitally segment the amygdala and hippocampal subfields in a longitudinal dataset, the National Database for Autism Research (NDAR). A volume-based analysis is then performed based on the segmentation results. Our study shows that the overgrowth of amygdala and cornu ammonis sectors (CA) 1-3 May start from 6 months of age, which may be related to the emergence of autistic spectrum disorder.

10.
IEEE Trans Image Process ; 28(1): 330-342, 2019 Jan.
Article in English | MEDLINE | ID: mdl-30183628

ABSTRACT

This paper presents an interactive image segmentation approach in which we formulate segmentation as a probabilistic estimation problem based on the prior user intention. Instead of directly measuring the relationship between pixels and labels, we first estimate the distances between pixel pairs and label pairs using a probabilistic framework. Then, binary probabilities with label pairs are naturally converted to unary probabilities with labels. The higher order relationship helps improve the robustness to user inputs. To improve segmentation accuracy, a likelihood learning framework is proposed to fuse the region and the boundary information of the image by imposing a smoothing constraint on the unary potentials. Furthermore, we establish an equivalence relationship between likelihood learning and likelihood diffusion and propose an iterative diffusion-based optimization strategy to maintain computational efficiency. Experiments on the Berkeley segmentation data set and Microsoft GrabCut database demonstrate that the proposed method can obtain better performance than the state-of-the-art methods.

11.
Mach Learn Med Imaging ; 11046: 303-309, 2018 Sep.
Article in English | MEDLINE | ID: mdl-30450494

ABSTRACT

Currently there are still no early biomarkers to detect infants with risk of autism spectrum disorder (ASD), which is mainly diagnosed based on behavior observations at three or four years old. Since intervention efforts may miss a critical developmental window after 2 years old, it is significant to identify imaging-based biomarkers for early diagnosis of ASD. Although some methods using magnetic resonance imaging (MRI) for brain disease prediction have been proposed in the last decade, few of them were developed for predicting ASD in early age. Inspired by deep multi-instance learning, in this paper, we propose a patch-level data-expanding strategy for multi-channel convolutional neural networks to automatically identify infants with risk of ASD in early age. Experiments were conducted on the National Database for Autism Research (NDAR), with results showing that our proposed method can significantly improve the performance of early diagnosis of ASD.

12.
IEEE Trans Neural Netw Learn Syst ; 29(9): 4324-4338, 2018 09.
Article in English | MEDLINE | ID: mdl-29990175

ABSTRACT

Embedding methods have shown promising performance in multilabel prediction, as they are able to discover the label dependence. However, most methods ignore the correlations between the input and output, such that their learned embeddings are not well aligned, which leads to degradation in prediction performance. This paper presents a formulation for multilabel learning, from the perspective of cross-view learning, that explores the correlations between the input and the output. The proposed method, called Co-Embedding (CoE), jointly learns a semantic common subspace and view-specific mappings within one framework. The semantic similarity structure among the embeddings is further preserved, ensuring that close embeddings share similar labels. Additionally, CoE conducts multilabel prediction through the cross-view $k$ nearest neighborhood ( $k$ NN) search among the learned embeddings, which significantly reduces computational costs compared with conventional decoding schemes. A hashing-based model, i.e., Co-Hashing (CoH), is further proposed. CoH is based on CoE, and imposes the binary constraint on continuous latent embeddings. CoH aims to generate compact binary representations to improve the prediction efficiency by benefiting from the efficient $k$ NN search of multiple labels in the Hamming space. Extensive experiments on various real-world data sets demonstrate the superiority of the proposed methods over the state of the arts in terms of both prediction accuracy and efficiency.

13.
Appl Opt ; 57(12): 3268-3280, 2018 Apr 20.
Article in English | MEDLINE | ID: mdl-29714314

ABSTRACT

State-of-the-art no-reference image quality assessment methods usually learn to evaluate image quality by regression from the human subjective scores of a training set. Their dependence on the regression algorithm and human subjective scores may limit the practical application of such methods. In this paper, we propose a completely blind image quality assessment method that is highly unsupervised and training free. We first use a specific image primitive to analyze the image gray-scale fluctuation and use this result as one of the image quality assessment features. The box-counting method is then used to evaluate the image fractal dimension, and the result is used as the other feature. Finally, the two features are combined together, and a formula is introduced to calculate a comprehensive image quality feature, which is used to measure the image quality. Experimental results on four open databases show that the newly proposed method correlates well with the human subjective judgments of diversely distorted images.

14.
PLoS One ; 12(1): e0168449, 2017.
Article in English | MEDLINE | ID: mdl-28045950

ABSTRACT

Accurate image segmentation is an important issue in image processing, where Gaussian mixture models play an important part and have been proven effective. However, most Gaussian mixture model (GMM) based methods suffer from one or more limitations, such as limited noise robustness, over-smoothness for segmentations, and lack of flexibility to fit data. In order to address these issues, in this paper, we propose a rough set bounded asymmetric Gaussian mixture model with spatial constraint for image segmentation. First, based on our previous work where each cluster is characterized by three automatically determined rough-fuzzy regions, we partition the target image into three rough regions with two adaptively computed thresholds. Second, a new bounded indicator function is proposed to determine the bounded support regions of the observed data. The bounded indicator and posterior probability of a pixel that belongs to each sub-region is estimated with respect to the rough region where the pixel lies. Third, to further reduce over-smoothness for segmentations, two novel prior factors are proposed that incorporate the spatial information among neighborhood pixels, which are constructed based on the prior and posterior probabilities of the within- and between-clusters, and considers the spatial direction. We compare our algorithm to state-of-the-art segmentation approaches in both synthetic and real images to demonstrate the superior performance of the proposed algorithm.


Subject(s)
Brain/diagnostic imaging , Brain/pathology , Diagnostic Imaging/methods , Image Processing, Computer-Assisted , Algorithms , Automation , Bayes Theorem , Cluster Analysis , Computer Simulation , Humans , Models, Statistical , Normal Distribution , Probability , Programming Languages
15.
Med Image Anal ; 36: 162-171, 2017 02.
Article in English | MEDLINE | ID: mdl-27914302

ABSTRACT

Accurate segmentation of anatomical structures in medical images is important in recent imaging based studies. In the past years, multi-atlas patch-based label fusion methods have achieved a great success in medical image segmentation. In these methods, the appearance of each input image patch is first represented by an atlas patch dictionary (in the image domain), and then the latent label of the input image patch is predicted by applying the estimated representation coefficients to the corresponding anatomical labels of the atlas patches in the atlas label dictionary (in the label domain). However, due to the generally large gap between the patch appearance in the image domain and the patch structure in the label domain, the estimated (patch) representation coefficients from the image domain may not be optimal for the final label fusion, thus reducing the labeling accuracy. To address this issue, we propose a novel label fusion framework to seek for the suitable label fusion weights by progressively constructing a dynamic dictionary in a layer-by-layer manner, where the intermediate dictionaries act as a sequence of guidance to steer the transition of (patch) representation coefficients from the image domain to the label domain. Our proposed multi-layer label fusion framework is flexible enough to be applied to the existing labeling methods for improving their label fusion performance, i.e., by extending their single-layer static dictionary to the multi-layer dynamic dictionary. The experimental results show that our proposed progressive label fusion method achieves more accurate hippocampal segmentation results for the ADNI dataset, compared to the counterpart methods using only the single-layer static dictionary.


Subject(s)
Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Algorithms , Alzheimer Disease/diagnostic imaging , Hippocampus/diagnostic imaging , Humans , Neuroimaging/methods
16.
IEEE Trans Cybern ; 47(12): 4275-4288, 2017 Dec.
Article in English | MEDLINE | ID: mdl-27655043

ABSTRACT

Due to the significant reduction in computational cost and storage, hashing techniques have gained increasing interests in facilitating large-scale cross-view retrieval tasks. Most cross-view hashing methods are developed by assuming that data from different views are well paired, e.g., text-image pairs. In real-world applications, however, this fully-paired multiview setting may not be practical. The more practical yet challenging semi-paired cross-view retrieval problem, where pairwise correspondences are only partially provided, has less been studied. In this paper, we propose an unsupervised hashing method for semi-paired cross-view retrieval, dubbed semi-paired discrete hashing (SPDH). In specific, SPDH explores the underlying structure of the constructed common latent subspace, where both paired and unpaired samples are well aligned. To effectively preserve the similarities of semi-paired data in the latent subspace, we construct the cross-view similarity graph with the help of anchor data pairs. SPDH jointly learns the latent features and hash codes with a factorization-based coding scheme. For the formulated objective function, we devise an efficient alternating optimization algorithm, where the key binary code learning problem is solved in a bit-by-bit manner with each bit generated with a closed-form solution. The proposed method is extensively evaluated on four benchmark datasets with both fully-paired and semi-paired settings and the results demonstrate the superiority of SPDH over several other state-of-the-art methods in term of both accuracy and scalability.

17.
Springerplus ; 5(1): 1714, 2016.
Article in English | MEDLINE | ID: mdl-27777850

ABSTRACT

We propose a blind image quality assessment that is highly unsupervised and training free. The new method is based on the hypothesis that the effect caused by distortion can be expressed by certain latent characteristics. Combined with probabilistic latent semantic analysis, the latent characteristics can be discovered by applying a topic model over a visual word dictionary. Four distortion-affected features are extracted to form the visual words in the dictionary: (1) the block-based local histogram; (2) the block-based local mean value; (3) the mean value of contrast within a block; (4) the variance of contrast within a block. Based on the dictionary, the latent topics in the images can be discovered. The discrepancy between the frequency of the topics in an unfamiliar image and a large number of pristine images is applied to measure the image quality. Experimental results for four open databases show that the newly proposed method correlates well with human subjective judgments of diversely distorted images.

18.
PLoS One ; 10(1): e0116315, 2015.
Article in English | MEDLINE | ID: mdl-25617769

ABSTRACT

Visual target tracking is a primary task in many computer vision applications and has been widely studied in recent years. Among all the tracking methods, the mean shift algorithm has attracted extraordinary interest and been well developed in the past decade due to its excellent performance. However, it is still challenging for the color histogram based algorithms to deal with the complex target tracking. Therefore, the algorithms based on other distinguishing features are highly required. In this paper, we propose a novel target tracking algorithm based on mean shift theory, in which a new type of image feature is introduced and utilized to find the corresponding region between the neighbor frames. The target histogram is created by clustering the features obtained in the extraction strategy. Then, the mean shift process is adopted to calculate the target location iteratively. Experimental results demonstrate that the proposed algorithm can deal with the challenging tracking situations such as: partial occlusion, illumination change, scale variations, object rotation and complex background clutter. Meanwhile, it outperforms several state-of-the-art methods.


Subject(s)
Artificial Intelligence , Algorithms , Color , Models, Theoretical , Pattern Recognition, Automated
19.
Med Image Comput Comput Assist Interv ; 9351: 190-197, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26942233

ABSTRACT

Accurate segmentation of anatomical structures in medical images is very important in neuroscience studies. Recently, multi-atlas patch-based label fusion methods have achieved many successes, which generally represent each target patch from an atlas patch dictionary in the image domain and then predict the latent label by directly applying the estimated representation coefficients in the label domain. However, due to the large gap between these two domains, the estimated representation coefficients in the image domain may not stay optimal for the label fusion. To overcome this dilemma, we propose a novel label fusion framework to make the weighting coefficients eventually to be optimal for the label fusion by progressively constructing a dynamic dictionary in a layer-by-layer manner, where a sequence of intermediate patch dictionaries gradually encode the transition from the patch representation coefficients in image domain to the optimal weights for label fusion. Our proposed framework is general to augment the label fusion performance of the current state-of-the-art methods. In our experiments, we apply our proposed method to hippocampus segmentation on ADNI dataset and achieve more accurate labeling results, compared to the counterpart methods with single-layer dictionary.

20.
IEEE Trans Image Process ; 24(1): 236-48, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25494504

ABSTRACT

Different from the photometric images, depth images resolve the distance ambiguity of the scene, while the properties, such as weak texture, high noise, and low resolution, may limit the representation ability of the well-developed descriptors, which are elaborately designed for the photometric images. In this paper, a novel depth descriptor, geodesic invariant feature (GIF), is presented for representing the parts of the articulate objects in depth images. GIF is a multilevel feature representation framework, which is proposed based on the nature of depth images. Low-level, geodesic gradient is introduced to obtain the invariance to the articulate motion, such as scale and rotation variation. Midlevel, superpixel clustering is applied to reduce depth image redundancy, resulting in faster processing speed and better robustness to noise. High-level, deep network is used to exploit the nonlinearity of the data, which further improves the classification accuracy. The proposed descriptor is capable of encoding the local structures in the depth data effectively and efficiently. Comparisons with the state-of-the-art methods reveal the superiority of the proposed method.

SELECTION OF CITATIONS
SEARCH DETAIL
...