Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Article in English | MEDLINE | ID: mdl-38319760

ABSTRACT

Unsupervised graph-structure learning (GSL) which aims to learn an effective graph structure applied to arbitrary downstream tasks by data itself without any labels' guidance, has recently received increasing attention in various real applications. Although several existing unsupervised GSL has achieved superior performance in different graph analytical tasks, how to utilize the popular graph masked autoencoder to sufficiently acquire effective supervision information from the data itself for improving the effectiveness of learned graph structure has been not effectively explored so far. To tackle the above issue, we present a multilevel contrastive graph masked autoencoder (MCGMAE) for unsupervised GSL. Specifically, we first introduce a graph masked autoencoder with the dual feature masking strategy to reconstruct the same input graph-structured data under the original structure generated by the data itself and learned graph-structure scenarios, respectively. And then, the inter-and intra-class contrastive loss is introduced to maximize the mutual information in feature and graph-structure reconstruction levels simultaneously. More importantly, the above inter-and intra-class contrastive loss is also applied to the graph encoder module for further strengthening their agreement at the feature-encoder level. In comparison to the existing unsupervised GSL, our proposed MCGMAE can effectively improve the training robustness of the unsupervised GSL via different-level supervision information from the data itself. Extensive experiments on three graph analytical tasks and eight datasets validate the effectiveness of the proposed MCGMAE.

2.
Article in English | MEDLINE | ID: mdl-37289610

ABSTRACT

Sparse additive machines (SAMs) have shown competitive performance on variable selection and classification in high-dimensional data due to their representation flexibility and interpretability. However, the existing methods often employ the unbounded or nonsmooth functions as the surrogates of 0-1 classification loss, which may encounter the degraded performance for data with outliers. To alleviate this problem, we propose a robust classification method, named SAM with the correntropy-induced loss (CSAM), by integrating the correntropy-induced loss (C-loss), the data-dependent hypothesis space, and the weighted lq,1 -norm regularizer ( q ≥ 1 ) into additive machines. In theory, the generalization error bound is estimated via a novel error decomposition and the concentration estimation techniques, which shows that the convergence rate O(n-1/4) can be achieved under proper parameter conditions. In addition, the theoretical guarantee on variable selection consistency is analyzed. Experimental evaluations on both synthetic and real-world datasets consistently validate the effectiveness and robustness of the proposed approach.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 12844-12861, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37015683

ABSTRACT

Zero-shot learning (ZSL) tackles the novel class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Semantic knowledge is typically represented by attribute descriptions shared between different classes, which act as strong priors for localizing object attributes that represent discriminative region features, enabling significant and sufficient visual-semantic interaction for advancing ZSL. Existing attention-based models have struggled to learn inferior region features in a single image by solely using unidirectional attention, which ignore the transferable and discriminative attribute localization of visual features for representing the key semantic knowledge for effective knowledge transfer in ZSL. In this paper, we propose a cross attribute-guided Transformer network, termed TransZero++, to refine visual features and learn accurate attribute localization for key semantic knowledge representations in ZSL. Specifically, TransZero++ employs an attribute → visual Transformer sub-net (AVT) and a visual → attribute Transformer sub-net (VAT) to learn attribute-based visual features and visual-based attribute features, respectively. By further introducing feature-level and prediction-level semantical collaborative losses, the two attribute-guided transformers teach each other to learn semantic-augmented visual embeddings for key semantic knowledge representations via semantical collaborative learning. Finally, the semantic-augmented visual embeddings learned by AVT and VAT are fused to conduct desirable visual-semantic interaction cooperated with class semantic vectors for ZSL classification. Extensive experiments show that TransZero++ achieves the new state-of-the-art results on three golden ZSL benchmarks and on the large-scale ImageNet dataset. The project website is available at: https://shiming-chen.github.io/TransZero-pp/TransZero-pp.html.

4.
Med Phys ; 50(7): 4325-4339, 2023 Jul.
Article in English | MEDLINE | ID: mdl-36708251

ABSTRACT

BACKGROUND: In the brain tumor magnetic resonance image (MRI) segmentation, although the 3D convolution networks (CNNs) has achieved state-of-the-art results, the class and hard-voxel imbalances in the 3D images have not been well addressed. Voxel independent losses are dependent on the setting of class weights for the class imbalance issue, and are hard to assign each class equally. Region-related losses cannot correctly focus on hard voxels dynamically and not be robust to misclassification of small structures. Meanwhile, repeatedly training on the additional hard samples augmented by existing methods would bring more class imbalance, overfitting and incorrect knowledge learning to the model. PURPOSE: A novel region-related loss with balanced dynamic weighting while alleviating the sensitivity to small structures is necessary. In addition, we need to increase the diversity of hard samples in the training to improve the performance of model. METHODS: The proposed Region-related Focal Loss (RFL) reshapes standard Dice Loss (DL) by up-weighting the loss assigned to hard-classified voxels. Compared to DL, RFL adaptively modulate its gradient with an invariant focalized point that voxels with lower-confidence than it would achieve a larger gradient, and higher-confidence voxels would get a smaller gradient. Meanwhile, RFL can adjust the parameters to set where and how much the network is focused. In addition, an Intra-classly Transformed Augmentation network (ITA-NET) is proposed to increase the diversity of hard samples, in which the 3D registration network and intra-class transfer layer are used to transform the shape and intensity respectively. A selective hard sample mining(SHSM) strategy is used to train the ITA-NET for avoiding excessive class imbalance. Source code (in Tensorflow) is available at: https://github.com/lb-whu/RFL_ITA. RESULTS: The experiments are carried out on public data set: Brain Tumor Segmentation Challenge 2020 (BratS2020). Experiments with BraTS2020 online validation set show that proposed methods achieve an average Dice scores of 0.905, 0.821, and 0.806 for whole tumor (WT), tumor core (TC) and enhancing tumor (ET), respectively. Compared with DL (baseline), the proposed RFL significantly improves the Dice scores by an average of 1%, and for the small region ET it can even increase by 3%. And the proposed method combined with ITA-NET improves the Dice scores of ET and TC by 5% and 3% respectively. CONCLUSIONS: The proposed RFL can converge with a invariant focalized point in the training of segmentation network, thus effectively alleviating the hard-voxel imbalance in brain tumor MRI segmentation. The negative region term of RFL can effectively reduce the sensitivity of the segmentation model to the misclassification of small structures. The proposed ITA-NET can increase the diversity of hard samples by transforming their shape and transfer their intra-class intensity, thereby effectively improving the robustness of the segmentation network to hard samples.


Subject(s)
Brain Neoplasms , Image Processing, Computer-Assisted , Humans , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/pathology , Imaging, Three-Dimensional/methods
5.
IEEE Trans Cybern ; 53(7): 4630-4641, 2023 Jul.
Article in English | MEDLINE | ID: mdl-34919528

ABSTRACT

The divide-and-conquer strategy is a very effective method of dealing with big data. Noisy samples in big data usually have a great impact on algorithmic performance. In this article, we introduce Markov sampling and different weights for distributed learning with the classical support vector machine (cSVM). We first estimate the generalization error of weighted distributed cSVM algorithm with uniformly ergodic Markov chain (u.e.M.c.) samples and obtain its optimal convergence rate. As applications, we obtain the generalization bounds of weighted distributed cSVM with strong mixing observations and independent and identically distributed (i.i.d.) samples, respectively. We also propose a novel weighted distributed cSVM based on Markov sampling (DM-cSVM). The numerical studies of benchmark datasets show that the DM-cSVM algorithm not only has better performance but also has less total time of sampling and training compared to other distributed algorithms.


Subject(s)
Algorithms , Support Vector Machine , Learning , Markov Chains
6.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7541-7554, 2023 Oct.
Article in English | MEDLINE | ID: mdl-35120009

ABSTRACT

Recent weakly supervised semantic segmentation methods generate pseudolabels to recover the lost position information in weak labels for training the segmentation network. Unfortunately, those pseudolabels often contain mislabeled regions and inaccurate boundaries due to the incomplete recovery of position information. It turns out that the result of semantic segmentation becomes determinate to a certain degree. In this article, we decompose the position information into two components: high-level semantic information and low-level physical information, and develop a componentwise approach to recover each component independently. Specifically, we propose a simple yet effective pseudolabels updating mechanism to iteratively correct mislabeled regions inside objects to precisely refine high-level semantic information. To reconstruct low-level physical information, we utilize a customized superpixel-based random walk mechanism to trim the boundaries. Finally, we design a novel network architecture, namely, a dual-feedback network (DFN), to integrate the two mechanisms into a unified model. Experiments on benchmark datasets show that DFN outperforms the existing state-of-the-art methods in terms of intersection-over-union (mIoU).

7.
Article in English | MEDLINE | ID: mdl-36107889

ABSTRACT

Despite the great success of the existing work in fine-grained visual categorization (FGVC), there are still several unsolved challenges, e.g., poor interpretation and vagueness contribution. To circumvent this drawback, motivated by the hypersphere embedding method, we propose a discriminative suprasphere embedding (DSE) framework, which can provide intuitive geometric interpretation and effectively extract discriminative features. Specifically, DSE consists of three modules. The first module is a suprasphere embedding (SE) block, which learns discriminative information by emphasizing weight and phase. The second module is a phase activation map (PAM) used to analyze the contribution of local descriptors to the suprasphere feature representation, which uniformly highlights the object region and exhibits remarkable object localization capability. The last module is a class contribution map (CCM), which quantitatively analyzes the network classification decision and provides insight into the domain knowledge about classified objects. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed method in comparison with state-of-the-art methods.

8.
Article in English | MEDLINE | ID: mdl-35507624

ABSTRACT

Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space. Thus, they fail to: 1) exploit the appearance relationship priors between various local regions in a single image, which corresponds to the semantic information and 2) learn cooperative global and local features jointly for discriminative feature representations. In this article, we propose the novel graph navigated dual attention network (GNDAN) for ZSL to address these drawbacks. GNDAN employs a region-guided attention network (RAN) and a region-guided graph attention network (RGAT) to jointly learn a discriminative local embedding and incorporate global context for exploiting explicit global embeddings under the guidance of a graph. Specifically, RAN uses soft spatial attention to discover discriminative regions for generating local embeddings. Meanwhile, RGAT employs an attribute-based attention to obtain attribute-based region features, where each attribute focuses on the most relevant image regions. Motivated by the graph neural network (GNN), which is beneficial for structural relationship representations, RGAT further leverages a graph attention network to exploit the relationships between the attribute-based region features for explicit global embedding representations. Based on the self-calibration mechanism, the joint visual embedding learned is matched with the semantic embedding to form the final prediction. Extensive experiments on three benchmark datasets demonstrate that the proposed GNDAN achieves superior performances to the state-of-the-art methods. Our code and trained models are available at https://github.com/shiming-chen/GNDAN.

9.
Med Phys ; 48(11): 6962-6975, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34494276

ABSTRACT

PURPOSE: In neonatal brain magnetic resonance image (MRI) segmentation, the model we trained on the training set (source domain) often performs poorly in clinical practice (target domain). As the label of target-domain images is unavailable, this cross-domain segmentation needs unsupervised domain adaptation (UDA) to make the model adapt to the target domain. However, the shape and intensity distribution of neonatal brain MRI images across the domains are largely different from adults'. Current UDA methods aim to make synthesized images similar to the target domain as a whole. But it is impossible to synthesize images with intraclass similarity because of the regional misalignment caused by the cross-domain difference. This will result in generating intraclassly incorrect intensity information from target-domain images. To address this issue, we propose an IAS-NET (joint intraclassly adaptive generative adversarial network (GAN) (IA-NET) and segmentation) framework to bridge the gap between the two domains for intraclass alignment. METHODS: Our proposed IAS-NET is an elegant learning framework that transfers the appearance of images across the domains from both image and feature perspectives. It consists of the proposed IA-NET and a segmentation network (S-NET). The proposed IA-NET is a GAN-based adaptive network that contains one generator (including two encoders and one shared decoder) and four discriminators for cross-domain transfer. The two encoders are implemented to extract original image, mean, and variance features from source and target domains. The proposed local adaptive instance normalization algorithm is used to perform intraclass feature alignment to the target domain in the feature-map level. S-NET is a U-net structure network that is used to provide semantic constraint by a segmentation loss for the training of IA-NET. Meanwhile, it offers pseudo-label images for calculating intraclass features of the target domain. Source code (in Tensorflow) is available at https://github.com/lb-whu/RAS-NET/. RESULTS: Extensive experiments are carried out on two different data sets (NeoBrainS12 and dHCP), respectively. There exist great differences in the shape, size, and intensity distribution of magnetic resonance (MR) images in the two databases. Compared to baseline, we improve the average dice score of all tissues on NeoBrains12 by 6% through adaptive training with unlabeled dHCP images. Besides, we also conduct experiments on dHCP and improved the average dice score by 4%. The quantitative analysis of the mean and variance of the synthesized images shows that the synthesized image by the proposed is closer to the target domain both in the full brain or within each class than that of the compared methods. CONCLUSIONS: In this paper, the proposed IAS-NET can improve the performance of the S-NET effectively by its intraclass feature alignment in the target domain. Compared to the current UDA methods, the synthesized images by IAS-NET are more intraclassly similar to the target domain for neonatal brain MR images. Therefore, it achieves state-of-the-art results in the compared UDA models for the segmentation task.


Subject(s)
Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Algorithms , Brain/diagnostic imaging , Magnetic Resonance Spectroscopy
10.
IEEE Trans Pattern Anal Mach Intell ; 43(1): 139-156, 2021 01.
Article in English | MEDLINE | ID: mdl-31331881

ABSTRACT

With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio (IR) of data is high, most existing imbalanced learning methods decline seriously in classification performance. In this paper, we systematically investigate the highly imbalanced data classification problem, and propose an uncorrelated cost-sensitive multiset learning (UCML) approach for it. Specifically, UCML first constructs multiple balanced subsets through random partition, and then employs the multiset feature learning (MFL) to learn discriminant features from the constructed multiset. To enhance the usability of each subset and deal with the non-linearity issue existed in each subset, we further propose a deep metric based UCML (DM-UCML) approach. DM-UCML introduces the generative adversarial network technique into the multiset constructing process, such that each subset can own similar distribution with the original dataset. To cope with the non-linearity issue, DM-UCML integrates deep metric learning with MFL, such that more favorable performance can be achieved. In addition, DM-UCML designs a new discriminant term to enhance the discriminability of learned metrics. Experiments on eight traditional highly class-imbalanced datasets and two large-scale datasets indicate that: the proposed approaches outperform state-of-the-art highly imbalanced learning methods and are more robust to high IR.

11.
IEEE Trans Neural Netw Learn Syst ; 32(3): 1204-1216, 2021 Mar.
Article in English | MEDLINE | ID: mdl-32287021

ABSTRACT

Low-rank Multiview Subspace Learning (LMvSL) has shown great potential in cross-view classification in recent years. Despite their empirical success, existing LMvSL-based methods are incapable of handling well view discrepancy and discriminancy simultaneously, which, thus, leads to performance degradation when there is a large discrepancy among multiview data. To circumvent this drawback, motivated by the block-diagonal representation learning, we propose structured low-rank matrix recovery (SLMR), a unique method of effectively removing view discrepancy and improving discriminancy through the recovery of the structured low-rank matrix. Furthermore, recent low-rank modeling provides a satisfactory solution to address the data contaminated by the predefined assumptions of noise distribution, such as Gaussian or Laplacian distribution. However, these models are not practical, since complicated noise in practice may violate those assumptions and the distribution is generally unknown in advance. To alleviate such a limitation, modal regression is elegantly incorporated into the framework of SLMR (termed MR-SLMR). Different from previous LMvSL-based methods, our MR-SLMR can handle any zero-mode noise variable that contains a wide range of noise, such as Gaussian noise, random noise, and outliers. The alternating direction method of multipliers (ADMM) framework and half-quadratic theory are used to optimize efficiently MR-SLMR. Experimental results on four public databases demonstrate the superiority of MR-SLMR and its robustness to complicated noise.

12.
Urology ; 142: 183-189, 2020 08.
Article in English | MEDLINE | ID: mdl-32445770

ABSTRACT

OBJECTIVE: To reliably and quickly diagnose children with posterior urethral valves (PUV), we developed a multi-instance deep learning method to automate image analysis. METHODS: We built a robust pattern classifier to distinguish 86 children with PUV from 71 children with mild unilateral hydronephrosis based on ultrasound images (3504 in sagittal view and 2558 in transverse view) obtained during routine clinical care. RESULTS: The multi-instance deep learning classifier performed better than classifiers built on either single sagittal images or single transverse images. Particularly, the deep learning classifiers built on single images in the sagittal view and single images in the transverse view obtained area under the receiver operating characteristic curve (AUC) values of 0.796 ± 0.064 and 0.815 ± 0.071, respectively. AUC values of the multi-instance deep learning classifiers built on images in the sagittal and transverse views with mean pooling operation were 0.949 ± 0.035 and 0.954 ± 0.033, respectively. The multi-instance deep learning classifiers built on images in both the sagittal and transverse views with a mean pooling operation obtained an AUC of 0.961 ± 0.026 with a classification rate of 0.925 ± 0.060, specificity of 0.986 ± 0.032, and sensitivity of 0.873 ± 0.120, respectively. Discriminative regions of the kidney located using classification activation mapping demonstrated that the deep learning techniques could identify meaningful anatomical features from ultrasound images. CONCLUSION: The multi-instance deep learning method provides an automatic and accurate means to extract informative features from ultrasound images and discriminate infants with PUV from male children with unilateral hydronephrosis.


Subject(s)
Deep Learning , Hydronephrosis/diagnosis , Image Interpretation, Computer-Assisted/methods , Urogenital Abnormalities/diagnosis , Vesico-Ureteral Reflux/diagnosis , Case-Control Studies , Diagnosis, Differential , Feasibility Studies , Female , Humans , Infant , Infant, Newborn , Kidney/abnormalities , Kidney/diagnostic imaging , Male , ROC Curve , Reproducibility of Results , Ultrasonography/methods , Urethra/abnormalities , Urethra/diagnostic imaging
13.
Proc IEEE Int Symp Biomed Imaging ; 2020: 1347-1350, 2020 Apr.
Article in English | MEDLINE | ID: mdl-33850604

ABSTRACT

Ultrasound images are widely used for diagnosis of congenital abnormalities of the kidney and urinary tract (CAKUT). Since a typical clinical ultrasound image captures 2D information of a specific view plan of the kidney and images of the same kidney on different planes have varied appearances, it is challenging to develop a computer aided diagnosis tool robust to ultrasound images in different views. To overcome this problem, we develop a multi-instance deep learning method for distinguishing children with CAKUT from controls based on their clinical ultrasound images, aiming to automatic diagnose the CAKUT in children based on ultrasound imaging data. Particularly, a multi-instance deep learning method was developed to build a robust pattern classifier to distinguish children with CAKUT from controls based on their ultrasound images in sagittal and transverse views obtained during routine clinical care. The classifier was built on imaging features derived using transfer learning from a pre-trained deep learning model with a mean pooling operator for fusing instance-level classification results. Experimental results have demonstrated that the multi-instance deep learning classifier performed better than classifiers built on either individual sagittal slices or individual transverse slices.

14.
IEEE Trans Cybern ; 50(8): 3640-3653, 2020 Aug.
Article in English | MEDLINE | ID: mdl-30794195

ABSTRACT

We present a novel cross-view classification algorithm where the gallery and probe data come from different views. A popular approach to tackle this problem is the multiview subspace learning (MvSL) that aims to learn a latent subspace shared by multiview data. Despite promising results obtained on some applications, the performance of existing methods deteriorates dramatically when the multiview data is sampled from nonlinear manifolds or suffers from heavy outliers. To circumvent this drawback, motivated by the Divide-and-Conquer strategy, we propose multiview hybrid embedding (MvHE), a unique method of dividing the problem of cross-view classification into three subproblems and building one model for each subproblem. Specifically, the first model is designed to remove view discrepancy, whereas the second and third models attempt to discover the intrinsic nonlinear structure and to increase the discriminability in intraview and interview samples, respectively. The kernel extension is conducted to further boost the representation power of MvHE. Extensive experiments are conducted on four benchmark datasets. Our methods demonstrate the overwhelming advantages against the state-of-the-art MvSL-based cross-view classification approaches in terms of classification accuracy and robustness.

15.
Med Image Anal ; 60: 101602, 2020 02.
Article in English | MEDLINE | ID: mdl-31760193

ABSTRACT

It remains challenging to automatically segment kidneys in clinical ultrasound (US) images due to the kidneys' varied shapes and image intensity distributions, although semi-automatic methods have achieved promising performance. In this study, we propose subsequent boundary distance regression and pixel classification networks to segment the kidneys automatically. Particularly, we first use deep neural networks pre-trained for classification of natural images to extract high-level image features from US images. These features are used as input to learn kidney boundary distance maps using a boundary distance regression network and the predicted boundary distance maps are classified as kidney pixels or non-kidney pixels using a pixelwise classification network in an end-to-end learning fashion. We also adopted a data-augmentation method based on kidney shape registration to generate enriched training data from a small number of US images with manually segmented kidney labels. Experimental results have demonstrated that our method could automatically segment the kidney with promising performance, significantly better than deep learning-based pixel classification networks.


Subject(s)
Kidney Diseases/diagnostic imaging , Neural Networks, Computer , Pattern Recognition, Automated , Ultrasonography/methods , Aged, 80 and over , Datasets as Topic , Deep Learning , Female , Humans , Male
16.
Proc IEEE Int Symp Biomed Imaging ; 2019: 1741-1744, 2019 Apr.
Article in English | MEDLINE | ID: mdl-31803348

ABSTRACT

It remains challenging to automatically segment kidneys in clinical ultrasound images due to the kidneys' varied shapes and image intensity distributions, although semi-automatic methods have achieved promising performance. In this study, we developed a novel boundary distance regression deep neural network to segment the kidneys, informed by the fact that the kidney boundaries are relatively consistent across images in terms of their appearance. Particularly, we first use deep neural networks pre-trained for classification of natural images to extract high-level image features from ultrasound images, then these feature maps are used as input to learn kidney boundary distance maps using a boundary distance regression network, and finally the predicted boundary distance maps are classified as kidney pixels or non-kidney pixels using a pixel classification network in an end-to-end learning fashion. Experimental results have demonstrated that our method could effectively improve the performance of automatic kidney segmentation, significantly better than deep learning based pixel classification networks.

17.
Article in English | MEDLINE | ID: mdl-31893285

ABSTRACT

Ultrasound imaging (US) is commonly used in nephrology for diagnostic studies of the kidneys and lower urinary tract. However, it remains challenging to automate the disease diagnosis based on clinical 2D US images since they provide partial anatomic information of the kidney and the 2D images of the same kidney may have heterogeneous appearance. To overcome this challenge, we develop a novel multi-instance deep learning method to build a robust classifier by treating multiple 2D US images of each individual subject as multiple instances of one bag. Particularly, we adopt convolutional neural networks (CNNs) to learn instance-level features from 2D US kidney images and graph convolutional networks (GCNs) to further optimize the instance-level features by exploring potential correlation among instances of the same bag. We also adopt a gated attention-based MIL pooling to learn bag-level features using full-connected neural networks (FCNs). Finally, we integrate both instance-level and bag-level supervision to further improve the bag-level classification accuracy. Ablation studies and comparison results have demonstrated that our method could accurately diagnose kidney diseases using ultrasound imaging, with better performance than alternative state-of-the-art multi-instance deep learning methods.

18.
Article in English | MEDLINE | ID: mdl-30072322

ABSTRACT

Video-based person re-identification (re-id) is an important application in practice. Since large variations exist between different pedestrian videos, as well as within each video, it's challenging to conduct re-identification between pedestrian videos. In this paper, we propose a simultaneous intra-video and inter-video distance learning (SI2DL) approach for video-based person re-id. Specifically, SI2DL simultaneously learns an intravideo distance metric and an inter-video distance metric from the training videos. The intra-video distance metric is used to make each video more compact, and the inter-video one is used to ensure that the distance between truly matching videos is smaller than that between wrong matching videos. Considering that the goal of distance learning is to make truly matching video pairs from different persons be well separated with each other, we also propose a pair separation based SI2DL (P-SI2DL). P-SI2DL aims to learn a pair of distance metrics, under which any two truly matching video pairs can be well separated. Experiments on four public pedestrian image sequence datasets show that our approaches achieve the state-of-the-art performance.

19.
Magn Reson Med ; 80(5): 2173-2187, 2018 11.
Article in English | MEDLINE | ID: mdl-29672917

ABSTRACT

PURPOSE: Low signal-to-noise-ratio and limited scan time of diffusion magnetic resonance imaging (dMRI) in current clinical settings impede obtaining images with high spatial and angular resolution (HSAR) for a reliable fiber reconstruction with fine anatomical details. To overcome this problem, we propose a joint space-angle regularization approach to reconstruct HSAR diffusion signals from a single 4D low resolution (LR) dMRI, which is down-sampled in both 3D-space and q-space. METHODS: Different from the existing works which combine multiple 4D LR diffusion images acquired using specific acquisition protocols, the proposed method reconstructs HSAR dMRI from only a single 4D dMRI by exploring and integrating two key priors, that is, the nonlocal self-similarity in the spatial domain as a prior to increase spatial resolution and ridgelet approximations in the diffusion domain as another prior to increase the angular resolution of dMRI. To more effectively capture nonlocal self-similarity in the spatial domain, a novel 3D block-based nonlocal means filter is imposed as the 3D image space regularization term which is accurate in measuring the similarity and fast for 3D reconstruction. To reduce computational complexity, we use the L2 -norm instead of sparsity constraint on the representation coefficients. RESULTS: Experimental results demonstrate that the proposed method can obtain the HSAR dMRI efficiently with approximately 2% per-voxel root-mean-square error between the actual and reconstructed HSAR dMRI. CONCLUSION: The proposed approach can effectively increase the spatial and angular resolution of the dMRI which is independent of the acquisition protocol, thus overcomes the inherent resolution limitation of imaging systems.


Subject(s)
Diffusion Magnetic Resonance Imaging/methods , Imaging, Three-Dimensional/methods , Algorithms , Brain/diagnostic imaging , Databases, Factual , Humans , Signal-To-Noise Ratio
20.
IEEE Trans Neural Netw Learn Syst ; 29(4): 1328-1341, 2018 04.
Article in English | MEDLINE | ID: mdl-28749357

ABSTRACT

Support vector machine (SVM) is one of the most widely used learning algorithms for classification problems. Although SVM has good performance in practical applications, it has high algorithmic complexity as the size of training samples is large. In this paper, we introduce SVM classification (SVMC) algorithm based on -times Markov sampling and present the numerical studies on the learning performance of SVMC with -times Markov sampling for benchmark data sets. The experimental results show that the SVMC algorithm with -times Markov sampling not only have smaller misclassification rates, less time of sampling and training, but also the obtained classifier is more sparse compared with the classical SVMC and the previously known SVMC algorithm based on Markov sampling. We also give some discussions on the performance of SVMC with -times Markov sampling for the case of unbalanced training samples and large-scale training samples.

SELECTION OF CITATIONS
SEARCH DETAIL
...