Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Med Image Anal ; 91: 103023, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37956551

RESUMO

Self-supervised learning (SSL) has achieved remarkable progress in medical image segmentation. The application of an SSL algorithm often follows a two-stage training process: using unlabeled data to perform label-free representation learning and fine-tuning the pre-trained model on the downstream tasks. One issue of this paradigm is that the SSL step is unaware of the downstream task, which may lead to sub-optimal feature representation for a target task. In this paper, we propose a hybrid pre-training paradigm that is driven by both self-supervised and supervised objectives. To achieve this, a supervised reference task is involved in self-supervised learning, aiming to improve the representation quality. Specifically, we employ the off-the-shelf medical image segmentation task as reference, and encourage learning a representation that (1) incurs low prediction loss on both SSL and reference tasks and (2) leads to a similar gradient when updating the feature extractor from either task. In this way, the reference task pilots SSL in the direction beneficial for the downstream segmentation. To this end, we propose a simple but effective gradient matching method to optimize the model towards a consistent direction, thus improving the compatibility of both SSL and supervised reference tasks. We call this hybrid pre-training paradigm reference-guided self-supervised learning (ReFs), and perform it on a large-scale unlabeled dataset and an additional reference dataset. The experimental results demonstrate its effectiveness on seven downstream medical image segmentation benchmarks.


Assuntos
Algoritmos , Benchmarking , Humanos , Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador
2.
IEEE Trans Image Process ; 32: 4800-4811, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37610890

RESUMO

Cross-resolution person re-identification (CRReID) is a challenging and practical problem that involves matching low-resolution (LR) query identity images against high-resolution (HR) gallery images. Query images often suffer from resolution degradation due to the different capturing conditions from real-world cameras. State-of-the-art solutions for CRReID either learn a resolution-invariant representation or adopt a super-resolution (SR) module to recover the missing information from the LR query. In this paper, we propose an alternative SR-free paradigm to directly compare HR and LR images via a dynamic metric that is adaptive to the resolution of a query image. We realize this idea by learning resolution-adaptive representations for cross-resolution comparison. We propose two resolution-adaptive mechanisms to achieve this. The first mechanism encodes the resolution specifics into different subvectors in the penultimate layer of the deep neural network, creating a varying-length representation. To better extract resolution-dependent information, we further propose to learn resolution-adaptive masks for intermediate residual feature blocks. A novel progressive learning strategy is proposed to train those masks properly. These two mechanisms are combined to boost the performance of CRReID. Experimental results show that the proposed method outperforms existing approaches and achieves state-of-the-art performance on multiple CRReID benchmarks.

3.
IEEE Trans Neural Netw Learn Syst ; 34(11): 8589-8601, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35263259

RESUMO

Person image generation conditioned on natural language allows us to personalize image editing in a user-friendly manner. This fashion, however, involves different granularities of semantic relevance between texts and visual content. Given a sentence describing an unknown person, we propose a novel pose-guided multi-granularity attention architecture to synthesize the person image in an end-to-end manner. To determine what content to draw at a global outline, the sentence-level description and pose feature maps are incorporated into a U-Net architecture to generate a coarse person image. To further enhance the fine-grained details, we propose to draw the human body parts with highly correlated textual nouns and determine the spatial positions with respect to target pose points. Our model is premised on a conditional generative adversarial network (GAN) that translates language description into a realistic person image. The proposed model is coupled with two-stream discriminators: 1) text-relevant local discriminators to improve the fine-grained appearance by identifying the region-text correspondences at the finer manipulation and 2) a global full-body discriminator to regulate the generation via a pose-weighting feature selection. Extensive experiments conducted on benchmarks validate the superiority of our method for person image generation.

4.
J Pharm Biomed Anal ; 223: 115139, 2023 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-36379100

RESUMO

Endogenous steroids, including sex hormones and bile acids, are a group of essential compounds with various biological functions. In this study, we developed an LC-MS method that simultaneously measures 14 sex hormones and metabolites (SH) and 32 bile acids (BA) in rat plasma. Multiple innovative approaches were applied to increase the sensitivity and specificity, including optimization of the mobile phases, gradients, and dynamic multiple reaction monitoring (DMRM) transitions. The method was validated and applied on plasma samples from pregnant rats before and 0.5 h after oral glucose tolerance test (OGTT) at gestational days 0.5 and 18.5. Results showed that the method was applicable, and 9 SH and 30 BA were measurable in the samples. In summary, this method is applicable in studies on SH and BA in rat plasma, and may also be used on other matrix and species.


Assuntos
Ácidos e Sais Biliares , Espectrometria de Massas em Tandem , Ratos , Animais , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/métodos , Plasma , Hormônios Esteroides Gonadais
5.
IEEE Trans Neural Netw Learn Syst ; 33(6): 2454-2465, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34170831

RESUMO

Weakly supervised anomaly detection aims at learning an anomaly detector from a limited amount of labeled data and abundant unlabeled data. Recent works build deep neural networks for anomaly detection by discriminatively mapping the normal samples and abnormal samples to different regions in the feature space or fitting different distributions. However, due to the limited number of annotated anomaly samples, directly training networks with the discriminative loss may not be sufficient. To overcome this issue, this article proposes a novel strategy to transform the input data into a more meaningful representation that could be used for anomaly detection. Specifically, we leverage an autoencoder to encode the input data and utilize three factors, hidden representation, reconstruction residual vector, and reconstruction error, as the new representation for the input data. This representation amounts to encode a test sample with its projection on the training data manifold, its direction to its projection, and its distance to its projection. In addition to this encoding, we also propose a novel network architecture to seamlessly incorporate those three factors. From our extensive experiments, the benefits of the proposed strategy are clearly demonstrated by its superior performance over the competitive methods. Code is available at: https://github.com/yj-zhou/Feature_Encoding_with_AutoEncoders_for_Weakly-supervised_Anomaly_Detection.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6140-6152, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34125669

RESUMO

This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may result in substantial accuracy loss. To address this, we propose three practical approaches, including (i) progressive quantization; (ii) stochastic precision; and (iii) joint knowledge distillation to improve the network training. First, for progressive quantization, we propose two schemes to progressively find good local minima. Specifically, we propose to first optimize a network with quantized weights and subsequently quantize activations. This is in contrast to the traditional methods which optimize them simultaneously. Furthermore, we propose a second progressive quantization scheme which gradually decreases the bitwidth from high-precision to low-precision during training. Second, to alleviate the excessive training burden due to the multi-round training stages, we further propose a one-stage stochastic precision strategy to randomly sample and quantize sub-networks while keeping other parts in full-precision. Finally, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training and significantly improves the performance of the low-precision network. Extensive experiments on various datasets (e.g., CIFAR-100, ImageNet) show the effectiveness of the proposed methods.


Assuntos
Algoritmos , Redes Neurais de Computação
7.
IEEE Trans Image Process ; 30: 6594-6608, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34270425

RESUMO

Semantic segmentation is a crucial image understanding task, where each pixel of image is categorized into a corresponding label. Since the pixel-wise labeling for ground-truth is tedious and labor intensive, in practical applications, many works exploit the synthetic images to train the model for real-word image semantic segmentation, i.e., Synthetic-to-Real Semantic Segmentation (SRSS). However, Deep Convolutional Neural Networks (CNNs) trained on the source synthetic data may not generalize well to the target real-world data. To address this problem, there has been rapidly growing interest in Domain Adaption technique to mitigate the domain mismatch between the synthetic and real-world images. Besides, Domain Generalization technique is another solution to handle SRSS. In contrast to Domain Adaption, Domain Generalization seeks to address SRSS without accessing any data of the target domain during training. In this work, we propose two simple yet effective texture randomization mechanisms, Global Texture Randomization (GTR) and Local Texture Randomization (LTR), for Domain Generalization based SRSS. GTR is proposed to randomize the texture of source images into diverse unreal texture styles. It aims to alleviate the reliance of the network on texture while promoting the learning of the domain-invariant cues. In addition, we find the texture difference is not always occurred in entire image and may only appear in some local areas. Therefore, we further propose a LTR mechanism to generate diverse local regions for partially stylizing the source images. Finally, we implement a regularization of Consistency between GTR and LTR (CGL) aiming to harmonize the two proposed mechanisms during training. Extensive experiments on five publicly available datasets (i.e., GTA5, SYNTHIA, Cityscapes, BDDS and Mapillary) with various SRSS settings (i.e., GTA5/SYNTHIA to Cityscapes/BDDS/Mapillary) demonstrate that the proposed method is superior to the state-of-the-art methods for domain generalization based SRSS.

8.
Small ; 16(38): e2003543, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32812355

RESUMO

The progress of antitumor immunotherapy is usually limited by tumor-associated macrophages (TAMs) that account for the highest proportion of immunosuppressive cells in the tumor microenvironment, and the TAMs can also be reversed by modulating the M2-like phenotype. Herein, a biomimetic polymer magnetic nanocarrier is developed with selectively targeting and polarizing TAMs for potentiating immunotherapy of breast cancer. This nanocarrier PLGA-ION-R837 @ M (PIR @ M) is achieved, first, by the fabrication of magnetic polymer nanoparticles (NPs) encapsulating Fe3 O4 NPs and Toll-like receptor 7 (TLR7) agonist imiquimod (R837) and, second, by the coating of the lipopolysaccharide (LPS)- treated macrophage membranes on the surface of the NPs for targeting TAMs. The intracellular uptake of the PIR @ M can greatly polarize TAMs from M2 to antitumor M1 phenotype with the synergy of Fe3 O4 NPs and R837. The relevant mechanism of the polarization is deeply studied through analyzing the mRNA expression of the signaling pathways. Different from previous reports, the polarization is ascribed to the fact that Fe3 O4 NPs mainly activate the IRF5 signaling pathway via iron ions instead of the reactive oxygen species-induced NF-κB signaling pathway. The anticancer effect can be effectively enhanced through potentiating immunotherapy by the polarization of the TAMs in the combination of Fe3 O4 NPs and R837.


Assuntos
Polímeros , Macrófagos Associados a Tumor , Biomimética , Humanos , Imunoterapia , Fatores Reguladores de Interferon , Fenômenos Magnéticos
9.
IEEE Trans Pattern Anal Mach Intell ; 42(7): 1654-1669, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-30835211

RESUMO

Landmark/pose estimation in single monocular images has received much effort in computer vision due to its important applications. It remains a challenging task when input images come with severe occlusions caused by, e.g., adverse camera views. Under such circumstances, biologically implausible pose predictions may be produced. In contrast, human vision is able to predict poses by exploiting geometric constraints of landmark point inter-connectivity. To address the problem, by incorporating priors about the structure of pose components, we propose a novel structure-aware fully convolutional network to implicitly take such priors into account during training of the deep network. Explicit learning of such constraints is typically challenging. Instead, inspired by how human identifies implausible poses, we design discriminators to distinguish the real poses from the fake ones (such as biologically implausible ones). If the pose generator G generates results that the discriminator fails to distinguish from real ones, the network successfully learns the priors. Training of the network follows the strategy of conditional Generative Adversarial Networks (GANs). The effectiveness of the proposed network is evaluated on three pose-related tasks: 2D human pose estimation, 2D facial landmark estimation and 3D human pose estimation. The proposed approach significantly outperforms several state-of-the-art methods and almost always generates plausible pose predictions, demonstrating the usefulness of implicit learning of structures using GANs.

10.
J Control Release ; 313: 42-53, 2019 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-31629039

RESUMO

Tumor-associated macrophage (TAM)-related immunotherapy is a greatly promising strategy that involves altering the immunosuppressive tumor microenvironment with the immunomodulator imiquimod (R837) for enhanced cancer therapy. However, the function of R837 is seriously limited due to poor water solubility and a lack of targeting ability. Here, we developed two types of targeting polymer micelles to separately deliver R837 and the anticancer drug doxorubicin (DOX) to TAMs and tumor cells via intratumoral injection and intravenous injection, respectively, for enhanced cancer chemo-immunotherapy against breast cancer. After these micelles accumulated in the tumor tissues, the immunostimulating micelles released R837, which bound to the TLR-7 receptor on the lysosomal membrane within the TAM, stimulating the maturation of the TAM, thereby causing an antitumor immune response and relieving the immunosuppressive effect in the tumor microenvironment. Simultaneously, the chemotherapeutic micelles released DOX in the cytoplasm of the tumor cells, directly inducing cell death. As a result, a synergistic combination of chemotherapy and immunotherapy was achieved through these nanomedicines, which separately activated the antitumor immune response and inhibited tumor cell growth. Therefore, this strategy is a new avenue for the development of targeting nanomedicines for combination chemo-immunotherapy against malignant cancer.


Assuntos
Antineoplásicos/química , Neoplasias da Mama/tratamento farmacológico , Doxorrubicina/química , Portadores de Fármacos/química , Imiquimode/química , Fatores Imunológicos/química , Macrófagos/efeitos dos fármacos , Polímeros/química , Animais , Antineoplásicos/farmacologia , Apoptose/efeitos dos fármacos , Ácidos Borônicos/química , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Preparações de Ação Retardada/química , Doxorrubicina/farmacologia , Quimioterapia Combinada , Feminino , Humanos , Imiquimode/farmacologia , Fatores Imunológicos/farmacologia , Imunoterapia , Camundongos , Camundongos Endogâmicos BALB C , Micelas , Neoplasias Experimentais/tratamento farmacológico , Poliésteres/química , Polietilenoglicóis/química , Receptor 7 Toll-Like/metabolismo , Microambiente Tumoral/efeitos dos fármacos
11.
IEEE Trans Image Process ; 28(12): 6116-6125, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31265400

RESUMO

Humans are capable of learning a new fine-grained concept with very little supervision, e.g., few exemplary images for a species of bird, yet our best deep learning systems need hundreds or thousands of labeled examples. In this paper, we try to reduce this gap by studying the fine-grained image recognition problem in a challenging few-shot learning setting, termed few-shot fine-grained recognition (FSFG). The task of FSFG requires the learning systems to build classifiers for the novel fine-grained categories from few examples (only one or less than five). To solve this problem, we propose an end-to-end trainable deep network, which is inspired by the state-of-the-art fine-grained recognition model and is tailored for the FSFG task. Specifically, our network consists of a bilinear feature learning module and a classifier mapping module: while the former encodes the discriminative information of an exemplar image into a feature vector, the latter maps the intermediate feature into the decision boundary of the novel category. The key novelty of our model is a "piecewise mappings" function in the classifier mapping module, which generates the decision boundary via learning a set of more attainable sub-classifiers in a more parameter-economic way. We learn the exemplar-to-classifier mapping based on an auxiliary dataset in a meta-learning fashion, which is expected to be able to generalize to novel categories. By conducting comprehensive experiments on three fine-grained datasets, we demonstrate that the proposed method achieves superior performance over the competing baselines.

12.
ACS Appl Mater Interfaces ; 10(21): 17672-17684, 2018 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-29737828

RESUMO

Clinical chemotherapy confronts a challenge resulting from cancer-related multidrug resistance (MDR), which can directly lead to treatment failure. To address it, an innovative approach is proposed to construct a light-activated reactive oxygen species (ROS)-responsive nanoplatform based on a protoporphyrin (PpIX)-conjugated and dual chemotherapeutics-loaded polymer micelle. This system combines chemotherapy and photodynamic therapy (PDT) to defeat the MDR of tumors. Such an intelligent nanocarrier can prolong the circulation time in blood because of the negative polysaccharide component of chondroitin sulfate, and subsequently being selectively internalized by MCF-7/ADR cells [doxorubicin (DOX)-resistant]. When exposed to 635 nm red light, this nanoplatform generates sufficient ROS through the photoconversion of PpIX, further triggering the disassociation of the micelles to release the dual cargoes. Afterward, the released apatinib, serving as a reversal inhibitor of MDR, can recover the chemosensitivity of DOX by competitively inhibiting the P-glycoprotein drug pump in drug-resistant tumor cells, and the excessive ROS has a strong capacity to exert its PDT effect to act on the mitochondria or the nuclei, ultimately causing cell apoptosis. As expected, this intelligent nanosystem successfully reverses tumor MDR via the synergism between apatinib-enhanced DOX sensitivity and ROS-mediated PDT performance.


Assuntos
Espécies Reativas de Oxigênio/química , Linhagem Celular Tumoral , Doxorrubicina , Resistencia a Medicamentos Antineoplásicos , Humanos , Fotoquimioterapia , Piridinas
13.
IEEE Trans Pattern Anal Mach Intell ; 39(12): 2335-2348, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28092518

RESUMO

Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC. To alleviate this limitation, in this work, we break the convention which assumes that a local feature is drawn from one of a few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as a linear combination of multiple key components, and the combination weight is a latent random variable. In doing so we greatly enhance the representative power of the generative model underlying FVC. To implement our idea, we design two particular generative models following this compositional approach. In our first model, the mean vector is sampled from the subspace spanned by a set of bases and the combination weight is drawn from a Laplace distribution. In our second model, we further assume that a local feature is composed of a discriminative part and a residual part. As a result, a local feature is generated by the linear combination of discriminative part bases and residual part bases. The decomposition of the discriminative and residual parts is achieved via the guidance of a pre-trained supervised coding method. By calculating the gradient vector of the proposed models, we derive two new Fisher vector coding strategies. The first is termed Sparse Coding-based Fisher Vector Coding (SCFVC) and can be used as the substitute of traditional GMM based FVC. The second is termed Hybrid Sparse Coding-based Fisher vector coding (HSCFVC) since it combines the merits of both pre-trained supervised coding methods and FVC. Using pre-trained Convolutional Neural Network (CNN) activations as local features, we experimentally demonstrate that the proposed methods are superior to traditional GMM based FVC and achieve state-of-the-art performance in various image classification tasks.

14.
IEEE Trans Neural Netw Learn Syst ; 28(2): 308-320, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-26742149

RESUMO

Appropriately merging visual words are an effective dimension reduction method for the bag-of-visual-words model in image classification. The approach of hierarchically merging visual words has been extensively employed, because it gives a fully determined merging hierarchy. Existing supervised hierarchical merging methods take different approaches and realize the merging process with various formulations. In this paper, we propose a unified hierarchical merging approach built upon the graph-embedding framework. Our approach is able to merge visual words for any scenario, where a preferred structure and an undesired structure are defined, and, therefore, can effectively attend to all kinds of requirements for the word-merging process. In terms of computational efficiency, we show that our algorithm can seamlessly integrate a fast search strategy developed in our previous work and, thus, well maintain the state-of-the-art merging speed. To the best of our survey, the proposed approach is the first one that addresses the hierarchical visual word mergence in such a flexible and unified manner. As demonstrated, it can maintain excellent image classification performance even after a significant dimension reduction, and outperform all the existing comparable visual word-merging methods. In a broad sense, our work provides an open platform for applying, evaluating, and developing new criteria for hierarchical word-merging tasks.

15.
IEEE Trans Pattern Anal Mach Intell ; 39(11): 2305-2313, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-27959804

RESUMO

Recent studies have shown that a Deep Convolutional Neural Network (DCNN) trained on a large image dataset can be used as a universal image descriptor and that doing so leads to impressive performance for a variety of image recognition tasks. Most of these studies adopt activations from a single DCNN layer, usually a fully-connected layer, as the image representation. In this paper, we proposed a novel way to extract image representations from two consecutive convolutional layers: one layer is used for local feature extraction and the other serves as guidance to pool the extracted features. By taking different viewpoints of convolutional layers, we further develop two schemes to realize this idea. The first directly uses convolutional layers from a DCNN. The second applies the pre-trained CNN on densely sampled image regions and treats the fully-connected activations of each image region as a convolutional layer's feature activations. We then train another convolutional layer on top of that as the pooling-guidance convolutional layer. By applying our method to three popular visual classification tasks, we find that our first scheme tends to perform better on applications which demand strong discrimination on lower-level visual patterns while the latter excels in cases that require discrimination on category-level patterns. Overall, the proposed method achieves superior performance over existing approaches for extracting image representations from a DCNN. In addition, we apply cross-layer pooling to the problem of image retrieval and propose schemes to reduce the computational cost. Experimental results suggest that the proposed method achieves promising results for the image retrieval task.

16.
IEEE Trans Pattern Anal Mach Intell ; 38(2): 224-37, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26761730

RESUMO

Compact and discriminative visual codebooks are preferred in many visual recognition tasks. In the literature, a number of works have taken the approach of hierarchically merging visual words of an initial large-sized codebook, but implemented this approach with different merging criteria. In this work, we propose a single probabilistic framework to unify these merging criteria, by identifying two key factors: the function used to model the class-conditional distribution and the method used to estimate the distribution parameters. More importantly, by adopting new distribution functions and/or parameter estimation methods, our framework can readily produce a spectrum of novel merging criteria. Three of them are specifically discussed in this paper. For the first criterion, we adopt the multinomial distribution with the Bayesian method; For the second criterion, we integrate the Gaussian distribution with maximum likelihood parameter estimation. For the third criterion, which shows the best merging performance, we propose a max-margin-based parameter estimation method and apply it with the multinomial distribution. Extensive experimental study is conducted to systematically analyze the performance of the above three criteria and compare them with existing ones. As demonstrated, the best criterion within our framework achieves the overall best merging performance among the compared merging criteria developed in the literature.

17.
IEEE Trans Pattern Anal Mach Intell ; 38(11): 2269-2283, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26731636

RESUMO

Due to its causal semantics, Bayesian networks (BN) have been widely employed to discover the underlying data relationship in exploratory studies, such as brain research. Despite its success in modeling the probability distribution of variables, BN is naturally a generative model, which is not necessarily discriminative. This may cause the ignorance of subtle but critical network changes that are of investigation values across populations. In this paper, we propose to improve the discriminative power of BN models for continuous variables from two different perspectives. This brings two general discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the first framework, we employ Fisher kernel to bridge the generative models of GBN and the discriminative classifiers of SVMs, and convert the GBN parameter learning to Fisher kernel learning via minimizing a generalization error bound of SVMs. In the second framework, we employ the max-margin criterion and build it directly upon GBN models to explicitly optimize the classification performance of the GBNs. The advantages and disadvantages of the two frameworks are discussed and experimentally compared. Both of them demonstrate strong power in learning discriminative parameters of GBNs for neuroimaging based brain network analysis, as well as maintaining reasonable representation capacity. The contributions of this paper also include a new Directed Acyclic Graph (DAG) constraint with theoretical guarantee to ensure the graph validity of GBN.

18.
IEEE Trans Image Process ; 24(7): 2110-23, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25823035

RESUMO

Graph matching has been widely used in both image processing and computer vision domain due to its powerful performance for structural pattern representation. However, it poses three challenges to image sparse feature matching: 1) the combinatorial nature limits the size of the possible matches; 2) it is sensitive to outliers because its objective function prefers more matches; and 3) it works poorly when handling many-to-many object correspondences, due to its assumption of one single cluster of true matches. In this paper, we address these challenges with a unified framework called density maximization (DM), which maximizes the values of a proposed graph density estimator both locally and globally. DM leads to the integration of feature matching, outlier elimination, and cluster detection. Experimental evaluation demonstrates that it significantly boosts the true matches and enables graph matching to handle both outliers and many-to-many object correspondences. We also extend it to dense correspondence estimation and obtain large improvement over the state-of-the-art methods. We further demonstrate the usefulness of our methods using three applications: 1) instance-level image retrieval; 2) mask transfer; and 3) image enhancement.

19.
Artigo em Inglês | MEDLINE | ID: mdl-25320815

RESUMO

Recently, neuroimaging data have been increasingly used to study the causal relationship among brain regions for the understanding and diagnosis of brain diseases. Recent work on sparse Gaussian Bayesian network (SGBN) has shown it as an efficient tool to learn large scale directional brain networks from neuroimaging data. In this paper, we propose a learning approach to constructing SGBNs that are both representative and discriminative for groups in comparison. A max-margin criterion built directly upon the SGBN models is proposed to effectively optimize the classification performance of the SGBNs. The proposed method shows significant improvements over the state-of-the-art works in the discriminative power of SGBNs.


Assuntos
Doença de Alzheimer/patologia , Inteligência Artificial , Conectoma/métodos , Interpretação de Imagem Assistida por Computador/métodos , Rede Nervosa/patologia , Neuroimagem/métodos , Reconhecimento Automatizado de Padrão/métodos , Doença de Alzheimer/diagnóstico por imagem , Teorema de Bayes , Análise Discriminante , Humanos , Rede Nervosa/diagnóstico por imagem , Cintilografia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
20.
IEEE Trans Pattern Anal Mach Intell ; 36(3): 417-35, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24457501

RESUMO

In image recognition with the bag-of-features model, a small-sized visual codebook is usually preferred to obtain a low-dimensional histogram representation and high computational efficiency. Such a visual codebook has to be discriminative enough to achieve excellent recognition performance. To create a compact and discriminative codebook, in this paper we propose to merge the visual words in a large-sized initial codebook by maximally preserving class separability. We first show that this results in a difficult optimization problem. To deal with this situation, we devise a suboptimal but very efficient hierarchical word-merging algorithm, which optimally merges two words at each level of the hierarchy. By exploiting the characteristics of the class separability measure and designing a novel indexing structure, the proposed algorithm can hierarchically merge 10,000 visual words down to two words in merely 90 seconds. Also, to show the properties of the proposed algorithm and reveal its advantages, we conduct detailed theoretical analysis to compare it with another hierarchical word-merging algorithm that maximally preserves mutual information, obtaining interesting findings. Experimental studies are conducted to verify the effectiveness of the proposed algorithm on multiple benchmark data sets. As shown, it can efficiently produce more compact and discriminative codebooks than the state-of-the-art hierarchical word-merging algorithms, especially when the size of the codebook is significantly reduced.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA