Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Image Process ; 33: 3009-3020, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38625760

RESUMO

Deep cross-modal hashing retrieval has recently made significant progress. However, existing methods generally learn hash functions with pairwise or triplet supervisions, which involves learning the relevant information by splicing partial similarity between data pairs; notably, this approach only captures the data similarity locally and incompletely, resulting in sub-optimal retrieval performance. In this paper, we propose a novel Multi-Relational Deep Hashing (MRDH) approach, which can fully bridge the modality gap by comprehensively modeling the similarity relationship between data in different modalities. In more detail, to investigate the inter-modal relationships, we constrain the consistency of cross-modal pairwise similarities to maintain the semantic similarity across modalities. Moreover, to further capture complete similarity information, we design a new similarity metric, which we term cross-modal global similarity, by encouraging hash codes of similar data pairs from different modalities to approach a common center and hash codes for dissimilar pairs to converge to different centers. Adopting this approach enables our model to generate more discriminative hash codes. Extensive experiments on three benchmark datasets demonstrate the superiority of our method on cross-modal hashing retrieval.

2.
Front Neurosci ; 17: 1209906, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37539384

RESUMO

Objectives: Our objective was to use deep learning models to identify underlying brain regions associated with depression symptom phenotypes in late-life depression (LLD). Participants: Diagnosed with LLD (N = 116) and enrolled in a prospective treatment study. Design: Cross-sectional. Measurements: Structural magnetic resonance imaging (sMRI) was used to predict five depression symptom phenotypes from the Hamilton and MADRS depression scales previously derived from factor analysis: (1) Anhedonia, (2) Suicidality, (3) Appetite, (4) Sleep Disturbance, and (5) Anxiety. Our deep learning model was deployed to predict each factor score via learning deep feature representations from 3D sMRI patches in 34 a priori regions-of-interests (ROIs). ROI-level prediction accuracy was used to identify the most discriminative brain regions associated with prediction of factor scores representing each of the five symptom phenotypes. Results: Factor-level results found significant predictive models for Anxiety and Suicidality factors. ROI-level results suggest the most LLD-associated discriminative regions in predicting all five symptom factors were located in the anterior cingulate and orbital frontal cortex. Conclusions: We validated the effectiveness of using deep learning approaches on sMRI for predicting depression symptom phenotypes in LLD. We were able to identify deep embedded local morphological differences in symptom phenotypes in the brains of those with LLD, which is promising for symptom-targeted treatment of LLD. Future research with machine learning models integrating multimodal imaging and clinical data can provide additional discriminative information.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14055-14068, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37540612

RESUMO

In label-noise learning, estimating the transition matrix is a hot topic as the matrix plays an important role in building statistically consistent classifiers. Traditionally, the transition from clean labels to noisy labels (i.e., clean-label transition matrix (CLTM)) has been widely exploited on class-dependent label-noise (wherein all samples in a clean class share the same label transition matrix). However, the CLTM cannot handle the more common instance-dependent label-noise well (wherein the clean-to-noisy label transition matrix needs to be estimated at the instance level by considering the input quality). Motivated by the fact that classifiers mostly output Bayes optimal labels for prediction, in this paper, we study to directly model the transition from Bayes optimal labels to noisy labels (i.e., Bayes-Label Transition Matrix (BLTM)) and learn a classifier to predict Bayes optimal labels. Note that given only noisy data, it is ill-posed to estimate either the CLTM or the BLTM. But favorably, Bayes optimal labels have no uncertainty compared with the clean labels, i.e., the class posteriors of Bayes optimal labels are one-hot vectors while those of clean labels are not. This enables two advantages to estimate the BLTM, i.e., (a) a set of examples with theoretically guaranteed Bayes optimal labels can be collected out of noisy data; (b) the feasible solution space is much smaller. By exploiting the advantages, this work proposes a parametrical model for estimating the instance-dependent label-noise transition matrix by employing a deep neural network, leading to better generalization and superior classification performance.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11458-11471, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37318979

RESUMO

Weakly supervised temporal action localization (WSTAL), which aims to locate the time interval of actions in an untrimmed video with only video-level action labels, has attracted increasing research interest in the past few years. However, a model trained with such labels will tend to focus on segments that contributions most to the video-level classification, leading to inaccurate and incomplete localization results. In this paper, we tackle the problem from a novel perspective of relation modeling and propose a method dubbed Bilateral Relation Distillation (BRD). The core of our method involves learning representations by jointly modeling the relation at the category and sequence levels. Specifically, category-wise latent segment representations are first obtained by different embedding networks, one for each category. We then distill knowledge obtained from a pre-trained language model to capture the category-level relations, which is achieved by performing correlation alignment and category-aware contrast in an intra- and inter-video manner. To model the relations among segments at the sequence-level, we elaborate a gradient-based feature augmentation method and encourage the learned latent representation of the augmented feature to be consistent with that of the original one. Extensive experiments illustrate that our approach achieves state-of-the-art results on THUMOS14 and ActivityNet1.3 datasets.

5.
IEEE Trans Med Imaging ; 42(2): 336-345, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-35657829

RESUMO

Orthognathic surgery corrects jaw deformities to improve aesthetics and functions. Due to the complexity of the craniomaxillofacial (CMF) anatomy, orthognathic surgery requires precise surgical planning, which involves predicting postoperative changes in facial appearance. To this end, most conventional methods involve simulation with biomechanical modeling methods, which are labor intensive and computationally expensive. Here we introduce a learning-based framework to speed up the simulation of postoperative facial appearances. Specifically, we introduce a facial shape change prediction network (FSC-Net) to learn the nonlinear mapping from bony shape changes to facial shape changes. FSC-Net is a point transform network weakly-supervised by paired preoperative and postoperative data without point-wise correspondence. In FSC-Net, a distance-guided shape loss places more emphasis on the jaw region. A local point constraint loss restricts point displacements to preserve the topology and smoothness of the surface mesh after point transformation. Evaluation results indicate that FSC-Net achieves 15× speedup with accuracy comparable to a state-of-the-art (SOTA) finite-element modeling (FEM) method.


Assuntos
Aprendizado Profundo , Cirurgia Ortognática , Procedimentos Cirúrgicos Ortognáticos , Procedimentos Cirúrgicos Ortognáticos/métodos , Simulação por Computador , Face/diagnóstico por imagem , Face/cirurgia
6.
Mach Learn Med Imaging ; 14349: 396-406, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38390519

RESUMO

Neuroimage retrieval plays a crucial role in providing physicians with access to previous similar cases, which is essential for case-based reasoning and evidence-based medicine. Due to low computation and storage costs, hashing-based search techniques have been widely adopted for establishing image retrieval systems. However, these methods often suffer from nonnegligible quantization loss, which can degrade the overall search performance. To address this issue, this paper presents a compact coding solution namely Deep Bayesian Quantization (DBQ), which focuses on deep compact quantization that can estimate continuous neuroimage representations and achieve superior performance over existing hashing solutions. Specifically, DBQ seamlessly combines the deep representation learning and the representation compact quantization within a novel Bayesian learning framework, where a proxy embedding-based likelihood function is developed to alleviate the sampling issue for traditional similarity supervision. Additionally, a Gaussian prior is employed to reduce the quantization losses. By utilizing pre-computed lookup tables, the proposed DBQ can enable efficient and effective similarity search. Extensive experiments conducted on 2, 008 structural MRI scans from three benchmark neuroimage datasets demonstrate that our method outperforms previous state-of-the-arts.

7.
Artigo em Inglês | MEDLINE | ID: mdl-34966912

RESUMO

Facial appearance changes with the movements of bony segments in orthognathic surgery of patients with craniomaxillofacial (CMF) deformities. Conventional bio-mechanical methods, such as finite element modeling (FEM), for simulating such changes, are labor intensive and computationally expensive, preventing them from being used in clinical settings. To overcome these limitations, we propose a deep learning framework to predict post-operative facial changes. Specifically, FC-Net, a facial appearance change simulation network, is developed to predict the point displacement vectors associated with a facial point cloud. FC-Net learns the point displacements of a pre-operative facial point cloud from the bony movement vectors between pre-operative and simulated post-operative bony models. FC-Net is a weakly-supervised point displacement network trained using paired data with strict point-to-point correspondence. To preserve the topology of the facial model during point transform, we employ a local-point-transform loss to constrain the local movements of points. Experimental results on real patient data reveal that the proposed framework can predict post-operative facial appearance changes remarkably faster than a state-of-the-art FEM method with comparable prediction accuracy.

8.
Med Image Anal ; 71: 102076, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33930828

RESUMO

Structural magnetic resonance imaging (MRI) has shown great clinical and practical values in computer-aided brain disorder identification. Multi-site MRI data increase sample size and statistical power, but are susceptible to inter-site heterogeneity caused by different scanners, scanning protocols, and subject cohorts. Multi-site MRI harmonization (MMH) helps alleviate the inter-site difference for subsequent analysis. Some MMH methods performed at imaging level or feature extraction level are concise but lack robustness and flexibility to some extent. Even though several machine/deep learning-based methods have been proposed for MMH, some of them require a portion of labeled data in the to-be-analyzed target domain or ignore the potential contributions of different brain regions to the identification of brain disorders. In this work, we propose an attention-guided deep domain adaptation (AD2A) framework for MMH and apply it to automated brain disorder identification with multi-site MRIs. The proposed framework does not need any category label information of target data, and can also automatically identify discriminative regions in whole-brain MR images. Specifically, the proposed AD2A is composed of three key modules: (1) an MRI feature encoding module to extract representations of input MRIs, (2) an attention discovery module to automatically locate discriminative dementia-related regions in each whole-brain MRI scan, and (3) a domain transfer module trained with adversarial learning for knowledge transfer between the source and target domains. Experiments have been performed on 2572 subjects from four benchmark datasets with T1-weighted structural MRIs, with results demonstrating the effectiveness of the proposed method in both tasks of brain disorder identification and disease progression prediction.


Assuntos
Encefalopatias , Imageamento por Ressonância Magnética , Atenção , Encéfalo/diagnóstico por imagem , Humanos , Aprendizado de Máquina
9.
IEEE Trans Med Imaging ; 40(4): 1279-1289, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33444133

RESUMO

Brain connectivity alterations associated with mental disorders have been widely reported in both functional MRI (fMRI) and diffusion MRI (dMRI). However, extracting useful information from the vast amount of information afforded by brain networks remains a great challenge. Capturing network topology, graph convolutional networks (GCNs) have demonstrated to be superior in learning network representations tailored for identifying specific brain disorders. Existing graph construction techniques generally rely on a specific brain parcellation to define regions-of-interest (ROIs) to construct networks, often limiting the analysis into a single spatial scale. In addition, most methods focus on the pairwise relationships between the ROIs and ignore high-order associations between subjects. In this letter, we propose a mutual multi-scale triplet graph convolutional network (MMTGCN) to analyze functional and structural connectivity for brain disorder diagnosis. We first employ several templates with different scales of ROI parcellation to construct coarse-to-fine brain connectivity networks for each subject. Then, a triplet GCN (TGCN) module is developed to learn functional/structural representations of brain connectivity networks at each scale, with the triplet relationship among subjects explicitly incorporated into the learning process. Finally, we propose a template mutual learning strategy to train different scale TGCNs collaboratively for disease classification. Experimental results on 1,160 subjects from three datasets with fMRI or dMRI data demonstrate that our MMTGCN outperforms several state-of-the-art methods in identifying three types of brain disorders.


Assuntos
Encefalopatias , Imageamento por Ressonância Magnética , Encéfalo/diagnóstico por imagem , Humanos
10.
IEEE Trans Med Imaging ; 40(2): 503-513, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33048672

RESUMO

Multi-modal neuroimage retrieval has greatly facilitated the efficiency and accuracy of decision making in clinical practice by providing physicians with previous cases (with visually similar neuroimages) and corresponding treatment records. However, existing methods for image retrieval usually fail when applied directly to multi-modal neuroimage databases, since neuroimages generally have smaller inter-class variation and larger inter-modal discrepancy compared to natural images. To this end, we propose a deep Bayesian hash learning framework, called CenterHash, which can map multi-modal data into a shared Hamming space and learn discriminative hash codes from imbalanced multi-modal neuroimages. The key idea to tackle the small inter-class variation and large inter-modal discrepancy is to learn a common center representation for similar neuroimages from different modalities and encourage hash codes to be explicitly close to their corresponding center representations. Specifically, we measure the similarity between hash codes and their corresponding center representations and treat it as a center prior in the proposed Bayesian learning framework. A weighted contrastive likelihood loss function is also developed to facilitate hash learning from imbalanced neuroimage pairs. Comprehensive empirical evidence shows that our method can generate effective hash codes and yield state-of-the-art performance in cross-modal retrieval on three multi-modal neuroimage datasets.


Assuntos
Teorema de Bayes , Bases de Dados Factuais
11.
Mach Learn Med Imaging ; 12436: 1-10, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36383497

RESUMO

Extensive studies focus on analyzing human brain functional connectivity from a network perspective, in which each network contains complex graph structures. Based on resting-state functional MRI (rs-fMRI) data, graph convolutional networks (GCNs) enable comprehensive mapping of brain functional connectivity (FC) patterns to depict brain activities. However, existing studies usually characterize static properties of the FC patterns, ignoring the time-varying dynamic information. In addition, previous GCN methods generally use fixed group-level (e.g., patients or controls) representation of FC networks, and thus, cannot capture subject-level FC specificity. To this end, we propose a Temporal-Adaptive GCN (TAGCN) framework that can not only take advantage of both spatial and temporal information using resting-state FC patterns and time-series but also explicitly characterize subject-level specificity of FC patterns. Specifically, we first segment each ROI-based time-series into multiple overlapping windows, then employ an adaptive GCN to mine topological information. We further model the temporal patterns for each ROI along time to learn the periodic brain status changes. Experimental results on 533 major depressive disorder (MDD) and health control (HC) subjects demonstrate that the proposed TAGCN outperforms several state-of-the-art methods in MDD vs. HC classification, and also can be used to capture dynamic FC alterations and learn valid graph representations.

12.
Artigo em Inglês | MEDLINE | ID: mdl-34746936

RESUMO

Neuroimaging has been widely used in computer-aided clinical diagnosis and treatment, and the rapid increase of neuroimage repositories introduces great challenges for efficient neuroimage search. Existing image search methods often use triplet loss to capture high-order relationships between samples. However, we find that the traditional triplet loss is difficult to pull positive and negative sample pairs to make their Hamming distance discrepancies larger than a small fixed value. This may reduce the discriminative ability of learned hash code and degrade the performance of image search. To address this issue, in this work, we propose a deep disentangled momentum hashing (DDMH) framework for neuroimage search. Specifically, we first investigate the original triplet loss and find that this loss function can be determined by the inner product of hash code pairs. Accordingly, we disentangle hash code norms and hash code directions and analyze the role of each part. By decoupling the loss function from the hash code norm, we propose a unique disentangled triplet loss, which can effectively push positive and negative sample pairs by desired Hamming distance discrepancies for hash codes with different lengths. We further develop a momentum triplet strategy to address the problem of insufficient triplet samples caused by small batch-size for 3D neuroimages. With the proposed disentangled triplet loss and the momentum triplet strategy, we design an end-to-end trainable deep hashing framework for neuroimage search. Comprehensive empirical evidence on three neuroimage datasets shows that DDMH has better performance in neuroimage search compared to several state-of-the-art methods.

13.
IEEE Trans Cybern ; 50(4): 1473-1484, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30561358

RESUMO

Due to its strong representation learning ability and its facilitation of joint learning for representation and hash codes, deep learning-to-hash has achieved promising results and is becoming increasingly popular for the large-scale approximate nearest neighbor search. However, recent studies highlight the vulnerability of deep image classifiers to adversarial examples; this also introduces profound security concerns for deep retrieval systems. Accordingly, in order to study the robustness of modern deep hashing models to adversarial perturbations, we propose hash adversary generation (HAG), a novel method of crafting adversarial examples for Hamming space search. The main goal of HAG is to generate imperceptibly perturbed examples as queries, whose nearest neighbors from a targeted hashing model are semantically irrelevant to the original queries. Extensive experiments prove that HAG can successfully craft adversarial examples with small perturbations to mislead targeted hashing models. The transferability of these perturbations under a variety of settings is also verified. Moreover, by combining heterogeneous perturbations, we further provide a simple yet effective method of constructing adversarial examples for black-box attacks.

14.
IEEE Trans Neural Netw Learn Syst ; 31(6): 2189-2201, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31514156

RESUMO

Hashing has been widely used for large-scale approximate nearest neighbor search due to its storage and search efficiency. Recent supervised hashing research has shown that deep learning-based methods can significantly outperform nondeep methods. Most existing supervised deep hashing methods exploit supervisory signals to generate similar and dissimilar image pairs for training. However, natural images can have large intraclass and small interclass variations, which may degrade the accuracy of hash codes. To address this problem, we propose a novel two-stream ConvNet architecture, which learns hash codes with class-specific representation centers. Our basic idea is that if we can learn a unified binary representation for each class as a center and encourage hash codes of images to be close to the corresponding centers, the intraclass variation will be greatly reduced. Accordingly, we design a neural network that leverages label information and outputs a unified binary representation for each class. Moreover, we also design an image network to learn hash codes from images and force these hash codes to be close to the corresponding class-specific centers. These two neural networks are then seamlessly incorporated to create a unified, end-to-end trainable framework. Extensive experiments on three popular benchmarks corroborate that our proposed method outperforms current state-of-the-art methods.

15.
IEEE Trans Image Process ; 28(8): 4032-4044, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30872226

RESUMO

Hashing plays a pivotal role in nearest-neighbor searching for large-scale image retrieval. Recently, deep learning-based hashing methods have achieved promising performance. However, most of these deep methods involve discriminative models, which require large-scale, labeled training datasets, thus hindering their real-world applications. In this paper, we propose a novel strategy to exploit the semantic similarity of the training data and design an efficient generative adversarial framework to learn binary hash codes in an unsupervised manner. Specifically, our model consists of three different neural networks: an encoder network to learn hash codes from images, a generative network to generate images from hash codes, and a discriminative network to distinguish between pairs of hash codes and images. By adversarially training these networks, we successfully learn mutually coherent encoder and generative networks, and can output efficient hash codes from the encoder network. We also propose a novel strategy, which utilizes both feature and neighbor similarities, to construct a semantic similarity matrix, then use this matrix to guide the hash code learning process. Integrating the supervision of this semantic similarity matrix into the adversarial learning framework can efficiently preserve the semantic information of training data in Hamming space. The experimental results on three widely used benchmarks show that our method not only significantly outperforms several state-of-the-art unsupervised hashing methods, but also achieves comparable performance with popular supervised hashing methods.

16.
IEEE Trans Neural Netw Learn Syst ; 29(11): 5292-5303, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29994640

RESUMO

With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...