Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
RSC Adv ; 13(16): 11002-11009, 2023 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-37033420

RESUMO

BaTiO3 nanoparticles were prepared by the hydrothermal method, and the effect of 1-(propyl-3-methoxysilyl)-3-methylimidazole chloride on the size of BaTiO3 particles was investigated. The obtained BaTiO3 was characterized by XRD, SEM, TEM, and Raman spectroscopy; and the dielectric properties of BaTiO3 ceramic sheets were tested. The results indicate that the spherical BaTiO3-N prepared without an ionic liquid was in a tetragonal phase with an average particle size of 129 nm. When an ionic liquid was added, the size of the BaTiO3-IL decreased and the degree of agglomeration increased. In addition, with increasing quantity of ionic liquid, the tetragonal-phase content of BaTiO3-IL gradually decreased until complete transformation into cubic phase. The dielectric constant of the BaTiO3-N ceramics was the highest, and the dielectric constant decreased with decreasing BaTiO3 particle size. Moreover, two types of BaTiO3 nanoparticles (bowl- and sea urchin-shaped) were prepared by changing the hydrothermal conditions and additives. The average particle size of the former was 92 nm, the tetragonal-phase content was ca. 90%, and the dielectric constant was large; whereas the sea urchin-shaped BaTiO3 consisted of small particles in the cubic phase, and the dielectric constant was small.

2.
Artigo em Inglês | MEDLINE | ID: mdl-37015388

RESUMO

Image-text matching is a challenging task due to the modality gap. Many recent methods focus on modeling entity relationships to learn a common embedding space of image and text. However, these methods suffer from distractions of entity relationships such as irrelevant visual regions in an image and noisy textual words in a text. In this paper, we propose an adaptive latent graph representation learning method to reduce the distractions of entity relationships for image-text matching. Specifically, we use an improved graph variational autoencoder to separate the distracting factors and latent factor of relationships and jointly learn latent textual graph representations, latent visual graph representations, and a visual-textual graph embedding space. We also introduce an adaptive cross-attention mechanism to perform feature attending on the latent graph representations across images and texts, thus further narrowing the modality gap to boost the matching performance. Extensive experiments on two public datasets, Flickr30K and COCO, show the effectiveness of our method.

3.
IEEE Trans Neural Netw Learn Syst ; 33(2): 539-553, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33064659

RESUMO

Partial domain adaptation aims to transfer knowledge from a label-rich source domain to a label-scarce target domain (i.e., the target categories are a subset of the source ones), which relaxes the common assumption in traditional domain adaptation that the label space is fully shared across different domains. In this more general and practical scenario on partial domain adaptation, a major challenge is how to select source instances from the shared categories to ensure positive transfer for the target domain. To address this problem, we propose a domain adversarial reinforcement learning (DARL) framework to progressively select source instances to learn transferable features between domains by reducing the domain shift. Specifically, we employ a deep Q-learning to learn policies for an agent to make selection decisions by approximating the action-value function. Moreover, domain adversarial learning is introduced to learn a common feature subspace for the selected source instances and the target instances, and also to contribute to the reward calculation for the agent that is based on the relevance of the selected source instances with respect to the target domain. Extensive experiments on several benchmark data sets clearly demonstrate the superior performance of our proposed DARL over existing state-of-the-art methods for partial domain adaptation.

4.
IEEE Trans Image Process ; 30: 3970-3984, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33769933

RESUMO

Cross-domain object detection in images has attracted increasing attention in the past few years, which aims at adapting the detection model learned from existing labeled images (source domain) to newly collected unlabeled ones (target domain). Existing methods usually deal with the cross-domain object detection problem through direct feature alignment between the source and target domains at the image level, the instance level (i.e., region proposals) or both. However, we have observed that directly aligning features of all object instances from the two domains often results in the problem of negative transfer, due to the existence of (1) outlier target instances that contain confusing objects not belonging to any category of the source domain and thus are hard to be captured by detectors and (2) low-relevance source instances that are considerably statistically different from target instances although their contained objects are from the same category. With this in mind, we propose a reinforcement learning based method, coined as sequential instance refinement, where two agents are learned to progressively refine both source and target instances by taking sequential actions to remove both outlier target instances and low-relevance source instances step by step. Extensive experiments on several benchmark datasets demonstrate the superior performance of our method over existing state-of-the-art baselines for cross-domain object detection.

5.
IEEE Trans Cybern ; 51(5): 2676-2687, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-31251207

RESUMO

In domain adaptation, the automatic discovery of multiple latent source domains has succeeded by capturing the intrinsic structure underlying the source data. Different from previous works that mainly rely on shallow models for domain discovery, we propose a novel unified framework based on deep neural networks to jointly address latent domain prediction from source data and deep representation learning from both source and target data. Within this framework, an iterative algorithm is proposed to alternate between 1) utilizing a new probabilistic hierarchical clustering method to separate the source domain into latent clusters and 2) training deep neural networks by using the domain membership as the supervision to learn deep representations. The key idea behind this joint learning framework is that good representations can help to improve the prediction accuracy of latent domains and, in turn, domain prediction results can provide useful supervisory information for feature learning. During the training of the deep model, a domain prediction loss, a domain confusion loss, and a task-specific classification loss are effectively integrated to enable the learned feature to distinguish between different latent source domains, transfer between source and target domains, and become semantically meaningful among different classes. Trained in an end-to-end fashion, our framework outperforms the state-of-the-art methods for latent domain discovery, as validated by extensive experiments on both object classification and human action-recognition tasks.

6.
IEEE Trans Image Process ; 30: 1180-1192, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33306468

RESUMO

In recent years, large scale datasets of paired images and sentences have enabled the remarkable success in automatically generating descriptions for images, namely image captioning. However, it is labour-intensive and time-consuming to collect a sufficient number of paired images and sentences in each domain. It may be beneficial to transfer the image captioning model trained in an existing domain with pairs of images and sentences (i.e., source domain) to a new domain with only unpaired data (i.e., target domain). In this paper, we propose a cross-modal retrieval aided approach to cross-domain image captioning that leverages a cross-modal retrieval model to generate pseudo pairs of images and sentences in the target domain to facilitate the adaptation of the captioning model. To learn the correlation between images and sentences in the target domain, we propose an iterative cross-modal retrieval process where a cross-modal retrieval model is first pre-trained using the source domain data and then applied to the target domain data to acquire an initial set of pseudo image-sentence pairs. The pseudo image-sentence pairs are further refined by iteratively fine-tuning the retrieval model with the pseudo image-sentence pairs and updating the pseudo image-sentence pairs using the retrieval model. To make the linguistic patterns of the sentences learned in the source domain adapt well to the target domain, we propose an adaptive image captioning model with a self-attention mechanism fine-tuned using the refined pseudo image-sentence pairs. Experimental results on several settings where MSCOCO is used as the source domain and five different datasets (Flickr30k, TGIF, CUB-200, Oxford-102 and Conceptual) are used as the target domains demonstrate that our method achieves mostly better or comparable performance against the state-of-the-art methods. We also extend our method to cross-domain video captioning where MSR-VTT is used as the source domain and two other datasets (MSVD and Charades Captions) are used as the target domains to further demonstrate the effectiveness of our method.

7.
Artigo em Inglês | MEDLINE | ID: mdl-32310773

RESUMO

Many existing methods formulate the action prediction task as recognizing early parts of actions in trimmed videos. In this paper, we focus on predicting actions from ongoing untrimmed videos where actions might not happen at the very beginning of videos. It is extremely challenging to predict actions in such untrimmed videos due to ambiguous or even no information of actions in the early parts of videos. To address this problem, we propose a prediction confidence that assesses the decision quality of a prediction model. Guided by the confidence, the model continuously refines the prediction results by itself with the increasing observed video frames. Specifically, we build a Self Prediction Refining Network (SPR-Net) which incrementally learns the confidence for action prediction. SPR-Net consists of three modules: a temporal hybrid network, an incremental confidence learner, and a self-refining Gumbel softmax sampler. The temporal hybrid network generates the action category distributions by integrating static scene and dynamic motion information. The incremental confidence learner calculates the confidence in an incremental manner, judging the extent to which the temporal hybrid network should believe its prediction result. The self-refining Gumbel softmax sampler models the mutual relationship between the prediction confidence and the category distribution, which enables them to be jointly learned in an end-to-end fashion. We also present a sparse self-attention mechanism to encode local spatio-temporal features into the frame-level motion representation to further improve the prediction performance. Extensive experiments on five datasets (i.e., UT-Interaction, BIT-Interaction, UCF101, THUMOS14, and ActivityNet) validate the effectiveness of the proposed method.

8.
IEEE Trans Image Process ; 28(11): 5308-5321, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31144637

RESUMO

Training deep models of video recognition usually requires sufficient labeled videos in order to achieve good performance without over-fitting. However, it is quite labor-intensive and time-consuming to collect and annotate a large amount of videos. Moreover, training deep neural networks on large-scale video datasets always demands huge computational resources which further hold back many researchers and practitioners. To resolve that, collecting and training on annotated images are much easier. However, thoughtlessly applying images to help recognize videos may result in noticeable performance degeneration due to the well-known domain shift and feature heterogeneity. This proposes a novel symmetric adversarial learning approach for heterogeneous image-to-video adaptation, which augments deep image and video features by learning domain-invariant representations of source images and target videos. Primarily focusing on an unsupervised scenario where the labeled source images are accompanied by unlabeled target videos in the training phrase, we present a data-driven approach to respectively learn the augmented features of images and videos with superior transformability and distinguishability. Starting with learning a common feature space (called image-frame feature space) between images and video frames, we then build new symmetric generative adversarial networks (Sym-GANs) where one GAN maps image-frame features to video features and the other maps video features to image-frame features. Using the Sym-GANs, the source image feature is augmented with the generated video-specific representation to capture the motion dynamics while the target video feature is augmented with the image-specific representation to take the static appearance information. Finally, the augmented features from the source domain are fed into a network with fully connected layers for classification. Thanks to an end-to-end training procedure of the Sym-GANs and the classification network, our approach achieves better results than other state-of-the-arts, which is clearly validated by experiments on two video datasets, i.e., the UCF101 and HMDB51 datasets.

9.
Parkinsonism Relat Disord ; 64: 211-219, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31003906

RESUMO

BACKGROUND: Primary familial brain calcification (PFBC) is a rare calcifying disorder of the brain with extensive clinical and genetic heterogeneity. Its prevalence is underestimated due to clinical selection bias (compared with symptomatic PFBC patients, asymptomatic ones are less likely to undergo genetic testing). METHODS: A total of 273 PFBC probands were enrolled in a multicenter retrospective cohort study by two different approaches. In Group I (nonsystematic approach), 37 probands diagnosed at our clinic were enrolled. In Group II (systematic approach), 236 probands were enrolled by searching the medical imaging databases of 50 other hospitals using specific keywords. Genetic testing of four genes known to be causative of autosomal dominant PFBC was performed in all probands using cDNA. All identified variants were further confirmed using genomic DNA and classified according to ACMG-AMP recommendations. RESULTS: Thirty-two variants including 22 novel variants were detected in 37 probands. Among these probands, 83.8% (31/37) were asymptomatic. Two probands with homozygous pathogenic SLC20A2 variants presented more severe brain calcification and symptoms. Based on the variant detection rate of probands in Group II, we extrapolated an overall minimal prevalence of PFBC of 6.6 per 1,000, much higher than previously reported (2.1 per 1000). CONCLUSIONS: We identified a higher proportion of genetically confirmed PFBC probands who were asymptomatic. These patients would be overlooked due to clinical selection bias, leading to underestimation of the disease prevalence. Considering that PFBC patients with biallelic variants had more severe phenotypes, this specific condition should be focused on in genetic counseling.


Assuntos
Encefalopatias , Calcinose , Proteínas Cotransportadoras de Sódio-Fosfato Tipo III/genética , Encefalopatias/diagnóstico , Encefalopatias/epidemiologia , Encefalopatias/genética , Encefalopatias/fisiopatologia , Calcinose/diagnóstico , Calcinose/epidemiologia , Calcinose/genética , Calcinose/fisiopatologia , China/epidemiologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Linhagem , Fenótipo , Prevalência , Estudos Retrospectivos , Análise de Sequência de DNA , Índice de Gravidade de Doença
10.
IEEE Trans Cybern ; 46(11): 2596-2608, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26485728

RESUMO

In this paper, we develop a novel transfer latent support vector machine for joint recognition and localization of actions by using Web images and weakly annotated training videos. The model takes training videos which are only annotated with action labels as input for alleviating the laborious and time-consuming manual annotations of action locations. Since the ground-truth of action locations in videos are not available, the locations are modeled as latent variables in our method and are inferred during both training and testing phrases. For the purpose of improving the localization accuracy with some prior information of action locations, we collect a number of Web images which are annotated with both action labels and action locations to learn a discriminative model by enforcing the local similarities between videos and Web images. A structural transformation based on randomized clustering forest is used to map the Web images to videos for handling the heterogeneous features of Web images and videos. Experiments on two public action datasets demonstrate the effectiveness of the proposed model for both action localization and action recognition.

11.
IEEE Trans Image Process ; 24(11): 4096-108, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26080383

RESUMO

In cross-view action recognition, what you saw in one view is different from what you recognize in another view, since the data distribution even the feature space can change from one view to another. In this paper, we address the problem of transferring action models learned in one view (source view) to another different view (target view), where action instances from these two views are represented by heterogeneous features. A novel learning method, called heterogeneous transfer discriminant-analysis of canonical correlations (HTDCC), is proposed to discover a discriminative common feature space for linking source view and target view to transfer knowledge between them. Two projection matrices are learned to, respectively, map data from the source view and the target view into a common feature space via simultaneously minimizing the canonical correlations of interclass training data, maximizing the canonical correlations of intraclass training data, and reducing the data distribution mismatch between the source and target views in the common feature space. In our method, the source view and the target view neither share any common features nor have any corresponding action instances. Moreover, our HTDCC method is capable of handling only a few or even no labeled samples available in the target view, and can also be easily extended to the situation of multiple source views. We additionally propose a weighting learning framework for multiple source views adaptation to effectively leverage action knowledge learned from multiple source views for the recognition task in the target view. Under this framework, different source views are assigned different weights according to their different relevances to the target view. Each weight represents how contributive the corresponding source view is to the target view. Extensive experiments on the IXMAS data set demonstrate the effectiveness of HTDCC on learning the common feature space for heterogeneous cross-view action recognition. In addition, the weighting learning framework can achieve promising results on automatically adapting multiple transferred source-view knowledge to the target view.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA