Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 4747-4762, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38261478

RESUMO

Despite significant results achieved by Contrastive Language-Image Pretraining (CLIP) in zero-shot image recognition, limited effort has been made exploring its potential for zero-shot video recognition. This paper presents Open-VCLIP++, a simple yet effective framework that adapts CLIP to a strong zero-shot video classifier, capable of identifying novel actions and events during testing. Open-VCLIP++ minimally modifies CLIP to capture spatial-temporal relationships in videos, thereby creating a specialized video classifier while striving for generalization. We formally demonstrate that training Open-VCLIP++ is tantamount to continual learning with zero historical data. To address this problem, we introduce Interpolated Weight Optimization, a technique that leverages the advantages of weight interpolation during both training and testing. Furthermore, we build upon large language models to produce fine-grained video descriptions. These detailed descriptions are further aligned with video features, facilitating a better transfer of CLIP to the video domain. Our approach is evaluated on three widely used action recognition datasets, following a variety of zero-shot evaluation protocols. The results demonstrate that our method surpasses existing state-of-the-art techniques by significant margins. Specifically, we achieve zero-shot accuracy scores of 88.1%, 58.7%, and 81.2% on UCF, HMDB, and Kinetics-600 datasets respectively, outpacing the best-performing alternative methods by 8.5%, 8.2%, and 12.3%. We also evaluate our approach on the MSR-VTT video-text retrieval dataset, where it delivers competitive video-to-text and text-to-video retrieval performance, while utilizing substantially less fine-tuning data compared to other methods.

2.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3772-3783, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38153825

RESUMO

The cross-model transferability of adversarial examples makes black-box attacks to be practical. However, it typically requires access to the input of the same modality as black-box models to attain reliable transferability. Unfortunately, the collection of datasets may be difficult in security-critical scenarios. Hence, developing cross-modal attacks for fooling models with different modalities of inputs would highly threaten real-world DNNs applications. The above considerations motivate us to investigate cross-modal transferability of adversarial examples. In particular, we aim to generate video adversarial examples from white-box image models to attack video CNN and ViT models. We introduce the Image To Video (I2V) attack based on the observation that image and video models share similar low-level features. For each video frame, I2V optimizes perturbations by reducing the similarity of intermediate features between benign and adversarial frames on image models. Then I2V combines adversarial frames together to generate video adversarial examples. I2V can be easily extended to simultaneously perturb multi-layer features extracted from an ensemble of image models. To efficiently integrate various features, we introduce an adaptive approach to re-weight the contributions of each layer based on its cosine similarity values of the previous attack step. Experimental results demonstrate the effectiveness of the proposed method.

3.
IEEE Trans Image Process ; 32: 6346-6358, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37966925

RESUMO

The transferability of adversarial examples across different convolutional neural networks (CNNs) makes it feasible to perform black-box attacks, resulting in security threats for CNNs. However, fewer endeavors have been made to investigate transferable attacks for vision transformers (ViTs), which achieve superior performance on various computer vision tasks. Unlike CNNs, ViTs establish relationships between patches extracted from inputs by the self-attention module. Thus, adversarial examples crafted on CNNs might hardly attack ViTs. To assess the security of ViTs comprehensively, we investigate the transferability across different ViTs in both untargetd and targeted scenarios. More specifically, we propose a Pay No Attention (PNA) attack, which ignores attention gradients during backpropagation to improve the linearity of backpropagation. Additionally, we introduce a PatchOut/CubeOut attack for image/video ViTs. They optimize perturbations within a randomly selected subset of patches/cubes during each iteration, preventing over-fitting to the white-box surrogate ViT model. Furthermore, we maximize the L2 norm of perturbations, ensuring that the generated adversarial examples deviate significantly from the benign ones. These strategies are designed to be harmoniously compatible. Combining them can enhance transferability by jointly considering patch-based inputs and the self-attention of ViTs. Moreover, the proposed combined attack seamlessly integrates with existing transferable attacks, providing an additional boost to transferability. We conduct experiments on ImageNet and Kinetics-400 for image and video ViTs, respectively. Experimental results demonstrate the effectiveness of the proposed method.

4.
World J Surg Oncol ; 21(1): 203, 2023 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-37430268

RESUMO

PURPOSE: Thymoma is the most common primary tumor in the anterior mediastinum. The prognostic factors of patients with thymoma still need to be clarified. In this study, we aimed to investigate the prognostic factors of patients with thymoma who received radical resection and establish the nomogram to predict the prognosis of these patients. MATERIALS AND METHODS: Patients who underwent radical resection for thymoma with complete follow-up data between 2005 and 2021 were enrolled. Their clinicopathological characteristics and treatment methods were retrospectively analyzed. Progression-free survival (PFS) and overall survival (OS) were estimated using the Kaplan-Meier method and compared by the log-rank test. Univariate and multivariate Cox proportional hazards regression analyses were performed to identify the independent prognostic factors. According to the results of the univariate analysis in the Cox regression model, the predictive nomograms were created. RESULTS: A total of 137 patients with thymoma were enrolled. With a median follow-up of 52 months, the 5-year and 10-year PFS rates were 79.5% and 68.1%, respectively. The 5-year and 10-year OS rates were 88.4% and 73.1%, respectively. Smoking status (P = 0.022) and tumor size (P = 0.039) were identified as independent prognostic factors for PFS. Multivariate analysis showed that a high level of neutrophils (P = 0.040) was independently associated with OS. The nomogram showed that the World Health Organization (WHO) histological classification contributed more to the risk of recurrence than other factors. Neutrophil count was the most important predictor of OS in patients with thymoma. CONCLUSION: Smoking status and tumor size are risk factors for PFS in patients with thymoma. A high level of neutrophils is an independent prognostic factor for OS. The nomograms developed in this study accurately predict PFS and OS rates at 5 and 10 years in patients with thymoma based on individual characteristics.


Assuntos
Timoma , Neoplasias do Timo , Humanos , Timoma/cirurgia , Prognóstico , Estudos Retrospectivos , Neoplasias do Timo/cirurgia , Organização Mundial da Saúde
5.
IEEE Trans Image Process ; 31: 7078-7090, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36346859

RESUMO

The vanilla Few-shot Learning (FSL) learns to build a classifier for a new concept from one or very few target examples, with the general assumption that source and target classes are sampled from the same domain. Recently, the task of Cross-Domain Few-Shot Learning (CD-FSL) aims at tackling the FSL where there is a huge domain shift between the source and target datasets. Extensive efforts on CD-FSL have been made via either directly extending the meta-learning paradigm of vanilla FSL methods, or employing massive unlabeled target data to help learn models. In this paper, we notice that in the CD-FSL task, the few labeled target images have never been explicitly leveraged to inform the model in the training stage. However, such a labeled target example set is very important to bridge the huge domain gap. Critically, this paper advocates a more practical training scenario for CD-FSL. And our key insight is to utilize a few labeled target data to guide the learning of the CD-FSL model. Technically, we propose a novel Generalized Meta-learning based Feature-Disentangled Mixup network, namely GMeta-FDMixup. We make three key contributions of utilizing GMeta-FDMixup to address CD-FSL. Firstly, we present two mixup modules - mixup-P and mixup-M that help facilitate utilizing the unbalanced and disjoint source and target datasets. These two novel modules enable diverse image generation for training the model on the source domain. Secondly, to narrow the domain gap explicitly, we contribute a novel feature disentanglement module that learns to decouple the domain-irrelevant and domain-specific features. By stripping the domain-specific features, we alleviate the negative effects caused by the domain inductive bias. Finally, we repurpose a new contrastive learning module, dubbed ConL. ConL prevents the model from only capturing category-related features via introducing contrastive loss. Thus, the generalization ability on novel categories is improved. Extensive experimental results on two benchmarks show the superiority of our setting and the effectiveness of our method. Code and models will be released.

7.
IEEE Trans Pattern Anal Mach Intell ; 44(4): 1699-1711, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-33026981

RESUMO

We introduce AdaFrame, a conditional computation framework that adaptively selects relevant frames on a per-input basis for fast video recognition. AdaFrame, which contains a Long Short-Term Memory augmented with a global memory to provide context information, operates as an agent to interact with video sequences aiming to search over time which frames to use. Trained with policy search methods, at each time step, AdaFrame computes a prediction, decides where to observe next, and estimates a utility, i.e., expected future rewards, of viewing more frames in the future. Exploring predicted utilities at testing time, AdaFrame is able to achieve adaptive lookahead inference so as to minimize the overall computational cost without incurring a degradation in accuracy. We conduct extensive experiments on two large-scale video benchmarks, FCVID and ActivityNet. With a vanilla ResNet-101 model, AdaFrame achieves similar performance of using all frames while only requiring, on average, 8.21 and 8.65 frames on FCVID and ActivityNet, respectively. We also demonstrate AdaFrame is compatible with modern 2D and 3D networks for video recognition. Furthermore, we show, among other things, learned frame usage can reflect the difficulty of making prediction decisions both at instance-level within the same class and at class-level among different categories.


Assuntos
Algoritmos
8.
Nutr Neurosci ; 25(5): 1001-1010, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-33078688

RESUMO

OBJECTIVE: To investigate the effect of maternal zinc deficiency on learning and memory in offspring and the changes in DNA methylation patterns. METHODS: Pregnant rats were divided into zinc adequate (ZA), zinc deficient (ZD), and paired fed (PF) groups. Serum zinc contents and AKP activity in mother rats and offspring at P21 (end of lactation) and P60 (weaned, adult) were detected. Cognitive ability of offspring at P21 and P60 were determined by Morris water maze. The expression of proteins including DNMT3a, DNMT1, GADD45ß, MeCP2 and BDNF in the offspring hippocampus were detected by Western-blot. The methylation status of BDNF promoter region in hippocampus of offspring rats was detected by MS-qPCR. RESULTS: Compared with the ZA and PF groups, pups in the ZD group had lower zinc levels and AKP activity in the serum, spent more time finding the platform and spent less time going through the platform area. Protein expression of DNMT1 and GADD45b were downregulated in the ZD group during P0 and P21 but not P60 compared with the ZA and PF group, these results were consistent with a reduction in BDNF protein at P0 (neonate), P21. However, when pups of rats in the ZD group were supplemented with zinc ion from P21 to P60, MeCP2 and GADD45b expression were significantly downregulated compared with the ZA and PF group. CONCLUSION: Post-weaning zinc supplementation may improve cognitive impairment induced by early life zinc deficiency, whereas it may not completely reverse the abnormal expression of particular genes that are involved in DNA methylation, binding to methylated DNA and neurogenesis.


Assuntos
Metilação de DNA , Desnutrição , Animais , Antígenos de Diferenciação/genética , Fator Neurotrófico Derivado do Encéfalo/genética , Fator Neurotrófico Derivado do Encéfalo/metabolismo , Feminino , Hipocampo/metabolismo , Aprendizagem , Desnutrição/metabolismo , Gravidez , Ratos , Zinco
9.
Nat Nanotechnol ; 16(8): 874-881, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34083773

RESUMO

Flash memory has become a ubiquitous solid-state memory device widely used in portable digital devices, computers and enterprise applications. The development of the information age has demanded improvements in memory speed and retention performance. Here we demonstrate an ultrafast non-volatile flash memory based on MoS2/hBN/multilayer graphene van der Waals heterostructures, which achieves an ultrafast writing/erasing speed of 20 ns through two-triangle-barrier modified Fowler-Nordheim tunnelling. Using detailed theoretical analysis and experimental verification, we postulate that a suitable barrier height, gate coupling ratio and clean interface are the main reasons for the breakthrough writing/erasing speed of our flash memory devices. Because of its non-volatility this ultrafast flash memory could provide the foundation for the next generation of high-speed non-volatile memory.

10.
Nano Lett ; 21(4): 1758-1764, 2021 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-33565310

RESUMO

In the continuous transistor feature size scaling down, the scaling of the supply voltage is stagnant because of the subthreshold swing (SS) limit. A transistor with a new mechanism is needed to break through the thermionic limit of SS and hold the large drive current at the same time. Here, by adopting the recently proposed Dirac-source field-effect transistor (DSFET) technology, we experimentally demonstrate a MoS2/graphene (1.8 nm/0.3 nm) DSFET for the first time, and a steep SS of 37.9 mV/dec at room temperature with nearly free hysteresis is observed. Besides, by bringing in the structure of gate-all-around (GAA), the MoS2/graphene DSFET exhibits a steeper SS of 33.5 mV/dec and a 40% increased normalized drive current up to 52.7 µA·µm/µm (VDS = 1 V) with a current on/off ratio of 108, which shows potential for low-power and high-performance electronics applications.

11.
Aging (Albany NY) ; 13(3): 4115-4137, 2021 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-33494069

RESUMO

In vitro and in vivo models of Parkinson's disease were established to investigate the effects of the lncRNA XIST/miR-199a-3p/Sp1/LRRK2 axis. The binding between XIST and miR-199a-3p as well as miR-199a-3p and Sp1 were examined by luciferase reporter assay and confirmed by RNA immunoprecipitation analysis. Following the Parkinson's disease animal behavioural assessment by suspension and swim tests, the brain tissue injuries were evaluated by hematoxylin and eosin, TdT-mediated dUTP-biotin nick end labelling, and tyrosine hydroxylase stainings. The results indicated that miR-199a-3p expression was downregulated, whereas that of XIST, Sp1 and LRRK2 were upregulated in Parkinson's disease. Moreover, miR-199a-3p overexpression or XIST knockdown inhibited the cell apoptosis induced by MPP+ treatment and promoted cell proliferation. The neurodegenerative defects were significantly recovered by treating the cells with shXIST or shSp1, whereas miR-199a-3p inhibition or Sp1 and LRRK2 overexpression abrogated these beneficial effects. Furthermore, the results of our in vivo experiments confirmed the neuroprotective effects of shXIST and miR-199a-3p against MPTP-induced brain injuries, and the Parkinson's disease behavioural symptoms were effectively alleviated upon shXIST or miR-199a-3p treatment. In summary, the results of the present study showed that lncRNA XIST sponges miR-199a-3p to modulate Sp1 expression and further accelerates Parkinson's disease progression by targeting LRRK2.


Assuntos
Apoptose/genética , Proteínas de Transporte/genética , Serina-Treonina Proteína Quinase-2 com Repetições Ricas em Leucina/genética , MicroRNAs/genética , Proteínas do Tecido Nervoso/genética , Neurônios/metabolismo , Doença de Parkinson/genética , RNA Longo não Codificante/genética , 1-Metil-4-fenilpiridínio/toxicidade , Animais , Apoptose/efeitos dos fármacos , Proteínas de Transporte/metabolismo , Linhagem Celular Tumoral , Progressão da Doença , Técnicas de Silenciamento de Genes , Herbicidas/toxicidade , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/genética , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Serina-Treonina Proteína Quinase-2 com Repetições Ricas em Leucina/metabolismo , Camundongos , MicroRNAs/metabolismo , Proteínas do Tecido Nervoso/metabolismo , Neurônios/efeitos dos fármacos , Células PC12 , Doença de Parkinson/metabolismo , Doença de Parkinson/fisiopatologia , Transtornos Parkinsonianos/genética , Transtornos Parkinsonianos/metabolismo , Transtornos Parkinsonianos/fisiopatologia , RNA Longo não Codificante/metabolismo , Ratos
12.
IEEE Trans Pattern Anal Mach Intell ; 43(10): 3600-3613, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-32248097

RESUMO

In this paper, we propose an end-to-end deep learning architecture that generates 3D triangular meshes from single color images. Restricted by the nature of prevalent deep learning techniques, the majority of previous works represent 3D shapes in volumes or point clouds. However, it is non-trivial to convert these representations to compact and ready-to-use mesh models. Unlike the existing methods, our network represents 3D shapes in meshes, which are essentially graphs and well suited for graph-based convolutional neural networks. Leveraging perceptual features extracted from an input image, our network produces the correct geometry by progressively deforming an ellipsoid. To make the whole deformation procedure stable, we adopt a coarse-to-fine strategy, and define various mesh/surface related losses to capture properties of various aspects, which benefits producing the visually appealing and physically accurate 3D geometry. In addition, our model by nature can be adapted to objects in specific domains, e.g., human faces, and be easily extended to learn per-vertex properties, e.g., color. Extensive experiments show that our method not only qualitatively produces the mesh model with better details, but also achieves the higher 3D shape estimation accuracy compared against the state-of-the-arts.

13.
IEEE Trans Image Process ; 30: 1514-1526, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33360994

RESUMO

Food recognition has captured numerous research attention for its importance for health-related applications. The existing approaches mostly focus on the categorization of food according to dish names, while ignoring the underlying ingredient composition. In reality, two dishes with the same name do not necessarily share the exact list of ingredients. Therefore, the dishes under the same food category are not mandatorily equal in nutrition content. Nevertheless, due to limited datasets available with ingredient labels, the problem of ingredient recognition is often overlooked. Furthermore, as the number of ingredients is expected to be much less than the number of food categories, ingredient recognition is more tractable in the real-world scenario. This paper provides an insightful analysis of three compelling issues in ingredient recognition. These issues involve recognition in either image-level or region level, pooling in either single or multiple image scales, learning in either single or multi-task manner. The analysis is conducted on a large food dataset, Vireo Food-251, contributed by this paper. The dataset is composed of 169,673 images with 251 popular Chinese food and 406 ingredients. The dataset includes adequate challenges in scale and complexity to reveal the limit of the current approaches in ingredient recognition.


Assuntos
Aprendizado Profundo , Ingredientes de Alimentos/classificação , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , China , Culinária , Humanos
14.
Medicine (Baltimore) ; 99(38): e22238, 2020 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-32957367

RESUMO

BACKGROUND: Systematic evaluation of the effectiveness and safety of combined procarbazine, lomustine, and vincristine for treating recurrent high-grade glioma. METHODS: Electronic databases including PubMed, MEDLINE, EMBASE, Cochrane Library Central Register of Controlled Trials, WanFang, and China National Knowledge Infrastructure (CNKI) were used to search for studies related to the utilization of combined procarbazine, lomustine, and vincristine as a therapeutic method for recurrent high-grade glioma. Literature screening, extraction of data, and evaluation of high standard studies were conducted by 2 independent researchers. The robustness and strength of the effectiveness and safety of combined procarbazine, lomustine, and vincristine as a therapeutic methodology for recurrent high-grade glioma was assessed based on the odds ratio (OR), mean differences (MDs), and 95% confidence interval (CI). RevMan 5.3 software was used for carrying out the statistical analysis. RESULTS: These results obtained in this study will be published in a peer-reviewed journal. CONCLUSION: Evidently, the conclusion of this study will provide an assessment on whether combined procarbazine, lomustine, and vincristine provides an effective and safe form of treatment for recurrent high-grade glioma. SYSTEMATIC REVIEW REGISTRATION NUMBER: INPLASY202080078.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/efeitos adversos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Neoplasias Encefálicas/tratamento farmacológico , Glioma/tratamento farmacológico , Metanálise como Assunto , Recidiva Local de Neoplasia/tratamento farmacológico , Revisões Sistemáticas como Assunto , Adolescente , Adulto , Neoplasias Encefálicas/patologia , Glioma/patologia , Humanos , Lomustina/efeitos adversos , Lomustina/uso terapêutico , Gradação de Tumores , Recidiva Local de Neoplasia/patologia , Procarbazina/efeitos adversos , Procarbazina/uso terapêutico , Vincristina/efeitos adversos , Vincristina/uso terapêutico , Adulto Jovem
15.
Artigo em Inglês | MEDLINE | ID: mdl-32946393

RESUMO

Generating realistic images with the guidance of reference images and human poses is challenging. Despite the success of previous works on synthesizing person images in the iconic views, no efforts are made towards the task of poseguided image synthesis in the non-iconic views. Particularly, we find that previous models cannot handle such a complex task, where the person images are captured in the non-iconic views by commercially-available digital cameras. To this end, we propose a new framework - Multi-branch Refinement Network (MR-Net), which utilizes several visual cues, including target person poses, foreground person body and scene images parsed. Furthermore, a novel Region of Interest (RoI) perceptual loss is proposed to optimize the MR-Net. Extensive experiments on two non-iconic datasets, Penn Action and BBC-Pose, as well as an iconic dataset - Market-1501, show the efficacy of the proposed model that can tackle the problem of pose-guided person image generation from the non-iconic views. The data, models, and codes are downloadable from https://github.com/loadder/MR-Net.

16.
Artigo em Inglês | MEDLINE | ID: mdl-32857695

RESUMO

The process of learning good representations for machine learning tasks can be very computationally expensive. Typically, we facilitate the same backbones learned on the training set to infer the labels of testing data. Interestingly, This learning and inference paradigm, however, is quite different from the typical inference scheme of human biological visual systems. Essentially, neuroscience studies have shown that the right hemisphere of the human brain predominantly makes a fast processing of low-frequency spatial signals, while the left hemisphere more focuses on analyzing high-frequency information in a slower way. And the low-pass analysis helps facilitate the high-pass analysis via a feedback form. Inspired by this biological vision mechanism, this paper explores the possibility of learning a layer-skippable inference network. Specifically, we propose a layer-skippable network that dynamically carries out coarse-tofine object categorization. Such a network has two branches to jointly deal with both coarse and fine-grained classification tasks. The layer-skipping mechanism is proposed to learn a gating network by generating dynamic inference graphs, and reducing the computational cost by detouring the inference path from some layers. This adaptive path inference strategy endows the network with better flexibility and larger capacity and makes the high-performance deep networks with dynamic structures. To efficiently train the gating network, a novel ranking-based loss function is presented. Furthermore, the learned representations are enhanced by the proposed top-down feedback facilitation and feature-wise affine transformation, individually. The former one employs features of a coarse branch to help the finegrained object recognition task, while the latter one encodes the selected path to enhance the final feature representations. Extensive experiments are conducted on several widely used coarse-to-fine object categorization benchmarks, and promising results are achieved by our proposed model. Quite surprisingly, our layer-skipping mechanism improves the network robustness to adversarial attacks. The codes and models are released on https://github.com/avalonstrel/DSN.

17.
Nat Nanotechnol ; 15(7): 545-557, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32647168

RESUMO

Rapid digital technology advancement has resulted in a tremendous increase in computing tasks imposing stringent energy efficiency and area efficiency requirements on next-generation computing. To meet the growing data-driven demand, in-memory computing and transistor-based computing have emerged as potent technologies for the implementation of matrix and logic computing. However, to fulfil the future computing requirements new materials are urgently needed to complement the existing Si complementary metal-oxide-semiconductor technology and new technologies must be developed to enable further diversification of electronics and their applications. The abundance and rich variety of electronic properties of two-dimensional materials have endowed them with the potential to enhance computing energy efficiency while enabling continued device downscaling to a feature size below 5 nm. In this Review, from the perspective of matrix and logic computing, we discuss the opportunities, progress and challenges of integrating two-dimensional materials with in-memory computing and transistor-based computing technologies.

18.
Artigo em Inglês | MEDLINE | ID: mdl-32406834

RESUMO

During the past decade, both multi-label learning and zero-shot learning have attracted huge research attention, and significant progress has been made. Multi-label learning algorithms aim to predict multiple labels given one instance, while most existing zero-shot learning approaches target at predicting a single testing label for each unseen class via transferring knowledge from auxiliary seen classes to target unseen classes. However, relatively less effort has been made on predicting multiple labels in the zero-shot setting, which is nevertheless a quite challenging task. In this work, we investigate and formalize a flexible framework consisting of two components, i.e., visual-semantic embedding and zero-shot multi-label prediction. First, we present a deep regression model to project the visual features into the semantic space, which explicitly exploits the correlations in the intermediate semantic layer of word vectors and makes label prediction possible. Then, we formulate the label prediction problem as a pairwise one and employ Ranking SVM to seek the unique multi-label correlations in the embedding space. Furthermore, we provide a transductive multi-label zeroshot prediction approach that exploits the testing data manifold structure. We demonstrate the effectiveness of the proposed approach on three popular multi-label datasets with state-of-theart performance obtained on both conventional and generalized ZSL settings.

19.
IEEE Trans Pattern Anal Mach Intell ; 42(2): 371-385, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31329547

RESUMO

Person re-identification (re-id) aims to match people across non-overlapping camera views in a public space. This is a challenging problem because the people captured in surveillance videos often wear similar clothing. Consequently, the differences in their appearance are typically subtle and only detectable at particular locations and scales. In this paper, we propose a deep re-id network (MuDeep) that is composed of two novel types of layers - a multi-scale deep learning layer, and a leader-based attention learning layer. Specifically, the former learns deep discriminative feature representations at different scales, while the latter utilizes the information from multiple scales to lead and determine the optimal weightings for each scale. The importance of different spatial locations for extracting discriminative features is learned explicitly via our leader-based attention learning layer. Extensive experiments are carried out to demonstrate that the proposed MuDeep outperforms the state-of-the-art on a number of benchmarks and has a better generalization ability under a domain generalization setting.


Assuntos
Identificação Biométrica/métodos , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Bases de Dados Factuais , Humanos , Reconhecimento Automatizado de Padrão , Gravação em Vídeo
20.
Nutr Neurosci ; 23(1): 75-84, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-29781405

RESUMO

Objective: To examine protein changes in the hippocampus of APP/PS1 transgenic mice after blueberry extracts (BB) intervention.Methods: Eight APP/PS1 transgenic mice were randomly assigned to Alzheimer's disease (AD)+BB group (n=4) and AD+control group (n=4). After a 16-week treatment, 2-DE and MALDI-TOF-MS were used to compare the proteomic profiles of the hippocampus in the two groups and Western blot was used to confirm the important differentially expressed proteins.Results: Twelve proteins were differentially expressed between the two groups. Nine of them were identified. Cytochrome b-c1 complex subunit 6, beta-actin, dynamin 1, and heat shock cognate 71 were up-regulated in AD+BB group, while a-enolase, stress-induced-phosphoprotein 1, malate dehydrogenase (MDH), MDH 1, and T-complex protein 1 subunit beta were down-regulated, respectively. Importantly, some of the identified proteins (e.g. dynamin 1) are known to be involved in cognitive impairment. Western blot analysis of hippocampus dynamin 1 expression confirmed the proteomic findings.Conclusions: The consumption of BB modulates the expression of proteins that are linked to the improvements of cognitive dysfunction in hippocampus of APP/PS1 transgenic mice.


Assuntos
Doença de Alzheimer/metabolismo , Mirtilos Azuis (Planta) , Hipocampo/efeitos dos fármacos , Hipocampo/metabolismo , Extratos Vegetais/administração & dosagem , Animais , Modelos Animais de Doenças , Camundongos Transgênicos , Proteômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA