RESUMO
Human neuroimaging studies have demonstrated that exercise influences the cortical structural plasticity as indexed by gray or white matter volume. It remains elusive, however, whether exercise affects cortical changes at the finer-grained myelination structure level. To answer this question, we scanned 28 elite golf players in comparison with control participants, using a novel neuroimaging technique-quantitative magnetic resonance imaging (qMRI). The data showed myeloarchitectonic plasticity in the left temporal pole of the golf players: the microstructure of this brain region of the golf players was better proliferated than that of control participants. In addition, this myeloarchitectonic plasticity was positively related to golfing proficiency. Our study has manifested that myeloarchitectonic plasticity could be induced by exercise, and thus, shed light on the potential benefits of exercise on brain health and cognitive enhancement.
Assuntos
Golfe , Substância Branca , Encéfalo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética , Neuroimagem , Substância Branca/diagnóstico por imagemRESUMO
Reverse thermally induced separation (RTIPS) was used to obtain a separation membrane with a better internal structure for a higher water flux and a surface that could easily form a hydration layer. In comparison to the traditional modification method, this work focused on the aspect that the internal structure obtained by changing the membrane-making method provided easier adhesion conditions for the dopamine/TiO2 hybrid nanoparticles (DA/TiO2 HNPs) obtained by biomimetic mineralization. It provided a basis for exploring the variation in adhesion with the water bath temperature and the amount of titanium added through the study of turbidity point, SEM images, water contact angle, thermogravimetric test, EDX, AFM, XPS, FTIR and other test results. The SEM images proved that the membrane obtained through the RTIPS method had a porous surface and spongy internal structure, furthermore, additional polymers were adsorbed. Use of EDX demonstrated that biomimetic mineralization prevented the production of agglomerated titanium dioxide. XPS and FTIR spectra confirmed the introduction and immobilization of HNP aggregation. Moreover, a decrease in the surface roughness and water contact angle further suggested an improvement in the hydrophilicity of the modified membrane. The introduction of HNP at a higher water bath temperature helped increase the water flux up to ten times, moreover, the oil-water separation efficiency could still reach over 99.50%. Lastly, a cycle test of the modified membrane under the optimal conditions helped confirm that the membrane forming conditions at this time could provide a better environment for the formation of the hydrophilic layer, which was conducive to the recycling of the separation membrane. In summary, more fixed more hydrophilic particles could be obtained through the RTIPS method based on biomimetic mineralization to prevent the accumulation of titanium dioxide, thus helping improve permeability and anti-fouling of the membrane.
Assuntos
Biônica , Membranas Artificiais , Polímeros/química , SulfonasRESUMO
With more than a billion people lacking accessible drinking water, there is a critical need to convert nonpotable sources such as seawater to water suitable for human use. However, energy requirements of desalination plants account for half their operating costs, so alternative, lower energy approaches are equally critical. Membrane distillation (MD) has shown potential due to its low operating temperature and pressure requirements, but the requirement of heating the input water makes it energy intensive. Here, we demonstrate nanophotonics-enabled solar membrane distillation (NESMD), where highly localized photothermal heating induced by solar illumination alone drives the distillation process, entirely eliminating the requirement of heating the input water. Unlike MD, NESMD can be scaled to larger systems and shows increased efficiencies with decreased input flow velocities. Along with its increased efficiency at higher ambient temperatures, these properties all point to NESMD as a promising solution for household- or community-scale desalination.
Assuntos
Destilação/instrumentação , Destilação/métodos , Membranas Artificiais , Energia Solar , Purificação da Água/instrumentação , Purificação da Água/métodosRESUMO
BACKGROUND: Alzheimer disease (AD) is the most common type of dementia with cognitive decline as one of the core symptoms in older adults. Numerous studies have suggested the value of psychosocial interventions to improve cognition in this population, but which one should be preferred are still matters of controversy. Consequently, we aim to compare and rank different psychosocial interventions in the management of mild to moderate AD with cognitive symptoms. METHODS: We did a network meta-analysis to identify both direct and indirect evidence in relevant studies. We searched MEDLINE, EMBASE, PsycINFO through the OVID database, CENTRAL through the Cochrane Library for clinical randomized controlled trials investigating psychosocial interventions of cognitive symptoms in patients with Alzheimer disease, published up to August 31, 2017. We included trials of home-based exercise(HE), group exercise(GE), walking program(WP), reminiscence therapy(RT), art therapy(AT) or the combination of psychosocial interventions and acetylcholinesterase inhibitor (ChEIs). We extracted the relevant information from these trials with a predefined data extraction sheet and assessed the risk of bias with the Cochrane risk of bias tool. The outcomes investigated were Mini-Mental State Examination (MMSE) and compliance. We did a pair-wise meta-analysis using the fixed-effects model and then did a random-effects network meta-analysis within a Bayesian framework. RESULTS: We deemed 10 trials eligible, including 682 patients and 11 treatments. The quality of included study was rated as low in most comparison with Cochrane tools. Treatment effects from the network meta-analysis showed WP was better than control (SMD 4.89, 95% CI -0.07 to 10.00) while cognitive training and acetylcholinesterase inhibitor (CT + ChEIs) was significantly better than the other treatments, when compared with simple ChEIs treatment, assessed by MMSE. In terms of compliance, the pair-wise meta-analysis indicated that WP and HE are better than GE and AT, while CT + ChEIs, CST + ChEIs are better than other combined interventions. CONCLUSION: Our study confirmed the effectiveness of psychosocial interventions for improving cognition or slowing down the progression of cognitive impairment in AD patients and recommended several interventions for clinical practice.
Assuntos
Doença de Alzheimer/psicologia , Doença de Alzheimer/terapia , Metanálise em Rede , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Comportamento Social , Idoso , Doença de Alzheimer/diagnóstico , Teorema de Bayes , Inibidores da Colinesterase/farmacologia , Inibidores da Colinesterase/uso terapêutico , Cognição/efeitos dos fármacos , Cognição/fisiologia , Transtornos Cognitivos/diagnóstico , Transtornos Cognitivos/psicologia , Transtornos Cognitivos/terapia , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/psicologia , Disfunção Cognitiva/terapia , Progressão da Doença , HumanosRESUMO
Previous deep learning-based event denoising methods mostly suffer from poor interpretability and difficulty in real-time processing due to their complex architecture designs. In this paper, we propose window-based event denoising, which simultaneously deals with a stack of events while existing element-based denoising focuses on one event each time. Besides, we give the theoretical analysis based on probability distributions in both temporal and spatial domains to improve interpretability. In temporal domain, we use timestamp deviations between processing events and central event to judge the temporal correlation and filter out temporal-irrelevant events. In spatial domain, we choose maximum a posteriori (MAP) to discriminate real-world event and noise and use the learned convolutional sparse coding to optimize the objective function. Based on the theoretical analysis, we build Temporal Window (TW) module and Soft Spatial Feature Embedding (SSFE) module to process temporal and spatial information separately, and construct a novel multi-scale window-based event denoising network, named WedNet. The high denoising accuracy and fast running speed of our WedNet enables us to achieve real-time denoising in complex scenes. Extensive experimental results verify the effectiveness and robustness of our WedNet. Our algorithm can remove event noise effectively and efficiently and improve the performance of downstream tasks.
RESUMO
The development of neuromorphic optoelectronic systems opens up the possibility of the next generation of artificial vision. In this work, the novel broadband (from 365 to 940 nm) and multilevel storage optoelectronic synaptic thin-film transistor (TFT) arrays are reported using the photosensitive conjugated polymer (poly[(9,9-dioctylfluorenyl-2,7-diyl)-co-(bithiophene)], F8T2) sorted semiconducting single-walled carbon nanotubes (sc-SWCNTs) as channel materials. The broadband synaptic responses are inherited to absorption from both photosensitive F8T2 and sorted sc-SWCNTs, and the excellent optoelectronic synaptic behaviors with 200 linearly increasing conductance states and long retention time > 103 s are attributed to the superior charge trapping at the AlOx dielectric layer grown by atomic layer deposition. Furthermore, the synaptic TFTs can achieve IOn/IOff ratios up to 106 and optoelectronic synaptic plasticity with the low power consumption (59 aJ per single pulse), which can simulate not only basic biological synaptic functions but also optical write and electrical erase, multilevel storage, and image recognition. Further, a novel Spiking Neural Network algorithm based on hardware characteristics is designed for the recognition task of Caltech 101 dataset and multiple features of the images are successfully extracted with higher accuracy (97.92%) of the recognition task from the multi-frequency curves of the optoelectronic synaptic devices.
RESUMO
Type 2 diabetes mellitus (T2DM) patients often suffer from depressive symptoms, which seriously affect cooperation in treatment and nursing. The amygdala plays a significant role in depression. This study aims to explore the microstructural alterations of the amygdala in T2DM and to investigate the relationship between the alterations and depressive symptoms. Fifty T2DM and 50 healthy controls were included. Firstly, the volumes of subcortical regions and subregions of amygdala were calculated by FreeSurfer. Covariance analysis (ANCOVA) was conducted between the two groups with covariates of age, sex, and estimated total intracranial volume to explore the differences in volume of subcortical regions and subregions of amygdala. Furthermore, the structural covariance within the amygdala subregions was performed. Moreover, we investigate the correlation between depressive symptoms and the volume of subcortical regions and amygdala subregions in T2DM. We observed a reduction in the volume of the bilateral cortico-amygdaloid transition area, left basal nucleus, bilateral accessory basal nucleus, left anterior amygdaloid area of amygdala, the left thalamus and left hippocampus in T2DM. T2DM patients showed decreased structural covariance connectivity between left paralaminar nucleus and the right central nucleus. Moreover, there was a negative correlation between self-rating depression scale scores and the volume of the bilateral cortico-amygdaloid transition area in T2DM. This study reveals extensive structural alterations in the amygdala subregions of T2DM patients. The reduction in the volume of the bilateral cortico-amygdaloid transition area may be a promising imaging marker for early recognition of depressive symptoms in T2DM.
Assuntos
Tonsila do Cerebelo , Depressão , Diabetes Mellitus Tipo 2 , Imageamento por Ressonância Magnética , Humanos , Diabetes Mellitus Tipo 2/patologia , Tonsila do Cerebelo/patologia , Tonsila do Cerebelo/diagnóstico por imagem , Masculino , Feminino , Pessoa de Meia-Idade , Depressão/diagnóstico por imagem , Depressão/patologia , Adulto , Idoso , Hipocampo/patologia , Hipocampo/diagnóstico por imagem , Tálamo/diagnóstico por imagem , Tálamo/patologiaRESUMO
Both network pruning and neural architecture search (NAS) can be interpreted as techniques to automate the design and optimization of artificial neural networks. In this paper, we challenge the conventional wisdom of training before pruning by proposing a joint search-and-training approach to learn a compact network directly from scratch. Using pruning as a search strategy, we advocate three new insights for network engineering: 1) to formulate adaptive search as a cold start strategy to find a compact subnetwork on the coarse scale; and 2) to automatically learn the threshold for network pruning; 3) to offer flexibility to choose between efficiency and robustness. More specifically, we propose an adaptive search algorithm in the cold start by exploiting the randomness and flexibility of filter pruning. The weights associated with the network filters will be updated by ThreshNet, a flexible coarse-to-fine pruning method inspired by reinforcement learning. In addition, we introduce a robust pruning strategy leveraging the technique of knowledge distillation through a teacher-student network. Extensive experiments on ResNet and VGGNet have shown that our proposed method can achieve a better balance in terms of efficiency and accuracy and notable advantages over current state-of-the-art pruning methods in several popular datasets, including CIFAR10, CIFAR100, and ImageNet. The code associate with this paper is available at: https://see.xidian.edu.cn/faculty/wsdong/Projects/AST-NP.htm.
Assuntos
Algoritmos , Aprendizagem , Humanos , Redes Neurais de ComputaçãoRESUMO
Image reconstruction from partial observations has attracted increasing attention. Conventional image reconstruction methods with hand-crafted priors often fail to recover fine image details due to the poor representation capability of the hand-crafted priors. Deep learning methods attack this problem by directly learning mapping functions between the observations and the targeted images can achieve much better results. However, most powerful deep networks lack transparency and are nontrivial to design heuristically. This paper proposes a novel image reconstruction method based on the Maximum a Posterior (MAP) estimation framework using learned Gaussian Scale Mixture (GSM) prior. Unlike existing unfolding methods that only estimate the image means (i.e., the denoising prior) but neglected the variances, we propose characterizing images by the GSM models with learned means and variances through a deep network. Furthermore, to learn the long-range dependencies of images, we develop an enhanced variant based on the Swin Transformer for learning GSM models. All parameters of the MAP estimator and the deep network are jointly optimized through end-to-end training. Extensive simulation and real data experimental results on spectral compressive imaging and image super-resolution demonstrate that the proposed method outperforms existing state-of-the-art methods.
RESUMO
This paper reports the background and results of the Surface Defect Detection Competition with Bio-inspired Vision Sensor, as well as summarizes the champion solutions, current challenges and future directions.
RESUMO
Magnetic resonance imaging (MRI) and positron emission tomography (PET) are increasingly used to forecast progression trajectories of cognitive decline caused by preclinical and prodromal Alzheimer's disease (AD). Many existing studies have explored the potential of these two distinct modalities with diverse machine and deep learning approaches. But successfully fusing MRI and PET can be complex due to their unique characteristics and missing modalities. To this end, we develop a hybrid multimodality fusion (HMF) framework with cross-domain knowledge transfer for joint MRI and PET representation learning, feature fusion, and cognitive decline progression forecasting. Our HMF consists of three modules: 1) a module to impute missing PET images, 2) a module to extract multimodality features from MRI and PET images, and 3) a module to fuse the extracted multimodality features. To address the issue of small sample sizes, we employ a cross-domain knowledge transfer strategy from the ADNI dataset, which includes 795 subjects, to independent small-scale AD-related cohorts, in order to leverage the rich knowledge present within the ADNI. The proposed HMF is extensively evaluated in three AD-related studies with 272 subjects across multiple disease stages, such as subjective cognitive decline and mild cognitive impairment. Experimental results demonstrate the superiority of our method over several state-of-the-art approaches in forecasting progression trajectories of AD-related cognitive decline.
RESUMO
Type 2 diabetes mellitus (T2DM) is closely linked to cognitive decline and alterations in brain structure and function. Resting-state functional magnetic resonance imaging (rs-fMRI) is used to diagnose neurodegenerative diseases, such as cognitive impairment (CI), Alzheimer's disease (AD), and vascular dementia (VaD). However, whether the functional connectivity (FC) of patients with T2DM and mild cognitive impairment (T2DM-MCI) is conducive to early diagnosis remains unclear. To answer this question, we analyzed the rs-fMRI data of 37 patients with T2DM and mild cognitive impairment (T2DM-MCI), 93 patients with T2DM but no cognitive impairment (T2DM-NCI), and 69 normal controls (NC). We achieved an accuracy of 87.91% in T2DM-MCI versus T2DM-NCI classification and 80% in T2DM-NCI versus NC classification using the XGBoost model. The thalamus, angular, caudate nucleus, and paracentral lobule contributed most to the classification outcome. Our findings provide valuable knowledge to classify and predict T2DM-related CI, can help with early clinical diagnosis of T2DM-MCI, and provide a basis for future studies.
Assuntos
Disfunção Cognitiva , Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/diagnóstico por imagem , Diabetes Mellitus Tipo 2/patologia , Imageamento por Ressonância Magnética/métodos , Encéfalo/patologia , Mapeamento Encefálico , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/patologiaRESUMO
Brain structural MRI has been widely used for assessing future progression of cognitive impairment (CI) based on learning-based methods. Previous studies generally suffer from the limited number of labeled training data, while there exists a huge amount of MRIs in large-scale public databases. Even without task-specific label information, brain anatomical structures provided by these MRIs can be used to boost learning performance intuitively. Unfortunately, existing research seldom takes advantage of such brain anatomy prior. To this end, this paper proposes a brain anatomy-guided representation (BAR) learning framework for assessing the clinical progression of cognitive impairment with T1-weighted MRIs. The BAR consists of a pretext model and a downstream model, with a shared brain anatomy-guided encoder for MRI feature extraction. The pretext model also contains a decoder for brain tissue segmentation, while the downstream model relies on a predictor for classification. We first train the pretext model through a brain tissue segmentation task on 9,544 auxiliary T1-weighted MRIs, yielding a generalizable encoder. The downstream model with the learned encoder is further fine-tuned on target MRIs for prediction tasks. We validate the proposed BAR on two CI-related studies with a total of 391 subjects with T1-weighted MRIs. Experimental results suggest that the BAR outperforms several state-of-the-art (SOTA) methods. The source code and pre-trained models are available at https://github.com/goodaycoder/BAR.
RESUMO
The yeast Saccharomyces cerevisiae able to tolerate lignocellulose-derived inhibitors like furfural. Yeast strain performance tolerance has been measured by the length of the lag phase for cell growth in response to the furfural inhibitor challenge. The aims of this work were to obtain RDS1 yeast tolerant strain against furfural through overexpression using a method of in vivo homologous recombination. Here, we report that the overexpressing RDS1 recovered more rapidly and displayed a lag phase at about 12 h than its parental strain. Overexpressing RDS1 strain encodes a novel aldehyde reductase with catalytic function for reduction of furfural with NAD(P)H as the co-factor. It displayed the highest specific activity (24.8 U/mg) for furfural reduction using NADH as a cofactor. Fluorescence microscopy revealed improved accumulation of reactive oxygen species resistance to the damaging effects of inhibitor in contrast to the parental. Comparative transcriptomics revealed key genes potentially associated with stress responses to the furfural inhibitor, including specific and multiple functions involving defensive reduction-oxidation reaction process and cell wall response. A significant change in expression level of log2 (fold change >1) was displayed for RDS1 gene in the recombinant strain, which demonstrated that the introduction of RDS1 overexpression promoted the expression level. Such signature expressions differentiated tolerance phenotypes of RDS1 from the innate stress response of its parental strain. Overexpression of the RDS1 gene involving diversified functional categories is accountable for stress tolerance in yeast S. cerevisiae to survive and adapt the furfural during the lag phase.
Assuntos
Furaldeído , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Furaldeído/farmacologia , NAD/metabolismo , Fenótipo , Saccharomyces cerevisiae/efeitos dos fármacos , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , TranscriptomaRESUMO
An increasing number of recent brain imaging studies are dedicated to understanding the neuro mechanism of cognitive impairment in type 2 diabetes mellitus (T2DM) individuals. In contrast to efforts to date that are limited to static functional connectivity, here we investigate abnormal connectivity in T2DM individuals by characterizing the time-varying properties of brain functional networks. Using group independent component analysis (GICA), sliding-window analysis, and k-means clustering, we extracted thirty-one intrinsic connectivity networks (ICNs) and estimated four recurring brain states. We observed significant group differences in fraction time (FT) and mean dwell time (MDT), and significant negative correlation between the Montreal Cognitive Assessment (MoCA) scores and FT/MDT. We found that in the T2DM group the inter- and intra-network connectivity decreases and increases respectively for the default mode network (DMN) and task-positive network (TPN). We also found alteration in the precuneus network (PCUN) and enhanced connectivity between the salience network (SN) and the TPN. Our study provides evidence of alterations of large-scale resting networks in T2DM individuals and shed light on the fundamental mechanisms of neurocognitive deficits in T2DM.
RESUMO
Lignocellulosic biomass is still considered a feasible source of bioethanol production. Saccharomyces cerevisiae can adapt to detoxify lignocellulose-derived inhibitors, including furfural. Tolerance of strain performance has been measured by the extent of the lag phase for cell proliferation following the furfural inhibitor challenge. The purpose of this work was to obtain a tolerant yeast strain against furfural through overexpression of YPR015C using the in vivo homologous recombination method. The physiological observation of the overexpressing yeast strain showed that it was more resistant to furfural than its parental strain. Fluorescence microscopy revealed improved enzyme reductase activity and accumulation of oxygen reactive species due to the harmful effects of furfural inhibitor in contrast to its parental strain. Comparative transcriptomic analysis revealed 79 genes potentially involved in amino acid biosynthesis, oxidative stress, cell wall response, heat shock protein, and mitochondrial-associated protein for the YPR015C overexpressing strain associated with stress responses to furfural at the late stage of lag phase growth. Both up- and down-regulated genes involved in diversified functional categories were accountable for tolerance in yeast to survive and adapt to the furfural stress in a time course study during the lag phase growth. This study enlarges our perceptions comprehensively about the physiological and molecular mechanisms implicated in the YPR015C overexpressing strain's tolerance under furfural stress. Construction illustration of the recombinant plasmid. a) pUG6-TEF1p-YPR015C, b) integration diagram of the recombinant plasmid pUG6-TEF1p-YPR into the chromosomal DNA of Saccharomyces cerevisiae.
Assuntos
Furaldeído , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Furaldeído/farmacologia , Biomassa , Parede Celular , Perfilação da Expressão GênicaRESUMO
PURPOSE: The white matter (WM) of the brain of type 2 diabetes mellitus (T2DM) patients is susceptible to neurodegenerative processes, but the specific types and positions of microstructural lesions along the fiber tracts remain unclear. METHODS: In this study 61 T2DM patients and 61 healthy controls were recruited and underwent diffusion spectrum imaging (DSI). The results were reconstructed with diffusion tensor imaging (DTI) and neurite orientation dispersion and density imaging (NODDI). WM microstructural abnormalities were identified using tract-based spatial statistics (TBSS). Pointwise WM tract differences were detected through automatic fiber quantification (AFQ). The relationships between WM tract abnormalities and clinical characteristics were explored with partial correlation analysis. RESULTS: TBSS revealed widespread WM lesions in T2DM patients with decreased fractional anisotropy and axial diffusivity and an increased orientation dispersion index (ODI). The AFQ results showed microstructural abnormalities in T2DM patients in specific portions of the right superior longitudinal fasciculus (SLF), right arcuate fasciculus (ARC), left anterior thalamic radiation (ATR), and forceps major (FMA). In the right ARC of T2DM patients, an aberrant ODI was positively correlated with fasting insulin and insulin resistance, and an abnormal intracellular volume fraction was negatively correlated with fasting blood glucose. Additionally, negative associations were found between blood pressure and microstructural abnormalities in the right ARC, left ATR, and FMA in T2DM patients. CONCLUSION: Using AFQ, together with DTI and NODDI, various kinds of microstructural alterations in the right SLF, right ARC, left ATR, and FMA can be accurately identified and may be associated with insulin and glucose status and blood pressure in T2DM patients.
Assuntos
Diabetes Mellitus Tipo 2 , Insulinas , Substância Branca , Humanos , Substância Branca/diagnóstico por imagem , Substância Branca/patologia , Imagem de Tensor de Difusão/métodos , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/diagnóstico por imagem , Diabetes Mellitus Tipo 2/patologia , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , AnisotropiaRESUMO
Unlike the success of neural architecture search (NAS) in high-level vision tasks, it remains challenging to find computationally efficient and memory-efficient solutions to low-level vision problems such as image restoration through NAS. One of the fundamental barriers to differential NAS-based image restoration is the optimization gap between the super-network and the sub-architectures, causing instability during the searching process. In this paper, we present a novel approach to fill this gap in image denoising application by connecting model-guided design (MoD) with NAS (MoD-NAS). Specifically, we propose to construct a new search space under a model-guided framework and develop more stable and efficient differential search strategies. MoD-NAS employs a highly reusable width search strategy and a densely connected search block to automatically select the operations of each layer as well as network width and depth via gradient descent. During the search process, the proposed MoD-NAS remains stable because of the smoother search space designed under the model-guided framework. Experimental results on several popular datasets show that our MoD-NAS method has achieved at least comparable even better PSNR performance than current state-of-the-art methods with fewer parameters, fewer flops, and less testing time.
RESUMO
Video quality assessment (VQA) task is an ongoing small sample learning problem due to the costly effort required for manual annotation. Since existing VQA datasets are of limited scale, prior research tries to leverage models pre-trained on ImageNet to mitigate this kind of shortage. Nonetheless, these well-trained models targeting on image classification task can be sub-optimal when applied on VQA data from a significantly different domain. In this paper, we make the first attempt to perform self-supervised pre-training for VQA task built upon contrastive learning method, targeting at exploiting the plentiful unlabeled video data to learn feature representation in a simple-yet-effective way. Specifically, we implement this idea by first generating distorted video samples with diverse distortion characteristics and visual contents based on the proposed distortion augmentation strategy. Afterwards, we conduct contrastive learning to capture quality-aware information by maximizing the agreement on feature representations of future frames and their corresponding predictions in the embedding space. In addition, we further introduce distortion prediction task as an additional learning objective to push the model towards discriminating different distortion categories of the input video. Solving these prediction tasks jointly with the contrastive learning not only provides stronger surrogate supervision signals, but also learns the shared knowledge among the prediction tasks. Extensive experiments demonstrate that our approach sets a new state-of-the-art in self-supervised learning for VQA task. Our results also underscore that the learned pre-trained model can significantly benefit the existing learning based VQA models. Source code is available at https://github.com/cpf0079/CSPT.
Assuntos
Algoritmos , SoftwareRESUMO
Typical image aesthetics assessment (IAA) is modeled for the generic aesthetics perceived by an "average" user. However, such generic aesthetics models neglect the fact that users' aesthetic preferences vary significantly depending on their unique preferences. Therefore, it is essential to tackle the issue for personalized IAA (PIAA). Since PIAA is a typical small sample learning (SSL) problem, existing PIAA models are usually built by fine-tuning the well-established generic IAA (GIAA) models, which are regarded as prior knowledge. Nevertheless, this kind of prior knowledge based on "average aesthetics" fails to incarnate the aesthetic diversity of different people. In order to learn the shared prior knowledge when different people judge aesthetics, that is, learn how people judge image aesthetics, we propose a PIAA method based on meta-learning with bilevel gradient optimization (BLG-PIAA), which is trained using individual aesthetic data directly and generalizes to unknown users quickly. The proposed approach consists of two phases: 1) meta-training and 2) meta-testing. In meta-training, the aesthetics assessment of each user is regarded as a task, and the training set of each task is divided into two sets: 1) support set and 2) query set. Unlike traditional methods that train a GIAA model based on average aesthetics, we train an aesthetic meta-learner model by bilevel gradient updating from the support set to the query set using many users' PIAA tasks. In meta-testing, the aesthetic meta-learner model is fine-tuned using a small amount of aesthetic data of a target user to obtain the PIAA model. The experimental results show that the proposed method outperforms the state-of-the-art PIAA metrics, and the learned prior model of BLG-PIAA can be quickly adapted to unseen PIAA tasks.