Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 209
Filtrar
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39129362

RESUMO

Influenza viruses rapidly evolve to evade previously acquired human immunity. Maintaining vaccine efficacy necessitates continuous monitoring of antigenic differences among strains. Traditional serological methods for assessing these differences are labor-intensive and time-consuming, highlighting the need for efficient computational approaches. This paper proposes MetaFluAD, a meta-learning-based method designed to predict quantitative antigenic distances among strains. This method models antigenic relationships between strains, represented by their hemagglutinin (HA) sequences, as a weighted attributed network. Employing a graph neural network (GNN)-based encoder combined with a robust meta-learning framework, MetaFluAD learns comprehensive strain representations within a unified space encompassing both antigenic and genetic features. Furthermore, the meta-learning framework enables knowledge transfer across different influenza subtypes, allowing MetaFluAD to achieve remarkable performance with limited data. MetaFluAD demonstrates excellent performance and overall robustness across various influenza subtypes, including A/H3N2, A/H1N1, A/H5N1, B/Victoria, and B/Yamagata. MetaFluAD synthesizes the strengths of GNN-based encoding and meta-learning to offer a promising approach for accurate antigenic distance prediction. Additionally, MetaFluAD can effectively identify dominant antigenic clusters within seasonal influenza viruses, aiding in the development of effective vaccines and efficient monitoring of viral evolution.


Assuntos
Antígenos Virais , Humanos , Antígenos Virais/genética , Antígenos Virais/imunologia , Redes Neurais de Computação , Influenza Humana/imunologia , Influenza Humana/virologia , Influenza Humana/prevenção & controle , Biologia Computacional/métodos , Orthomyxoviridae/imunologia , Orthomyxoviridae/genética , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Glicoproteínas de Hemaglutininação de Vírus da Influenza/imunologia , Aprendizado de Máquina
2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39133096

RESUMO

The molecular property prediction (MPP) plays a crucial role in the drug discovery process, providing valuable insights for molecule evaluation and screening. Although deep learning has achieved numerous advances in this area, its success often depends on the availability of substantial labeled data. The few-shot MPP is a more challenging scenario, which aims to identify unseen property with only few available molecules. In this paper, we propose an attribute-guided prototype network (APN) to address the challenge. APN first introduces an molecular attribute extractor, which can not only extract three different types of fingerprint attributes (single fingerprint attributes, dual fingerprint attributes, triplet fingerprint attributes) by considering seven circular-based, five path-based, and two substructure-based fingerprints, but also automatically extract deep attributes from self-supervised learning methods. Furthermore, APN designs the Attribute-Guided Dual-channel Attention module to learn the relationship between the molecular graphs and attributes and refine the local and global representation of the molecules. Compared with existing works, APN leverages high-level human-defined attributes and helps the model to explicitly generalize knowledge in molecular graphs. Experiments on benchmark datasets show that APN can achieve state-of-the-art performance in most cases and demonstrate that the attributes are effective for improving few-shot MPP performance. In addition, the strong generalization ability of APN is verified by conducting experiments on data from different domains.


Assuntos
Aprendizado Profundo , Descoberta de Drogas , Descoberta de Drogas/métodos , Humanos , Algoritmos , Redes Neurais de Computação
3.
Neural Netw ; 179: 106574, 2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39096754

RESUMO

Graph neural networks (GNN) are widely used in recommendation systems, but traditional centralized methods raise privacy concerns. To address this, we introduce a federated framework for privacy-preserving GNN-based recommendations. This framework allows distributed training of GNN models using local user data. Each client trains a GNN using its own user-item graph and uploads gradients to a central server for aggregation. To overcome limited data, we propose expanding local graphs using Software Guard Extension (SGX) and Local Differential Privacy (LDP). SGX computes node intersections for subgraph exchange and expansion, while local differential privacy ensures privacy. Additionally, we introduce a personalized approach with Prototype Networks (PN) and Model-Agnostic Meta-Learning (MAML) to handle data heterogeneity. This enhances the encoding abilities of the federated meta-learner, enabling precise fine-tuning and quick adaptation to diverse client graph data. We leverage SGX and local differential privacy for secure parameter sharing and defense against malicious servers. Comprehensive experiments across six datasets demonstrate our method's superiority over centralized GNN-based recommendations, while preserving user privacy.

4.
Network ; : 1-24, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38994690

RESUMO

Plant diseases pose a significant threat to agricultural productivity worldwide. Convolutional neural networks (CNNs) have achieved state-of-the-art performances on several plant disease detection tasks. However, the manual development of CNN models using an exhaustive approach is a resource-intensive task. Neural Architecture Search (NAS) has emerged as an innovative paradigm that seeks to automate model generation procedures without human intervention. However, the application of NAS in plant disease detection has received limited attention. In this work, we propose a two-stage meta-learning-based neural architecture search system (ML NAS) to automate the generation of CNN models for unseen plant disease detection tasks. The first stage recommends the most suitable benchmark models for unseen plant disease detection tasks based on the prior evaluations of benchmark models on existing plant disease datasets. In the second stage, the proposed NAS operators are employed to optimize the recommended model for the target task. The experimental results showed that the MLNAS system's model outperformed state-of-the-art models on the fruit disease dataset, achieving an accuracy of 99.61%. Furthermore, the MLNAS-generated model outperformed the Progressive NAS model on the 8-class plant disease dataset, achieving an accuracy of 99.8%. Hence, the proposed MLNAS system facilitates faster model development with reduced computational costs.

5.
Biotechnol J ; 19(7): e2400080, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38997212

RESUMO

Modern machine learning has the potential to fundamentally change the way bioprocesses are developed. In particular, horizontal knowledge transfer methods, which seek to exploit data from historical processes to facilitate process development for a new product, provide an opportunity to rethink current workflows. In this work, we first assess the potential of two knowledge transfer approaches, meta learning and one-hot encoding, in combination with Gaussian process (GP) models. We compare their performance with GPs trained only on data of the new process, that is, local models. Using simulated mammalian cell culture data, we observe that both knowledge transfer approaches exhibit test set errors that are approximately halved compared to those of the local models when two, four, or eight experiments of the new product are used for training. Subsequently, we address the question whether experiments for a new product could be designed more effectively by exploiting existing knowledge. In particular, we suggest to specifically design a few runs for the novel product to calibrate knowledge transfer models, a task that we coin calibration design. We propose a customized objective function to identify a set of calibration design runs, which exploits differences in the process evolution of historical products. In two simulated case studies, we observed that training with calibration designs yields similar test set errors compared to common design of experiments approaches. However, the former requires approximately four times fewer experiments. Overall, the results suggest that process development could be significantly streamlined when systematically carrying knowledge from one product to the next.


Assuntos
Aprendizado de Máquina , Calibragem , Simulação por Computador , Animais
6.
Sensors (Basel) ; 24(14)2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-39065871

RESUMO

Multivariate time series modeling has been essential in sensor-based data mining tasks. However, capturing complex dynamics caused by intra-variable (temporal) and inter-variable (spatial) relationships while simultaneously taking into account evolving data distributions is a non-trivial task, which faces accumulated computational overhead and multiple temporal patterns or distribution modes. Most existing methods focus on the former direction without adaptive task-specific learning ability. To this end, we developed a holistic spatial-temporal meta-learning probabilistic inference framework, entitled ST-MeLaPI, for the efficient and versatile learning of complex dynamics. Specifically, first, a multivariate relationship recognition module is utilized to learn task-specific inter-variable dependencies. Then, a multiview meta-learning and probabilistic inference strategy was designed to learn shared parameters while enabling the fast and flexible learning of task-specific parameters for different batches. At the core are spatial dependency-oriented and temporal pattern-oriented meta-learning approximate probabilistic inference modules, which can quickly adapt to changing environments via stochastic neurons at each timestamp. Finally, a gated aggregation scheme is leveraged to realize appropriate information selection for the generative style prediction. We benchmarked our approach against state-of-the-art methods with real-world data. The experimental results demonstrate the superiority of our approach over the baselines.

7.
J Am Stat Assoc ; 119(546): 1274-1285, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38948492

RESUMO

Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transfer learning for high-dimensional generalized linear models (GLMs). A novel algorithm, TransHDGLM, that integrates data from the target study and the source studies is proposed. Minimax rate of convergence for estimation is established and the proposed estimator is shown to be rate-optimal. Statistical inference for the target regression coefficients is also studied. Asymptotic normality for a debiased estimator is established, which can be used for constructing coordinate-wise confidence intervals of the regression coefficients. Numerical studies show significant improvement in estimation and inference accuracy over GLMs that only use the target data. The proposed methods are applied to a real data study concerning the classification of colorectal cancer using gut microbiomes, and are shown to enhance the classification accuracy in comparison to methods that only use the target data.

8.
Interdiscip Sci ; 16(2): 469-488, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38951382

RESUMO

Image classification, a fundamental task in computer vision, faces challenges concerning limited data handling, interpretability, improved feature representation, efficiency across diverse image types, and processing noisy data. Conventional architectural approaches have made insufficient progress in addressing these challenges, necessitating architectures capable of fine-grained classification, enhanced accuracy, and superior generalization. Among these, the vision transformer emerges as a noteworthy computer vision architecture. However, its reliance on substantial data for training poses a drawback due to its complexity and high data requirements. To surmount these challenges, this paper proposes an innovative approach, MetaV, integrating meta-learning into a vision transformer for medical image classification. N-way K-shot learning is employed to train the model, drawing inspiration from human learning mechanisms utilizing past knowledge. Additionally, deformational convolution and patch merging techniques are incorporated into the vision transformer model to mitigate complexity and overfitting while enhancing feature representation. Augmentation methods such as perturbation and Grid Mask are introduced to address the scarcity and noise in medical images, particularly for rare diseases. The proposed model is evaluated using diverse datasets including Break His, ISIC 2019, SIPaKMed, and STARE. The achieved performance accuracies of 89.89%, 87.33%, 94.55%, and 80.22% for Break His, ISIC 2019, SIPaKMed, and STARE, respectively, present evidence validating the superior performance of the proposed model in comparison to conventional models, setting a new benchmark for meta-vision image classification models.


Assuntos
Processamento de Imagem Assistida por Computador , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Aprendizado de Máquina , Diagnóstico por Imagem , Aprendizado Profundo
9.
Neural Netw ; 179: 106561, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-39084171

RESUMO

Person re-identification (ReID) has made good progress in stationary domains. The ReID model must be retrained to adapt to new scenarios (domains) as they emerge unexpectedly, which leads to catastrophic forgetting. Continual learning trains the model in the order of domain emergence to alleviate catastrophic forgetting. However, generalization ability of the model is still limited due to the distribution difference between training and testing domains. To address the above problem, we propose the generalized continual person re-Identification (GCReID) model to continuously train an anti-forgetting and generalizable model. We endeavor to increase the diversity of samples by prior to simulate unseen domains. Meta-train and meta-test are adopted to enhance generalization of the model. Universal knowledge extracted from all seen domains and the simulated domains is stored in a set of feature embeddings. The knowledge is continually updated and applied to guide meta-train and meta-test via a graph attention network. Extensive experiments on 12 benchmark datasets and comparisons with 6 representative models demonstrate the effectiveness of the proposed model GCReID in enhancing generalization performance on unseen domains and alleviating catastrophic forgetting of seen domains. The code will be available at https://github.com/DFLAG-NEU/GCReID if our work is accepted.

10.
Eur J Neurosci ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38923238

RESUMO

In uncertain environments in which resources fluctuate continuously, animals must permanently decide whether to stabilise learning and exploit what they currently believe to be their best option, or instead explore potential alternatives and learn fast from new observations. While such a trade-off has been extensively studied in pretrained animals facing non-stationary decision-making tasks, it is yet unknown how they progressively tune it while learning the task structure during pretraining. Here, we compared the ability of different computational models to account for long-term changes in the behaviour of 24 rats while they learned to choose a rewarded lever in a three-armed bandit task across 24 days of pretraining. We found that the day-by-day evolution of rat performance and win-shift tendency revealed a progressive stabilisation of the way they regulated reinforcement learning parameters. We successfully captured these behavioural adaptations using a meta-learning model in which either the learning rate or the inverse temperature was controlled by the average reward rate.

11.
Sensors (Basel) ; 24(12)2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38931667

RESUMO

Nowadays, the focus on few-shot object detection (FSOD) is fueled by limited remote sensing data availability. In view of various challenges posed by remote sensing images (RSIs) and FSOD, we propose a meta-learning-based Balanced Few-Shot Object Detector (B-FSDet), built upon YOLOv9 (GELAN-C version). Firstly, addressing the problem of incompletely annotated objects that potentially breaks the balance of the few-shot principle, we propose a straightforward yet efficient data clearing strategy, which ensures balanced input of each category. Additionally, considering the significant variance fluctuations in output feature vectors from the support set that lead to reduced effectiveness in accurately representing object information for each class, we propose a stationary feature extraction module and corresponding stationary and fast prediction method, forming a stationary meta-learning mode. In the end, in consideration of the issue of minimal inter-class differences in RSIs, we propose inter-class discrimination support loss based on the stationary meta-learning mode to ensure the information provided for each class from the support set is balanced and easier to distinguish. Our proposed detector's performance is evaluated on the DIOR and NWPU VHR-10.v2 datasets, and comparative analysis with state-of-the-art detectors reveals promising performance.

12.
Sensors (Basel) ; 24(11)2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38894247

RESUMO

Few-shot object detection is a challenging task aimed at recognizing novel classes and localizing with limited labeled data. Although substantial achievements have been obtained, existing methods mostly struggle with forgetting and lack stability across various few-shot training samples. In this paper, we reveal two gaps affecting meta-knowledge transfer, leading to unstable performance and forgetting in meta-learning-based frameworks. To this end, we propose sample normalization, a simple yet effective method that enhances performance stability and decreases forgetting. Additionally, we apply Z-score normalization to mitigate the hubness problem in high-dimensional feature space. Experimental results on the PASCAL VOC data set demonstrate that our approach outperforms existing methods in both accuracy and stability, achieving up to +4.4 mAP@0.5 and +5.3 mAR in a single run, with +4.8 mAP@0.5 and +5.1 mAR over 10 random experiments on average. Furthermore, our method alleviates the drop in performance of base classes. The code will be released to facilitate future research.

13.
Sensors (Basel) ; 24(11)2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38894310

RESUMO

This paper investigates the application of ensemble learning techniques, specifically meta-learning, in intrusion detection systems (IDS) for the Internet of Medical Things (IoMT). It underscores the existing challenges posed by the heterogeneous and dynamic nature of IoMT environments, which necessitate adaptive, robust security solutions. By harnessing meta-learning alongside various ensemble strategies such as stacking and bagging, the paper aims to refine IDS mechanisms to effectively counter evolving cyber threats. The study proposes a performance-driven weighted meta-learning technique for dynamic assignment of voting weights to classifiers based on accuracy, loss, and confidence levels. This approach significantly enhances the intrusion detection capabilities for the IoMT by dynamically optimizing ensemble IDS models. Extensive experiments demonstrate the proposed model's superior performance in terms of accuracy, detection rate, F1 score, and false positive rate compared to existing models, particularly when analyzing various sizes of input features. The findings highlight the potential of integrating meta-learning in ensemble-based IDS to enhance the security and integrity of IoMT networks, suggesting avenues for future research to further advance IDS performance in protecting sensitive medical data and IoT infrastructures.

14.
Diagnostics (Basel) ; 14(12)2024 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-38928629

RESUMO

Deep learning has attained state-of-the-art results in general image segmentation problems; however, it requires a substantial number of annotated images to achieve the desired outcomes. In the medical field, the availability of annotated images is often limited. To address this challenge, few-shot learning techniques have been successfully adapted to rapidly generalize to new tasks with only a few samples, leveraging prior knowledge. In this paper, we employ a gradient-based method known as Model-Agnostic Meta-Learning (MAML) for medical image segmentation. MAML is a meta-learning algorithm that quickly adapts to new tasks by updating a model's parameters based on a limited set of training samples. Additionally, we use an enhanced 3D U-Net as the foundational network for our models. The enhanced 3D U-Net is a convolutional neural network specifically designed for medical image segmentation. We evaluate our approach on the TotalSegmentator dataset, considering a few annotated images for four tasks: liver, spleen, right kidney, and left kidney. The results demonstrate that our approach facilitates rapid adaptation to new tasks using only a few annotated images. In 10-shot settings, our approach achieved mean dice coefficients of 93.70%, 85.98%, 81.20%, and 89.58% for liver, spleen, right kidney, and left kidney segmentation, respectively. In five-shot sittings, the approach attained mean Dice coefficients of 90.27%, 83.89%, 77.53%, and 87.01% for liver, spleen, right kidney, and left kidney segmentation, respectively. Finally, we assess the effectiveness of our proposed approach on a dataset collected from a local hospital. Employing five-shot sittings, we achieve mean Dice coefficients of 90.62%, 79.86%, 79.87%, and 78.21% for liver, spleen, right kidney, and left kidney segmentation, respectively.

15.
Neural Netw ; 178: 106429, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38901090

RESUMO

Although recent studies on blind single image super-resolution (SISR) have achieved significant success, most of them typically require supervised training on synthetic low resolution (LR)-high resolution (HR) paired images. This leads to re-training necessity for different degradations and restricted applications in real-world scenarios with unfavorable inputs. In this paper, we propose an unsupervised blind SISR method with input underlying different degradations, named different degradations blind super-resolution (DDSR). It formulates a Gaussian modeling on blur degradation and employs a meta-learning framework for solving different image degradations. Specifically, a neural network-based kernel generator is optimized by learning from random kernel samples, referred to as random kernel learning. This operation provides effective initialization for blur degradation optimization. At the same time, a meta-learning framework is proposed to resolve multiple degradation modelings on the basis of alternative optimization between blur degradation and image restoration, respectively. Differing from the pre-trained deep-learning methods, the proposed DDSR is implemented in a plug-and-play manner, and is capable of restoring HR image from unfavorable LR input with degradations such as partial coverage, noise addition, and darkening. Extensive simulations illustrate the superior performance of the proposed DDSR approach compared to the state-of-the-arts on public datasets with comparable memory load and time consumption, yet exhibiting better application flexibility and convenience, and significantly better generalization ability towards multiple degradations. Our code is available at https://github.com/XYLGroup/DDSR.


Assuntos
Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos , Humanos , Aprendizado Profundo , Algoritmos , Simulação por Computador , Aprendizado de Máquina
16.
Sci Rep ; 14(1): 10125, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38698202

RESUMO

Fairness has become a critical value online, and the latest studies consider it in many problems. In recommender systems, fairness is important since the visibility of items is controlled by systems. Previous fairness-aware recommender systems assume that sufficient relationship data between users and items are available. However, it is common that new users and items are frequently introduced, and they have no relationship data yet. In this paper, we study recommendation methods to enhance fairness in a cold-start state. Fairness is more significant when the preference of a user or the popularity of an item is unknown. We propose a meta-learning-based cold-start recommendation framework called FaRM to alleviate the unfairness of recommendations. The proposed framework consists of three steps. We first propose a fairness-aware meta-path generation method to eliminate bias in sensitive attributes. In addition, we construct fairness-aware user representations through the meta-path aggregation approach. Then, we propose a novel fairness objective function and introduce a joint learning method to minimize the trade-off between relevancy and fairness. In extensive experiments with various cold-start scenarios, it is shown that FaRM is significantly superior in fairness performance while preserving relevance accuracy over previous work.

17.
BMC Med Inform Decis Mak ; 24(1): 137, 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38802809

RESUMO

BACKGROUND: Modeling causality through graphs, referred to as causal graph learning, offers an appropriate description of the dynamics of causality. The majority of current machine learning models in clinical decision support systems only predict associations between variables, whereas causal graph learning models causality dynamics through graphs. However, building personalized causal graphs for each individual is challenging due to the limited amount of data available for each patient. METHOD: In this study, we present a new algorithmic framework using meta-learning for learning personalized causal graphs in biomedicine. Our framework extracts common patterns from multiple patient graphs and applies this information to develop individualized graphs. In multi-task causal graph learning, the proposed optimized initial guess of shared commonality enables the rapid adoption of knowledge to new tasks for efficient causal graph learning. RESULTS: Experiments on one real-world biomedical causal graph learning benchmark data and four synthetic benchmarks show that our algorithm outperformed the baseline methods. Our algorithm can better understand the underlying patterns in the data, leading to more accurate predictions of the causal graph. Specifically, we reduce the structural hamming distance by 50-75%, indicating an improvement in graph prediction accuracy. Additionally, the false discovery rate is decreased by 20-30%, demonstrating that our algorithm made fewer incorrect predictions compared to the baseline algorithms. CONCLUSION: To the best of our knowledge, this is the first study to demonstrate the effectiveness of meta-learning in personalized causal graph learning and cause inference modeling for biomedicine. In addition, the proposed algorithm can also be generalized to transnational research areas where integrated analysis is necessary for various distributions of datasets, including different clinical institutions.


Assuntos
Algoritmos , Aprendizado de Máquina , Humanos , Causalidade
18.
Sci Rep ; 14(1): 11963, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796529

RESUMO

Due to the challenge of collecting a substantial amount of production-quality data in real-world industrial settings, the implementation of production quality prediction models based on deep learning is not effective. To achieve the goal of predicting production quality with limited data and address the issue of model degradation in the training process of deep learning networks, we propose Meta-Learning based on Residual Network (MLRN) models for production quality prediction with limited data. Firstly, the MLRN model is trained on a variety of learning tasks to acquire knowledge for predicting production quality. Furthermore, to obtain more features with limited data and avoid the issues of gradient disappearing or exploding in deep network training, the enhanced residual network with the effective channel attention (ECA) mechanism is chosen as the basic network structure of MLRN. Additionally, a multi-batch and multi-task data input approach is implemented to prevent overfitting. Finally, the availability of the MLRN model is demonstrated by comparing it with other models using both numerical and graphical datasets.

19.
Entropy (Basel) ; 26(5)2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38785677

RESUMO

Ensuring the safe and stable operation of high-speed trains necessitates real-time monitoring and diagnostics of their suspension systems. While machine learning technology is widely employed for industrial equipment fault diagnosis, its effective application relies on the availability of a large dataset with annotated fault data for model training. However, in practice, the availability of informational data samples is often insufficient, with most of them being unlabeled. The challenge arises when traditional machine learning methods encounter a scarcity of training data, leading to overfitting due to limited information. To address this issue, this paper proposes a novel few-shot learning method for high-speed train fault diagnosis, incorporating sensor-perturbation injection and meta-confidence learning to improve detection accuracy. Experimental results demonstrate the superior performance of the proposed method, which introduces perturbations, compared to existing methods. The impact of perturbation effects and class numbers on fault detection is analyzed, confirming the effectiveness of our learning strategy.

20.
Front Neurorobot ; 18: 1391247, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38736985

RESUMO

Introduction: The meta-learning methods have been widely used to solve the problem of few-shot learning. Generally, meta-learners are trained on a variety of tasks and then generalized to novel tasks. Methods: However, existing meta-learning methods do not consider the relationship between meta-tasks and novel tasks during the meta-training period, so that initial models of the meta-learner provide less useful meta-knowledge for the novel tasks. This leads to a weak generalization ability on novel tasks. Meanwhile, different initial models contain different meta-knowledge, which leads to certain differences in the learning effect of novel tasks during the meta-testing period. Therefore, this article puts forward a meta-optimization method based on situational meta-task construction and cooperation of multiple initial models. First, during the meta-training period, a method of constructing situational meta-task is proposed, and the selected candidate task sets provide more effective meta-knowledge for novel tasks. Then, during the meta-testing period, an ensemble model method based on meta-optimization is proposed to minimize the loss of inter-model cooperation in prediction, so that multiple models cooperation can realize the learning of novel tasks. Results: The above-mentioned methods are applied to popular few-shot character datasets and image recognition datasets. Furthermore, the experiment results indicate that the proposed method achieves good effects in few-shot classification tasks. Discussion: In future work, we will extend our methods to provide more generalized and useful meta-knowledge to the model during the meta-training period when the novel few-shot tasks are completely invisible.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA