Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38536695

RESUMO

Few-shot image classification (FSIC) is beneficial for a variety of real-world scenarios, aiming to construct a recognition system with limited training data. In this article, we extend the original FSIC task by incorporating defense against malicious adversarial examples. This can be an arduous challenge because numerous deep learning-based approaches remain susceptible to adversarial examples, even when trained with ample amounts of data. Previous studies on this problem have predominantly concentrated on the meta-learning framework, which involves sampling numerous few-shot tasks during the training stage. In contrast, we propose a straightforward but effective baseline via learning robust and discriminative representations without tedious meta-task sampling, which can further be generalized to unforeseen adversarial FSIC tasks. Specifically, we introduce an adversarial-aware (AA) mechanism that exploits feature-level distinctions between the legitimate and the adversarial domains to provide supplementary supervision. Moreover, we design a novel adversarial reweighting training strategy to ameliorate the imbalance among adversarial examples. To further enhance the adversarial robustness without compromising discriminative features, we propose the cyclic feature purifier during the postprocessing projection, which can reduce the interference of unforeseen adversarial examples. Furthermore, our method can obtain robust feature embeddings that maintain superior transferability, even when facing cross-domain adversarial examples. Extensive experiments and systematic analyses demonstrate that our method achieves state-of-the-art robustness as well as natural performance among adversarially robust FSIC algorithms on three standard benchmarks by a substantial margin.

2.
Artigo em Inglês | MEDLINE | ID: mdl-37402198

RESUMO

The pandemic of coronavirus disease 2019 (COVID-19) has led to a global public health crisis, which caused millions of deaths and billions of infections, greatly increasing the pressure on medical resources. With the continuous emergence of viral mutations, developing automated tools for COVID-19 diagnosis is highly desired to assist the clinical diagnosis and reduce the tedious workload of image interpretation. However, medical images in a single site are usually of a limited amount or weakly labeled, while integrating data scattered around different institutions to build effective models is not allowed due to data policy restrictions. In this article, we propose a novel privacy-preserving cross-site framework for COVID-19 diagnosis with multimodal data, seeking to effectively leverage heterogeneous data from multiple parties while preserving patients' privacy. Specifically, a Siamese branched network is introduced as the backbone to capture inherent relationships across heterogeneous samples. The redesigned network is capable of handling semisupervised inputs in multimodalities and conducting task-specific training, in order to improve the model performance of various scenarios. The framework achieves significant improvement compared with state-of-the-art methods, as we demonstrate through extensive simulations on real-world datasets.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13328-13343, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37379198

RESUMO

Multi-party learning provides an effective approach for training a machine learning model, e.g., deep neural networks (DNNs), over decentralized data by leveraging multiple decentralized computing devices, subjected to legal and practical constraints. Different parties, so-called local participants, usually provide heterogenous data in a decentralized mode, leading to non-IID data distributions across different local participants which pose a notorious challenge for multi-party learning. To address this challenge, we propose a novel heterogeneous differentiable sampling (HDS) framework. Inspired by the dropout strategy in DNNs, a data-driven network sampling strategy is devised in the HDS framework, with differentiable sampling rates which allow each local participant to extract from a common global model the optimal local model that best adapts to its own data properties so that the size of the local model can be significantly reduced to enable more efficient inference. Meanwhile, co-adaptation of the global model via learning such local models allows for achieving better learning performance under non-IID data distributions and speeds up the convergence of the global model. Experiments have demonstrated the superiority of the proposed method over several popular multi-party learning techniques in the multi-party settings with non-IID data distributions.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 7142-7156, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37145953

RESUMO

Transfer regression is a practical and challenging problem with important applications in various domains, such as engineering design and localization. Capturing the relatedness of different domains is the key of adaptive knowledge transfer. In this paper, we investigate an effective way of explicitly modelling domain relatedness through transfer kernel, a transfer-specified kernel that considers domain information in the covariance calculation. Specifically, we first give the formal definition of transfer kernel, and introduce three basic general forms that well cover existing related works. To cope with the limitations of the basic forms in handling complex real-world data, we further propose two advanced forms. Corresponding instantiations of the two forms are developed, namely Trkαß and Trkω based on multiple kernel learning and neural networks, respectively. For each instantiation, we present a condition with which the positive semi-definiteness is guaranteed and a semantic meaning is interpreted to the learned domain relatedness. Moreover, the condition can be easily used in the learning of TrGP αß and TrGP ω that are the Gaussian process models with the transfer kernels Trkαß and Trkω respectively. Extensive empirical studies show the effectiveness of TrGP αß and TrGP ω on domain relatedness modelling and transfer adaptiveness.

5.
Sci Rep ; 13(1): 7842, 2023 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-37188695

RESUMO

In multi-objective optimization, it becomes prohibitively difficult to cover the Pareto front (PF) as the number of points scales exponentially with the dimensionality of the objective space. The challenge is exacerbated in expensive optimization domains where evaluation data is at a premium. To overcome insufficient representations of PFs, Pareto estimation (PE) invokes inverse machine learning to map preferred but unexplored regions along the front to the Pareto set in decision space. However, the accuracy of the inverse model depends on the training data, which is inherently scarce/small given high-dimensional/expensive objectives. To alleviate this small data challenge, this paper marks a first study on multi-source inverse transfer learning for PE. A method to maximally utilize experiential source tasks to augment PE in the target optimization task is proposed. Information transfers between heterogeneous source-target pairs is uniquely enabled in the inverse setting through the unification provided by common objective spaces. Our approach is tested experimentally on benchmark functions as well as on high-fidelity, multidisciplinary simulation data of composite materials manufacturing processes, revealing significant gains to the predictive accuracy and PF approximation capacity of Pareto set learning. With such accurate inverse models made feasible, a future of on-demand human-machine interaction facilitating multi-objective decisions is envisioned.

6.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8206-8226, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015510

RESUMO

Recently, one critical issue looms large in the field of recommender systems - there are no effective benchmarks for rigorous evaluation - which consequently leads to unreproducible evaluation and unfair comparison. We, therefore, conduct studies from the perspectives of practical theory and experiments, aiming at benchmarking recommendation for rigorous evaluation. Regarding the theoretical study, a series of hyper-factors affecting recommendation performance throughout the whole evaluation chain are systematically summarized and analyzed via an exhaustive review on 141 papers published at eight top-tier conferences within 2017-2020. We then classify them into model-independent and model-dependent hyper-factors, and different modes of rigorous evaluation are defined and discussed in-depth accordingly. For the experimental study, we release DaisyRec 2.0 library by integrating these hyper-factors to perform rigorous evaluation, whereby a holistic empirical study is conducted to unveil the impacts of different hyper-factors on recommendation performance. Supported by the theoretical and experimental studies, we finally create benchmarks for rigorous evaluation by proposing standardized procedures and providing performance of ten state-of-the-arts across six evaluation metrics on six datasets as a reference for later study. Overall, our work sheds light on the issues in recommendation evaluation, provides potential solutions for rigorous evaluation, and lays foundation for further investigation.

7.
Chaos ; 33(2): 023126, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36859223

RESUMO

Granger causality is a commonly used method for uncovering information flow and dependencies in a time series. Here, we introduce JGC (Jacobian Granger causality), a neural network-based approach to Granger causality using the Jacobian as a measure of variable importance, and propose a variable selection procedure for inferring Granger causal variables with this measure, using criteria of significance and consistency. The resulting approach performs consistently well compared to other approaches in identifying Granger causal variables, the associated time lags, as well as interaction signs. In addition, we also discuss the need for contemporaneous variables in Granger causal modeling as well as how these neural network-based approaches reduce the impact of nonseparability in dynamical systems, a problem where predictive information on a target variable is not unique to its causes, but also contained in the history of the target variable itself.

8.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3862-3876, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35727778

RESUMO

Multi-source transfer regression is a practical and challenging problem where capturing the diverse relatedness of different domains is the key of adaptive knowledge transfer. In this article, we propose an effective way of explicitly modeling the domain relatedness of each domain pair through transfer kernel learning. Specifically, we first discuss the advantages and disadvantages of existing transfer kernels in handling the multi-source transfer regression problem. To cope with the limitations of the existing transfer kernels, we further propose a novel multi-source transfer kernel kms. The proposed kms assigns a learnable parametric coefficient to model the relatedness of each inter-domain pair, and simultaneously regulates the relatedness of the intra-domain pair to be 1. Moreover, to capture the heterogeneous data characteristics of multiple domains, kms exploits different standard kernels for different domain pairs. We further provide a theorem that not only guarantees the positive semi-definiteness of kms but also conveys a semantic interpretation to the learned domain relatedness. Moreover, the theorem can be easily used in the learning of the corresponding transfer Gaussian process model with kms. Extensive empirical studies show the effectiveness of our proposed method on domain relatedness modelling and transfer performance.

9.
IEEE Trans Cybern ; 53(1): 483-496, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34818203

RESUMO

In dealing with the expensive multiobjective optimization problem, some algorithms convert it into a number of single-objective subproblems for optimization. At each iteration, these algorithms conduct surrogate-assisted optimization on one or multiple subproblems. However, these subproblems may be unnecessary or resolved. Operating on such subproblems can cause server inefficiencies, especially in the case of expensive optimization. To overcome this shortcoming, we propose an adaptive subproblem selection (ASS) strategy to identify the most promising subproblems for further modeling. To better leverage the cross information between the subproblems, we use the collaborative multioutput Gaussian process surrogate to model them jointly. Moreover, the commonly used acquisition functions (also known as infill criteria) are investigated in this article. Our analysis reveals that these acquisition functions may cause severe imbalances between exploitation and exploration in multiobjective optimization scenarios. Consequently, we develop a new acquisition function, namely, adaptive lower confidence bound (ALCB), to cope with it. The experimental results on three different sets of benchmark problems indicate that our proposed algorithm is competitive. Beyond that, we also quantitatively validate the effectiveness of the ASS strategy, the CoMOGP model, and the ALCB acquisition function.

10.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9040-9053, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35298385

RESUMO

Neural architecture search (NAS) has attracted much attention in recent years. It automates the neural network construction for different tasks, which is traditionally addressed manually. In the literature, evolutionary optimization (EO) has been proposed for NAS due to its strong global search capability. However, despite the success enjoyed by EO, it is worth noting that existing EO algorithms for NAS are often very computationally expensive, which makes these algorithms unpractical in reality. Keeping this in mind, in this article, we propose an efficient memetic algorithm (MA) for automated convolutional neural network (CNN) architecture search. In contrast to existing EO algorithms for CNN architecture design, a new cell-based architecture search space, and new global and local search operators are proposed for CNN architecture search. To further improve the efficiency of our proposed algorithm, we develop a one-epoch-based performance estimation strategy without any pretrained models to evaluate each found architecture on the training datasets. To investigate the performance of the proposed method, comprehensive empirical studies are conducted against 34 state-of-the-art peer algorithms, including manual algorithms, reinforcement learning (RL) algorithms, gradient-based algorithms, and evolutionary algorithms (EAs), on widely used CIFAR10 and CIFAR100 datasets. The obtained results confirmed the efficacy of the proposed approach for automated CNN architecture design.

11.
IEEE Trans Cybern ; 53(5): 2955-2968, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-35044926

RESUMO

The performance of machine learning algorithms heavily relies on the availability of a large amount of training data. However, in reality, data usually reside in distributed parties such as different institutions and may not be directly gathered and integrated due to various data policy constraints. As a result, some parties may suffer from insufficient data available for training machine learning models. In this article, we propose a multiparty dual learning (MPDL) framework to alleviate the problem of limited data with poor quality in an isolated party. Since the knowledge-sharing processes for multiple parties always emerge in dual forms, we show that dual learning is naturally suitable to handle the challenge of missing data, and explicitly exploits the probabilistic correlation and structural relationship between dual tasks to regularize the training process. We introduce a feature-oriented differential privacy with mathematical proof, in order to avoid possible privacy leakage of raw features in the dual inference process. The approach requires minimal modifications to the existing multiparty learning structure, and each party can build flexible and powerful models separately, whose accuracy is no less than nondistributed self-learning approaches. The MPDL framework achieves significant improvement compared with state-of-the-art multiparty learning methods, as we demonstrated through simulations on real-world datasets.

12.
IEEE Trans Neural Netw Learn Syst ; 34(9): 6146-6157, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34936559

RESUMO

Deep reinforcement learning (DRL) policies have been shown to be deceived by perturbations (e.g., random noise or intensional adversarial attacks) on state observations that appear at test time but are unknown during training. To increase the robustness of DRL policies, previous approaches assume that explicit adversarial information can be added into the training process, to achieve generalization ability on these perturbed observations as well. However, such approaches not only make robustness improvement more expensive but may also leave a model prone to other kinds of attacks in the wild. In contrast, we propose an adversary agnostic robust DRL paradigm that does not require learning from predefined adversaries. To this end, we first theoretically show that robustness could indeed be achieved independently of the adversaries based on a policy distillation (PD) setting. Motivated by this finding, we propose a new PD loss with two terms: 1) a prescription gap maximization (PGM) loss aiming to simultaneously maximize the likelihood of the action selected by the teacher policy and the entropy over the remaining actions and 2) a corresponding Jacobian regularization (JR) loss that minimizes the magnitude of gradients with respect to the input state. The theoretical analysis substantiates that our distillation loss guarantees to increase the prescription gap and hence improves the adversarial robustness. Furthermore, experiments on five Atari games firmly verify the superiority of our approach compared to the state-of-the-art baselines.

13.
IEEE Trans Cybern ; 53(3): 1776-1789, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34936562

RESUMO

Collision-avoidance control for UAV swarm has recently drawn great attention due to its significant implications in many industrial and commercial applications. However, traditional collision-avoidance models for UAV swarm tend to focus on avoidance at individual UAV level, and no explicit strategy is designed for avoidance among multiple UAV groups. When directly applying these models for multigroup UAV scenarios, the deadlock situation may happen. A group of UAVs may be temporally blocked by other groups in a narrow space and cannot progress toward achieving its goal. To this end, this article proposes a modeling and optimization approach to multigroup UAV collision avoidance. Specifically, group level collision detection and adaption mechanism are introduced, efficiently detecting potential collisions among different UAV groups and restructuring a group into subgroups for better collision and deadlock avoidance. A two-level control model is then designed for realizing collision avoidance among UAV groups and of UAVs within each group. Finally, an evolutionary multitask optimization method is introduced to effectively calibrate the parameters that exist in different levels of our control model, and an adaptive fitness evaluation strategy is proposed to reduce computation overhead in simulation-based optimization. The simulation results show that our model has superior performances in deadlock resolution, motion stability, and distance maintenance in multigroup UAV scenarios compared to the state-of-the-art collision-avoidance models. The model optimization results also show that our model optimization method can largely reduce execution time for computationally-intensive optimization process that involves UAV swarm simulation.

14.
IEEE Trans Cybern ; 53(10): 6160-6172, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35446777

RESUMO

In today's digital world, we are faced with an explosion of data and models produced and manipulated by numerous large-scale cloud-based applications. Under such settings, existing transfer evolutionary optimization (TrEO) frameworks grapple with simultaneously satisfying two important quality attributes, namely: 1) scalability against a growing number of source tasks and 2) online learning agility against sparsity of relevant sources to the target task of interest. Satisfying these attributes shall facilitate practical deployment of transfer optimization to scenarios with big task instances, while curbing the threat of negative transfer. While applications of existing algorithms are limited to tens of source tasks, in this article, we take a quantum leap forward in enabling more than two orders of magnitude scale-up in the number of tasks; that is, we efficiently handle scenarios beyond 1000 source task instances. We devise a novel TrEO framework comprising two co-evolving species for joint evolutions in the space of source knowledge and in the search space of solutions to the target problem. In particular, co-evolution enables the learned knowledge to be orchestrated on the fly, expediting convergence in the target optimization task. We have conducted an extensive series of experiments across a set of practically motivated discrete and continuous optimization examples comprising a large number of source task instances, of which only a small fraction indicate source-target relatedness. The experimental results show that not only does our proposed framework scale efficiently with a growing number of source tasks but is also effective in capturing relevant knowledge against sparsity of related sources, fulfilling the two salient features of scalability and online learning agility.

15.
IEEE Trans Cybern ; 53(10): 6222-6235, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35476555

RESUMO

Graph classification aims to predict the label associated with a graph and is an important graph analytic task with widespread applications. Recently, graph neural networks (GNNs) have achieved state-of-the-art results on purely supervised graph classification by virtue of the powerful representation ability of neural networks. However, almost all of them ignore the fact that graph classification usually lacks reasonably sufficient labeled data in practical scenarios due to the inherent labeling difficulty caused by the high complexity of graph data. The existing semisupervised GNNs typically focus on the task of node classification and are incapable to deal with graph classification. To tackle the challenging but practically useful scenario, we propose a novel and general semisupervised GNN framework for graph classification, which takes full advantage of a slight amount of labeled graphs and abundant unlabeled graph data. In our framework, we train two GNNs as complementary views for collaboratively learning high-quality classifiers using both labeled and unlabeled graphs. To further exploit the view itself, we constantly select pseudo-labeled graph examples with high confidence from its own view for enlarging the labeled graph dataset and enhancing predictions on graphs. Furthermore, the proposed framework is investigated on two specific implementation regimes with a few labeled graphs and the extremely few labeled graphs, respectively. Extensive experimental results demonstrate the effectiveness of our proposed semisupervised GNN framework for graph classification on several benchmark datasets.

16.
IEEE Trans Cybern ; 53(7): 4347-4360, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35560088

RESUMO

Many real-world problems, such as airfoil design, involve optimizing a black-box expensive objective function over complex-structured input space (e.g., discrete space or non-Euclidean space). By mapping the complex-structured input space into a latent space of dozens of variables, a two-stage procedure labeled as generative model-based optimization (GMO), in this article, shows promise in solving such problems. However, the latent dimension of GMO is hard to determine, which may trigger the conflicting issue between desirable solution accuracy and convergence rate. To address the above issue, we propose a multiform GMO approach, namely, generative multiform optimization (GMFoO), which conducts optimization over multiple latent spaces simultaneously to complement each other. More specifically, we devise a generative model which promotes a positive correlation between latent spaces to facilitate effective knowledge transfer in GMFoO. And furthermore, by using Bayesian optimization (BO) as the optimizer, we propose two strategies to exchange information between these latent spaces continuously. Experimental results are presented on airfoil and corbel design problems and an area maximization problem as well to demonstrate that our proposed GMFoO converges to better designs on a limited computational budget.

17.
IEEE Trans Cybern ; PP2022 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-36350864

RESUMO

In an era of pervasive digitalization, the growing volume and variety of data streams poses a new challenge to the efficient running of data-driven optimization algorithms. Targeting scalable multiobjective evolution under large-instance data, this article proposes the general idea of using subsampled small-data tasks as helpful minions (i.e., auxiliary source tasks) to quickly optimize for large datasets-via an evolutionary multitasking framework. Within this framework, a novel computational resource allocation strategy is designed to enable the effective utilization of the minions while guarding against harmful negative transfers. To this end, an intertask empirical correlation measure is defined and approximated via Bayes' rule, which is then used to allocate resources online in proportion to the inferred degree of source-target correlation. In the experiments, the performance of the proposed algorithm is verified on: 1) sample average approximations of benchmark multiobjective optimization problems under uncertainty and 2) practical multiobjective hyperparameter tuning of deep neural network models. The results show that the proposed algorithm can obtain up to about 73% speedup relative to existing approaches, demonstrating its ability to efficiently tackle real-world multiobjective optimization involving evaluations on large datasets.

18.
Artigo em Inglês | MEDLINE | ID: mdl-35853062

RESUMO

Recent years have witnessed the great success of group buying (GB) in social e-commerce, opening up a new way of online shopping. In this business model, a user can launch a GB as an initiator to share her interested product with social friends. The GB is clinched once enough friends join in as participants to copurchase the shared product. As such, a successful GB depends on not only whether the initiator can find her interested product but also whether the friends are willing to join in as participants. Most existing recommenders are incompetent in such complex scenario, as they merely seek to help users find their preferred products and cannot help identify potential participants to join in a GB. To this end, we propose a novel joint product-participant recommendation (J2PRec) framework, which recommends both candidate products and participants for maximizing the success rate of a GB. Specifically, J2PRec first designs a relational graph embedding module, which effectively encodes the various relations in GB for learning enhanced user and product embeddings. It then jointly learns the product and participant recommendation tasks under a probabilistic framework to maximize the GB likelihood, i.e., boost the success rate of a GB. Extensive experiments on three real-world datasets demonstrate the superiority of J2PRec for GB recommendation.

19.
Artigo em Inglês | MEDLINE | ID: mdl-35749327

RESUMO

Current one-stage methods for visual grounding encode the language query as one holistic sentence embedding before fusion with visual features for target localization. Such a formulation provides insufficient ability to model query at the word level, and therefore is prone to neglect words that may not be the most important ones for a sentence but are critical for the referred object. In this article, we propose Word2Pix: a one-stage visual grounding network based on the encoder-decoder transformer architecture that enables learning for textual to visual feature correspondence via word to pixel attention. Each word from the query sentence is given an equal opportunity when attending to visual pixels through multiple stacks of transformer decoder layers. In this way, the decoder can learn to model the language query and fuse language with the visual features for target prediction simultaneously. We conduct the experiments on RefCOCO, RefCOCO + , and RefCOCOg datasets, and the proposed Word2Pix outperforms the existing one-stage methods by a notable margin. The results obtained also show that Word2Pix surpasses the two-stage visual grounding models, while at the same time keeping the merits of the one-stage paradigm, namely, end-to-end training and fast inference speed. Code is available at https://github.com/azurerain7/Word2Pix.

20.
IEEE Trans Cybern ; 52(9): 9820-9833, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35687641

RESUMO

Bayesian optimization (BO) is well known to be sample efficient for solving black-box problems. However, BO algorithms may get stuck in suboptimal solutions even with plenty of samples. Intrinsically, such a suboptimal problem of BO can attribute to the poor surrogate accuracy of the trained Gaussian process (GP), particularly that in the regions where the optimal solutions locate. Hence, we propose to build multiple GP models instead of a single GP surrogate to complement each other, thus resolving the suboptimal problem of BO. Nevertheless, according to the bias-variance tradeoff equation, the individual prediction errors can increase when increasing the diversity of models, which may lead even worse overall surrogate accuracy. On the other hand, based on the theory of the Rademacher complexity, it has been proven that exploiting the agreement of models on unlabeled information can reduce the complexity of hypothesis space, therefore achieving the required surrogate accuracy with fewer samples. Such value of model agreement has been extensively demonstrated for co-training style algorithms to boost model accuracy with a small portion of samples. Inspired by the above, we propose a novel BO algorithm labeled as co-learning BO (CLBO), which exploits both model diversity and agreement on unlabeled information to improve the overall surrogate accuracy with limited samples, therefore achieving more efficient global optimization. Through tests on five numerical toy problems and three engineering benchmarks, the effectiveness of the proposed CLBO has been well demonstrated.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...