Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 73
Filter
1.
Article in English | MEDLINE | ID: mdl-38963736

ABSTRACT

Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering (DC), which can learn clustering-friendly representations using deep neural networks (DNNs), has been broadly applied in a wide range of clustering tasks. Existing surveys for DC mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering. To address this issue, in this article, we provide a comprehensive survey for DC in views of data sources. With different data sources, we systematically distinguish the clustering methods in terms of methodology, prior knowledge, and architecture. Concretely, DC methods are introduced according to four categories, i.e., traditional single-view DC, semi-supervised DC, deep multiview clustering (MVC), and deep transfer clustering. Finally, we discuss the open challenges and potential future opportunities in different fields of DC.

2.
Neural Netw ; 176: 106341, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38692189

ABSTRACT

The great learning ability of deep learning facilitates us to comprehend the real physical world, making learning to simulate complicated particle systems a promising endeavour both in academia and industry. However, the complex laws of the physical world pose significant challenges to the learning based simulations, such as the varying spatial dependencies between interacting particles and varying temporal dependencies between particle system states in different time stamps, which dominate particles' interacting behavior and the physical systems' evolution patterns. Existing learning based methods fail to fully account for the complexities, making them unable to yield satisfactory simulations. To better comprehend the complex physical laws, we propose a novel model - Graph Networks with Spatial-Temporal neural Ordinary Differential Equations (GNSTODE) - that characterizes the varying spatial and temporal dependencies in particle systems using a united end-to-end framework. Through training with real-world particle-particle interaction observations, GNSTODE can simulate any possible particle systems with high precisions. We empirically evaluate GNSTODE's simulation performance on two real-world particle systems, Gravity and Coulomb, with varying levels of spatial and temporal dependencies. The results show that GNSTODE yields better simulations than state-of-the-art methods, showing that GNSTODE can serve as an effective tool for particle simulation in real-world applications. Our code is made available at https://github.com/Guangsi-Shi/AI-for-physics-GNSTODE.


Subject(s)
Computer Simulation , Neural Networks, Computer , Gravitation , Physics , Deep Learning , Algorithms
3.
Article in English | MEDLINE | ID: mdl-38743549

ABSTRACT

Adversarial training (AT) is widely considered as the most promising strategy to defend against adversarial attacks and has drawn increasing interest from researchers. However, the existing AT methods still suffer from two challenges. First, they are unable to handle unrestricted adversarial examples (UAEs), which are built from scratch, as opposed to restricted adversarial examples (RAEs), which are created by adding perturbations bound by an lp norm to observed examples. Second, the existing AT methods often achieve adversarial robustness at the expense of standard generalizability (i.e., the accuracy on natural examples) because they make a tradeoff between them. To overcome these challenges, we propose a unique viewpoint that understands UAEs as imperceptibly perturbed unobserved examples. Also, we find that the tradeoff results from the separation of the distributions of adversarial examples and natural examples. Based on these ideas, we propose a novel AT approach called Provable Unrestricted Adversarial Training (PUAT), which can provide a target classifier with comprehensive adversarial robustness against both UAE and RAE, and simultaneously improve its standard generalizability. Particularly, PUAT utilizes partially labeled data to achieve effective UAE generation by accurately capturing the natural data distribution through a novel augmented triple-GAN. At the same time, PUAT extends the traditional AT by introducing the supervised loss of the target classifier into the adversarial loss and achieves the alignment between the UAE distribution, the natural data distribution, and the distribution learned by the classifier, with the collaboration of the augmented triple-GAN. Finally, the solid theoretical analysis and extensive experiments conducted on widely-used benchmarks demonstrate the superiority of PUAT.

4.
Article in English | MEDLINE | ID: mdl-38648122

ABSTRACT

While existing fairness interventions show promise in mitigating biased predictions, most studies concentrate on single-attribute protections. Although a few methods consider multiple attributes, they either require additional constraints or prediction heads, incurring high computational overhead or jeopardizing the stability of the training process. More critically, they consider per-attribute protection approaches, raising concerns about fairness gerrymandering where certain attribute combinations remain unfair. This work aims to construct a neutral domain containing fused information across all subgroups and attributes. It delivers fair predictions as the fused input contains neutralized information for all considered attributes. Specifically, we adopt mixup operations to generate samples with fused information. However, our experiments reveal that directly adopting the operations leads to degraded prediction results. The excessive mixup operations result in unrecognizable training data. To this end, we design three distinct mixup schemes that balance information fusion across attributes while retaining distinct visual features critical for training valid models. Extensive experiments with multiple datasets and up to eight sensitive attributes demonstrate that the proposed MultiFair method can deliver fairness protections for multiple attributes while maintaining valid prediction results.

5.
Article in English | MEDLINE | ID: mdl-38687672

ABSTRACT

Multiple instance learning (MIL) trains models from bags of instances, where each bag contains multiple instances, and only bag-level labels are available for supervision. The application of graph neural networks (GNNs) in capturing intrabag topology effectively improves MIL. Existing GNNs usually require filtering low-confidence edges among instances and adapting graph neural architectures to new bag structures. However, such asynchronous adjustments to structure and architecture are tedious and ignore their correlations. To tackle these issues, we propose a reinforced GNN framework for MIL (RGMIL), pioneering the exploitation of multiagent deep reinforcement learning (MADRL) in MIL tasks. MADRL enables the flexible definition or extension of factors that influence bag graphs or GNNs and provides synchronous control over them. Moreover, MADRL explores structure-to-architecture correlations while automating adjustments. Experimental results on multiple MIL datasets demonstrate that RGMIL achieves the best performance with excellent explainability. The code and data are available at https://github.com/RingBDStack/RGMIL.

6.
Article in English | MEDLINE | ID: mdl-38408012

ABSTRACT

Community detection has become a prominent task in complex network analysis. However, most of the existing methods for community detection only focus on the lower order structure at the level of individual nodes and edges and ignore the higher order connectivity patterns that characterize the fundamental building blocks within the network. In recent years, researchers have shown interest in motifs and their role in network analysis. However, most of the existing higher order approaches are based on shallow methods, failing to capture the intricate nonlinear relationships between nodes. In order to better fuse higher order and lower order structural information, a novel deep learning framework called motif-based contrastive learning for community detection (MotifCC) is proposed. First, a higher order network is constructed based on motifs. Subnetworks are then obtained by removing isolated nodes, addressing the fragmentation issue in the higher order network. Next, the concept of contrastive learning is applied to effectively fuse various kinds of information from nodes, edges, and higher order and lower order structures. This aims to maximize the similarity of corresponding node information, while distinguishing different nodes and different communities. Finally, based on the community structure of subnetworks, the community labels of all nodes are obtained by using the idea of label propagation. Extensive experiments on real-world datasets validate the effectiveness of MotifCC.

7.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 15275-15291, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37751343

ABSTRACT

Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focus respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.

8.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37401373

ABSTRACT

Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.


Subject(s)
Artificial Intelligence , Neural Networks, Computer , Humans , Drug Interactions , Natural Language Processing , Drug Discovery
9.
Article in English | MEDLINE | ID: mdl-37216231

ABSTRACT

Social network alignment, aiming at linking identical identities across different social platforms, is a fundamental task in social graph mining. Most existing approaches are supervised models and require a large number of manually labeled data, which are infeasible in practice considering the yawning gap between social platforms. Recently, isomorphism across social networks is incorporated as complementary to link identities from the distribution level, which contributes to alleviating the dependency on sample-level annotations. Adversarial learning is adopted to learn a shared projection function by minimizing the distance between two social distributions. However, the hypothesis of isomorphism might not always hold true as social user behaviors are generally unpredictable, and thus a shared projection function is insufficient to handle the sophisticated cross-platform correlations. In addition, adversarial learning suffers from training instability and uncertainty, which may hinder model performance. In this article, we propose a novel meta-learning-based social network alignment model Meta-SNA to effectively capture the isomorphism and the unique characteristics of each identity. Our motivation lies in learning a shared meta-model to preserve the global cross-platform knowledge and an adaptor to learn a specific projection function for each identity. Sinkhorn distance is further introduced as the distribution closeness measurement to tackle the limitations of adversarial learning, which owns an explicitly optimal solution and can be efficiently computed by the matrix scaling algorithm. Empirically, we evaluate the proposed model over multiple datasets, and the experimental results demonstrate the superiority of Meta-SNA.

10.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8063-8080, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37018637

ABSTRACT

While graph representation learning methods have shown success in various graph mining tasks, what knowledge is exploited for predictions is less discussed. This paper proposes a novel Adaptive Subgraph Neural Network named AdaSNN to find critical structures in graph data, i.e., subgraphs that are dominant to the prediction results. To detect critical subgraphs of arbitrary size and shape in the absence of explicit subgraph-level annotations, AdaSNN designs a Reinforced Subgraph Detection Module to search subgraphs adaptively without heuristic assumptions or predefined rules. To encourage the subgraph to be predictive at the global scale, we design a Bi-Level Mutual Information Enhancement Mechanism including both global-aware and label-aware mutual information maximization to further enhance the subgraph representations in the perspective of information theory. By mining critical subgraphs that reflect the intrinsic property of a graph, AdaSNN can provide sufficient interpretability to the learned results. Comprehensive experimental results on seven typical graph datasets demonstrate that AdaSNN has a significant and consistent performance improvement and provides insightful results.

11.
IEEE/ACM Trans Comput Biol Bioinform ; 20(4): 2577-2586, 2023.
Article in English | MEDLINE | ID: mdl-37018664

ABSTRACT

Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.


Subject(s)
Deep Learning , Proteins
12.
IEEE Trans Cybern ; 53(5): 3060-3074, 2023 May.
Article in English | MEDLINE | ID: mdl-34767522

ABSTRACT

Community detection in multiview networks has drawn an increasing amount of attention in recent years. Many approaches have been developed from different perspectives. Despite the success, the problem of community detection in adversarial multiview networks remains largely unsolved. An adversarial multiview network is a multiview network that suffers an adversarial attack on community detection in which the attackers may deliberately remove some critical edges so as to hide the underlying community structure, leading to the performance degeneration of the existing approaches. To address this problem, we propose a novel approach, called higher order connection enhanced multiview modularity (HCEMM). The main idea lies in enhancing the intracommunity connection of each view by means of utilizing the higher order connection structure. The first step is to discover the view-specific higher order Microcommunities (VHM-communities) from the higher order connection structure. Then, for each view of the original multiview network, additional edges are added to make the nodes in each of its VHM-communities fully connected like a clique, by which the intracommunity connection of the multiview network can be enhanced. Therefore, the proposed approach is able to discover the underlying community structure in a multiview network while recovering the missing edges. Extensive experiments conducted on 16 real-world datasets confirm the effectiveness of the proposed approach.

13.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5557-5569, 2023 Sep.
Article in English | MEDLINE | ID: mdl-34878980

ABSTRACT

As deep learning models mature, one of the most prescient questions we face is: what is the ideal tradeoff between accuracy, fairness, and privacy (AFP)? Unfortunately, both the privacy and the fairness of a model come at the cost of its accuracy. Hence, an efficient and effective means of fine-tuning the balance between this trinity of needs is critical. Motivated by some curious observations in privacy-accuracy tradeoffs with differentially private stochastic gradient descent (DP-SGD), where fair models sometimes result, we conjecture that fairness might be better managed as an indirect byproduct of this process. Hence, we conduct a series of analyses, both theoretical and empirical, on the impacts of implementing DP-SGD in deep neural network models through gradient clipping and noise addition. The results show that, in deep learning, the number of training epochs is central to striking a balance between AFP because DP-SGD makes the training less stable, providing the possibility of model updates at a low discrimination level without much loss in accuracy. Based on this observation, we designed two different early stopping criteria to help analysts choose the optimal epoch at which to stop training a model so as to achieve their ideal tradeoff. Extensive experiments show that our methods can achieve an ideal balance between AFP.

14.
IEEE Trans Neural Netw Learn Syst ; 34(2): 973-986, 2023 Feb.
Article in English | MEDLINE | ID: mdl-34432638

ABSTRACT

Most existing multiview clustering methods are based on the original feature space. However, the feature redundancy and noise in the original feature space limit their clustering performance. Aiming at addressing this problem, some multiview clustering methods learn the latent data representation linearly, while performance may decline if the relation between the latent data representation and the original data is nonlinear. The other methods which nonlinearly learn the latent data representation usually conduct the latent representation learning and clustering separately, resulting in that the latent data representation might be not well adapted to clustering. Furthermore, none of them model the intercluster relation and intracluster correlation of data points, which limits the quality of the learned latent data representation and therefore influences the clustering performance. To solve these problems, this article proposes a novel multiview clustering method via proximity learning in latent representation space, named multiview latent proximity learning (MLPL). For one thing, MLPL learns the latent data representation in a nonlinear manner which takes the intercluster relation and intracluster correlation into consideration simultaneously. For another, through conducting the latent representation learning and consensus proximity learning simultaneously, MLPL learns a consensus proximity matrix with k connected components to output the clustering result directly. Extensive experiments are conducted on seven real-world datasets to demonstrate the effectiveness and superiority of the MLPL method compared with the state-of-the-art multiview clustering methods.

15.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7934-7945, 2023 Oct.
Article in English | MEDLINE | ID: mdl-35157599

ABSTRACT

In multiagent learning, one of the main ways to improve learning performance is to ask for advice from another agent. Contemporary advising methods share a common limitation that a teacher agent can only advise a student agent if the teacher has experience with an identical state. However, in highly complex learning scenarios, such as autonomous driving, it is rare for two agents to experience exactly the same state, which makes the advice less of a learning aid and more of a one-time instruction. In these scenarios, with contemporary methods, agents do not really help each other learn, and the main outcome of their back and forth requests for advice is an exorbitant communications' overhead. In human interactions, teachers are often asked for advice on what to do in situations that students are personally unfamiliar with. In these, we generally draw from similar experiences to formulate advice. This inspired us to provide agents with the same ability when asked for advice on an unfamiliar state. Hence, we propose a model-based self-advising method that allows agents to train a model based on states similar to the state in question to inform its response. As a result, the advice given can not only be used to resolve the current dilemma but also many other similar situations that the student may come across in the future via self-advising. Compared with contemporary methods, our method brings a significant improvement in learning performance with much lower communication overheads.

16.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 980-998, 2023 Jan.
Article in English | MEDLINE | ID: mdl-35077355

ABSTRACT

Detecting hot social events (e.g., political scandal, momentous meetings, natural hazards, etc.) from social messages is crucial as it highlights significant happenings to help people understand the real world. On account of the streaming nature of social messages, incremental social event detection models in acquiring, preserving, and updating messages over time have attracted great attention. However, the challenge is that the existing event detection methods towards streaming social messages are generally confronted with ambiguous events features, dispersive text contents, and multiple languages, and hence result in low accuracy and generalization ability. In this paper, we present a novel reinForced, incremental and cross-lingual social Event detection architecture, namely FinEvent, from streaming social messages. Concretely, we first model social messages into heterogeneous graphs integrating both rich meta-semantics and diverse meta-relations, and convert them to weighted multi-relational message graphs. Second, we propose a new reinforced weighted multi-relational graph neural network framework by using a Multi-agent Reinforcement Learning algorithm to select optimal aggregation thresholds across different relations/edges to learn social message embeddings. To solve the long-tail problem in social event detection, a balanced sampling strategy guided Contrastive Learning mechanism is designed for incremental social message representation learning. Third, a new Deep Reinforcement Learning guided density-based spatial clustering model is designed to select the optimal minimum number of samples required to form a cluster and optimal minimum distance between two clusters in social event detection tasks. Finally, we implement incremental social message representation learning based on knowledge preservation on the graph neural network and achieve the transferring cross-lingual social event detection. We conduct extensive experiments to evaluate the FinEvent on Twitter streams, demonstrating a significant and consistent improvement in model quality with 14%-118%, 8%-170%, and 2%-21% increases in performance on offline, online, and cross-lingual social event detection tasks.

17.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 1746-1760, 2023.
Article in English | MEDLINE | ID: mdl-36251903

ABSTRACT

The "curse of dimensionality" brings new challenges to the feature selection (FS) problem, especially in bioinformatics filed. In this paper, we propose a hybrid Two-Stage Teaching-Learning-Based Optimization (TS-TLBO) algorithm to improve the performance of bioinformatics data classification. In the selection reduction stage, potentially informative features, as well as noisy features, are selected to effectively reduce the search space. In the following comparative self-learning stage, the teacher and the worst student with self-learning evolve together based on the duality of the FS problems to enhance the exploitation capabilities. In addition, an opposition-based learning strategy is utilized to generate initial solutions to rapidly improve the quality of the solutions. We further develop a self-adaptive mutation mechanism to improve the search performance by dynamically adjusting the mutation rate according to the teacher's convergence ability. Moreover, we integrate a differential evolutionary method with TLBO to boost the exploration ability of our algorithm. We conduct comparative experiments on 31 public data sets with different data dimensions, including 7 bioinformatics datasets, and evaluate our TS-TLBO algorithm compared with 11 related methods. The experimental results show that the TS-TLBO algorithm obtains a good feature subset with better classification performance, and indicates its generality to the FS problems.


Subject(s)
Algorithms , Computational Biology , Machine Learning
18.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 2208-2225, 2023 Feb.
Article in English | MEDLINE | ID: mdl-35380958

ABSTRACT

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical context, where the visual dynamics are believed to have modular structures that can be learned with compositional subsystems. This paper models these structures by presenting PredRNN, a new recurrent network, in which a pair of memory cells are explicitly decoupled, operate in nearly independent transition manners, and finally form unified representations of the complex environment. Concretely, besides the original memory cell of LSTM, this network is featured by a zigzag memory flow that propagates in both bottom-up and top-down directions across all layers, enabling the learned visual dynamics at different levels of RNNs to communicate. It also leverages a memory decoupling loss to keep the memory cells from learning redundant features. We further propose a new curriculum learning strategy to force PredRNN to learn long-term dynamics from context frames, which can be generalized to most sequence-to-sequence models. We provide detailed ablation studies to verify the effectiveness of each component. Our approach is shown to obtain highly competitive results on five datasets for both action-free and action-conditioned predictive learning scenarios.

19.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9671-9684, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35324448

ABSTRACT

Session-based recommendation tries to make use of anonymous session data to deliver high-quality recommendations under the condition that user profiles and the complete historical behavioral data of a target user are unavailable. Previous works consider each session individually and try to capture user interests within a session. Despite their encouraging results, these models can only perceive intra-session items and cannot draw upon the massive historical relational information. To solve this problem, we propose a novel method named global graph guided session-based recommendation (G3SR). G3SR decomposes the session-based recommendation workflow into two steps. First, a global graph is built upon all session data, from which the global item representations are learned in an unsupervised manner. Then, these representations are refined on session graphs under the graph networks, and a readout function is used to generate session representations for each session. Extensive experiments on two real-world benchmark datasets show remarkable and consistent improvements of the G3SR method over the state-of-the-art methods, especially for cold items.

20.
IEEE Trans Neural Netw Learn Syst ; 34(6): 2767-2780, 2023 Jun.
Article in English | MEDLINE | ID: mdl-34550893

ABSTRACT

Trust prediction provides valuable support for decision making, information dissemination, and product promotion in online social networks. As a complex concept in the social network community, trust relationships among people can be established virtually based on: 1) their interaction behaviors, e.g., the ratings and comments that they provided; 2) the contextual information associated with their interactions, e.g., location and culture; and 3) the relative temporal features of interactions and the time periods when the trust relationships hold. Most of the existing works only focus on some aspects of trust, and there is not a comprehensive study of user trust development that considers and incorporates 1)-3) in trust prediction. In this article, we propose a context-aware deep trust prediction model C-DeepTrust to fill this gap. First, we conduct user feature modeling to obtain the user's static and dynamic preference features in each context. Static user preference features are obtained from all the ratings and reviews that a user provided, while dynamic user preference features are obtained from the items rated/reviewed by the user in time series. The obtained context-aware user features are then combined and fed into the multilayer projection structure to further mine the context-aware latent features. Finally, the context-aware trust relationships between users are calculated by their context-aware feature vector cosine similarities according to the social homophily theory, which shows a pervasive property of social networks that trust relationships are more likely to be developed among similar people. Extensive experiments conducted on two real-world datasets show the superior performance of our approach compared with the representative baseline methods.

SELECTION OF CITATIONS
SEARCH DETAIL