Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Neural Netw ; 180: 106635, 2024 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-39173205

RESUMEN

Graph neural networks (GNNs) have become a popular approach for semi-supervised graph representation learning. GNNs research has generally focused on improving methodological details, whereas less attention has been paid to exploring the importance of labeling the data. However, for semi-supervised learning, the quality of training data is vital. In this paper, we first introduce and elaborate on the problem of training data selection for GNNs. More specifically, focusing on node classification, we aim to select representative nodes from a graph used to train GNNs to achieve the best performance. To solve this problem, we are inspired by the popular lottery ticket hypothesis, typically used for sparse architectures, and we propose the following subset hypothesis for graph data: "There exists a core subset when selecting a fixed-size dataset from the dense training dataset, that can represent the properties of the dataset, and GNNs trained on this core subset can achieve a better graph representation". Equipped with this subset hypothesis, we present an efficient algorithm to identify the core data in the graph for GNNs. Extensive experiments demonstrate that the selected data (as a training set) can obtain performance improvements across various datasets and GNNs architectures.

2.
Neural Netw ; 174: 106265, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38552351

RESUMEN

Graph Transformers (GTs) have achieved impressive results on various graph-related tasks. However, the huge computational cost of GTs hinders their deployment and application, especially in resource-constrained environments. Therefore, in this paper, we explore the feasibility of sparsifying GTs, a significant yet under-explored topic. We first discuss the redundancy of GTs based on the characteristics of existing GT models, and then propose a comprehensive Graph Transformer SParsification (GTSP) framework that helps to reduce the computational complexity of GTs from four dimensions: the input graph data, attention heads, model layers, and model weights. Specifically, GTSP designs differentiable masks for each individual compressible component, enabling effective end-to-end pruning. We examine our GTSP through extensive experiments on prominent GTs, including GraphTrans, Graphormer, and GraphGPS. The experimental results demonstrate that GTSP effectively reduces computational costs, with only marginal decreases in accuracy or, in some instances, even improvements. For example, GTSP results in a 30% reduction in Floating Point Operations while contributing to a 1.8% increase in Area Under the Curve accuracy on the OGBG-HIV dataset. Furthermore, we provide several insights on the characteristics of attention heads and the behavior of attention mechanisms, all of which have immense potential to inspire future research endeavors in this domain. Our code is available at https://github.com/LiuChuang0059/GTSP.

3.
Neural Netw ; 170: 548-563, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38052151

RESUMEN

Siamese tracking has witnessed tremendous progress in tracking paradigm. However, its default box estimation pipeline still faces a crucial inconsistency issue, namely, the bounding box decided by its classification score is not always best overlapped with the ground truth, thus harming performance. To this end, we explore a novel simple tracking paradigm based on the intersection over union (IoU) value prediction. To first bypass this inconsistency issue, we propose a concise target state predictor termed IoUformer, which instead of default box estimation pipeline directly predicts the IoU values related to tracking performance metrics. In detail, it extends the long-range dependency modeling ability of transformer to jointly grasp target-aware interactions between target template and search region, and search sub-region interactions, thus neatly unifying global semantic interaction and target state prediction. Thanks to this joint strength, IoUformer can predict reliable IoU values near-linear with the ground truth, which paves a safe way for our new IoU-based siamese tracking paradigm. Since it is non-trivial to explore this paradigm with pleased efficacy and portability, we offer the respective network components and two alternative localization ways. Experimental results show that our IoUformer-based tracker achieves promising results with less training data. For its applicability, it still serves as a refinement module to consistently boost existing advanced trackers.


Asunto(s)
Benchmarking , Semántica
4.
Neural Netw ; 168: 539-548, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37837743

RESUMEN

As a graph data mining task, graph classification has high academic value and wide practical application. Among them, the graph neural network-based method is one of the mainstream methods. Most graph neural networks (GNNs) follow the message passing paradigm and can be called Message Passing Neural Networks (MPNNs), achieving good results in structural data-related tasks. However, it has also been reported that these methods suffer from over-squashing and limited expressive power. In recent years, many works have proposed different solutions to these problems separately, but none has yet considered these shortcomings in a comprehensive way. After considering these several aspects comprehensively, we identify two specific defects: information loss caused by local information aggregation, and an inability to capture higher-order structures. To solve these issues, we propose a plug-and-play framework based on Commute Time Distance (CTD), in which information is propagated in commute time distance neighborhoods. By considering both local and global graph connections, the commute time distance between two nodes is evaluated with reference to the path length and the number of paths in the whole graph. Moreover, the proposed framework CTD-MPNNs (Commute Time Distance-based Message Passing Neural Networks) can capture higher-order structural information by utilizing commute paths to enhance the expressive power of GNNs. Thus, our proposed framework can propagate and aggregate messages from defined important neighbors and model more powerful GNNs. We conduct extensive experiments using various real-world graph classification benchmarks. The experimental performance demonstrates the effectiveness of our framework. Codes are released on https://github.com/Haldate-Yu/CTD-MPNNs.


Asunto(s)
Benchmarking , Minería de Datos , Redes Neurales de la Computación
5.
Neural Netw ; 167: 559-571, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37696073

RESUMEN

Graph Neural Networks (GNNs) have been successfully applied to graph-level tasks in various fields such as biology, social networks, computer vision, and natural language processing. For the graph-level representations learning of GNNs, graph pooling plays an essential role. Among many pooling techniques, node drop pooling has garnered significant attention and is considered as a leading approach. However, existing node drop pooling methods, which typically retain the top-k nodes based on their significance scores, often overlook the diversity inherent in node features and graph structures. This limitation leads to suboptimal graph-level representations. To overcome this, we introduce a groundbreaking plug-and-play score scheme, termed MID. MID comprises a Multidimensional score space and two key operations: flIpscore and Dropscore. The multidimensional score space depicts the significance of nodes by multiple criteria; the flipscore process promotes the preservation of distinct node features; the dropscore compels the model to take into account a range of graph structures rather than focusing on local structures. To evaluate the effectiveness of our proposed MID, we have conducted extensive experiments by integrating it with a broad range of recent node drop pooling methods, such as TopKPool, SAGPool, GSAPool, and ASAP. In particular, MID has proven to bring a significant average improvement of approximately 2.8% over the four aforementioned methods when tested on 17 real-world graph classification datasets. Code is available at https://github.com/whuchuang/mid.


Asunto(s)
Aprendizaje , Procesamiento de Lenguaje Natural , Redes Neurales de la Computación , Red Social
6.
Artículo en Inglés | MEDLINE | ID: mdl-37440376

RESUMEN

Contrastive learning (CL) is a prominent technique for self-supervised representation learning, which aims to contrast semantically similar (i.e., positive) and dissimilar (i.e., negative) pairs of examples under different augmented views. Recently, CL has provided unprecedented potential for learning expressive graph representations without external supervision. In graph CL, the negative nodes are typically uniformly sampled from augmented views to formulate the contrastive objective. However, this uniform negative sampling strategy limits the expressive power of contrastive models. To be specific, not all the negative nodes can provide sufficiently meaningful knowledge for effective contrastive representation learning. In addition, the negative nodes that are semantically similar to the anchor are undesirably repelled from it, leading to degraded model performance. To address these limitations, in this article, we devise an adaptive sampling strategy termed "AdaS." The proposed AdaS framework can be trained to adaptively encode the importance of different negative nodes, so as to encourage learning from the most informative graph nodes. Meanwhile, an auxiliary polarization regularizer is proposed to suppress the adverse impacts of the false negatives and enhance the discrimination ability of AdaS. The experimental results on a variety of real-world datasets firmly verify the effectiveness of our AdaS in improving the performance of graph CL.

7.
Artículo en Inglés | MEDLINE | ID: mdl-37368807

RESUMEN

Graph neural networks (GNNs) tend to suffer from high computation costs due to the exponentially increasing scale of graph data and a large number of model parameters, which restricts their utility in practical applications. To this end, some recent works focus on sparsifying GNNs (including graph structures and model parameters) with the lottery ticket hypothesis (LTH) to reduce inference costs while maintaining performance levels. However, the LTH-based methods suffer from two major drawbacks: 1) they require exhaustive and iterative training of dense models, resulting in an extremely large training computation cost, and 2) they only trim graph structures and model parameters but ignore the node feature dimension, where vast redundancy exists. To overcome the above limitations, we propose a comprehensive graph gradual pruning framework termed CGP. This is achieved by designing a during-training graph pruning paradigm to dynamically prune GNNs within one training process. Unlike LTH-based methods, the proposed CGP approach requires no retraining, which significantly reduces the computation costs. Furthermore, we design a cosparsifying strategy to comprehensively trim all the three core elements of GNNs: graph structures, node features, and model parameters. Next, to refine the pruning operation, we introduce a regrowth process into our CGP framework, to reestablish the pruned but important connections. The proposed CGP is evaluated over a node classification task across six GNN architectures, including shallow models graph convolutional network (GCN) and graph attention network (GAT), shallow-but-deep-propagation models simple graph convolution (SGC) and approximate personalized propagation of neural predictions (APPNP), and deep models GCN via initial residual and identity mapping (GCNII) and residual GCN (ResGCN), on a total of 14 real-world graph datasets, including large-scale graph datasets from the challenging Open Graph Benchmark (OGB). Experiments reveal that the proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of the existing methods.

8.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 11270-11282, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37027256

RESUMEN

Point cloud registration is a fundamental problem in 3D computer vision. Previous learning-based methods for LiDAR point cloud registration can be categorized into two schemes: dense-to-dense matching methods and sparse-to-sparse matching methods. However, for large-scale outdoor LiDAR point clouds, solving dense point correspondences is time-consuming, whereas sparse keypoint matching easily suffers from keypoint detection error. In this paper, we propose SDMNet, a novel Sparse-to-Dense Matching Network for large-scale outdoor LiDAR point cloud registration. Specifically, SDMNet performs registration in two sequential stages: sparse matching stage and local-dense matching stage. In the sparse matching stage, we sample a set of sparse points from the source point cloud and then match them to the dense target point cloud using a spatial consistency enhanced soft matching network and a robust outlier rejection module. Furthermore, a novel neighborhood matching module is developed to incorporate local neighborhood consensus, significantly improving performance. The local-dense matching stage is followed for fine-grained performance, where dense correspondences are efficiently obtained by performing point matching in local spatial neighborhoods of high-confidence sparse correspondences. Extensive experiments on three large-scale outdoor LiDAR point cloud datasets demonstrate that the proposed SDMNet achieves state-of-the-art performance with high efficiency.

9.
IEEE Trans Image Process ; 31: 6635-6648, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36256710

RESUMEN

Image dehazing aims to remove haze in images to improve their image quality. However, most image dehazing methods heavily depend on strict prior knowledge and paired training strategy, which would hinder generalization and performance when dealing with unseen scenes. In this paper, to address the above problem, we propose Bidirectional Normalizing Flow (BiN-Flow), which exploits no prior knowledge and constructs a neural network through weakly-paired training with better generalization for image dehazing. Specifically, BiN-Flow designs 1) Feature Frequency Decoupling (FFD) for mining the various texture details through multi-scale residual blocks and 2) Bidirectional Propagation Flow (BPF) for exploiting the one-to-many relationships between hazy and haze-free images using a sequence of invertible Flow. In addition, BiN-Flow constructs a reference mechanism (RM) that uses a small number of paired hazy and haze-free images and a large number of haze-free reference images for weakly-paired training. Essentially, the mutual relationships between hazy and haze-free images could be effectively learned to further improve the generalization and performance for image dehazing. We conduct extensive experiments on five commonly-used datasets to validate the BiN-Flow. The experimental results that BiN-Flow outperforms all state-of-the-art competitors demonstrate the capability and generalization of our BiN-Flow. Besides, our BiN-Flow could produce diverse dehazing images for the same image by considering restoration diversity.

10.
Artículo en Inglés | MEDLINE | ID: mdl-36136920

RESUMEN

Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances. Many few-shot visual recognition methods adopt the metric-based meta-learning paradigm by comparing the query representation with class representations to predict the category of query instance. However, the current metric-based methods generally treat all instances equally and consequently often obtain biased class representation, considering not all instances are equally significant when summarizing the instance-level representations for the class-level representation. For example, some instances may contain unrepresentative information, such as too much background and information of unrelated concepts, which skew the results. To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition. Specifically, we develop an adaptive instance revaluing network (AIRN) with the capability to address the biased representation issue when generating the class representation, by learning and assigning adaptive weights for different instances according to their relative significance in the support set of corresponding class. In addition, we design an improved bilinear instance representation and incorporate two novel structural losses, i.e., intraclass instance clustering loss and interclass representation distinguishing loss, to further regulate the instance revaluation process and refine the class representation. We conduct extensive experiments on four commonly adopted few-shot benchmarks: miniImageNet, tieredImageNet, CIFAR-FS, and FC100 datasets. The experimental results compared with the state-of-the-art approaches demonstrate the superiority of our ICRL-Net.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA