Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 583
Filter
1.
Adv Sci (Weinh) ; : e2403393, 2024 Sep 03.
Article in English | MEDLINE | ID: mdl-39225619

ABSTRACT

Microbes are extensively present among various cancer tissues and play critical roles in carcinogenesis and treatment responses. However, the underlying relationships between intratumoral microbes and tumors remain poorly understood. Here, a MIcrobial Cancer-association Analysis using a Heterogeneous graph transformer (MICAH) to identify intratumoral cancer-associated microbial communities is presented. MICAH integrates metabolic and phylogenetic relationships among microbes into a heterogeneous graph representation. It uses a graph transformer to holistically capture relationships between intratumoral microbes and cancer tissues, which improves the explainability of the associations between identified microbial communities and cancers. MICAH is applied to intratumoral bacterial data across 5 cancer types and 5 fungi datasets, and its generalizability and reproducibility are demonstrated. After experimentally testing a representative observation using a mouse model of tumor-microbe-immune interactions, a result consistent with MICAH's identified relationship is observed. Source tracking analysis reveals that the primary known contributor to a cancer-associated microbial community is the organs affected by the type of cancer. Overall, this graph neural network framework refines the number of microbes that can be used for follow-up experimental validation from thousands to tens, thereby helping to accelerate the understanding of the relationship between tumors and intratumoral microbiomes.

2.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39226887

ABSTRACT

Plasma protein biomarkers have been considered promising tools for diagnosing dementia subtypes due to their low variability, cost-effectiveness, and minimal invasiveness in diagnostic procedures. Machine learning (ML) methods have been applied to enhance accuracy of the biomarker discovery. However, previous ML-based studies often overlook interactions between proteins, which are crucial in complex disorders like dementia. While protein-protein interactions (PPIs) have been used in network models, these models often fail to fully capture the diverse properties of PPIs due to their local awareness. This drawback increases the chance of neglecting critical components and magnifying the impact of noisy interactions. In this study, we propose a novel graph-based ML model for dementia subtype diagnosis, the graph propagational network (GPN). By propagating the independent effect of plasma proteins on PPI network, the GPN extracts the globally interactive effects between proteins. Experimental results showed that the interactive effect between proteins yielded to further clarify the differences between dementia subtype groups and contributed to the performance improvement where the GPN outperformed existing methods by 10.4% on average.


Subject(s)
Biomarkers , Blood Proteins , Dementia , Machine Learning , Protein Interaction Maps , Humans , Dementia/metabolism , Dementia/diagnosis , Blood Proteins/metabolism , Protein Interaction Mapping/methods , Algorithms , Computational Biology/methods
3.
BMC Bioinformatics ; 25(1): 287, 2024 Sep 02.
Article in English | MEDLINE | ID: mdl-39223474

ABSTRACT

BACKGROUND: Recently, the process of evolution information and the deep learning network has promoted the improvement of protein contact prediction methods. Nevertheless, still remain some bottleneck: (1) One of the bottlenecks is the prediction of orphans and other fewer evolution information proteins. (2) The other bottleneck is the method of predicting single-sequence-based proteins mainly focuses on selecting protein sequence features and tuning the neural network architecture, However, while the deeper neural networks improve prediction accuracy, there is still the problem of increasing the computational burden. Compared with other neural networks in the field of protein prediction, the graph neural network has the following advantages: due to the advantage of revealing the topology structure via graph neural network and being able to take advantage of the hierarchical structure and local connectivity of graph neural networks has certain advantages in capturing the features of different levels of abstraction in protein molecules. When using protein sequence and structure information for joint training, the dependencies between the two kinds of information can be better captured. And it can process protein molecular structures of different lengths and shapes, while traditional neural networks need to convert proteins into fixed-size vectors or matrices for processing. RESULTS: Here, we propose a single-sequence-based protein contact map predictor PCP-GC-LM, with dual-level graph neural networks and convolution networks. Our method performs better with other single-sequence-based predictors in different independent tests. In addition, to verify the validity of our method against complex protein structures, we will also compare it with other methods in two homodimers protein test sets (DeepHomo test dataset and CASP-CAPRI target dataset). Furthermore, we also perform ablation experiments to demonstrate the necessity of a dual graph network. In all, our framework presents new modules to accurately predict inter-chain contact maps in protein and it's also useful to analyze interactions in other types of protein complexes.


Subject(s)
Neural Networks, Computer , Proteins , Proteins/chemistry , Proteins/metabolism , Computational Biology/methods , Sequence Analysis, Protein/methods , Databases, Protein , Deep Learning , Protein Conformation , Algorithms
4.
Neural Netw ; 179: 106562, 2024 Jul 22.
Article in English | MEDLINE | ID: mdl-39142173

ABSTRACT

Multi-view learning is an emerging field of multi-modal fusion, which involves representing a single instance using multiple heterogeneous features to improve compatibility prediction. However, existing graph-based multi-view learning approaches are implemented on homogeneous assumptions and pairwise relationships, which may not adequately capture the complex interactions among real-world instances. In this paper, we design a compressed hypergraph neural network from the perspective of multi-view heterogeneous graph learning. This approach effectively captures rich multi-view heterogeneous semantic information, incorporating a hypergraph structure that simultaneously enables the exploration of higher-order correlations between samples in multi-view scenarios. Specifically, we introduce efficient hypergraph convolutional networks based on an explainable regularizer-centered optimization framework. Additionally, a low-rank approximation is adopted as hypergraphs to reformat the initial complex multi-view heterogeneous graph. Extensive experiments compared with several advanced node classification methods and multi-view classification methods have demonstrated the feasibility and effectiveness of the proposed method.

5.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39162311

ABSTRACT

The prediction of metabolite-protein interactions (MPIs) plays an important role in plant basic life functions. Compared with the traditional experimental methods and the high-throughput genomics methods using statistical correlation, applying heterogeneous graph neural networks to the prediction of MPIs in plants can reduce the cost of manpower, resources, and time. However, to the best of our knowledge, applying heterogeneous graph neural networks to the prediction of MPIs in plants still remains under-explored. In this work, we propose a novel model named heterogeneous neighbor contrastive graph attention network (HNCGAT), for the prediction of MPIs in Arabidopsis. The HNCGAT employs the type-specific attention-based neighborhood aggregation mechanism to learn node embeddings of proteins, metabolites, and functional-annotations, and designs a novel heterogeneous neighbor contrastive learning framework to preserve heterogeneous network topological structures. Extensive experimental results and ablation study demonstrate the effectiveness of the HNCGAT model for MPI prediction. In addition, a case study on our MPI prediction results supports that the HNCGAT model can effectively predict the potential MPIs in plant.


Subject(s)
Arabidopsis , Neural Networks, Computer , Arabidopsis/genetics , Arabidopsis/metabolism , Algorithms , Computational Biology/methods , Plant Proteins/genetics , Plant Proteins/metabolism
6.
Neural Netw ; 179: 106549, 2024 Jul 16.
Article in English | MEDLINE | ID: mdl-39089148

ABSTRACT

Traffic flow prediction is crucial for efficient traffic management. It involves predicting vehicle movement patterns to reduce congestion and enhance traffic flow. However, the highly non-linear and complex patterns commonly observed in traffic flow pose significant challenges for this task. Current Graph Neural Network (GNN) models often construct shallow networks, which limits their ability to extract deeper spatio-temporal representations. Neural ordinary differential equations for traffic prediction address over-smoothing but require significant computational resources, leading to inefficiencies, and sometimes deeper networks may lead to poorer predictions for complex traffic information. In this study, we propose an Adaptive Decision spatio-temporal Neural Ordinary Differential Network, which can adaptively determine the number of layers of ODE according to the complexity of traffic information. It can solve the over-smoothing problem better, improving overall efficiency and prediction accuracy. In addition, traditional temporal convolution methods make it difficult to deal with complex and variable traffic time information with a large time span. Therefore, we introduce a multi-kernel temporal dynamic expansive convolution to handle the traffic time information. Multi-kernel temporal dynamic expansive convolution employs a dynamic dilation strategy, dynamically adjusting the network's receptive field across levels, effectively capturing temporal dependencies, and can better adapt to the changing time data of traffic information. Additionally, multi-kernel temporal dynamic expansive convolution integrates multi-scale convolution kernels, enabling the model to learn features across diverse temporal scales. We evaluated our proposed method on several real-world traffic datasets. Experimental results show that our method outperformed state-of-the-art benchmarks.

7.
J Adv Res ; 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-39097091

ABSTRACT

INTRODUCTION: Immune checkpoint inhibitors (ICIs) are potent and precise therapies for various cancer types, significantly improving survival rates in patients who respond positively to them. However, only a minority of patients benefit from ICI treatments. OBJECTIVES: Identifying ICI responders before treatment could greatly conserve medical resources, minimize potential drug side effects, and expedite the search for alternative therapies. Our goal is to introduce a novel deep-learning method to predict ICI treatment responses in cancer patients. METHODS: The proposed deep-learning framework leverages graph neural network and biological pathway knowledge. We trained and tested our method using ICI-treated patients' data from several clinical trials covering melanoma, gastric cancer, and bladder cancer. RESULTS: Our results demonstrate that this predictive model outperforms current state-of-the-art methods and tumor microenvironment-based predictors. Additionally, the model quantifies the importance of pathways, pathway interactions, and genes in its predictions. A web server for IRnet has been developed and deployed, providing broad accessibility to users at https://irnet.missouri.edu. CONCLUSION: IRnet is a competitive tool for predicting patient responses to immunotherapy, specifically ICIs. Its interpretability also offers valuable insights into the mechanisms underlying ICI treatments.

8.
Genome Biol ; 25(1): 207, 2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39103856

ABSTRACT

Cell type identification is an indispensable analytical step in single-cell data analyses. To address the high noise stemming from gene expression data, existing computational methods often overlook the biologically meaningful relationships between genes, opting to reduce all genes to a unified data space. We assume that such relationships can aid in characterizing cell type features and improving cell type recognition accuracy. To this end, we introduce scPriorGraph, a dual-channel graph neural network that integrates multi-level gene biosemantics. Experimental results demonstrate that scPriorGraph effectively aggregates feature values of similar cells using high-quality graphs, achieving state-of-the-art performance in cell type identification.


Subject(s)
Single-Cell Analysis , Single-Cell Analysis/methods , Humans , Neural Networks, Computer , RNA-Seq/methods , Computational Biology/methods , Algorithms , Software , Single-Cell Gene Expression Analysis
9.
bioRxiv ; 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-39149355

ABSTRACT

Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) methods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer's, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet's interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.

10.
Front Genet ; 15: 1378809, 2024.
Article in English | MEDLINE | ID: mdl-39161422

ABSTRACT

Introduction: Developing effective breast cancer survival prediction models is critical to breast cancer prognosis. With the widespread use of next-generation sequencing technologies, numerous studies have focused on survival prediction. However, previous methods predominantly relied on single-omics data, and survival prediction using multi-omics data remains a significant challenge. Methods: In this study, considering the similarity of patients and the relevance of multi-omics data, we propose a novel multi-omics stacked fusion network (MSFN) based on a stacking strategy to predict the survival of breast cancer patients. MSFN first constructs a patient similarity network (PSN) and employs a residual graph neural network (ResGCN) to obtain correlative prognostic information from PSN. Simultaneously, it employs convolutional neural networks (CNNs) to obtain specificity prognostic information from multi-omics data. Finally, MSFN stacks the prognostic information from these networks and feeds into AdaboostRF for survival prediction. Results: Experiments results demonstrated that our method outperformed several state-of-the-art methods, and biologically validated by Kaplan-Meier and t-SNE.

11.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39175133

ABSTRACT

Target identification is one of the crucial tasks in drug research and development, as it aids in uncovering the action mechanism of herbs/drugs and discovering new therapeutic targets. Although multiple algorithms of herb target prediction have been proposed, due to the incompleteness of clinical knowledge and the limitation of unsupervised models, accurate identification for herb targets still faces huge challenges of data and models. To address this, we proposed a deep learning-based target prediction framework termed HTINet2, which designed three key modules, namely, traditional Chinese medicine (TCM) and clinical knowledge graph embedding, residual graph representation learning, and supervised target prediction. In the first module, we constructed a large-scale knowledge graph that covers the TCM properties and clinical treatment knowledge of herbs, and designed a component of deep knowledge embedding to learn the deep knowledge embedding of herbs and targets. In the remaining two modules, we designed a residual-like graph convolution network to capture the deep interactions among herbs and targets, and a Bayesian personalized ranking loss to conduct supervised training and target prediction. Finally, we designed comprehensive experiments, of which comparison with baselines indicated the excellent performance of HTINet2 (HR@10 increased by 122.7% and NDCG@10 by 35.7%), ablation experiments illustrated the positive effect of our designed modules of HTINet2, and case study demonstrated the reliability of the predicted targets of Artemisia annua and Coptis chinensis based on the knowledge base, literature, and molecular docking.


Subject(s)
Drugs, Chinese Herbal , Medicine, Chinese Traditional , Neural Networks, Computer , Drugs, Chinese Herbal/chemistry , Drugs, Chinese Herbal/pharmacology , Algorithms , Humans , Deep Learning , Bayes Theorem
12.
ACS Appl Mater Interfaces ; 16(33): 43734-43741, 2024 Aug 21.
Article in English | MEDLINE | ID: mdl-39121441

ABSTRACT

Applying machine-learning techniques for imbalanced data sets presents a significant challenge in materials science since the underrepresented characteristics of minority classes are often buried by the abundance of unrelated characteristics in majority of classes. Existing approaches to address this focus on balancing the counts of each class using oversampling or synthetic data generation techniques. However, these methods can lead to loss of valuable information or overfitting. Here, we introduce a deep learning framework to predict minority-class materials, specifically within the realm of metal-insulator transition (MIT) materials. The proposed approach, termed boosting-CGCNN, combines the crystal graph convolutional neural network (CGCNN) model with a gradient-boosting algorithm. The model effectively handled extreme class imbalances in MIT material data by sequentially building a deeper neural network. The comparative evaluations demonstrated the superior performance of the proposed model compared to other approaches. Our approach is a promising solution for handling imbalanced data sets in materials science.

13.
Water Res ; 263: 122142, 2024 Oct 01.
Article in English | MEDLINE | ID: mdl-39094201

ABSTRACT

Physics-based models are computationally time-consuming and infeasible for real-time scenarios of urban drainage networks, and a surrogate model is needed to accelerate the online predictive modelling. Fully-connected neural networks (NNs) are potential surrogate models, but may suffer from low interpretability and efficiency in fitting complex targets. Owing to the state-of-the-art modelling power of graph neural networks (GNNs) and their match with urban drainage networks in the graph structure, this work proposes a GNN-based surrogate of the flow routing model for the hydraulic prediction problem of drainage networks, which regards recent hydraulic states as initial conditions, and future runoff and control policy as boundary conditions. To incorporate hydraulic constraints and physical relationships into drainage modelling, physics-guided mechanisms are designed on top of the surrogate model to restrict the prediction variables with flow balance and flooding occurrence constraints. According to case results in a stormwater network, the GNN-based model is more cost-effective with better hydraulic prediction accuracy than the NN-based model after equal training epochs, and the designed mechanisms further limit prediction errors with interpretable domain knowledge. As the model structure adheres to the flow routing mechanisms and hydraulic constraints in urban drainage networks, it provides an interpretable and effective solution for data-driven surrogate modelling. Simultaneously, the surrogate model accelerates the predictive modelling of urban drainage networks for real-time use compared with the physics-based model.


Subject(s)
Models, Theoretical , Neural Networks, Computer , Cities , Water Movements
14.
Sci Total Environ ; 951: 175411, 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-39134280

ABSTRACT

Efficient management of wastewater treatment plants (WWTPs) necessitates accurate forecasting of influent water quality parameters (WQPs) and flow rate (Q) to reduce energy consumption and mitigate carbon emissions. The time series of WQPs and Q are highly non-linear and influenced by various factors such as temperature (T) and precipitation (Precip). Conventional models often struggle to account for long-term temporal patterns and overlook the complex interactions of parameters within the data, leading to inaccuracies in detecting WQPs and Q. This work introduced the Pre-training enhanced Spatio-Temporal Graph Neural Network (PT-STGNN), a novel methodology for accurately forecasting of influent COD, ammonia nitrogen (NH3-N), total phosphorus (TP), total nitrogen (TN), pH and Q in WWTPs. PT-STGNN utilizes influent data of the WWTP, air quality data and meteorological data from the service area as inputs to enhance prediction accuracy. The model employs unsupervised Transformer blocks for pre-training, with efficient masking strategies to effectively capture long-term historical patterns and contextual information, thereby significantly boosting forecasting accuracy. Furthermore, PT-STGNN integrates a unique graph structure learning mechanism to identify dependencies between parameters, further improving the model's forecasting accuracy and interpretability. Compared with the state-of-the-art models, PT-STGNN demonstrated superior predictive performance, particularly for a longer-term prediction (i.e., 12 h), with MAE, RMSE and MAPE at 12-h prediction horizon of 2.737 ± 0.040, 4.209 ± 0.060 and 13.648 ± 0.151 %, respectively, for the algebraic mean of each parameter. From the results of graph structure learning, it is observed that there are strong dependencies between NH3-N and TN, TP and Q, as well as Precip, etc. This study innovatively applies STGNN, not only offering a novel approach for predicting influent WQPs and Q in WWTPs, but also advances our understanding of the interrelationships among various parameters, significantly enhancing the model's interpretability.

15.
Neural Netw ; 179: 106574, 2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39096754

ABSTRACT

Graph neural networks (GNN) are widely used in recommendation systems, but traditional centralized methods raise privacy concerns. To address this, we introduce a federated framework for privacy-preserving GNN-based recommendations. This framework allows distributed training of GNN models using local user data. Each client trains a GNN using its own user-item graph and uploads gradients to a central server for aggregation. To overcome limited data, we propose expanding local graphs using Software Guard Extension (SGX) and Local Differential Privacy (LDP). SGX computes node intersections for subgraph exchange and expansion, while local differential privacy ensures privacy. Additionally, we introduce a personalized approach with Prototype Networks (PN) and Model-Agnostic Meta-Learning (MAML) to handle data heterogeneity. This enhances the encoding abilities of the federated meta-learner, enabling precise fine-tuning and quick adaptation to diverse client graph data. We leverage SGX and local differential privacy for secure parameter sharing and defense against malicious servers. Comprehensive experiments across six datasets demonstrate our method's superiority over centralized GNN-based recommendations, while preserving user privacy.

16.
Comput Biol Med ; 180: 108869, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39096607

ABSTRACT

Alzheimer's disease (AD) is a chronic neurodegenerative disease. Early diagnosis are very important to timely treatment and delay the progression of the disease. In the past decade, many computer-aided diagnostic (CAD) algorithms have been proposed for classification of AD. In this paper, we propose a novel graph neural network method, termed Brain Graph Attention Network (BGAN) for classification of AD. First, brain graph data are used to model classification of AD as a graph classification task. Second, a local attention layer is designed to capture and aggregate messages of interactions between node neighbors. And, a global attention layer is introduced to obtain the contribution of each node for graph representation. Finally, using the BGAN to implement AD classification. We train and test on two open public databases for AD classification task. Compared to classic models, the experimental results show that our model is superior to six classic models. We demonstrate that BGAN is a powerful classification model for AD. In addition, our model can provide an analysis of brain regions in order to judge which regions are related to AD disease and which regions are related to AD progression.


Subject(s)
Alzheimer Disease , Brain , Neural Networks, Computer , Alzheimer Disease/classification , Alzheimer Disease/diagnostic imaging , Humans , Brain/diagnostic imaging , Algorithms , Databases, Factual , Diagnosis, Computer-Assisted/methods
17.
Comput Biol Med ; 180: 108971, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39106672

ABSTRACT

BACKGROUND: The intersection of artificial intelligence and medical image analysis has ushered in a new era of innovation and changed the landscape of brain tumor detection and diagnosis. Correct detection and classification of brain tumors based on medical images is crucial for early diagnosis and effective treatment. Convolutional Neural Network (CNN) models are widely used for disease detection. However, they are sometimes unable to sufficiently recognize the complex features of medical images. METHODS: This paper proposes a fused Deep Learning (DL) model that combines Graph Neural Networks (GNN), which recognize relational dependencies of image regions, and CNN, which captures spatial features, is proposed to improve brain tumor detection. By integrating these two architectures, our model achieves a more comprehensive representation of brain tumor images and improves classification performance. The proposed model is evaluated on a public dataset of 10847 MRI images. The results show that the proposed model outperforms the existing pre-trained models and traditional CNN architectures. RESULTS: The fused DL model achieves 93.68% accuracy in brain tumor classification. The results indicate that the proposed model outperforms the existing pre-trained models and traditional CNN architectures. CONCLUSION: The numerical results suggest that the model should be further investigated for potential use in clinical trials to improve clinical decision-making.


Subject(s)
Brain Neoplasms , Deep Learning , Magnetic Resonance Imaging , Neural Networks, Computer , Humans , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/pathology , Magnetic Resonance Imaging/methods , Image Interpretation, Computer-Assisted/methods , Brain/diagnostic imaging
18.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39210506

ABSTRACT

Tumorigenesis arises from the dysfunction of cancer genes, leading to uncontrolled cell proliferation through various mechanisms. Establishing a complete cancer gene catalogue will make precision oncology possible. Although existing methods based on graph neural networks (GNN) are effective in identifying cancer genes, they fall short in effectively integrating data from multiple views and interpreting predictive outcomes. To address these shortcomings, an interpretable representation learning framework IMVRL-GCN is proposed to capture both shared and specific representations from multiview data, offering significant insights into the identification of cancer genes. Experimental results demonstrate that IMVRL-GCN outperforms state-of-the-art cancer gene identification methods and several baselines. Furthermore, IMVRL-GCN is employed to identify a total of 74 high-confidence novel cancer genes, and multiview data analysis highlights the pivotal roles of shared, mutation-specific, and structure-specific representations in discriminating distinctive cancer genes. Exploration of the mechanisms behind their discriminative capabilities suggests that shared representations are strongly associated with gene functions, while mutation-specific and structure-specific representations are linked to mutagenic propensity and functional synergy, respectively. Finally, our in-depth analyses of these candidates suggest potential insights for individualized treatments: afatinib could counteract many mutation-driven risks, and targeting interactions with cancer gene SRC is a reasonable strategy to mitigate interaction-induced risks for NR3C1, RXRA, HNF4A, and SP1.


Subject(s)
Neoplasms , Humans , Neoplasms/genetics , Computational Biology/methods , Neural Networks, Computer , Mutation , Genes, Neoplasm , Hepatocyte Nuclear Factor 4/genetics , Machine Learning
19.
Neural Netw ; 180: 106650, 2024 Aug 23.
Article in English | MEDLINE | ID: mdl-39208465

ABSTRACT

Real-world graphs exhibit increasing heterophily, where nodes no longer tend to be connected to nodes with the same label, challenging the homophily assumption of classical graph neural networks (GNNs) and impeding their performance. Intriguingly, from the observation of heterophilous data, we notice that certain high-order information exhibits higher homophily, which motivates us to involve high-order information in node representation learning. However, common practices in GNNs to acquire high-order information mainly through increasing model depth and altering message-passing mechanisms, which, albeit effective to a certain extent, suffer from three shortcomings: (1) over-smoothing due to excessive model depth and propagation times; (2) high-order information is not fully utilized; (3) low computational efficiency. In this regard, we design a similarity-based path sampling strategy to capture smooth paths containing high-order homophily. Then we propose a lightweight model based on multi-layer perceptrons (MLP), named PathMLP, which can encode messages carried by paths via simple transformation and concatenation operations, and effectively learn node representations in heterophilous graphs through adaptive path aggregation. Extensive experiments demonstrate that our method outperforms baselines on 16 out of 20 datasets, underlining its effectiveness and superiority in alleviating the heterophily problem. In addition, our method is immune to over-smoothing and has high computational efficiency. The source code will be available in https://github.com/Graph4Sec-Team/PathMLP.

20.
J Healthc Inform Res ; 8(3): 555-575, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39131103

ABSTRACT

Electronic Health Records (EHRs) play a crucial role in shaping predictive are models, yet they encounter challenges such as significant data gaps and class imbalances. Traditional Graph Neural Network (GNN) approaches have limitations in fully leveraging neighbourhood data or demanding intensive computational requirements for regularisation. To address this challenge, we introduce CliqueFluxNet, a novel framework that innovatively constructs a patient similarity graph to maximise cliques, thereby highlighting strong inter-patient connections. At the heart of CliqueFluxNet lies its stochastic edge fluxing strategy - a dynamic process involving random edge addition and removal during training. This strategy aims to enhance the model's generalisability and mitigate overfitting. Our empirical analysis, conducted on MIMIC-III and eICU datasets, focuses on the tasks of mortality and readmission prediction. It demonstrates significant progress in representation learning, particularly in scenarios with limited data availability. Qualitative assessments further underscore CliqueFluxNet's effectiveness in extracting meaningful EHR representations, solidifying its potential for advancing GNN applications in healthcare analytics.

SELECTION OF CITATIONS
SEARCH DETAIL