RESUMO
Multi-view learning is an emerging field of multi-modal fusion, which involves representing a single instance using multiple heterogeneous features to improve compatibility prediction. However, existing graph-based multi-view learning approaches are implemented on homogeneous assumptions and pairwise relationships, which may not adequately capture the complex interactions among real-world instances. In this paper, we design a compressed hypergraph neural network from the perspective of multi-view heterogeneous graph learning. This approach effectively captures rich multi-view heterogeneous semantic information, incorporating a hypergraph structure that simultaneously enables the exploration of higher-order correlations between samples in multi-view scenarios. Specifically, we introduce efficient hypergraph convolutional networks based on an explainable regularizer-centered optimization framework. Additionally, a low-rank approximation is adopted as hypergraphs to reformat the initial complex multi-view heterogeneous graph. Extensive experiments compared with several advanced node classification methods and multi-view classification methods have demonstrated the feasibility and effectiveness of the proposed method.
Assuntos
Redes Neurais de Computação , Algoritmos , Aprendizado de Máquina , Semântica , HumanosRESUMO
Graphs are widely used to model interconnected entities and improve downstream predictions in various real-world applications. However, real-world graphs nowadays are often associated with complex attributes on multiple types of nodes and even links that are hard to model uniformly, while the widely used graph neural networks (GNNs) often require sufficient training toward specific downstream predictions to achieve strong performance. In this work, we take a fundamentally different approach than GNNs, to simultaneously achieve deep joint modeling of complex attributes and flexible structures of real-world graphs and obtain unsupervised generic graph representations that are not limited to specific downstream predictions. Our framework, built on a natural integration of language models (LMs) and random walks (RWs), is straightforward, powerful and data-efficient. Specifically, we first perform attributed RWs on the graph and design an automated program to compose roughly meaningful textual sequences directly from the attributed RWs; then we fine-tune an LM using the RW-based textual sequences and extract embedding vectors from the LM, which encapsulates both attribute semantics and graph structures. In our experiments, we evaluate the learned node embeddings towards different downstream prediction tasks on multiple real-world attributed graph datasets and observe significant improvements over a comprehensive set of state-of-the-art unsupervised node embedding methods. We believe this work opens a door for more sophisticated technical designs and empirical evaluations toward the leverage of LMs for the modeling of real-world graphs.
RESUMO
Personalized diagnosis prediction based on electronic health records (EHR) of patients is a promising yet challenging task for AI in healthcare. Existing studies typically ignore the heterogeneity of diseases across different patients. For example, diabetes can have different complications across different patients (e.g., hyperlipidemia and circulatory disorder), which requires personalized diagnoses and treatments. Specifically, existing models fail to consider 1) varying severity of the same diseases for different patients, 2) complex interactions among syndromic diseases, and 3) dynamic progression of chronic diseases. In this work, we propose to perform personalized diagnosis prediction based on EHR data via capturing disease severity, interaction, and progression. In particular, we enable personalized disease representations via severity-driven embeddings at the disease level. Then, at the visit level, we propose to capture higher-order interactions among diseases that can collectively affect patients' health status via hypergraph-based aggregation; at the patient level, we devise a personalized generative model based on neural ordinary differential equations to capture the continuous-time disease progressions underlying discrete and incomplete visits. Extensive experiments on two real-world EHR datasets show significant performance gains brought by our approach, yielding average improvements of 10.70% for diagnosis prediction over state-of-the-art competitors.