Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
Entropy (Basel) ; 26(5)2024 Apr 28.
Article in English | MEDLINE | ID: mdl-38785621

ABSTRACT

The integration of graph embedding technology and collaborative filtering algorithms has shown promise in enhancing the performance of recommendation systems. However, existing integrated recommendation algorithms often suffer from feature bias and lack effectiveness in personalized user recommendation. For instance, users' historical interactions with a certain class of items may inaccurately lead to recommendations of all items within that class, resulting in feature bias. Moreover, accommodating changes in user interests over time poses a significant challenge. This study introduces a novel recommendation model, RCKFM, which addresses these shortcomings by leveraging the CoFM model, TransR graph embedding model, backdoor tuning of causal inference, KL divergence, and the factorization machine model. RCKFM focuses on improving graph embedding technology, adjusting feature bias in embedding models, and achieving personalized recommendations. Specifically, it employs the TransR graph embedding model to handle various relationship types effectively, mitigates feature bias using causal inference techniques, and predicts changes in user interests through KL divergence, thereby enhancing the accuracy of personalized recommendations. Experimental evaluations conducted on publicly available datasets, including "MovieLens-1M" and "Douban dataset" from Kaggle, demonstrate the superior performance of the RCKFM model. The results indicate a significant improvement of between 3.17% and 6.81% in key indicators such as precision, recall, normalized discount cumulative gain, and hit rate in the top-10 recommendation tasks. These findings underscore the efficacy and potential impact of the proposed RCKFM model in advancing recommendation systems.

2.
Life Sci Space Res (Amst) ; 41: 64-73, 2024 May.
Article in English | MEDLINE | ID: mdl-38670654

ABSTRACT

Microgravity in the space environment can potentially have various negative effects on the human body, one of which is bone loss. Given the increasing frequency of human space activities, there is an urgent need to identify effective anti-osteoporosis drugs for the microgravity environment. Traditional microgravity experiments conducted in space suffer from limitations such as time-consuming procedures, high costs, and small sample sizes. In recent years, the in-silico drug discovery method has emerged as a promising strategy due to the advancements in bioinformatics and computer technology. In this study, we first collected a total of 184,915 literature articles related to microgravity and bone loss. We employed a combination of dependency path extraction and clustering techniques to extract data from the text. Afterwards, we conducted data cleaning and standardization to integrate data from several sources, including The Global Network of Biomedical Relationships (GNBR), Curated Drug-Drug Interactions Database (DDInter), Search Tool for Interacting Chemicals (STITCH), DrugBank, and Traditional Chinese Medicines Integrated Database (TCMID). Through this integration process, we constructed the Microgravity Biology Knowledge Graph (MBKG) consisting of 134,796 biological entities and 3,395,273 triplets. Subsequently, the TransE model was utilized to perform knowledge graph embedding. By calculating the distances between entities in the model space, the model successfully predicted potential drugs for treating osteoporosis and microgravity-induced bone loss. The results indicate that out of the top 10 ranked western medicines, 7 have been approved for the treatment of osteoporosis. Additionally, among the top 10 ranked traditional Chinese medicines, 5 have scientific literature supporting their effectiveness in treating bone loss. Among the top 20 predicted medicines for microgravity-induced bone loss, 15 have been studied in microgravity or simulated microgravity environments, while the remaining 5 are also applicable for treating osteoporosis. This research highlights the potential application of MBKG in the field of space drug discovery.


Subject(s)
Osteoporosis , Weightlessness , Humans , Osteoporosis/drug therapy , Drug Discovery , Bone Density Conservation Agents/therapeutic use , Computational Biology/methods , Computer Simulation
3.
Comput Biol Med ; 174: 108398, 2024 May.
Article in English | MEDLINE | ID: mdl-38608322

ABSTRACT

The recurrence of low-stage lung cancer poses a challenge due to its unpredictable nature and diverse patient responses to treatments. Personalized care and patient outcomes heavily rely on early relapse identification, yet current predictive models, despite their potential, lack comprehensive genetic data. This inadequacy fuels our research focus-integrating specific genetic information, such as pathway scores, into clinical data. Our aim is to refine machine learning models for more precise relapse prediction in early-stage non-small cell lung cancer. To address the scarcity of genetic data, we employ imputation techniques, leveraging publicly available datasets such as The Cancer Genome Atlas (TCGA), integrating pathway scores into our patient cohort from the Cancer Long Survivor Artificial Intelligence Follow-up (CLARIFY) project. Through the integration of imputed pathway scores from the TCGA dataset with clinical data, our approach achieves notable strides in predicting relapse among a held-out test set of 200 patients. By training machine learning models on enriched knowledge graph data, inclusive of triples derived from pathway score imputation, we achieve a promising precision of 82% and specificity of 91%. These outcomes highlight the potential of our models as supplementary tools within tumour, node, and metastasis (TNM) classification systems, offering improved prognostic capabilities for lung cancer patients. In summary, our research underscores the significance of refining machine learning models for relapse prediction in early-stage non-small cell lung cancer. Our approach, centered on imputing pathway scores and integrating them with clinical data, not only enhances predictive performance but also demonstrates the promising role of machine learning in anticipating relapse and ultimately elevating patient outcomes.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Genomics , Lung Neoplasms , Machine Learning , Humans , Lung Neoplasms/genetics , Carcinoma, Non-Small-Cell Lung/genetics , Genomics/methods , Neoplasm Recurrence, Local/genetics , Female , Male , Databases, Genetic
4.
PeerJ Comput Sci ; 10: e1808, 2024.
Article in English | MEDLINE | ID: mdl-38435603

ABSTRACT

The purpose of knowledge embedding is to extract entities and relations from the knowledge graph into low-dimensional dense vectors, in order to be applied to downstream tasks, such as connection prediction and intelligent classification. Existing knowledge embedding methods still have many limitations, such as the contradiction between the vast amount of data and limited computing power, and the challenge of effectively representing rare entities. This article proposed a knowledge embedding learning model, which incorporates a graph attention mechanism to integrate key node information. It can effectively aggregate key information from the global graph structure, shield redundant information, and represent rare nodes in the knowledge base independently of its own structure. We introduce a relation update layer to further update the relation based on the results of entity training. The experiment shows that our method matches or surpasses the performance of other baseline models in link prediction on the FB15K-237 dataset. The metric Hits@1 has increased by 10.9% compared to the second-ranked baseline model. In addition, we conducted further analysis on rare nodes with fewer neighborhoods, confirming that our model can embed rare nodes more accurately than the baseline models.

5.
Neural Netw ; 172: 106143, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38309139

ABSTRACT

Entity alignment aims to construct a complete knowledge graph (KG) by matching the same entities in multi-source KGs. Existing researches on entity alignment mainly focuses on static multi-relational data in knowledge graphs. However, the relationships or attributes between entities often possess temporal characteristics as well. Neglecting these temporal characteristics can frequently lead to alignment errors. Compared to studying entity alignment in temporal knowledge graphs, there are relatively few efforts on entity alignment in cross-lingual temporal knowledge graphs. Therefore, in this paper, we put forward an entity alignment method for cross-lingual temporal knowledge graphs, namely CTEA. Based on GCN and TransE, CTEA combines entity embeddings, relation embeddings and attribute embeddings to design a joint embedding model, which is more conducive to generating transferable entity embedding. In the meantime, the distance calculation between elements and the similarity calculation of entity pairs are combined to enhance the reliability of cross-lingual entity alignment. Experiments shows that the proposed CTEA model improves Hits@m and MRR by about 0.8∼2.4 percentage points compared with the latest methods.


Subject(s)
Knowledge , Pattern Recognition, Automated , Reproducibility of Results
6.
BMC Bioinformatics ; 24(1): 488, 2023 Dec 19.
Article in English | MEDLINE | ID: mdl-38114937

ABSTRACT

BACKGROUND: The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. RESULTS: The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. CONCLUSIONS: The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.


Subject(s)
Drug Development , Pattern Recognition, Automated , Drug Development/methods , Proteins/chemistry , Algorithms , Knowledge Bases , Drug Interactions
7.
J Biomed Inform ; 147: 104503, 2023 11.
Article in English | MEDLINE | ID: mdl-37778673

ABSTRACT

Predicting relationships between biological entities can greatly benefit important biomedical problems. Previous studies have attempted to represent biological entities and relationships in Euclidean space using embedding methods, which evaluate their semantic similarity by representing entities as numerical vectors. However, the limitation of these methods is that they cannot prevent the loss of latent hierarchical information when embedding large graph-structured data into Euclidean space, and therefore cannot capture the semantics of entities and relationships accurately. Hyperbolic spaces, such as Poincaré ball, are better suited for hierarchical modeling than Euclidean spaces. This is because hyperbolic spaces exhibit negative curvature, causing distances to grow exponentially as they approach the boundary. In this paper, we propose HEM, a hyperbolic hierarchical knowledge graph embedding model to generate vector representations of bio-entities. By encoding the entities and relations in the hyperbolic space, HEM can capture latent hierarchical information and improve the accuracy of biological entity representation. Notably, HEM can preserve rich information with a low dimension compared with the methods that encode entities in Euclidean space. Furthermore, we explore the performance of HEM in protein-protein interaction prediction and gene-disease association prediction tasks. Experimental results demonstrate the superior performance of HEM over state-of-the-art baselines. The data and code are available at : https://github.com/Nan-ll/HEM.


Subject(s)
Knowledge , Pattern Recognition, Automated , Semantics
8.
J Med Internet Res ; 25: e45225, 2023 10 20.
Article in English | MEDLINE | ID: mdl-37862061

ABSTRACT

BACKGROUND: The global pandemics of severe acute respiratory syndrome, Middle East respiratory syndrome, and COVID-19 have caused unprecedented crises for public health. Coronaviruses are constantly evolving, and it is unknown which new coronavirus will emerge and when the next coronavirus will sweep across the world. Knowledge graphs are expected to help discover the pathogenicity and transmission mechanism of viruses. OBJECTIVE: The aim of this study was to discover potential targets and candidate drugs to repurpose for coronaviruses through a knowledge graph-based approach. METHODS: We propose a computational and evidence-based knowledge discovery approach to identify potential targets and candidate drugs for coronaviruses from biomedical literature and well-known knowledge bases. To organize the semantic triples extracted automatically from biomedical literature, a semantic conversion model was designed. The literature knowledge was associated and integrated with existing drug and gene knowledge through semantic mapping, and the coronavirus knowledge graph (CovKG) was constructed. We adopted both the knowledge graph embedding model and the semantic reasoning mechanism to discover unrecorded mechanisms of drug action as well as potential targets and drug candidates. Furthermore, we have provided evidence-based support with a scoring and backtracking mechanism. RESULTS: The constructed CovKG contains 17,369,620 triples, of which 641,195 were extracted from biomedical literature, covering 13,065 concept unique identifiers, 209 semantic types, and 97 semantic relations of the Unified Medical Language System. Through multi-source knowledge integration, 475 drugs and 262 targets were mapped to existing knowledge, and 41 new drug mechanisms of action were found by semantic reasoning, which were not recorded in the existing knowledge base. Among the knowledge graph embedding models, TransR outperformed others (mean reciprocal rank=0.2510, Hits@10=0.3505). A total of 33 potential targets and 18 drug candidates were identified for coronaviruses. Among them, 7 novel drugs (ie, quinine, nelfinavir, ivermectin, asunaprevir, tylophorine, Artemisia annua extract, and resveratrol) and 3 highly ranked targets (ie, angiotensin converting enzyme 2, transmembrane serine protease 2, and M protein) were further discussed. CONCLUSIONS: We showed the effectiveness of a knowledge graph-based approach in potential target discovery and drug repurposing for coronaviruses. Our approach can be extended to other viruses or diseases for biomedical knowledge discovery and relevant applications.


Subject(s)
COVID-19 , Drug Repositioning , Humans , Pattern Recognition, Automated , Knowledge Bases , Unified Medical Language System
9.
Open Res Eur ; 3: 100, 2023.
Article in English | MEDLINE | ID: mdl-37645491

ABSTRACT

Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3's understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means.

10.
Math Biosci Eng ; 20(6): 9607-9624, 2023 Mar 21.
Article in English | MEDLINE | ID: mdl-37322903

ABSTRACT

Knowledge graph (KG) embedding is to embed the entities and relations of a KG into a low-dimensional continuous vector space while preserving the intrinsic semantic associations between entities and relations. One of the most important applications of knowledge graph embedding (KGE) is link prediction (LP), which aims to predict the missing fact triples in the KG. A promising approach to improving the performance of KGE for the task of LP is to increase the feature interactions between entities and relations so as to express richer semantics between them. Convolutional neural networks (CNNs) have thus become one of the most popular KGE models due to their strong expression and generalization abilities. To further enhance favorable features from increased feature interactions, we propose a lightweight CNN-based KGE model called IntSE in this paper. Specifically, IntSE not only increases the feature interactions between the components of entity and relationship embeddings with more efficient CNN components but also incorporates the channel attention mechanism that can adaptively recalibrate channel-wise feature responses by modeling the interdependencies between channels to enhance the useful features while suppressing the useless ones for improving its performance for LP. The experimental results on public datasets confirm that IntSE is superior to state-of-the-art CNN-based KGE models for link prediction in KGs.


Subject(s)
Neural Networks, Computer , Pattern Recognition, Automated , Semantics
11.
Health Inf Sci Syst ; 11(1): 7, 2023 Dec.
Article in English | MEDLINE | ID: mdl-36703901

ABSTRACT

Purpose: The early detection of organ failure mitigates the risk of post-intensive care syndrome and long-term functional impairment. The aim of this study is to predict organ failure in real-time for critical care patients based on a data-driven and knowledge-driven machine learning method (DKM) and provide explanations for the prediction by incorporating a medical knowledge graph. Methods: The cohort of this study was a subset of the 4,386 adult Intensive Care Unit (ICU) patients from the MIMIC-III dataset collected between 2001 and 2012, and the primary outcome was the Delta Sequential Organ Failure Assessment (SOFA) score. A real-time Delta SOFA score prediction model was developed with two key components: an improved deep learning temporal convolutional network (S-TCN) and a graph-embedding feature extraction method based on a medical knowledge graph. Entities and relations related to organ failure were extracted from the Unified Medical Language System to build the medical knowledge graph, and patient data were mapped onto the graph to extract the embeddings. We measured the performance of our DKM approach with cross-validation to avoid the formation of biased assessments. Results: An area under the receiver operating characteristic curve (AUC) of 0.973, a precision of 0.923, a NPV of 0.989, and an F1 score of 0.927 were achieved using the DKM approach, which significantly outperformed the baseline methods. Additionally, the performance remained stable following external validation on the eICU dataset, which consists of 2,816 admissions (AUC = 0.981, precision = 0.860, NPV = 0.984). Visualization of feature importance for the Delta SOFA score and their relationships on the basic clinical medical (BCM) knowledge graph provided a model explanation. Conclusion: The use of an improved TCN model and a medical knowledge graph led to substantial improvement in prediction accuracy, providing generalizability and an independent explanation for organ failure prediction in critical care patients. These findings show the potential of incorporating prior domain knowledge into machine learning models to inform care and service planning. Supplementary Information: The online version of this article contains supplementary material available 10.1007/s13755-023-00210-5.

12.
Complex Intell Systems ; 9(1): 1059-1095, 2023.
Article in English | MEDLINE | ID: mdl-35965491

ABSTRACT

The necessity for scholarly knowledge mining and management has grown significantly as academic literature and its linkages to authors produce enormously. Information extraction, ontology matching, and accessing academic components with relations have become more critical than ever. Therefore, with the advancement of scientific literature, scholarly knowledge graphs have become critical to various applications where semantics can impart meanings to concepts. The objective of study is to report a literature review regarding knowledge graph construction, refinement and utilization in scholarly domain. Based on scholarly literature, the study presents a complete assessment of current state-of-the-art techniques. We presented an analytical methodology to investigate the existing status of scholarly knowledge graphs (SKG) by structuring scholarly communication. This review paper investigates the field of applying machine learning, rule-based learning, and natural language processing tools and approaches to construct SKG. It further presents the review of knowledge graph utilization and refinement to provide a view of current research efforts. In addition, we offer existing applications and challenges across the board in construction, refinement and utilization collectively. This research will help to identify frontier trends of SKG which will motivate future researchers to carry forward their work.

13.
Article in Chinese | WPRIM (Western Pacific) | ID: wpr-987651

ABSTRACT

@#Alzheimer''s disease (AD) has brought to us huge medical and economic burdens, and so discovery of its therapeutic drugs is of great significance.In this paper, we utilized knowledge graph embedding (KGE) models to explore drug repurposing for AD on the publicly available drug repurposing knowledge graph (DRKG).Specifically, we applied four KGE models, namely TransE, DistMult, ComplEx, and RotatE, to learn the embedding vectors of entities and relations on DRKG.By using three classical knowledge graph evaluation metrics, we then evaluated and compared the performance of these models as well as the quality of the learned embedded vectors.Based on our results, we selected the RotatE model for link prediction and identified 16 drugs that might be repurposed for the treatment of AD.Previous studies have confirmed the potential therapeutic effects of 12 drugs against AD, i.e., glutathione, haloperidol, capsaicin, quercetin, estradiol, glucose, disulfire, adenosine, paroxetine, paclitaxel, glybride and amitriptyline.Our study demonstrates that drug repurposing based on KGE may provide new ideas and methods for AD drug discovery.Moreover, the RotatE model effectively integrates multi-source information of DRKG, enabling promising AD drug repurposing.The source code of this paper is available at https://github.com/LuYF-Lemon-love/AD-KGE.

14.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36384050

ABSTRACT

Recent advances in Knowledge Graphs (KGs) and Knowledge Graph Embedding Models (KGEMs) have led to their adoption in a broad range of fields and applications. The current publishing system in machine learning requires newly introduced KGEMs to achieve state-of-the-art performance, surpassing at least one benchmark in order to be published. Despite this, dozens of novel architectures are published every year, making it challenging for users, even within the field, to deduce the most suitable configuration for a given application. A typical biomedical application of KGEMs is drug-disease prediction in the context of drug discovery, in which a KGEM is trained to predict triples linking drugs and diseases. These predictions can be later tested in clinical trials following extensive experimental validation. However, given the infeasibility of evaluating each of these predictions and that only a minimal number of candidates can be experimentally tested, models that yield higher precision on the top prioritized triples are preferred. In this paper, we apply the concept of ensemble learning on KGEMs for drug discovery to assess whether combining the predictions of several models can lead to an overall improvement in predictive performance. First, we trained and benchmarked 10 KGEMs to predict drug-disease triples on two independent biomedical KGs designed for drug discovery. Following, we applied different ensemble methods that aggregate the predictions of these models by leveraging the distribution or the position of the predicted triple scores. We then demonstrate how the ensemble models can achieve better results than the original KGEMs by benchmarking the precision (i.e., number of true positives prioritized) of their top predictions. Lastly, we released the source code presented in this work at https://github.com/enveda/kgem-ensembles-in-drug-discovery.


Subject(s)
Drug Discovery , Pattern Recognition, Automated , Knowledge , Machine Learning , Software
15.
Molecules ; 27(16)2022 Aug 12.
Article in English | MEDLINE | ID: mdl-36014371

ABSTRACT

Nowadays, drug-target interactions (DTIs) prediction is a fundamental part of drug repositioning. However, on the one hand, drug-target interactions prediction models usually consider drugs or targets information, which ignore prior knowledge between drugs and targets. On the other hand, models incorporating priori knowledge cannot make interactions prediction for under-studied drugs and targets. Hence, this article proposes a novel dual-network integrated logistic matrix factorization DTIs prediction scheme (Ro-DNILMF) via a knowledge graph embedding approach. This model adds prior knowledge as input data into the prediction model and inherits the advantages of the DNILMF model, which can predict under-studied drug-target interactions. Firstly, a knowledge graph embedding model based on relational rotation (RotatE) is trained to construct the interaction adjacency matrix and integrate prior knowledge. Secondly, a dual-network integrated logistic matrix factorization prediction model (DNILMF) is used to predict new drugs and targets. Finally, several experiments conducted on the public datasets are used to demonstrate that the proposed method outperforms the single base-line model and some mainstream methods on efficiency.


Subject(s)
Drug Repositioning , Pattern Recognition, Automated , Algorithms , Drug Delivery Systems , Drug Interactions , Logistic Models
16.
Comput Biol Chem ; 100: 107730, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35945150

ABSTRACT

To easier manipulate Knowledge Graphs (KGs), knowledge graph embedding (KGE) is proposed and wildly used. However, the relations between entities are usually incomplete due to the performance problems of knowledge extraction methods, which also leads to the sparsity of KGs and make it difficult for KGE methods to obtain reliable representations. Related research has not paid much attention to this challenge in the biomedicine field and has not sufficiently integrated the domain knowledge into KGE methods. To alleviate this problem, we try to incorporate the molecular structure information of the entity into KGE. Specifically, we adopt two strategies to obtain the vector representations of the entities: text-structure-based and graph-structure-based. Then, we spliced the two together as the input of the KGE models. To validate our model, we construct a KCCR knowledge graph and validate the model's superiority in entity prediction, relation prediction, and drug-drug interaction prediction tasks. To the best of our knowledge, this is the first time that molecular structure information has been integrated into KGE methods. It is worth noting that researchers can try to improve the work based on KGE by fusing other feature annotations such as Gene Ontology and protein structure.


Subject(s)
Pattern Recognition, Automated , Semantics , Gene Ontology , Knowledge , Molecular Structure
17.
J Biomed Inform ; 126: 103983, 2022 02.
Article in English | MEDLINE | ID: mdl-34990838

ABSTRACT

OBJECTIVE: This paper aims to propose knowledge-aware embedding, a critical tool for medical term normalization. METHODS: We develop CODER (Cross-lingual knowledge-infused medical term embedding) via contrastive learning based on a medical knowledge graph (KG) named the Unified Medical Language System, and similarities are calculated utilizing both terms and relation triplets from the KG. Training with relations injects medical knowledge into embeddings and can potentially improve their performance as machine learning features. RESULTS: We evaluate CODER based on zero-shot term normalization, semantic similarity, and relation classification benchmarks, and the results show that CODER outperforms various state-of-the-art biomedical word embeddings, concept embeddings, and contextual embeddings. CONCLUSION: CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts. One can use CODER for embedding-based medical term normalization or to provide features for machine learning. Similar to other pretrained language models, CODER can also be fine-tuned for specific tasks. Codes and models are available at https://github.com/GanjinZero/CODER.


Subject(s)
Natural Language Processing , Unified Medical Language System , Language , Machine Learning , Semantics
18.
J Biomed Inform ; 124: 103955, 2021 12.
Article in English | MEDLINE | ID: mdl-34800722

ABSTRACT

Enormous hope in the efficacy of vaccines became recently a successful reality in the fight against the COVID-19 pandemic. However, vaccine hesitancy, fueled by exposure to social media misinformation about COVID-19 vaccines became a major hurdle. Therefore, it is essential to automatically detect where misinformation about COVID-19 vaccines on social media is spread and what kind of misinformation is discussed, such that inoculation interventions can be delivered at the right time and in the right place, in addition to interventions designed to address vaccine hesitancy. This paper is addressing the first step in tackling hesitancy against COVID-19 vaccines, namely the automatic detection of known misinformation about the vaccines on Twitter, the social media platform that has the highest volume of conversations about COVID-19 and its vaccines. We present CoVaxLies, a new dataset of tweets judged relevant to several misinformation targets about COVID-19 vaccines on which a novel method of detecting misinformation was developed. Our method organizes CoVaxLies in a Misinformation Knowledge Graph as it casts misinformation detection as a graph link prediction problem. The misinformation detection method detailed in this paper takes advantage of the link scoring functions provided by several knowledge embedding methods. The experimental results demonstrate the superiority of this method when compared with classification-based methods, widely used currently.


Subject(s)
COVID-19 , Social Media , COVID-19 Vaccines , Communication , Humans , Pandemics , SARS-CoV-2 , Vaccination Hesitancy
19.
Front Res Metr Anal ; 6: 670206, 2021.
Article in English | MEDLINE | ID: mdl-34278204

ABSTRACT

We deal with a heterogeneous pharmaceutical knowledge-graph containing textual information built from several databases. The knowledge graph is a heterogeneous graph that includes a wide variety of concepts and attributes, some of which are provided in the form of textual pieces of information which have not been targeted in the conventional graph completion tasks. To investigate the utility of textual information for knowledge graph completion, we generate embeddings from textual descriptions given to heterogeneous items, such as drugs and proteins, while learning knowledge graph embeddings. We evaluate the obtained graph embeddings on the link prediction task for knowledge graph completion, which can be used for drug discovery and repurposing. We also compare the results with existing methods and discuss the utility of the textual information.

20.
Front Neurorobot ; 15: 674428, 2021.
Article in English | MEDLINE | ID: mdl-34045950

ABSTRACT

With the rapid development of artificial intelligence, Cybernetics, and other High-tech subject technology, robots have been made and used in increasing fields. And studies on robots have attracted growing research interests from different communities. The knowledge graph can act as the brain of a robot and provide intelligence, to support the interaction between the robot and the human beings. Although the large-scale knowledge graphs contain a large amount of information, they are still incomplete compared with real-world knowledge. Most existing methods for knowledge graph completion focus on entity representation learning. However, the importance of relation representation learning is ignored, as well as the cross-interaction between entities and relations. In this paper, we propose an encoder-decoder model which embeds the interaction between entities and relations, and adds a gate mechanism to control the attention mechanism. Experimental results show that our method achieves better link prediction performance than state-of-the-art embedding models on two benchmark datasets, WN18RR and FB15k-237.

SELECTION OF CITATIONS
SEARCH DETAIL