Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.008
Filtrar
1.
Int J Digit Libr ; 25(2): 273-285, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38948004

RESUMEN

Due to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks.

2.
Clin Ther ; 2024 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-38981792

RESUMEN

PURPOSE: To critically assess the role and added value of knowledge graphs in pharmacovigilance, focusing on their ability to predict adverse drug reactions. METHODS: A systematic scoping review was conducted in which detailed information, including objectives, technology, data sources, methodology, and performance metrics, were extracted from a set of peer-reviewed publications reporting the use of knowledge graphs to support pharmacovigilance signal detection. FINDINGS: The review, which included 47 peer-reviewed articles, found knowledge graphs were utilized for detecting/predicting single-drug adverse reactions and drug-drug interactions, with variable reported performance and sparse comparisons to legacy methods. IMPLICATIONS: Research to date suggests that knowledge graphs have the potential to augment predictive signal detection in pharmacovigilance, but further research using more reliable reference sets of adverse drug reactions and comparison with legacy pharmacovigilance methods are needed to more clearly define best practices and to establish their place in holistic pharmacovigilance systems.

3.
Anal Bioanal Chem ; 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-38990360

RESUMEN

Because of the pathological indication and the physiological functions, bile acids (BAs) have occupied the research hotspot in recent decades. Although extensive efforts have been paid onto BAs sub-metabolome characterization, as the subfamily, BA glucuronides (gluA-BAs) profile is seldom concerned. Here, we made efforts to develop a LC-MS/MS program enabling quantitative gluA-BAs sub-metabolome characterization and to explore the differential species in serum between intrahepatic cholestasis of pregnancy (ICP) patients and healthy subjects. To gain as many authentic gluA-BAs as possible, liver microsomes from humans, rats, and mice were deployed to conjugate glucuronyl group to authentic BAs through in vitro incubation. Eighty gluA-BAs were captured and subsequently served as authentic compounds to correlate MS/MS spectral behaviors to structural features using squared energy-resolved MS program. Optimal collision energy (OCE) of [M-H]->[M-H-176.1]- was jointly administrated by [M-H]- mass and glucuronidation site, and identical exciting energies corresponding to 50% survival rate of 1st-generation fragment ion (EE50) were observed merely when the aglycone of a gluA-BA was consistent with the suspected structure. Through integrating high-resolution m/z, OCE, and EE50 information to identify gluA-BAs in a BAs pool, 97 ones were found and identified, and further, quantitative program was built for all annotated gluA-BAs by assigning OCEs to [M-H]->[M-H-176.1]- ion transitions. Quantitative gluA-BAs sub-metabolome of ICP was different from that of the healthy group. More GCDCA-3-G, GDCA-3-G, TCDCA-7-G, TDCA-3-G, and T-ß-MCA-3-G were distributed in the ICP group. Above all, this study not only offered a promising analytical tool for in-depth gluA-BAs sub-metabolome characterization, but also clarified gluA-BAs allowing the differentiation of ICP and healthy subjects.

4.
Int J Epidemiol ; 53(4)2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38996447

RESUMEN

BACKGROUND: Empirical evaluation of inverse probability weighting (IPW) for self-selection bias correction is inaccessible without the full source population. We aimed to: (i) investigate how self-selection biases frequency and association measures and (ii) assess self-selection bias correction using IPW in a cohort with register linkage. METHODS: The source population included 17 936 individuals invited to the Copenhagen Aging and Midlife Biobank during 2009-11 (ages 49-63 years). Participants counted 7185 (40.1%). Register data were obtained for every invited person from 7 years before invitation to the end of 2020. The association between education and mortality was estimated using Cox regression models among participants, IPW participants and the source population. RESULTS: Participants had higher socioeconomic position and fewer hospital contacts before baseline than the source population. Frequency measures of participants approached those of the source population after IPW. Compared with primary/lower secondary education, upper secondary, short tertiary, bachelor and master/doctoral were associated with reduced risk of death among participants (adjusted hazard ratio [95% CI]: 0.60 [0.46; 0.77], 0.68 [0.42; 1.11], 0.37 [0.25; 0.54], 0.28 [0.18; 0.46], respectively). IPW changed the estimates marginally (0.59 [0.45; 0.77], 0.57 [0.34; 0.93], 0.34 [0.23; 0.50], 0.24 [0.15; 0.39]) but not only towards those of the source population (0.57 [0.51; 0.64], 0.43 [0.32; 0.60], 0.38 [0.32; 0.47], 0.22 [0.16; 0.29]). CONCLUSIONS: Frequency measures of study participants may not reflect the source population in the presence of self-selection, but the impact on association measures can be limited. IPW may be useful for (self-)selection bias correction, but the returned results can still reflect residual or other biases and random errors.


Asunto(s)
Mortalidad , Modelos de Riesgos Proporcionales , Factores Socioeconómicos , Humanos , Femenino , Masculino , Persona de Mediana Edad , Dinamarca/epidemiología , Mortalidad/tendencias , Sesgo de Selección , Escolaridad , Probabilidad , Sistema de Registros
5.
Sci Rep ; 14(1): 16587, 2024 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-39025897

RESUMEN

Drug repurposing aims to find new therapeutic applications for existing drugs in the pharmaceutical market, leading to significant savings in time and cost. The use of artificial intelligence and knowledge graphs to propose repurposing candidates facilitates the process, as large amounts of data can be processed. However, it is important to pay attention to the explainability needed to validate the predictions. We propose a general architecture to understand several explainable methods for graph completion based on knowledge graphs and design our own architecture for drug repurposing. We present XG4Repo (eXplainable Graphs for Repurposing), a framework that takes advantage of the connectivity of any biomedical knowledge graph to link compounds to the diseases they can treat. Our method allows methapaths of different types and lengths, which are automatically generated and optimised based on data. XG4Repo focuses on providing meaningful explanations to the predictions, which are based on paths from compounds to diseases. These paths include nodes such as genes, pathways, side effects, or anatomies, so they provide information about the targets and other characteristics of the biomedical mechanism that link compounds and diseases. Paths make predictions interpretable for experts who can validate them and use them in further research on drug repurposing. We also describe three use cases where we analyse new uses for Epirubicin, Paclitaxel, and Predinisone and present the paths that support the predictions.


Asunto(s)
Reposicionamiento de Medicamentos , Reposicionamiento de Medicamentos/métodos , Humanos , Inteligencia Artificial , Algoritmos
6.
Heliyon ; 10(13): e33400, 2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39044974

RESUMEN

This work proves the local vertex anti-magic coloring of even regular circulant bipartite graphs C ( m ; L ) . Let G be either K r , r or K r , r - F , F is a 1-factor. Also, we discover the local vertex anti-magic coloring for union of bipartite graphs; join graphs G ∨ H , where H ∈ { O r , K r , C r , K r , s } ; and the upper bound of corona product G ⊙ O r . It was a problem Lau and Shiu (2023) [1] that: For any G 1 and G 2 , determine χ ℓ v a ( G 1 × G 2 ) . We give partial answer to this problem by proving the followings:1. χ ℓ v a ( C 2 m × C 2 n ) ;2. χ ℓ v a ( C 2 m + 1 × C 2 n + 2 ) ; and3. χ ℓ v a ( P 3 × H ) , where H ∈ { K r , K m , m } .

7.
Genetics ; 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39013109

RESUMEN

As a result of recombination, adjacent nucleotides can have different paths of genetic inheritance and therefore the genealogical trees for a sample of DNA sequences vary along the genome. The structure capturing the details of these intricately interwoven paths of inheritance is referred to as an ancestral recombination graph (ARG). Classical formalisms have focused on mapping coalescence and recombination events to the nodes in an ARG. However, this approach is out of step with some modern developments, which do not represent genetic inheritance in terms of these events or explicitly infer them. We present a simple formalism that defines an ARG in terms of specific genomes and their intervals of genetic inheritance, and show how it generalizes these classical treatments and encompasses the outputs of recent methods. We discuss nuances arising from this more general structure, and argue that it forms an appropriate basis for a software standard in this rapidly growing field.

8.
Artículo en Inglés | MEDLINE | ID: mdl-38946554

RESUMEN

BACKGROUND: Acute hepatic porphyria (AHP) is a group of rare but treatable conditions associated with diagnostic delays of 15 years on average. The advent of electronic health records (EHR) data and machine learning (ML) may improve the timely recognition of rare diseases like AHP. However, prediction models can be difficult to train given the limited case numbers, unstructured EHR data, and selection biases intrinsic to healthcare delivery. We sought to train and characterize models for identifying patients with AHP. METHODS: This diagnostic study used structured and notes-based EHR data from 2 centers at the University of California, UCSF (2012-2022) and UCLA (2019-2022). The data were split into 2 cohorts (referral and diagnosis) and used to develop models that predict (1) who will be referred for testing of acute porphyria, among those who presented with abdominal pain (a cardinal symptom of AHP), and (2) who will test positive, among those referred. The referral cohort consisted of 747 patients referred for testing and 99 849 contemporaneous patients who were not. The diagnosis cohort consisted of 72 confirmed AHP cases and 347 patients who tested negative. The case cohort was 81% female and 6-75 years old at the time of diagnosis. Candidate models used a range of architectures. Feature selection was semi-automated and incorporated publicly available data from knowledge graphs. Our primary outcome was the F-score on an outcome-stratified test set. RESULTS: The best center-specific referral models achieved an F-score of 86%-91%. The best diagnosis model achieved an F-score of 92%. To further test our model, we contacted 372 current patients who lack an AHP diagnosis but were predicted by our models as potentially having it (≥10% probability of referral, ≥50% of testing positive). However, we were only able to recruit 10 of these patients for biochemical testing, all of whom were negative. Nonetheless, post hoc evaluations suggested that these models could identify 71% of cases earlier than their diagnosis date, saving 1.2 years. CONCLUSIONS: ML can reduce diagnostic delays in AHP and other rare diseases. Robust recruitment strategies and multicenter coordination will be needed to validate these models before they can be deployed.

9.
Neural Netw ; 179: 106516, 2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-39003981

RESUMEN

Temporal Knowledge Graphs (TKGs) enable effective modeling of knowledge dynamics and event evolution, facilitating deeper insights and analysis into temporal information. Recently, extrapolation of TKG reasoning has attracted great significance due to its remarkable ability to capture historical correlations and predict future events. Existing studies of extrapolation aim mainly at encoding the structural and temporal semantics based on snapshot sequences, which contain graph aggregators for the association within snapshots and recurrent units for the evolution. However, these methods are limited to modeling long-distance history, as they primarily focus on capturing temporal correlations over shorter periods. Besides, a few approaches rely on compiling historical repetitive statistics of TKGs for predicting future facts. But they often overlook explicit interactions in the graph structure among concurrent events. To address these issues, we propose a PotentiaL concurrEnt Aggregation and contraStive learnING (PLEASING) method for TKG extrapolation. PLEASING is a two-step reasoning framework that effectively leverages the historical and potential features of TKGs. It includes two encoders for historical and global events with an adaptive gated mechanism, acquiring predictions with appropriate weight of the two aspects. Specifically, PLEASING constructs two auxiliary graphs to capture temporal interaction among timestamps and correlations among potential concurrent events, respectively, enabling a holistic investigation of temporal characteristics and future potential possibilities in TKGs. Furthermore, PLEASING incorporates contrastive learning to strengthen its capacity to identify whether queries are related to history. Extensive experiments on seven benchmark datasets demonstrate the state-of-the-art performances of PLEASING and its comprehensive ability to model TKG semantics.

10.
Heliyon ; 10(13): e33833, 2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39050435

RESUMEN

Major depressive disorder (MDD) is a debilitating mental health condition that poses significant risks and burdens. Resting-state functional magnetic resonance imaging (fMRI) has emerged as a promising tool in investigating the neural mechanisms underlying MDD. However, a comprehensive bibliometric analysis of resting-state fMRI in MDD is currently lacking. Here, we aimed to thoroughly explore the trends and frontiers of resting-state fMRI in MDD research. The relevant publications were retrieved from the Web of Science database for the period between 1998 and 2022, and the CiteSpace software was employed to identify the influence of authors, institutions, countries/regions, and the latest research trends. A total of 1501 publications met the search criteria, revealing a gradual increase in the number of annual publications over the years. China contributed the largest publication output, accounting for the highest percentage among all countries. Particularly, the University of Electronic Science and Technology of China, Capital Medical University, and Harvard Medical School were identified as key institutions that have made substantial contributions to this growth. Neuroimage, Biological Psychiatry, Journal of Affective Disorders, and Proceedings of the National Academy of Sciences of the United States of America are among the influential journals in the field of resting-state fMRI research in MDD. Burst keywords analysis suggest the emerging research frontiers in this field are characterized by prominent keywords such as dynamic functional connectivity, cognitive control network, transcranial brain stimulation, and childhood trauma. Overall, our study provides a systematic overview into the historical development, current status, and future trends of resting-state fMRI in MDD, thus offering a useful guide for researchers to plan their future research.

11.
Environ Res Lett ; 19(7): 074069, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39070017

RESUMEN

The global health burden associated with exposure to heat is a grave concern and is projected to further increase under climate change. While physiological studies have demonstrated the role of humidity alongside temperature in exacerbating heat stress for humans, epidemiological findings remain conflicted. Understanding the intricate relationships between heat, humidity, and health outcomes is crucial to inform adaptation and drive increased global climate change mitigation efforts. This article introduces 'directed acyclic graphs' (DAGs) as causal models to elucidate the analytical complexity in observational epidemiological studies that focus on humid-heat-related health impacts. DAGs are employed to delineate implicit assumptions often overlooked in such studies, depicting humidity as a confounder, mediator, or an effect modifier. We also discuss complexities arising from using composite indices, such as wet-bulb temperature. DAGs representing the health impacts associated with wet-bulb temperature help to understand the limitations in separating the individual effect of humidity from the perceived effect of wet-bulb temperature on health. General examples for regression models corresponding to each of the causal assumptions are also discussed. Our goal is not to prioritize one causal model but to discuss the causal models suitable for representing humid-heat health impacts and highlight the implications of selecting one model over another. We anticipate that the article will pave the way for future quantitative studies on the topic and motivate researchers to explicitly characterize the assumptions underlying their models with DAGs, facilitating accurate interpretations of the findings. This methodology is applicable to similarly complex compound events.

12.
J Med Internet Res ; 26: e54263, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38968598

RESUMEN

BACKGROUND: The medical knowledge graph provides explainable decision support, helping clinicians with prompt diagnosis and treatment suggestions. However, in real-world clinical practice, patients visit different hospitals seeking various medical services, resulting in fragmented patient data across hospitals. With data security issues, data fragmentation limits the application of knowledge graphs because single-hospital data cannot provide complete evidence for generating precise decision support and comprehensive explanations. It is important to study new methods for knowledge graph systems to integrate into multicenter, information-sensitive medical environments, using fragmented patient records for decision support while maintaining data privacy and security. OBJECTIVE: This study aims to propose an electronic health record (EHR)-oriented knowledge graph system for collaborative reasoning with multicenter fragmented patient medical data, all the while preserving data privacy. METHODS: The study introduced an EHR knowledge graph framework and a novel collaborative reasoning process for utilizing multicenter fragmented information. The system was deployed in each hospital and used a unified semantic structure and Observational Medical Outcomes Partnership (OMOP) vocabulary to standardize the local EHR data set. The system transforms local EHR data into semantic formats and performs semantic reasoning to generate intermediate reasoning findings. The generated intermediate findings used hypernym concepts to isolate original medical data. The intermediate findings and hash-encrypted patient identities were synchronized through a blockchain network. The multicenter intermediate findings were collaborated for final reasoning and clinical decision support without gathering original EHR data. RESULTS: The system underwent evaluation through an application study involving the utilization of multicenter fragmented EHR data to alert non-nephrology clinicians about overlooked patients with chronic kidney disease (CKD). The study covered 1185 patients in nonnephrology departments from 3 hospitals. The patients visited at least two of the hospitals. Of these, 124 patients were identified as meeting CKD diagnosis criteria through collaborative reasoning using multicenter EHR data, whereas the data from individual hospitals alone could not facilitate the identification of CKD in these patients. The assessment by clinicians indicated that 78/91 (86%) patients were CKD positive. CONCLUSIONS: The proposed system was able to effectively utilize multicenter fragmented EHR data for clinical application. The application study showed the clinical benefits of the system with prompt and comprehensive decision support.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Registros Electrónicos de Salud , Humanos
13.
Brain Inform ; 11(1): 14, 2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38833014

RESUMEN

Depression is a serious mental illness that affects millions worldwide and consequently has attracted considerable research interest in recent years. Within the field of automated depression estimation, most researchers focus on neural network architectures while ignoring other research directions. Within this paper, we explore an alternate approach and study the impact of input representations on the learning ability of the models. In particular, we work with graph-based representations to highlight different aspects of input transcripts, both at the interview and corpus levels. We use sentence similarity graphs and keyword correlation graphs to exemplify the advantages of graphical representations over sequential models for binary classification problems within depression estimation. Additionally, we design multi-view architectures that split interview transcripts into question and answer views in order to take into account dialogue structure. Our experiments show the benefits of multi-view based graphical input encodings over sequential models and provide new state-of-the-art results for binary classification on the gold standard DAIC-WOZ dataset. Further analysis establishes our method as a means for generating meaningful insights and visual summaries of interview transcripts that can be used by medical professionals.

14.
Front Digit Health ; 6: 1416390, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38846322

RESUMEN

[This corrects the article DOI: 10.3389/fdgth.2023.1322428.].

15.
J Inflamm Res ; 17: 3709-3724, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38882188

RESUMEN

Purpose: Granulomatous mastitis (GLM) is a rare and complex chronic inflammatory disease of the breast with an unknown cause and a tendency to recur. As medical science advances, the cause, treatment strategies, and comprehensive management of GLM have increasingly attracted widespread attention. The aim of this study is to assess the development trends and research focal points in the GLM field over the past 24 years using bibliometric analysis. Methods: Using GLM, Granulomatous mastitis (GM), Idiopathic granulomatous lobular mastitis (IGLM), and Idiopathic granulomatous mastitis (IGM) as keywords, we retrieved publications related to GLM from 2000 to 2023 from the Web of Science, excluding articles irrelevant to this study. Citespace and VOSviewer were employed for data analysis and visualization. Results: A total of 347 publications were included in this analysis. Over the past 24 years, the number of publications has steadily increased, with Turkey being the leading contributor in terms of publications and citations. The University of Health Sciences, Istanbul University, and Istanbul University Cerrahpasa were the most influential institutions. The Breast Journal, Breast Care, and Journal of Investigative Surgery were the journals that published the most on this topic. The research primarily focused on the cause, differential diagnosis, treatment, and comprehensive management of GLM. Issues related to recurrence, hyperprolactinemia, and Corynebacterium emerged as current research hotspots. Conclusion: Our bibliometric study outlines the historical development of the GLM field and identifies recent research focuses and trends, which may aid researchers in identifying research hotspots and directions, thereby advancing the study of GLM.

16.
Acta Psychiatr Scand ; 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38886846

RESUMEN

BACKGROUND: Knowledge graphs (KGs) remain an underutilized tool in the field of psychiatric research. In the broader biomedical field KGs are already a significant tool mainly used as knowledge database or for novel relation detection between biomedical entities. This review aims to outline how KGs would further research in the field of psychiatry in the age of Artificial Intelligence (AI) and Large Language Models (LLMs). METHODS: We conducted a thorough literature review across a spectrum of scientific fields ranging from computer science and knowledge engineering to bioinformatics. The literature reviewed was taken from PubMed, Semantic Scholar and Google Scholar searches including terms such as "Psychiatric Knowledge Graphs", "Biomedical Knowledge Graphs", "Knowledge Graph Machine Learning Applications", "Knowledge Graph Applications for Biomedical Sciences". The resulting publications were then assessed and accumulated in this review regarding their possible relevance to future psychiatric applications. RESULTS: A multitude of papers and applications of KGs in associated research fields that are yet to be utilized in psychiatric research was found and outlined in this review. We create a thorough recommendation for other computational researchers regarding use-cases of these KG applications in psychiatry. CONCLUSION: This review illustrates use-cases of KG-based research applications in biomedicine and beyond that may aid in elucidating the complex biology of psychiatric illness and open new routes for developing innovative interventions. We conclude that there is a wealth of opportunities for KG utilization in psychiatric research across a variety of application areas including biomarker discovery, patient stratification and personalized medicine approaches.

17.
Entropy (Basel) ; 26(6)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38920449

RESUMEN

The causal structure of a system imposes constraints on the joint probability distribution of variables that can be generated by the system. Archetypal constraints consist of conditional independencies between variables. However, particularly in the presence of hidden variables, many causal structures are compatible with the same set of independencies inferred from the marginal distributions of observed variables. Additional constraints allow further testing for the compatibility of data with specific causal structures. An existing family of causally informative inequalities compares the information about a set of target variables contained in a collection of variables, with a sum of the information contained in different groups defined as subsets of that collection. While procedures to identify the form of these groups-decomposition inequalities have been previously derived, we substantially enlarge the applicability of the framework. We derive groups-decomposition inequalities subject to weaker independence conditions, with weaker requirements in the configuration of the groups, and additionally allowing for conditioning sets. Furthermore, we show how constraints with higher inferential power may be derived with collections that include hidden variables, and then converted into testable constraints using data processing inequalities. For this purpose, we apply the standard data processing inequality of conditional mutual information and derive an analogous property for a measure of conditional unique information recently introduced to separate redundant, synergistic, and unique contributions to the information that a set of variables has about a target.

18.
Int J Mol Sci ; 25(12)2024 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-38928289

RESUMEN

Graph Neural Networks have proven to be very valuable models for the solution of a wide variety of problems on molecular graphs, as well as in many other research fields involving graph-structured data. Molecules are heterogeneous graphs composed of atoms of different species. Composite graph neural networks process heterogeneous graphs with multiple-state-updating networks, each one dedicated to a particular node type. This approach allows for the extraction of information from s graph more efficiently than standard graph neural networks that distinguish node types through a one-hot encoded type of vector. We carried out extensive experimentation on eight molecular graph datasets and on a large number of both classification and regression tasks. The results we obtained clearly show that composite graph neural networks are far more efficient in this setting than standard graph neural networks.


Asunto(s)
Redes Neurales de la Computación , Algoritmos
19.
Neural Netw ; 178: 106468, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38943862

RESUMEN

Knowledge graph reasoning, vital for addressing incompleteness and supporting applications, faces challenges with the continuous growth of graphs. To address this challenge, several inductive reasoning models for encoding emerging entities have been proposed. However, they do not consider the multi-batch emergence scenario, where new entities and new facts are usually added to knowledge graphs (KGs) in multiple batches in the order of their emergence. To simulate the continuous growth of knowledge graphs, a novel multi-batch emergence (MBE) scenario has recently been proposed. We propose a path-based inductive model to handle multi-batch entity growth, enhancing entity encoding with type information. Specifically, we observe a noteworthy pattern in which entity types at the head and tail of the same relation exhibit relative regularity. To utilize this regularity, we introduce a pair of learnable parameters for each relation, representing entity type features linked to the relation. The type features are dedicated to encoding and updating the features of entities. Meanwhile, our model incorporates a novel attention mechanism, combining statistical co-occurrence and semantic similarity of relations effectively for contextual information capture. After generating embeddings, we employ reinforcement learning for path reasoning. To reduce sparsity and expand the action space, our model generates soft candidate facts by grounding a set of soft path rules. Meanwhile, we incorporate the confidence scores of these facts in the action space to facilitate the agent to better distinguish between original facts and rule-generated soft facts. Performances on three multi-batch entity growth datasets demonstrate robust performance, consistently outperforming state-of-the-art models.

20.
Comput Biol Med ; 178: 108768, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38936076

RESUMEN

Biomedical knowledge graphs (KGs) serve as comprehensive data repositories that contain rich information about nodes and edges, providing modeling capabilities for complex relationships among biological entities. Many approaches either learn node features through traditional machine learning methods, or leverage graph neural networks (GNNs) to directly learn features of target nodes in the biomedical KGs and utilize them for downstream tasks. Motivated by the pre-training technique in natural language processing (NLP), we propose a framework named PT-KGNN (Pre-Training the biomedical KG with GNNs) to learn embeddings of nodes in a broader context by applying GNNs on the biomedical KG. We design several experiments to evaluate the effectivity of our proposed framework and the impact of the scale of KGs. The results of tasks consistently improve as the scale of the biomedical KG used for pre-training increases. Pre-training on large-scale biomedical KGs significantly enhances the drug-drug interaction (DDI) and drug-disease association (DDA) prediction performance on the independent dataset. The embeddings derived from a larger biomedical KG have demonstrated superior performance compared to those obtained from a smaller KG. By applying pre-training techniques on biomedical KGs, rich semantic and structural information can be learned, leading to enhanced performance on downstream tasks. it is evident that pre-training techniques hold tremendous potential and wide-ranging applications in bioinformatics.


Asunto(s)
Redes Neurales de la Computación , Humanos , Procesamiento de Lenguaje Natural , Aprendizaje Automático , Biología Computacional/métodos , Interacciones Farmacológicas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA