Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.030
Filtrar
1.
Sci Data ; 11(1): 363, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605048

RESUMO

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.


Assuntos
Disciplinas das Ciências Biológicas , Bases de Conhecimento , Reconhecimento Automatizado de Padrão , Algoritmos , Pesquisa Translacional Biomédica
2.
PLoS One ; 19(3): e0297044, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38478525

RESUMO

This study examines the relationship between CEO career variety, digital knowledge base extension, and digital transformation in a digital M&A context. An empirical test was conducted using regression analysis with the digital M&A events of the new generation of information technology firms in China as the research sample. The results reveal that CEO career variety has a positive effect on digital transformation in the digital M&A context and that digital knowledge-base extension plays a mediating role. Moreover, the heterogeneity impact analysis indicated that the moderating effects of geographical distance, knowledge disparity, and cultural difference between target and acquirer firms on the above relationships vary greatly: geographical distance has a negative moderating effect, cultural difference has a positive moderating effect, and the moderating effects of both geographical distance and cultural difference are realized through mediating effects, but none of the moderating effects of knowledge disparity are significant.


Assuntos
Evolução Cultural , Tecnologia da Informação , Ciência da Informação , China , Bases de Conhecimento
3.
Artif Intell Med ; 149: 102812, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38462270

RESUMO

Mental and physical disorders (MPD) are inextricably linked in many medical cases; psychosomatic diseases can be induced by mental concerns and psychological discomfort can ensue from physiological diseases. However, existing medical informatics studies focus on identifying mental or physical disorders from a unilateral perspective. Consequently, no existing domain knowledge base, corpus, or detection modeling approach considers mental as well as physical aspects concurrently. This paper proposes a joint modeling approach to detect MPD. First, we crawl through online medical consultation records of patients from websites and build an MPD knowledge ontology by extracting the core conceptual features of the text. Based on the ontology, an MPD knowledge graph containing 12,673 nodes and 82,195 relations is obtained using term matching with a domain thesaurus of each concept. Subsequently, an MPD corpus with fine-grained severities (None, Mild, Moderate, Severe, Dangerous) and 8909 records is constructed by formulating MPD classification criteria and a data annotation process under the guidance of domain experts. Taking the knowledge graph and corpus as the dataset, we design a multi-task learning model to detect the MPD severity, in which a knowledge graph attention network (KGAT) is embedded to better extract knowledge features. Experiments are performed to demonstrate the effectiveness of our model. Furthermore, we employ ontology-based and centrality-based methods to discover additional potential inferred knowledge, which can be captured by KGAT so as to improve the prediction performance and interpretability of our model. Our dataset has been made publicly available, so it can be further used as a medical informatics reference in the fields of psychosomatic medicine, psychiatrics, physical co-morbidity, and so on.


Assuntos
Transtornos Mentais , Psiquiatria , Humanos , Reconhecimento Automatizado de Padrão , Aprendizagem , Transtornos Mentais/diagnóstico , Bases de Conhecimento
4.
PLoS One ; 19(3): e0296864, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38536833

RESUMO

The modeling of uncertain information is an open problem in ontology research and is a theoretical obstacle to creating a truly semantic web. Currently, ontologies often do not model uncertainty, so stochastic subject matter must either be normalized or rejected entirely. Because uncertainty is omnipresent in the real world, knowledge engineers are often faced with the dilemma of performing prohibitively labor-intensive research or running the risk of rejecting correct information and accepting incorrect information. It would be preferable if ontologies could explicitly model real-world uncertainty and incorporate it into reasoning. We present an ontology framework which is based on a seamless synthesis of description logic and probabilistic semantics. This synthesis is powered by a link between ontology assertions and random variables that allows for automated construction of a probability distribution suitable for inferencing. Furthermore, our approach defines how to represent stochastic, uncertain, or incomplete subject matter. Additionally, this paper describes how to fuse multiple conflicting ontologies into a single knowledge base that can be reasoned with using the methods of both description logic and probabilistic inferencing. This is accomplished by using probabilistic semantics to resolve conflicts between assertions, eliminating the need to delete potentially valid knowledge and perform consistency checks. In our framework, emergent inferences can be made from a fused ontology that were not present in any of the individual ontologies, producing novel insights in a given domain.


Assuntos
Ontologias Biológicas , Semântica , Incerteza , Teorema de Bayes , Bases de Conhecimento , Lógica
5.
J Chem Inf Model ; 64(6): 1868-1881, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38483449

RESUMO

The lengthy and expensive process of developing new drugs from scratch, coupled with a high failure rate, has prompted the emergence of drug repurposing/repositioning as a more efficient and cost-effective approach. This approach involves identifying new therapeutic applications for existing approved drugs, leveraging the extensive drug-related data already gathered. However, the diversity and heterogeneity of data, along with the limited availability of known drug-disease interactions, pose significant challenges to computational drug design. To address these challenges, this study introduces EKGDR, an end-to-end knowledge graph-based approach for computational drug repurposing. EKGDR utilizes the power of a drug knowledge graph, a comprehensive repository of drug-related information that encompasses known drug interactions and various categorization information, as well as structural molecular descriptors of drugs. EKGDR employs graph neural networks, a cutting-edge graph representation learning technique, to embed the drug knowledge graph (nodes and relations) in an end-to-end manner. By doing so, EKGDR can effectively learn the underlying causes (intents) behind drug-disease interactions and recursively aggregate and combine relational messages between nodes along different multihop neighborhood paths (relational paths). This process generates representations of disease and drug nodes, enabling EKGDR to predict the interaction probability for each drug-disease pair in an end-to-end manner. The obtained results demonstrate that EKGDR outperforms previous models in all three evaluation metrics: area under the receiver operating characteristic curve (AUROC = 0.9475), area under the precision-recall curve (AUPRC = 0.9490), and recall at the top-200 recommendations (Recall@200 = 0.8315). To further validate EKGDR's effectiveness, we evaluated the top-20 candidate drugs suggested for each of Alzheimer's and Parkinson's diseases.


Assuntos
Reposicionamento de Medicamentos , Reconhecimento Automatizado de Padrão , Reposicionamento de Medicamentos/métodos , Redes Neurais de Computação , Bases de Conhecimento , Interações Medicamentosas
6.
J Biomed Semantics ; 15(1): 1, 2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38438913

RESUMO

The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.


Assuntos
Interações Alimento-Droga , Bases de Conhecimento
7.
BMC Bioinformatics ; 25(1): 62, 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38326757

RESUMO

BACKGROUND: Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems. RESULTS: We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches: LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average. CONCLUSIONS: We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.


Assuntos
Bases de Conhecimento , Transcriptoma , Humanos , Algoritmos , Aprendizado de Máquina
8.
Elife ; 122024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38345923

RESUMO

Hippocampome.org is a mature open-access knowledge base of the rodent hippocampal formation focusing on neuron types and their properties. Previously, Hippocampome.org v1.0 established a foundational classification system identifying 122 hippocampal neuron types based on their axonal and dendritic morphologies, main neurotransmitter, membrane biophysics, and molecular expression (Wheeler et al., 2015). Releases v1.1 through v1.12 furthered the aggregation of literature-mined data, including among others neuron counts, spiking patterns, synaptic physiology, in vivo firing phases, and connection probabilities. Those additional properties increased the online information content of this public resource over 100-fold, enabling numerous independent discoveries by the scientific community. Hippocampome.org v2.0, introduced here, besides incorporating over 50 new neuron types, now recenters its focus on extending the functionality to build real-scale, biologically detailed, data-driven computational simulations. In all cases, the freely downloadable model parameters are directly linked to the specific peer-reviewed empirical evidence from which they were derived. Possible research applications include quantitative, multiscale analyses of circuit connectivity and spiking neural network simulations of activity dynamics. These advances can help generate precise, experimentally testable hypotheses and shed light on the neural mechanisms underlying associative memory and spatial navigation.


Assuntos
Hipocampo , Roedores , Animais , Hipocampo/fisiologia , Neurônios/fisiologia , Redes Neurais de Computação , Bases de Conhecimento
9.
Comput Methods Programs Biomed ; 246: 108051, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38301394

RESUMO

BACKGROUND AND OBJECTIVE: Symptom descriptions by ordinary people are often inaccurate or vague when seeking medical advice, which often leads to inaccurate preliminary clinical diagnoses. To address this issue, we propose a deep learning model named the knowledgeable diagnostic transformer (KDT) for the natural language processing (NLP)-based preliminary clinical diagnoses. METHODS: The KDT extracts symptom-disease relation triples (h,r,t) from patient symptom descriptions by using a proposed bipartite medical knowledge graph (bMKG). To avoid too many relation triples causing the knowledge noise issue, we propose a knowledge inclusion-exclusion approach (KIA) to eliminate undesirable triples (a knowledge filtering layer). Next, we combine token embedding techniques with the transformer model to predict the diseases that patients may encounter. RESULTS: To train the KDT, a medical diagnosis question-answering dataset (named MDQA dataset) containing large-scale, high-quality questions (patient syndrome description) and answering (diagnosis) corpora with 2.6M entries (1.07GB in size) in Mandarin was built. We also train the KDT with the National Institutes of Health (NIH) English dataset (MedQuAD). The KDT marks a transformative approach by achieving a remarkable accuracy of 99% for different evaluation metrics when compared with the baseline transformers used for the NLP-based preliminary clinical diagnoses approaches. CONCLUSIONS: In essence, our study not only demonstrates the effectiveness of the KDT in enhancing diagnostic precision but also underscores its potential to revolutionize the field of preliminary clinical diagnoses. By harnessing the power of knowledge-based approaches and advanced NLP techniques, we have paved the way for more accurate and reliable diagnoses, ultimately benefiting both healthcare providers and patients. The KDT has the potential to significantly reduce misdiagnoses and improve patient outcomes, marking a pivotal advancement in the realm of medical diagnostics.


Assuntos
Benchmarking , Processamento de Linguagem Natural , Humanos , Bases de Conhecimento , Idioma , Encaminhamento e Consulta , Estados Unidos
10.
Artif Intell Med ; 148: 102748, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38325935

RESUMO

Medical automatic diagnosis aims to organize real-world diagnostic processes similar to those from human doctors and to achieve accurate diagnoses by interacting with patients. The task is formulated as a sequential decision-making problem with a series of information inquiry steps (asking about symptoms and ordering examinations) and the final diagnosis. Recent research has studied incorporating reinforcement learning for information inquiry and classification techniques for disease diagnosis, respectively. However, studies on efficiently and effectively combining the two procedures are still lacking. To address this issue, we devised an adaptive mechanism to align reinforcement learning and classification methods using distribution entropy as the medium. Additionally, we created a new dataset for patient simulation to address the lack of large-scale evaluation benchmarks. The dataset is extracted from the MedlinePlus knowledge base and contains significantly more diseases and more comprehensive symptom and examination information than existing datasets. Experimental evaluation shows that our method outperforms three current state-of-the-art methods on different datasets by achieving higher medical diagnostic accuracy with fewer inquiring turns.


Assuntos
Aprendizagem , Médicos , Humanos , Reforço Psicológico , Entropia , Bases de Conhecimento
11.
Comput Biol Med ; 170: 108105, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38330823

RESUMO

Infertility affects ∼15% of couples globally and half of cases are related to genetic disorders. Despite growing data and unprecedented improvements in high-throughput sequencing technologies, accumulated fertility-related issues concerning genetic diagnosis and potential treatment are urgent to be solved. However, there is a lack of comprehensive platforms that characterise various infertility-related records to provide research applications for exploring infertility in-depth and genetic counselling of infertility couple. To solve this problem, we provide IDDB Xtra by further integrating phenotypic manifestations, genomic datasets, epigenetics, modulators in collaboration with numerous interactive tools into our previous infertility database, IDDB. IDDB Xtra houses manually-curated 2369 genes of human and nine model organisms, 273 chromosomal abnormalities, 884 phenotypes, 60 genomic datasets, 464 epigenetic records, 1144 modulators relevant to infertility diagnosis and treatment. Additionally, IDDB Xtra incorporated customized graphical applications for researchers and clinicians to decipher in-depth disease mechanisms from the perspectives of developmental atlas, mutation effects, and clinical manifestations. Users can browse genes across developmental stages of human and mouse, filter candidate genes, mine potential variants and retrieve infertility biomedical network in an intuitive web interface. In summary, IDDB Xtra not only captures valuable research and data, but also provides useful applications to facilitate the genetic counselling and drug discovery of infertility. IDDB Xtra is freely available at https://mdl.shsmu.edu.cn/IDDB/and http://www.allostery.net/IDDB.


Assuntos
Infertilidade , Humanos , Camundongos , Animais , Bases de Dados Factuais , Mutação , Infertilidade/genética , Fenótipo , Bases de Conhecimento
12.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38383067

RESUMO

MOTIVATION: Creating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas. RESULTS: Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM. AVAILABILITY AND IMPLEMENTATION: SPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt.


Assuntos
Bases de Conhecimento , Semântica , Bases de Dados Factuais
13.
NPJ Syst Biol Appl ; 10(1): 4, 2024 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-38218959

RESUMO

Knowledge bases have been instrumental in advancing biological research, facilitating pathway analysis and data visualization, which are now widely employed in the scientific community. Despite the establishment of several prominent knowledge bases focusing on signaling, metabolic networks, or both, integrating these networks into a unified topological network has proven to be challenging. The intricacy of molecular interactions and the diverse formats employed to store and display them contribute to the complexity of this task. In a prior study, we addressed this challenge by introducing a "meta-pathway" structure that integrated the advantages of the Simple Interaction Format (SIF) while accommodating reaction information. Nevertheless, the earlier Global Integrative Network (GIN) was limited to reliance on KEGG alone. Here, we present GIN version 2.0, which incorporates human molecular interaction data from ten distinct knowledge bases, including KEGG, Reactome, and HumanCyc, among others. We standardized the data structure, gene IDs, and chemical IDs, and conducted a comprehensive analysis of the consistency among the ten knowledge bases before combining all unified interactions into GINv2.0. Utilizing GINv2.0, we investigated the glycolysis process and its regulatory proteins, revealing coordinated regulations on glycolysis and autophagy, particularly under glucose starvation. The expanded scope and enhanced capabilities of GINv2.0 provide a valuable resource for comprehensive systems-level analyses in the field of biological research. GINv2.0 can be accessed at: https://github.com/BIGchix/GINv2.0 .


Assuntos
Redes e Vias Metabólicas , Transdução de Sinais , Humanos , Redes e Vias Metabólicas/genética , Bases de Conhecimento
14.
Int J Mol Sci ; 25(2)2024 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-38256255

RESUMO

SpliceProt 2.0 is a public proteogenomics database that aims to list the sequence of known proteins and potential new proteoforms in human, mouse, and rat proteomes. This updated repository provides an even broader range of computationally translated proteins and serves, for example, to aid with proteomic validation of splice variants absent from the reference UniProtKB/SwissProt database. We demonstrate the value of SpliceProt 2.0 to predict orthologous proteins between humans and murines based on transcript reconstruction, sequence annotation and detection at the transcriptome and proteome levels. In this release, the annotation data used in the reconstruction of transcripts based on the methodology of ternary matrices were acquired from new databases such as Ensembl, UniProt, and APPRIS. Another innovation implemented in the pipeline is the exclusion of transcripts predicted to be susceptible to degradation through the NMD pathway. Taken together, our repository and its applications represent a valuable resource for the proteogenomics community.


Assuntos
Proteogenômica , Proteômica , Ratos , Camundongos , Humanos , Animais , Bases de Dados de Proteínas , Bases de Conhecimento , Proteoma/genética
15.
Stud Health Technol Inform ; 310: 184-188, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269790

RESUMO

In multicenter clinical research, case-reported clinical data are managed for each research project. Participating institutions manage the mapping between standardized codes and in-house codes. To use the data extracted from electronic medical records in case report forms, it is necessary to pay attention to the gap in the semantic hierarchy. Managing mapping information between in-house and standardized codes is centralized in Resource Description Framework (RDF) stores. The relationship between standardized and in-house codes is described in RDF and stored in RDF stores. RESTful APIs for accessing RDF stores in SPARQL was developed and verified. The relationship between standardized codes and in-house codes of pharmaceuticals was expressed in RDF triples. As a +result of the operational verification of the implemented APIs, it was confirmed that data management with knowledge bases expressed in RDF graphs is possible. The ability to dynamically modify mapping definitions enables flexible data management and ease of operational restrictions.


Assuntos
Administração de Caso , Gerenciamento de Dados , Registros Eletrônicos de Saúde , Bases de Conhecimento , Sistema de Registros
16.
Stud Health Technol Inform ; 310: 639-643, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38269887

RESUMO

Automatic extraction of relations between drugs/chemicals and proteins from ever-growing biomedical literature is required to build up-to-date knowledge bases in biomedicine. To promote the development of automated methods, BioCreative-VII organized a shared task - the DrugProt track, to recognize drug-protein entity relations from PubMed abstracts. We participated in the shared task and leveraged deep learning-based transformer models pre-trained on biomedical data to build ensemble approaches to automatically extract drug-protein relation from biomedical literature. On the main corpora of 10,750 abstracts, our best system obtained an F1-score of 77.60% (ranked 4th among 30 participating teams), and on the large-scale corpus of 2.4M documents, our system achieved micro-averaged F1-score of 77.32% (ranked 2nd among 9 system submissions). This demonstrates the effectiveness of domain-specific transformer models and ensemble approaches for automatic relation extraction from biomedical literature.


Assuntos
Fontes de Energia Elétrica , Bases de Conhecimento , PubMed
17.
PLoS One ; 19(1): e0297022, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38271452

RESUMO

Previous studies have primarily investigated scientists' direct impact on technological performance. Expanding on this, the study explores the nuanced ways and timing through which scientists influence team-level technological performance. By integrating knowledge-based and network dynamics theories, the study establishes and assesses membership turnover as a significant mediator of the science-technological performance process. Furthermore, it investigates the moderating effects of team internationalization and coreness on the mediation effects. Employing an unbalanced panel dataset from Huawei and Intel from 2000 to 2022, the study applied the Tobit and Negative Binomial models and conducted robustness tests for data analysis. The findings support the indirect influence of scientists within an invention team on the quantity and quality of inventions through membership turnover. Moreover, team internationalization diminishes the relationship between membership turnover and the quantity and quality of inventions, thereby impairing scientists' indirect effects on technological performance through membership turnover. Team coreness enhances the relationship between membership turnover and the quantity and quality of inventions, strengthening the indirect impact of scientists on these dimensions through membership turnover.


Assuntos
Análise de Dados , Tecnologia , Manipulação de Alimentos , Bases de Conhecimento , Modelos Estatísticos , Invenções , China
18.
Nucleic Acids Res ; 52(D1): D1210-D1217, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38183204

RESUMO

The Catalogue Of Somatic Mutations In Cancer (COSMIC), https://cancer.sanger.ac.uk/cosmic, is an expert-curated knowledgebase providing data on somatic variants in cancer, supported by a comprehensive suite of tools for interpreting genomic data, discerning the impact of somatic alterations on disease, and facilitating translational research. The catalogue is accessed and used by thousands of cancer researchers and clinicians daily, allowing them to quickly access information from an immense pool of data curated from over 29 thousand scientific publications and large studies. Within the last 4 years, COSMIC has substantially expanded its utility by adding new resources: the Mutational Signatures catalogue, the Cancer Mutation Census, and Actionability. To improve data accessibility and interoperability, somatic variants have received stable genomic identifiers that are associated with their genomic coordinates in GRCh37 and GRCh38, and new export files with reduced data redundancy have been made available for download.


Assuntos
Bases de Dados Genéticas , Genômica , Neoplasias , Humanos , Bases de Dados Factuais , Bases de Conhecimento , Mutação , Neoplasias/genética , Bases de Dados Genéticas/tendências , Internet
19.
Artif Intell Med ; 147: 102718, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38184346

RESUMO

BACKGROUND: Diagnostic errors have become the biggest threat to the safety of patients in primary health care. General practitioners, as the "gatekeepers" of primary health care, have a responsibility to accurately diagnose patients. However, many general practitioners have insufficient knowledge and clinical experience in some diseases. Clinical decision making tools need to be developed to effectively improve the diagnostic process in primary health care. The long-tailed class distributions of medical datasets are challenging for many popular decision making models based on deep learning, which have difficulty predicting few-shot diseases. Meta-learning is a new strategy for solving few-shot problems. METHODS AND MATERIALS: In this study, a few-shot disease diagnosis decision making model based on a model-agnostic meta-learning algorithm (FSDD-MAML) is proposed. The MAML algorithm is applied in a knowledge graph-based disease diagnosis model to find the optimal model parameters. Moreover, FSDD-MAML can learn learning rates for all modules of the knowledge graph-based disease diagnosis model. For n-way, k-shot learning tasks, the inner loop of FSDD-MAML performs multiple gradient update steps to learn internal features in disease classification tasks using n×k examples, and the outer loop of FSDD-MAML optimizes the meta-objective to find the associated optimal parameters and learning rates. FSDD-MAML is compared with the original knowledge graph-based disease diagnosis model and other meta-learning algorithms based on an abdominal disease dataset. RESULT: Meta-learning algorithms can greatly improve the performance of models in top-1 evaluation compared with top-3, top-5, and top-10 evaluations. The proposed decision making model FSDD-MAML outperforms all the other models, with a precision@1 of 90.02 %. We achieve state-of-the-art performance in the diagnosis of all diseases, and the prediction performance for few-shot diseases is greatly improved. For the two groups with the fewest examples of diseases, FSDD-MAML achieves relative increases in precision@1 of 29.13 % and 21.63 % compared with the original knowledge graph-based disease diagnosis model. In addition, we analyze the reasoning process of several few-shot disease predictions and provide an explanation for the results. CONCLUSION: The decision making model based on meta-learning proposed in this paper can support the rapid diagnosis of diseases in general practice and is especially capable of helping general practitioners diagnose few-shot diseases. This study is of profound significance for the exploration and application of meta-learning to few-shot disease assessment in general practice.


Assuntos
Medicina Geral , Humanos , Algoritmos , Tomada de Decisão Clínica , Bases de Conhecimento , Tomada de Decisões
20.
Artif Intell Med ; 147: 102735, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38184359

RESUMO

Early assessment, with the help of machine learning methods, can aid clinicians in optimizing the diagnosis and treatment process, allowing patients to receive critical treatment time. Due to the advantages of effective information organization and interpretable reasoning, knowledge graph-based methods have become one of the most widely used machine learning algorithms for this task. However, due to a lack of effective organization and use of multi-granularity and temporal information, current knowledge graph-based approaches are hard to fully and comprehensively exploit the information contained in medical records, restricting their capacity to make superior quality diagnoses. To address these challenges, we examine and study disease diagnosis applications in-depth, and propose a novel disease diagnosis framework named FIT-Graph. With novel medical multi-grained evolutionary graphs, FIT-Graph efficiently organizes the extracted information from various granularities and time stages, maximizing the retention of valuable information for disease inference and ensuring the comprehensiveness and validity of the final disease inference. We compare FIT-Graph with two real-world clinical datasets from cardiology and respiratory departments with the baseline. The experimental results show that its effect is better than the baseline model, and the baseline performance of the task is improved by about 5% in multiple indices.


Assuntos
Algoritmos , Bases de Conhecimento , Humanos , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...