Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
1.
BMC Bioinformatics ; 25(1): 273, 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39169321

RESUMEN

BACKGROUND: There has been a considerable advancement in AI technologies like LLM and machine learning to support biomedical knowledge discovery. MAIN BODY: We propose a novel biomedical neural search service called 'VAIV Bio-Discovery', which supports enhanced knowledge discovery and document search on unstructured text such as PubMed. It mainly handles with information related to chemical compound/drugs, gene/proteins, diseases, and their interactions (chemical compounds/drugs-proteins/gene including drugs-targets, drug-drug, and drug-disease). To provide comprehensive knowledge, the system offers four search options: basic search, entity and interaction search, and natural language search. We employ T5slim_dec, which adapts the autoregressive generation task of the T5 (text-to-text transfer transformer) to the interaction extraction task by removing the self-attention layer in the decoder block. It also assists in interpreting research findings by summarizing the retrieved search results for a given natural language query with Retrieval Augmented Generation (RAG). The search engine is built with a hybrid method that combines neural search with the probabilistic search, BM25. CONCLUSION: As a result, our system can better understand the context, semantics and relationships between terms within the document, enhancing search accuracy. This research contributes to the rapidly evolving biomedical field by introducing a new service to access and discover relevant knowledge.


Asunto(s)
Procesamiento de Lenguaje Natural , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , PubMed , Motor de Búsqueda , Aprendizaje Automático , Almacenamiento y Recuperación de la Información/métodos , Redes Neurales de la Computación
2.
Comput Biol Med ; 179: 108920, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39047506

RESUMEN

This study introduces RheumaLinguisticpack (RheumaLpack), the first specialised linguistic web corpus designed for the field of musculoskeletal disorders. By combining web mining (i.e., web scraping) and natural language processing (NLP) techniques, as well as clinical expertise, RheumaLpack systematically captures and curates structured and unstructured data across a spectrum of web sources including clinical trials registers (i.e., ClinicalTrials.gov), bibliographic databases (i.e., PubMed), medical agencies (i.e. European Medicines Agency), social media (i.e., Reddit), and accredited health websites (i.e., MedlinePlus, Harvard Health Publishing, and Cleveland Clinic). Given the complexity of rheumatic and musculoskeletal diseases (RMDs) and their significant impact on quality of life, this resource can be proposed as a useful tool to train algorithms that could mitigate the diseases' effects. Therefore, the corpus aims to improve the training of artificial intelligence (AI) algorithms and facilitate knowledge discovery in RMDs. The development of RheumaLpack involved a systematic six-step methodology covering data identification, characterisation, selection, collection, processing, and corpus description. The result is a non-annotated, monolingual, and dynamic corpus, featuring almost 3 million records spanning from 2000 to 2023. RheumaLpack represents a pioneering contribution to rheumatology research, providing a useful resource for the development of advanced AI and NLP applications. This corpus highlights the value of web data to address the challenges posed by musculoskeletal diseases, illustrating the corpus's potential to improve research and treatment paradigms in rheumatology. Finally, the methodology shown can be replicated to obtain data from other medical specialities. The code and details on how to build RheumaLpack are also provided to facilitate the dissemination of such resource.


Asunto(s)
Procesamiento de Lenguaje Natural , Reumatología , Humanos , Internet , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , Enfermedades Musculoesqueléticas
3.
Comput Biol Med ; 176: 108525, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38749322

RESUMEN

Deep neural networks have become increasingly popular for analyzing ECG data because of their ability to accurately identify cardiac conditions and hidden clinical factors. However, the lack of transparency due to the black box nature of these models is a common concern. To address this issue, explainable AI (XAI) methods can be employed. In this study, we present a comprehensive analysis of post-hoc XAI methods, investigating the glocal (aggregated local attributions over multiple samples) and global (concept based XAI) perspectives. We have established a set of sanity checks to identify saliency as the most sensible attribution method. We provide a dataset-wide analysis across entire patient subgroups, which goes beyond anecdotal evidence, to establish the first quantitative evidence for the alignment of model behavior with cardiologists' decision rules. Furthermore, we demonstrate how these XAI techniques can be utilized for knowledge discovery, such as identifying subtypes of myocardial infarction. We believe that these proposed methods can serve as building blocks for a complementary assessment of the internal validity during a certification process, as well as for knowledge discovery in the field of ECG analysis.


Asunto(s)
Aprendizaje Profundo , Electrocardiografía , Electrocardiografía/métodos , Humanos , Descubrimiento del Conocimiento/métodos , Redes Neurales de la Computación , Procesamiento de Señales Asistido por Computador
4.
J Biomed Inform ; 145: 104464, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37541406

RESUMEN

OBJECTIVE: We explore the framing of literature-based discovery (LBD) as link prediction and graph embedding learning, with Alzheimer's Disease (AD) as our focus disease context. The key link prediction setting of prediction window length is specifically examined in the context of a time-sliced evaluation methodology. METHODS: We propose a four-stage approach to explore literature-based discovery for Alzheimer's Disease, creating and analyzing a knowledge graph tailored to the AD context, and predicting and evaluating new knowledge based on time-sliced link prediction. The first stage is to collect an AD-specific corpus. The second stage involves constructing an AD knowledge graph with identified AD-specific concepts and relations from the corpus. In the third stage, 20 pairs of training and testing datasets are constructed with the time-slicing methodology. Finally, we infer new knowledge with graph embedding-based link prediction methods. We compare different link prediction methods in this context. The impact of limiting prediction evaluation of LBD models in the context of short-term and longer-term knowledge evolution for Alzheimer's Disease is assessed. RESULTS: We constructed an AD corpus of over 16 k papers published in 1977-2021, and automatically annotated it with concepts and relations covering 11 AD-specific semantic entity types. The knowledge graph of Alzheimer's Disease derived from this resource consisted of ∼11 k nodes and ∼394 k edges, among which 34% were genotype-phenotype relationships, 57% were genotype-genotype relationships, and 9% were phenotype-phenotype relationships. A Structural Deep Network Embedding (SDNE) model consistently showed the best performance in terms of returning the most confident set of link predictions as time progresses over 20 years. A huge improvement in model performance was observed when changing the link prediction evaluation setting to consider a more distant future, reflecting the time required for knowledge accumulation. CONCLUSION: Neural network graph-embedding link prediction methods show promise for the literature-based discovery context, although the prediction setting is extremely challenging, with graph densities of less than 1%. Varying prediction window length on the time-sliced evaluation methodology leads to hugely different results and interpretations of LBD studies. Our approach can be generalized to enable knowledge discovery for other diseases. AVAILABILITY: Code, AD ontology, and data are available at https://github.com/READ-BioMed/readbiomed-lbd.


Asunto(s)
Enfermedad de Alzheimer , Descubrimiento del Conocimiento , Humanos , Descubrimiento del Conocimiento/métodos , Enfermedad de Alzheimer/diagnóstico , Redes Neurales de la Computación , Aprendizaje , Fenotipo
5.
J Biomed Inform ; 143: 104362, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37146741

RESUMEN

Scientific literature presents a wealth of information yet to be explored. As the number of researchers increase with each passing year and publications are released, this contributes to an era where specialized fields of research are becoming more prevalent. As this trend continues, this further propagates the separation of interdisciplinary publications and makes keeping up to date with literature a laborious task. Literature-based discovery (LBD) aims to mitigate these concerns by promoting information sharing among non-interacting literature while extracting potentially meaningful information. Furthermore, recent advances in neural network architectures and data representation techniques have fueled their respective research communities in achieving state-of-the-art performance in many downstream tasks. However, studies of neural network-based methods for LBD remain to be explored. We introduce and explore a deep learning neural network-based approach for LBD. Additionally, we investigate various approaches to represent terms as concepts and analyze the affect of feature scaling representations into our model. We compare the evaluation performance of our method on five hallmarks of cancer datasets utilized for closed discovery. Our results show the chosen representation as input into our model affects evaluation performance. We found feature scaling our input representations increases evaluation performance and decreases the necessary number of epochs needed to achieve model generalization. We also explore two approaches to represent model output. We found reducing the model's output to capturing a subset of concepts improved evaluation performance at the cost of model generalizability. We also compare the efficacy of our method on the five hallmarks of cancer datasets to a set of randomly chosen relations between concepts. We found these experiments confirm our method's suitability for LBD.


Asunto(s)
Aprendizaje Profundo , Neoplasias , Humanos , Redes Neurales de la Computación , Descubrimiento del Conocimiento/métodos , Publicaciones
6.
J Biomed Inform ; 142: 104383, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37196989

RESUMEN

OBJECTIVE: To demonstrate and develop an approach enabling individual researchers or small teams to create their own ad-hoc, lightweight knowledge bases tailored for specialized scientific interests, using text-mining over scientific literature, and demonstrate the effectiveness of these knowledge bases in hypothesis generation and literature-based discovery (LBD). METHODS: We propose a lightweight process using an extractive search framework to create ad-hoc knowledge bases, which require minimal training and no background in bio-curation or computer science. These knowledge bases are particularly effective for LBD and hypothesis generation using Swanson's ABC method. The personalized nature of the knowledge bases allows for a somewhat higher level of noise than "public facing" ones, as researchers are expected to have prior domain experience to separate signal from noise. Fact verification is shifted from exhaustive verification of the knowledge base to post-hoc verification of specific entries of interest, allowing researchers to assess the correctness of relevant knowledge base entries by considering the paragraphs in which the facts were introduced. RESULTS: We demonstrate the methodology by constructing several knowledge bases of different kinds: three knowledge bases that support lab-internal hypothesis generation: Drug Delivery to Ovarian Tumors (DDOT); Tissue Engineering and Regeneration; Challenges in Cancer Research; and an additional comprehensive, accurate knowledge base designated as a public resource for the wider community on the topic of Cell Specific Drug Delivery (CSDD). In each case, we show the design and construction process, along with relevant visualizations for data exploration, and hypothesis generation. For CSDD and DDOT we also show meta-analysis, human evaluation, and in vitro experimental evaluation. CONCLUSION: Our approach enables researchers to create personalized, lightweight knowledge bases for specialized scientific interests, effectively facilitating hypothesis generation and literature-based discovery (LBD). By shifting fact verification efforts to post-hoc verification of specific entries, researchers can focus on exploring and generating hypotheses based on their expertise. The constructed knowledge bases demonstrate the versatility and adaptability of our approach to versatile research interests. The web-based platform, available at https://spike-kbc.apps.allenai.org, provides researchers with a valuable tool for rapid construction of knowledge bases tailored to their needs.


Asunto(s)
Minería de Datos , Descubrimiento del Conocimiento , Humanos , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , Publicaciones
8.
J Community Psychol ; 49(6): 1718-1731, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34004017

RESUMEN

Large amounts of text-based data, like study abstracts, often go unanalyzed because the task is laborious. Natural language processing (NLP) uses computer-based algorithms not traditionally implemented in community psychology to effectively and efficiently process text. These methods include examining the frequency of words and phrases, the clustering of topics, and the interrelationships of words. This article applied NLP to explore the concept of equity in community psychology. The COVID-19 crisis has made pre-existing health equity gaps even more salient. Community psychology has a specific interest in working with organizations, systems, and communities to address social determinants that perpetuate inequities by refocusing interventions around achieving health and wellness for all. This article examines how community psychology has discussed equity thus far to identify strengths and gaps for future research and practice. The results showed the prominence of community-based participatory research and the diversity of settings researchers work in. However, the total number of abstracts with equity concepts was lower than expected, which suggests there is a need for a continued focus on equity.


Asunto(s)
Psiquiatría Comunitaria/métodos , Investigación Participativa Basada en la Comunidad/métodos , Equidad en Salud/estadística & datos numéricos , Descubrimiento del Conocimiento/métodos , Procesamiento de Lenguaje Natural , Determinantes Sociales de la Salud/estadística & datos numéricos , Humanos , Publicaciones Periódicas como Asunto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA