Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
1.
BMC Bioinformatics ; 25(1): 273, 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39169321

RESUMEN

BACKGROUND: There has been a considerable advancement in AI technologies like LLM and machine learning to support biomedical knowledge discovery. MAIN BODY: We propose a novel biomedical neural search service called 'VAIV Bio-Discovery', which supports enhanced knowledge discovery and document search on unstructured text such as PubMed. It mainly handles with information related to chemical compound/drugs, gene/proteins, diseases, and their interactions (chemical compounds/drugs-proteins/gene including drugs-targets, drug-drug, and drug-disease). To provide comprehensive knowledge, the system offers four search options: basic search, entity and interaction search, and natural language search. We employ T5slim_dec, which adapts the autoregressive generation task of the T5 (text-to-text transfer transformer) to the interaction extraction task by removing the self-attention layer in the decoder block. It also assists in interpreting research findings by summarizing the retrieved search results for a given natural language query with Retrieval Augmented Generation (RAG). The search engine is built with a hybrid method that combines neural search with the probabilistic search, BM25. CONCLUSION: As a result, our system can better understand the context, semantics and relationships between terms within the document, enhancing search accuracy. This research contributes to the rapidly evolving biomedical field by introducing a new service to access and discover relevant knowledge.


Asunto(s)
Procesamiento de Lenguaje Natural , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , PubMed , Motor de Búsqueda , Aprendizaje Automático , Almacenamiento y Recuperación de la Información/métodos , Redes Neurales de la Computación
2.
Comput Biol Med ; 179: 108920, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39047506

RESUMEN

This study introduces RheumaLinguisticpack (RheumaLpack), the first specialised linguistic web corpus designed for the field of musculoskeletal disorders. By combining web mining (i.e., web scraping) and natural language processing (NLP) techniques, as well as clinical expertise, RheumaLpack systematically captures and curates structured and unstructured data across a spectrum of web sources including clinical trials registers (i.e., ClinicalTrials.gov), bibliographic databases (i.e., PubMed), medical agencies (i.e. European Medicines Agency), social media (i.e., Reddit), and accredited health websites (i.e., MedlinePlus, Harvard Health Publishing, and Cleveland Clinic). Given the complexity of rheumatic and musculoskeletal diseases (RMDs) and their significant impact on quality of life, this resource can be proposed as a useful tool to train algorithms that could mitigate the diseases' effects. Therefore, the corpus aims to improve the training of artificial intelligence (AI) algorithms and facilitate knowledge discovery in RMDs. The development of RheumaLpack involved a systematic six-step methodology covering data identification, characterisation, selection, collection, processing, and corpus description. The result is a non-annotated, monolingual, and dynamic corpus, featuring almost 3 million records spanning from 2000 to 2023. RheumaLpack represents a pioneering contribution to rheumatology research, providing a useful resource for the development of advanced AI and NLP applications. This corpus highlights the value of web data to address the challenges posed by musculoskeletal diseases, illustrating the corpus's potential to improve research and treatment paradigms in rheumatology. Finally, the methodology shown can be replicated to obtain data from other medical specialities. The code and details on how to build RheumaLpack are also provided to facilitate the dissemination of such resource.


Asunto(s)
Procesamiento de Lenguaje Natural , Reumatología , Humanos , Internet , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , Enfermedades Musculoesqueléticas
3.
Comput Biol Med ; 176: 108525, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38749322

RESUMEN

Deep neural networks have become increasingly popular for analyzing ECG data because of their ability to accurately identify cardiac conditions and hidden clinical factors. However, the lack of transparency due to the black box nature of these models is a common concern. To address this issue, explainable AI (XAI) methods can be employed. In this study, we present a comprehensive analysis of post-hoc XAI methods, investigating the glocal (aggregated local attributions over multiple samples) and global (concept based XAI) perspectives. We have established a set of sanity checks to identify saliency as the most sensible attribution method. We provide a dataset-wide analysis across entire patient subgroups, which goes beyond anecdotal evidence, to establish the first quantitative evidence for the alignment of model behavior with cardiologists' decision rules. Furthermore, we demonstrate how these XAI techniques can be utilized for knowledge discovery, such as identifying subtypes of myocardial infarction. We believe that these proposed methods can serve as building blocks for a complementary assessment of the internal validity during a certification process, as well as for knowledge discovery in the field of ECG analysis.


Asunto(s)
Aprendizaje Profundo , Electrocardiografía , Electrocardiografía/métodos , Humanos , Descubrimiento del Conocimiento/métodos , Redes Neurales de la Computación , Procesamiento de Señales Asistido por Computador
4.
J Biomed Inform ; 145: 104464, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37541406

RESUMEN

OBJECTIVE: We explore the framing of literature-based discovery (LBD) as link prediction and graph embedding learning, with Alzheimer's Disease (AD) as our focus disease context. The key link prediction setting of prediction window length is specifically examined in the context of a time-sliced evaluation methodology. METHODS: We propose a four-stage approach to explore literature-based discovery for Alzheimer's Disease, creating and analyzing a knowledge graph tailored to the AD context, and predicting and evaluating new knowledge based on time-sliced link prediction. The first stage is to collect an AD-specific corpus. The second stage involves constructing an AD knowledge graph with identified AD-specific concepts and relations from the corpus. In the third stage, 20 pairs of training and testing datasets are constructed with the time-slicing methodology. Finally, we infer new knowledge with graph embedding-based link prediction methods. We compare different link prediction methods in this context. The impact of limiting prediction evaluation of LBD models in the context of short-term and longer-term knowledge evolution for Alzheimer's Disease is assessed. RESULTS: We constructed an AD corpus of over 16 k papers published in 1977-2021, and automatically annotated it with concepts and relations covering 11 AD-specific semantic entity types. The knowledge graph of Alzheimer's Disease derived from this resource consisted of ∼11 k nodes and ∼394 k edges, among which 34% were genotype-phenotype relationships, 57% were genotype-genotype relationships, and 9% were phenotype-phenotype relationships. A Structural Deep Network Embedding (SDNE) model consistently showed the best performance in terms of returning the most confident set of link predictions as time progresses over 20 years. A huge improvement in model performance was observed when changing the link prediction evaluation setting to consider a more distant future, reflecting the time required for knowledge accumulation. CONCLUSION: Neural network graph-embedding link prediction methods show promise for the literature-based discovery context, although the prediction setting is extremely challenging, with graph densities of less than 1%. Varying prediction window length on the time-sliced evaluation methodology leads to hugely different results and interpretations of LBD studies. Our approach can be generalized to enable knowledge discovery for other diseases. AVAILABILITY: Code, AD ontology, and data are available at https://github.com/READ-BioMed/readbiomed-lbd.


Asunto(s)
Enfermedad de Alzheimer , Descubrimiento del Conocimiento , Humanos , Descubrimiento del Conocimiento/métodos , Enfermedad de Alzheimer/diagnóstico , Redes Neurales de la Computación , Aprendizaje , Fenotipo
5.
J Biomed Inform ; 143: 104362, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37146741

RESUMEN

Scientific literature presents a wealth of information yet to be explored. As the number of researchers increase with each passing year and publications are released, this contributes to an era where specialized fields of research are becoming more prevalent. As this trend continues, this further propagates the separation of interdisciplinary publications and makes keeping up to date with literature a laborious task. Literature-based discovery (LBD) aims to mitigate these concerns by promoting information sharing among non-interacting literature while extracting potentially meaningful information. Furthermore, recent advances in neural network architectures and data representation techniques have fueled their respective research communities in achieving state-of-the-art performance in many downstream tasks. However, studies of neural network-based methods for LBD remain to be explored. We introduce and explore a deep learning neural network-based approach for LBD. Additionally, we investigate various approaches to represent terms as concepts and analyze the affect of feature scaling representations into our model. We compare the evaluation performance of our method on five hallmarks of cancer datasets utilized for closed discovery. Our results show the chosen representation as input into our model affects evaluation performance. We found feature scaling our input representations increases evaluation performance and decreases the necessary number of epochs needed to achieve model generalization. We also explore two approaches to represent model output. We found reducing the model's output to capturing a subset of concepts improved evaluation performance at the cost of model generalizability. We also compare the efficacy of our method on the five hallmarks of cancer datasets to a set of randomly chosen relations between concepts. We found these experiments confirm our method's suitability for LBD.


Asunto(s)
Aprendizaje Profundo , Neoplasias , Humanos , Redes Neurales de la Computación , Descubrimiento del Conocimiento/métodos , Publicaciones
6.
J Biomed Inform ; 142: 104383, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37196989

RESUMEN

OBJECTIVE: To demonstrate and develop an approach enabling individual researchers or small teams to create their own ad-hoc, lightweight knowledge bases tailored for specialized scientific interests, using text-mining over scientific literature, and demonstrate the effectiveness of these knowledge bases in hypothesis generation and literature-based discovery (LBD). METHODS: We propose a lightweight process using an extractive search framework to create ad-hoc knowledge bases, which require minimal training and no background in bio-curation or computer science. These knowledge bases are particularly effective for LBD and hypothesis generation using Swanson's ABC method. The personalized nature of the knowledge bases allows for a somewhat higher level of noise than "public facing" ones, as researchers are expected to have prior domain experience to separate signal from noise. Fact verification is shifted from exhaustive verification of the knowledge base to post-hoc verification of specific entries of interest, allowing researchers to assess the correctness of relevant knowledge base entries by considering the paragraphs in which the facts were introduced. RESULTS: We demonstrate the methodology by constructing several knowledge bases of different kinds: three knowledge bases that support lab-internal hypothesis generation: Drug Delivery to Ovarian Tumors (DDOT); Tissue Engineering and Regeneration; Challenges in Cancer Research; and an additional comprehensive, accurate knowledge base designated as a public resource for the wider community on the topic of Cell Specific Drug Delivery (CSDD). In each case, we show the design and construction process, along with relevant visualizations for data exploration, and hypothesis generation. For CSDD and DDOT we also show meta-analysis, human evaluation, and in vitro experimental evaluation. CONCLUSION: Our approach enables researchers to create personalized, lightweight knowledge bases for specialized scientific interests, effectively facilitating hypothesis generation and literature-based discovery (LBD). By shifting fact verification efforts to post-hoc verification of specific entries, researchers can focus on exploring and generating hypotheses based on their expertise. The constructed knowledge bases demonstrate the versatility and adaptability of our approach to versatile research interests. The web-based platform, available at https://spike-kbc.apps.allenai.org, provides researchers with a valuable tool for rapid construction of knowledge bases tailored to their needs.


Asunto(s)
Minería de Datos , Descubrimiento del Conocimiento , Humanos , Minería de Datos/métodos , Descubrimiento del Conocimiento/métodos , Publicaciones
8.
J Community Psychol ; 49(6): 1718-1731, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34004017

RESUMEN

Large amounts of text-based data, like study abstracts, often go unanalyzed because the task is laborious. Natural language processing (NLP) uses computer-based algorithms not traditionally implemented in community psychology to effectively and efficiently process text. These methods include examining the frequency of words and phrases, the clustering of topics, and the interrelationships of words. This article applied NLP to explore the concept of equity in community psychology. The COVID-19 crisis has made pre-existing health equity gaps even more salient. Community psychology has a specific interest in working with organizations, systems, and communities to address social determinants that perpetuate inequities by refocusing interventions around achieving health and wellness for all. This article examines how community psychology has discussed equity thus far to identify strengths and gaps for future research and practice. The results showed the prominence of community-based participatory research and the diversity of settings researchers work in. However, the total number of abstracts with equity concepts was lower than expected, which suggests there is a need for a continued focus on equity.


Asunto(s)
Psiquiatría Comunitaria/métodos , Investigación Participativa Basada en la Comunidad/métodos , Equidad en Salud/estadística & datos numéricos , Descubrimiento del Conocimiento/métodos , Procesamiento de Lenguaje Natural , Determinantes Sociales de la Salud/estadística & datos numéricos , Humanos , Publicaciones Periódicas como Asunto
9.
Br J Soc Psychol ; 60(1): 1-28, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33616965

RESUMEN

The COVID-19 pandemic points to the need for scientists to pool their efforts in order to understand this disease and respond to the ensuing crisis. Other global challenges also require such scientific cooperation. Yet in academic institutions, reward structures and incentives are based on systems that primarily fuel the competition between (groups of) scientific researchers. Competition between individual researchers, research groups, research approaches, and scientific disciplines is seen as an important selection mechanism and driver of academic excellence. These expected benefits of competition have come to define the organizational culture in academia. There are clear indications that the overreliance on competitive models undermines cooperative exchanges that might lead to higher quality insights. This damages the well-being and productivity of individual researchers and impedes efforts towards collaborative knowledge generation. Insights from social and organizational psychology on the side effects of relying on performance targets, prioritizing the achievement of success over the avoidance of failure, and emphasizing self-interest and efficiency, clarify implicit mechanisms that may spoil valid attempts at transformation. The analysis presented here elucidates that a broader change in the academic culture is needed to truly benefit from current attempts to create more open and collaborative practices for cumulative knowledge generation.


Asunto(s)
Comunicación Interdisciplinaria , Colaboración Intersectorial , Descubrimiento del Conocimiento , Ciencia/educación , Curriculum , Eficiencia , Humanos , Descubrimiento del Conocimiento/métodos , Investigación/educación
15.
PLoS One ; 16(2): e0244618, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33571223

RESUMEN

Just like everything in nature, scientific topics flourish and perish. While existing literature well captures article's life-cycle via citation patterns, little is known about how scientific popularity and impact evolves for a specific topic. It would be most intuitive if we could 'feel' topic's activity just as we perceive the weather by temperature. Here, we conceive knowledge temperature to quantify topic overall popularity and impact through citation network dynamics. Knowledge temperature includes 2 parts. One part depicts lasting impact by assessing knowledge accumulation with an analogy between topic evolution and isobaric expansion. The other part gauges temporal changes in knowledge structure, an embodiment of short-term popularity, through the rate of entropy change with internal energy, 2 thermodynamic variables approximated via node degree and edge number. Our analysis of representative topics with size ranging from 1000 to over 30000 articles reveals that the key to flourishing is topics' ability in accumulating useful information for future knowledge generation. Topics particularly experience temperature surges when their knowledge structure is altered by influential articles. The spike is especially obvious when there appears a single non-trivial novel research focus or merging in topic structure. Overall, knowledge temperature manifests topics' distinct evolutionary cycles.


Asunto(s)
Descubrimiento del Conocimiento/métodos , Publicaciones/tendencias , Investigación/tendencias , Conocimiento , Modelos Teóricos , Cultura Popular , Proyectos de Investigación/tendencias
16.
BMJ Health Care Inform ; 28(1)2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33419870

RESUMEN

INTRODUCTION: Numerous scientific journal articles related to COVID-19 have been rapidly published, making navigation and understanding of relationships difficult. METHODS: A graph network was constructed from the publicly available COVID-19 Open Research Dataset (CORD-19) of COVID-19-related publications using an engine leveraging medical knowledge bases to identify discrete medical concepts and an open-source tool (Gephi) to visualise the network. RESULTS: The network shows connections between diseases, medications and procedures identified from the title and abstract of 195 958 COVID-19-related publications (CORD-19 Dataset). Connections between terms with few publications, those unconnected to the main network and those irrelevant were not displayed. Nodes were coloured by knowledge base and the size of the node related to the number of publications containing the term. The data set and visualisations were made publicly accessible via a webtool. CONCLUSION: Knowledge management approaches (text mining and graph networks) can effectively allow rapid navigation and exploration of entity inter-relationships to improve understanding of diseases such as COVID-19.


Asunto(s)
Inteligencia Artificial , COVID-19/epidemiología , Descubrimiento del Conocimiento/métodos , Publicaciones Periódicas como Asunto/estadística & datos numéricos , Humanos , Procesamiento de Lenguaje Natural , SARS-CoV-2
18.
Nature ; 587(7833): 240-245, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33177664

RESUMEN

The Zoonomia Project is investigating the genomics of shared and specialized traits in eutherian mammals. Here we provide genome assemblies for 131 species, of which all but 9 are previously uncharacterized, and describe a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from more than 80% of mammalian families. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.


Asunto(s)
Conservación de los Recursos Naturales , Euterios/clasificación , Euterios/genética , Variación Genética , Genómica/métodos , Descubrimiento del Conocimiento , Animales , Biodiversidad , Investigación Biomédica , Conservación de los Recursos Naturales/métodos , Evolución Molecular , Extinción Biológica , Especiación Genética , Humanos , Infecciones , Descubrimiento del Conocimiento/métodos , Pérdida de Heterocigocidad , Neoplasias , Filogenia , Medición de Riesgo , Selección Genética , Alineación de Secuencia , Especificidad de la Especie , Ponzoñas
19.
Nurs Philos ; 21(3): e12309, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32537914

RESUMEN

To revitalize nursing science, there is a need for a new approach to guide nurse scientists in addressing complex problems in health care. By applying theoretical concepts from a revolutionary philosopher of science, Paul K. Feyerabend, new nursing knowledge can be produced using creativity and pluralistic approaches. Feyerabend proposed that methods within and outside of science can produce knowledge. Despite the recognition of Feyerabendian philosophy within science, there is currently a lack of literature regarding the relevance of Feyerabendian philosophy for nursing science. We aim to (a) describe and critique Feyerabendian concepts, (b) discuss the potential application of Feyerabendian philosophy for knowledge production within gerontological nursing and (c) describe theoretical possibilities for nurse scientists in using Feyerabendian philosophy to guide nursing knowledge development. We begin by introducing Feyerabend's life and his inspirations for his theoretical concepts, epistemological anarchism, theoretical pluralism and humanitarianism, and conclude by offering suggestions of how to apply Feyerabendian philosophy in nursing research.


Asunto(s)
Descubrimiento del Conocimiento/métodos , Enfermería/métodos , Filosofía , Humanos , Enfermería/tendencias
20.
PLoS One ; 15(6): e0233879, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32544200

RESUMEN

Although a great deal of attention has been paid to how conspiracy theories circulate on social media, and the deleterious effect that they, and their factual counterpart conspiracies, have on political institutions, there has been little computational work done on describing their narrative structures. Predicating our work on narrative theory, we present an automated pipeline for the discovery and description of the generative narrative frameworks of conspiracy theories that circulate on social media, and actual conspiracies reported in the news media. We base this work on two separate comprehensive repositories of blog posts and news articles describing the well-known conspiracy theory Pizzagate from 2016, and the New Jersey political conspiracy Bridgegate from 2013. Inspired by the qualitative narrative theory of Greimas, we formulate a graphical generative machine learning model where nodes represent actors/actants, and multi-edges and self-loops among nodes capture context-specific relationships. Posts and news items are viewed as samples of subgraphs of the hidden narrative framework network. The problem of reconstructing the underlying narrative structure is then posed as a latent model estimation problem. To derive the narrative frameworks in our target corpora, we automatically extract and aggregate the actants (people, places, objects) and their relationships from the posts and articles. We capture context specific actants and interactant relationships by developing a system of supernodes and subnodes. We use these to construct an actant-relationship network, which constitutes the underlying generative narrative framework for each of the corpora. We show how the Pizzagate framework relies on the conspiracy theorists' interpretation of "hidden knowledge" to link otherwise unlinked domains of human interaction, and hypothesize that this multi-domain focus is an important feature of conspiracy theories. We contrast this to the single domain focus of an actual conspiracy. While Pizzagate relies on the alignment of multiple domains, Bridgegate remains firmly rooted in the single domain of New Jersey politics. We hypothesize that the narrative framework of a conspiracy theory might stabilize quickly in contrast to the narrative framework of an actual conspiracy, which might develop more slowly as revelations come to light. By highlighting the structural differences between the two narrative frameworks, our approach could be used by private and public analysts to help distinguish between conspiracy theories and conspiracies.


Asunto(s)
Decepción , Descubrimiento del Conocimiento/métodos , Narración , Política , Medios de Comunicación Sociales , Programas Informáticos , Análisis de Datos , Humanos , Aprendizaje Automático , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...