Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
1.
BMC Med Inform Decis Mak ; 21(Suppl 9): 251, 2021 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-34789238

RESUMEN

BACKGROUND: Drug repurposing is to find new indications of approved drugs, which is essential for investigating new uses for approved or investigational drug efficiency. The active gene annotation corpus (named AGAC) is annotated by human experts, which was developed to support knowledge discovery for drug repurposing. The AGAC track of the BioNLP Open Shared Tasks using this corpus is organized by EMNLP-BioNLP 2019, where the "Selective annotation" attribution makes AGAC track more challenging than other traditional sequence labeling tasks. In this work, we show our methods for trigger word detection (Task 1) and its thematic role identification (Task 2) in the AGAC track. As a step forward to drug repurposing research, our work can also be applied to large-scale automatic extraction of medical text knowledge. METHODS: To meet the challenges of the two tasks, we consider Task 1 as the medical name entity recognition (NER), which cultivates molecular phenomena related to gene mutation. And we regard Task 2 as a relation extraction task, which captures the thematic roles between entities. In this work, we exploit pre-trained biomedical language representation models (e.g., BioBERT) in the information extraction pipeline for mutation-disease knowledge collection from PubMed. Moreover, we design the fine-tuning framework by using a multi-task learning technique and extra features. We further investigate different approaches to consolidate and transfer the knowledge from varying sources and illustrate the performance of our model on the AGAC corpus. Our approach is based on fine-tuned BERT, BioBERT, NCBI BERT, and ClinicalBERT using multi-task learning. Further experiments show the effectiveness of knowledge transformation and the ensemble integration of models of two tasks. We conduct a performance comparison of various algorithms. We also do an ablation study on the development set of Task 1 to examine the effectiveness of each component of our method. RESULTS: Compared with competitor methods, our model obtained the highest Precision (0.63), Recall (0.56), and F-score value (0.60) in Task 1, which ranks first place. It outperformed the baseline method provided by the organizers by 0.10 in F-score. The model shared the same encoding layers for the named entity recognition and relation extraction parts. And we obtained a second high F-score (0.25) in Task 2 with a simple but effective framework. CONCLUSIONS: Experimental results on the benchmark annotation of genes with active mutation-centric function changes corpus show that integrating pre-trained biomedical language representation models (i.e., BERT, NCBI BERT, ClinicalBERT, BioBERT) into a pipe of information extraction methods with multi-task learning can improve the ability to collect mutation-disease knowledge from PubMed.


Asunto(s)
Procesamiento de Lenguaje Natural , Preparaciones Farmacéuticas , Algoritmos , Humanos , Almacenamiento y Recuperación de la Información , Descubrimiento del Conocimiento
2.
Annu Rev Biomed Data Sci ; 4: 313-339, 2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-34465169

RESUMEN

The COVID-19 (coronavirus disease 2019) pandemic has had a significant impact on society, both because of the serious health effects of COVID-19 and because of public health measures implemented to slow its spread. Many of these difficulties are fundamentally information needs; attempts to address these needs have caused an information overload for both researchers and the public. Natural language processing (NLP)-the branch of artificial intelligence that interprets human language-can be applied to address many of the information needs made urgent by the COVID-19 pandemic. This review surveys approximately 150 NLP studies and more than 50 systems and datasets addressing the COVID-19 pandemic. We detail work on four core NLP tasks: information retrieval, named entity recognition, literature-based discovery, and question answering. We also describe work that directly addresses aspects of the pandemic through four additional tasks: topic modeling, sentiment and emotion analysis, caseload forecasting, and misinformation detection. We conclude by discussing observable trends and remaining challenges.


Asunto(s)
COVID-19/epidemiología , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Comunicación , Minería de Datos/métodos , Conjuntos de Datos como Asunto , Emociones , Humanos , Descubrimiento del Conocimiento , Pandemias , Publicaciones Periódicas como Asunto , Programas Informáticos
4.
Waste Manag Res ; 39(11): 1331-1340, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34525881

RESUMEN

The processes related to solid waste management (SWM) are being revised as new technologies emerge and are applied in the area to achieve greater environmental, social and economic sustainability for society. To achieve our goal, two robust review protocols (Population, Intervention, Comparison, Outcome, and Context (PICOC) and Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA)) were used to systematically analyze 62 documents extracted from the Web of Science database to identify the main techniques and tools for Knowledge Discovery in Databases (KDD) and Data Mining (DM) as applied to SWM and explore the technological potential to optimize the stages of collecting and transporting waste. Moreover, it was possible to analyze the main challenges and opportunities of KDD and DM for SWM. The results show that the most used tools for SWM are MATLAB (29.7%) and GIS (13.5%), whereas the most used techniques are Artificial Neural Networks (35.8%), Linear Regression (16.0%) and Support Vector Machine (12.3%). In addition, 15.3% of the studies were conducted with data from China, 11.1% from India and 9.7% of the studies analyzed and compared data from several other countries. Furthermore, the research showed that the main challenges in the field of study are related to the collection and treatment of data, whereas the opportunities appear to be linked mainly to the impact on the pillars of sustainable development. Thus, this study portrays important issues associated with the use of KDD and DM for optimal SWM and has the potential to assist and direct researchers and field professionals in future studies.


Asunto(s)
Residuos Sólidos , Administración de Residuos , Minería de Datos , Bases de Datos Factuales , Descubrimiento del Conocimiento , Residuos Sólidos/análisis
5.
Sci Rep ; 11(1): 10949, 2021 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-34040033

RESUMEN

Deep learning-based tools may annotate and interpret medical data more quickly, consistently, and accurately than medical doctors. However, as medical doctors are ultimately responsible for clinical decision-making, any deep learning-based prediction should be accompanied by an explanation that a human can understand. We present an approach called electrocardiogram gradient class activation map (ECGradCAM), which is used to generate attention maps and explain the reasoning behind deep learning-based decision-making in ECG analysis. Attention maps may be used in the clinic to aid diagnosis, discover new medical knowledge, and identify novel features and characteristics of medical tests. In this paper, we showcase how ECGradCAM attention maps can unmask how a novel deep learning model measures both amplitudes and intervals in 12-lead electrocardiograms, and we show an example of how attention maps may be used to develop novel ECG features.


Asunto(s)
Aprendizaje Profundo , Electrocardiografía , Descubrimiento del Conocimiento , Modelos Cardiovasculares , Adulto , Anciano , Algoritmos , Cardiólogos , Exactitud de los Datos , Diagnóstico por Computador , Femenino , Cardiopatías/diagnóstico , Cardiopatías/fisiopatología , Humanos , Masculino , Persona de Mediana Edad , Análisis para Determinación del Sexo
6.
Stud Health Technol Inform ; 281: 724-728, 2021 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-34042671

RESUMEN

This paper explores the use of semantic- and evidence-based biomedical knowledge to build the RiskExplorer knowledge graph that outlines causal associations between risk factors and chronic disease or cancers. The intent of this work is to offer an interactive knowledge synthesis platform to empower health-information-seeking individuals to learn about and mitigate modifiable risk factors. Our approach analyzes biomedical text (from PubMed abstracts), Semantic Medline database, evidence-based semantic associations, literature-based discovery, and graph database to discover associations between risk factors and breast cancer. Our methodological framework involves (a) identifying relevant literature on specified chronic diseases or cancers, (b) extracting semantic associations via knowledge mining tool, (c) building rich semantic graph by transforming semantic associations to nodes and edges, (d) applying frequency-based methods and using semantic edge properties to traverse the graph and identify meaningful multi-node NCD risk paths. Generated multi-node risk paths consist of a source node (representing the source risk factor), one or more intermediate nodes (representing biomedical phenotypes), a target node (representing a chronic disease or cancer), and edges between nodes representing meaningful semantic associations. The results demonstrate that our methodology is capable of generating biomedically valid knowledge related to causal risk and protective factors related to breast cancer.


Asunto(s)
Neoplasias de la Mama , Reconocimiento de Normas Patrones Automatizadas , Neoplasias de la Mama/epidemiología , Humanos , Incidencia , Descubrimiento del Conocimiento , Factores de Riesgo , Semántica
7.
J Community Psychol ; 49(6): 1718-1731, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34004017

RESUMEN

Large amounts of text-based data, like study abstracts, often go unanalyzed because the task is laborious. Natural language processing (NLP) uses computer-based algorithms not traditionally implemented in community psychology to effectively and efficiently process text. These methods include examining the frequency of words and phrases, the clustering of topics, and the interrelationships of words. This article applied NLP to explore the concept of equity in community psychology. The COVID-19 crisis has made pre-existing health equity gaps even more salient. Community psychology has a specific interest in working with organizations, systems, and communities to address social determinants that perpetuate inequities by refocusing interventions around achieving health and wellness for all. This article examines how community psychology has discussed equity thus far to identify strengths and gaps for future research and practice. The results showed the prominence of community-based participatory research and the diversity of settings researchers work in. However, the total number of abstracts with equity concepts was lower than expected, which suggests there is a need for a continued focus on equity.


Asunto(s)
Psiquiatría Comunitaria/métodos , Investigación Participativa Basada en la Comunidad/métodos , Equidad en Salud/estadística & datos numéricos , Descubrimiento del Conocimiento/métodos , Procesamiento de Lenguaje Natural , Determinantes Sociales de la Salud/estadística & datos numéricos , Humanos , Publicaciones Periódicas como Asunto
8.
Tijdschr Psychiatr ; 63(4): 294-300, 2021.
Artículo en Holandés | MEDLINE | ID: mdl-33913146

RESUMEN

BACKGROUND: Today, almost every psychiatric care institution registers information concerning the care they provide in an electronic health record (EHR). By analyzing these health care data with innovative and advanced techniques, they can be an important source of new knowledge in the near future, and thereby contribute to improving psychiatric care. AIM: To investigate how data from EHRs can provide relevant knowledge and insights for psychiatric care. METHOD: We designed and discussed solutions for some technical, organizational and ethical barriers surrounding unlocking health care data, in order to make analysis possible. We then analyzed the obtained health care data using techniques from knowledge discovery, the process in which new and useful information is extracted from data. We used techniques from data visualization, machine learning and natural language processing, among others, to demonstrate which types of results can be achieved. RESULTS: Our approach showed that it is possible to find new and interesting insights that are hidden in EHRs on an aggregated level, in collaboration with healthcare professionals and patients. In particular we showed how the risk of violent behavior can effectively and accurately be assessed based on clinical text in the EHR. CONCLUSION: After addressing some of the important challenges surrounding analyzing EHR data, learning from data from EHRs is a new and interesting approach with clear potential for improving psychiatric care.


Asunto(s)
Registros Electrónicos de Salud , Descubrimiento del Conocimiento , Psiquiatría , Humanos
9.
J Biomed Inform ; 117: 103743, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33753268

RESUMEN

Accurate forecasting of medical service requirements is an important big data problem that is crucial for resource management in critical times such as natural disasters and pandemics. With the global spread of coronavirus disease 2019 (COVID-19), several concerns have been raised regarding the ability of medical systems to handle sudden changes in the daily routines of healthcare providers. One significant problem is the management of ambulance dispatch and control during a pandemic. To help address this problem, we first analyze ambulance dispatch data records from April 2014 to August 2020 for Nagoya City, Japan. Significant changes were observed in the data during the pandemic, including the state of emergency (SoE) declared across Japan. In this study, we propose a deep learning framework based on recurrent neural networks to estimate the number of emergency ambulance dispatches (EADs) during a SoE. The fusion of data includes environmental factors, the localization data of mobile phone users, and the past history of EADs, thereby providing a general framework for knowledge discovery and better resource management. The results indicate that the proposed blend of training data can be used efficiently in a real-world estimation of EAD requirements during periods of high uncertainties such as pandemics.


Asunto(s)
Ambulancias , COVID-19 , Servicios Médicos de Urgencia , Descubrimiento del Conocimiento , Aprendizaje Profundo , Recursos en Salud , Humanos , Japón , Redes Neurales de la Computación , Pandemias
10.
Appl Clin Inform ; 12(2): 245-250, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33763846

RESUMEN

BACKGROUND: Clinicians express concern that they may be unaware of important information contained in voluminous scanned and other outside documents contained in electronic health records (EHRs). An example is "unrecognized EHR risk factor information," defined as risk factors for heritable cancer that exist within a patient's EHR but are not known by current treating providers. In a related study using manual EHR chart review, we found that half of the women whose EHR contained risk factor information meet criteria for further genetic risk evaluation for heritable forms of breast and ovarian cancer. They were not referred for genetic counseling. OBJECTIVES: The purpose of this study was to compare the use of automated methods (optical character recognition with natural language processing) versus human review in their ability to identify risk factors for heritable breast and ovarian cancer within EHR scanned documents. METHODS: We evaluated the accuracy of the chart review by comparing our criterion standard (physician chart review) versus an automated method involving Amazon's Textract service (Amazon.com, Seattle, Washington, United States), a clinical language annotation modeling and processing toolkit (CLAMP) (Center for Computational Biomedicine at The University of Texas Health Science, Houston, Texas, United States), and a custom-written Java application. RESULTS: We found that automated methods identified most cancer risk factor information that would otherwise require clinician manual review and therefore is at risk of being missed. CONCLUSION: The use of automated methods for identification of heritable risk factors within EHRs may provide an accurate yet rapid review of patients' past medical histories. These methods could be further strengthened via improved analysis of handwritten notes, tables, and colloquial phrases.


Asunto(s)
Descubrimiento del Conocimiento , Registros Electrónicos de Salud , Femenino , Humanos , Procesamiento de Lenguaje Natural , Factores de Riesgo , Texas , Estados Unidos
11.
BMC Bioinformatics ; 22(1): 107, 2021 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-33663372

RESUMEN

BACKGROUND: Visual exploration of gene product behavior across multiple omic datasets can pinpoint technical limitations in data and reveal biological trends. Still, such exploration is challenging as there is a need for visualizations that are tailored for the purpose. RESULTS: The OmicLoupe software was developed to facilitate visual data exploration and provides more than 15 interactive cross-dataset visualizations for omics data. It expands visualizations to multiple datasets for quality control, statistical comparisons and overlap and correlation analyses, while allowing for rapid inspection and downloading of selected features. The usage of OmicLoupe is demonstrated in three different studies, where it allowed for detection of both technical data limitations and biological trends across different omic layers. An example is an analysis of SARS-CoV-2 infection based on two previously published studies, where OmicLoupe facilitated the identification of gene products with consistent expression changes across datasets at both the transcript and protein levels. CONCLUSIONS: OmicLoupe provides fast exploration of omics data with tailored visualizations for comparisons within and across data layers. The interactive visualizations are highly informative and are expected to be useful in various analyses of both newly generated and previously published data. OmicLoupe is available at quantitativeproteomics.org/omicloupe.


Asunto(s)
Biología Computacional/instrumentación , Descubrimiento del Conocimiento , Programas Informáticos , COVID-19/genética , Interpretación Estadística de Datos , Humanos , Proteoma , Transcriptoma
12.
Curr Protoc ; 1(3): e90, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33780170

RESUMEN

Profiling samples from patients, tissues, and cells with genomics, transcriptomics, epigenomics, proteomics, and metabolomics ultimately produces lists of genes and proteins that need to be further analyzed and integrated in the context of known biology. Enrichr (Chen et al., 2013; Kuleshov et al., 2016) is a gene set search engine that enables the querying of hundreds of thousands of annotated gene sets. Enrichr uniquely integrates knowledge from many high-profile projects to provide synthesized information about mammalian genes and gene sets. The platform provides various methods to compute gene set enrichment, and the results are visualized in several interactive ways. This protocol provides a summary of the key features of Enrichr, which include using Enrichr programmatically and embedding an Enrichr button on any website. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Analyzing lists of differentially expressed genes from transcriptomics, proteomics and phosphoproteomics, GWAS studies, or other experimental studies Basic Protocol 2: Searching Enrichr by a single gene or key search term Basic Protocol 3: Preparing raw or processed RNA-seq data through BioJupies in preparation for Enrichr analysis Basic Protocol 4: Analyzing gene sets for model organisms using modEnrichr Basic Protocol 5: Using Enrichr in Geneshot Basic Protocol 6: Using Enrichr in ARCHS4 Basic Protocol 7: Using the enrichment analysis visualization Appyter to visualize Enrichr results Basic Protocol 8: Using the Enrichr API Basic Protocol 9: Adding an Enrichr button to a website.


Asunto(s)
Descubrimiento del Conocimiento , Programas Informáticos , Animales , Biología Computacional , Genómica , Humanos , RNA-Seq
14.
J Biomed Inform ; 116: 103716, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33647519

RESUMEN

Corpora are one of the most valuable resources at present for building machine learning systems. However, building new corpora is an expensive task, which makes the automatic extension of corpora a highly attractive task to develop. Hence, finding new strategies that reduce the cost and effort involved in this task, while at the same time guaranteeing quality, remains an open and important challenge for the research community. In this paper, we present a set of ensembling strategies oriented toward entity and relation extraction tasks. The main goal is to combine several automatically annotated versions of corpora to produce a single version with improved quality. An ensembler is built by exploring a configuration space in search of the combination that maximizes the fitness of the ensembled collection according to a reference collection. The eHealth-KD 2019 challenge was chosen for the case study. The submitted systems' outputs were ensembled, resulting in the construction of an automatically annotated collection of 8000 sentences. We show that using this collection as additional training input for a baseline algorithm has a positive impact on its performance. Additionally, the ensembling pipeline was used as a participant system in the 2020 edition of the challenge. The ensembled run achieved a slightly better performance than the individual runs.


Asunto(s)
Descubrimiento del Conocimiento , Telemedicina , Algoritmos , Humanos , Lenguaje , Aprendizaje Automático , Procesamiento de Lenguaje Natural
15.
J Biomed Inform ; 115: 103696, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33571675

RESUMEN

OBJECTIVE: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. METHODS: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from PubMed and other COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative and accurate subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant. We used this subset to construct a knowledge graph, and applied five state-of-the-art, neural knowledge graph completion algorithms (i.e., TransE, RotatE, DistMult, ComplEx, and STELP) to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. RESULTS: Accuracy classifier based on PubMedBERT achieved the best performance (F1 = 0.854) in identifying accurate semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1 = 0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as others that have not yet been studied. Discovery patterns enabled identification of additional candidate drugs and generation of plausible hypotheses regarding the links between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (i.e., paclitaxel, SB 203580, alpha 2-antiplasmin, metoclopramide, and oxymatrine) and the mechanistic explanations for their potential use are further discussed. CONCLUSION: We showed that a LBD approach can be feasible not only for discovering drug candidates for COVID-19, but also for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions. Source code and data are available at https://github.com/kilicogluh/lbd-covid.


Asunto(s)
COVID-19/tratamiento farmacológico , Reposicionamiento de Medicamentos , Descubrimiento del Conocimiento , Algoritmos , Antivirales/uso terapéutico , COVID-19/virología , Humanos , Redes Neurales de la Computación , SARS-CoV-2/aislamiento & purificación
20.
Br J Soc Psychol ; 60(1): 1-28, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33616965

RESUMEN

The COVID-19 pandemic points to the need for scientists to pool their efforts in order to understand this disease and respond to the ensuing crisis. Other global challenges also require such scientific cooperation. Yet in academic institutions, reward structures and incentives are based on systems that primarily fuel the competition between (groups of) scientific researchers. Competition between individual researchers, research groups, research approaches, and scientific disciplines is seen as an important selection mechanism and driver of academic excellence. These expected benefits of competition have come to define the organizational culture in academia. There are clear indications that the overreliance on competitive models undermines cooperative exchanges that might lead to higher quality insights. This damages the well-being and productivity of individual researchers and impedes efforts towards collaborative knowledge generation. Insights from social and organizational psychology on the side effects of relying on performance targets, prioritizing the achievement of success over the avoidance of failure, and emphasizing self-interest and efficiency, clarify implicit mechanisms that may spoil valid attempts at transformation. The analysis presented here elucidates that a broader change in the academic culture is needed to truly benefit from current attempts to create more open and collaborative practices for cumulative knowledge generation.


Asunto(s)
Comunicación Interdisciplinaria , Colaboración Intersectorial , Descubrimiento del Conocimiento , Ciencia/educación , Curriculum , Eficiencia , Humanos , Descubrimiento del Conocimiento/métodos , Investigación/educación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...