Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
J Biomed Inform ; 146: 104486, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37722445

RESUMEN

Large neural-based Pre-trained Language Models (PLM) have recently gained much attention due to their noteworthy performance in many downstream Information Retrieval (IR) and Natural Language Processing (NLP) tasks. PLMs can be categorized as either general-purpose, which are trained on resources such as large-scale Web corpora, and domain-specific which are trained on in-domain or mixed-domain corpora. While domain-specific PLMs have shown promising performance on domain-specific tasks, they are significantly more computationally expensive compared to general-purpose PLMs as they have to be either retrained or trained from scratch. The objective of our work in this paper is to explore whether it would be possible to leverage general-purpose PLMs to show competitive performance to domain-specific PLMs without the need for expensive retraining of the PLMs for domain-specific tasks. By focusing specifically on the recent BioASQ Biomedical Question Answering task, we show how different general-purpose PLMs show synergistic behaviour in terms of performance, which can lead to overall notable performance improvement when used in tandem with each other. More concretely, given a set of general-purpose PLMs, we propose a self-supervised method for training a classifier that systematically selects the PLM that is most likely to answer the question correctly on a per-input basis. We show that through such a selection strategy, the performance of general-purpose PLMs can become competitive with domain-specific PLMs while remaining computationally light since there is no need to retrain the large language model itself. We run experiments on the BioASQ dataset, which is a large-scale biomedical question-answering benchmark. We show that utilizing our proposed selection strategy can show statistically significant performance improvements on general-purpose language models with an average of 16.7% when using only lighter models such as DistilBERT and DistilRoBERTa, as well as 14.2% improvement when using relatively larger models such as BERT and RoBERTa and so, their performance become competitive with domain-specific large language models such as PubMedBERT.

2.
J Biomed Inform ; 71: 91-109, 2017 07.
Artículo en Inglés | MEDLINE | ID: mdl-28552401

RESUMEN

Recently, both researchers and practitioners have explored the possibility of semantically annotating large and continuously evolving collections of biomedical texts such as research papers, medical reports, and physician notes in order to enable their efficient and effective management and use in clinical practice or research laboratories. Such annotations can be automatically generated by biomedical semantic annotators - tools that are specifically designed for detecting and disambiguating biomedical concepts mentioned in text. The biomedical community has already presented several solid automated semantic annotators. However, the existing tools are either strong in their disambiguation capacity, i.e., the ability to identify the correct biomedical concept for a given piece of text among several candidate concepts, or they excel in their processing time, i.e., work very efficiently, but none of the semantic annotation tools reported in the literature has both of these qualities. In this paper, we present RysannMD (Ryerson Semantic Annotator for Medical Domain), a biomedical semantic annotation tool that strikes a balance between processing time and performance while disambiguating biomedical terms. In other words, RysannMD provides reasonable disambiguation performance when choosing the right sense for a biomedical term in a given context, and does that in a reasonable time. To examine how RysannMD stands with respect to the state of the art biomedical semantic annotators, we have conducted a series of experiments using standard benchmarking corpora, including both gold and silver standards, and four modern biomedical semantic annotators, namely cTAKES, MetaMap, NOBLE Coder, and Neji. The annotators were compared with respect to the quality of the produced annotations measured against gold and silver standards using precision, recall, and F1 measure and speed, i.e., processing time. In the experiments, RysannMD achieved the best median F1 measure across the benchmarking corpora, independent of the standard used (silver/gold), biomedical subdomain, and document size. In terms of the annotation speed, RysannMD scored the second best median processing time across all the experiments. The obtained results indicate that RysannMD offers the best performance among the examined semantic annotators when both quality of annotation and speed are considered simultaneously.


Asunto(s)
Curaduría de Datos , Procesamiento de Lenguaje Natural , Semántica , Minería de Datos , Humanos
3.
BMJ Open ; 14(7): e084124, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38969371

RESUMEN

BACKGROUND: Systematic reviews (SRs) are being published at an accelerated rate. Decision-makers may struggle with comparing and choosing between multiple SRs on the same topic. We aimed to understand how healthcare decision-makers (eg, practitioners, policymakers, researchers) use SRs to inform decision-making and to explore the potential role of a proposed artificial intelligence (AI) tool to assist in critical appraisal and choosing among SRs. METHODS: We developed a survey with 21 open and closed questions. We followed a knowledge translation plan to disseminate the survey through social media and professional networks. RESULTS: Our survey response rate was lower than expected (7.9% of distributed emails). Of the 684 respondents, 58.2% identified as researchers, 37.1% as practitioners, 19.2% as students and 13.5% as policymakers. Respondents frequently sought out SRs (97.1%) as a source of evidence to inform decision-making. They frequently (97.9%) found more than one SR on a given topic of interest to them. Just over half (50.8%) struggled to choose the most trustworthy SR among multiple. These difficulties related to lack of time (55.2%), or difficulties comparing due to varying methodological quality of SRs (54.2%), differences in results and conclusions (49.7%) or variation in the included studies (44.6%). Respondents compared SRs based on the relevance to their question of interest, methodological quality, and recency of the SR search. Most respondents (87.0%) were interested in an AI tool to help appraise and compare SRs. CONCLUSIONS: Given the identified barriers of using SR evidence, an AI tool to facilitate comparison of the relevance of SRs, the search and methodological quality, could help users efficiently choose among SRs and make healthcare decisions.


Asunto(s)
Inteligencia Artificial , Toma de Decisiones , Revisiones Sistemáticas como Asunto , Humanos , Revisiones Sistemáticas como Asunto/métodos , Encuestas y Cuestionarios , Técnicas de Apoyo para la Decisión , Atención a la Salud
4.
Syst Rev ; 10(1): 156, 2021 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-34039433

RESUMEN

BACKGROUND: Current text mining tools supporting abstract screening in systematic reviews are not widely used, in part because they lack sensitivity and precision. We set out to develop an accessible, semi-automated "workflow" to conduct abstract screening for systematic reviews and other knowledge synthesis methods. METHODS: We adopt widely recommended text-mining and machine-learning methods to (1) process title-abstracts into numerical training data; and (2) train a classification model to predict eligible abstracts. The predicted abstracts are screened by human reviewers for ("true") eligibility, and the newly eligible abstracts are used to identify similar abstracts, using near-neighbor methods, which are also screened. These abstracts, as well as their eligibility results, are used to update the classification model, and the above steps are iterated until no new eligible abstracts are identified. The workflow was implemented in R and evaluated using a systematic review of insulin formulations for type-1 diabetes (14,314 abstracts) and a scoping review of knowledge-synthesis methods (17,200 abstracts). Workflow performance was evaluated against the recommended practice of screening abstracts by 2 reviewers, independently. Standard measures were examined: sensitivity (inclusion of all truly eligible abstracts), specificity (exclusion of all truly ineligible abstracts), precision (inclusion of all truly eligible abstracts among all abstracts screened as eligible), F1-score (harmonic average of sensitivity and precision), and accuracy (correctly predicted eligible or ineligible abstracts). Workload reduction was measured as the hours the workflow saved, given only a subset of abstracts needed human screening. RESULTS: With respect to the systematic and scoping reviews respectively, the workflow attained 88%/89% sensitivity, 99%/99% specificity, 71%/72% precision, an F1-score of 79%/79%, 98%/97% accuracy, 63%/55% workload reduction, with 12%/11% fewer abstracts for full-text retrieval and screening, and 0%/1.5% missed studies in the completed reviews. CONCLUSION: The workflow was a sensitive, precise, and efficient alternative to the recommended practice of screening abstracts with 2 reviewers. All eligible studies were identified in the first case, while 6 studies (1.5%) were missed in the second that would likely not impact the review's conclusions. We have described the workflow in language accessible to reviewers with limited exposure to natural language processing and machine learning, and have made the code available to reviewers.


Asunto(s)
Minería de Datos , Procesamiento de Lenguaje Natural , Humanos , Aprendizaje Automático , Revisiones Sistemáticas como Asunto , Flujo de Trabajo
5.
J Am Med Inform Assoc ; 25(7): 819-826, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29648604

RESUMEN

Objective: The goal of this work is to map Unified Medical Language System (UMLS) concepts to DBpedia resources using widely accepted ontology relations from the Simple Knowledge Organization System (skos:exactMatch, skos:closeMatch) and from the Resource Description Framework Schema (rdfs:seeAlso), as a result of which a complete mapping from UMLS (UMLS 2016AA) to DBpedia (DBpedia 2015-10) is made publicly available that includes 221 690 skos:exactMatch, 26 276 skos:closeMatch, and 6 784 322 rdfs:seeAlso mappings. Methods: We propose a method called circular resolution that utilizes a combination of semantic annotators to map UMLS concepts to DBpedia resources. A set of annotators annotate definitions of UMLS concepts returning DBpedia resources while another set performs annotation on DBpedia resource abstracts returning UMLS concepts. Our pipeline aligns these 2 sets of annotations to determine appropriate mappings from UMLS to DBpedia. Results: We evaluate our proposed method using structured data from the Wikidata knowledge base as the ground truth, which consists of 4899 already existing UMLS to DBpedia mappings. Our results show an 83% recall with 77% precision-at-one (P@1) in mapping UMLS concepts to DBpedia resources on this testing set. Conclusions: The proposed circular resolution method is a simple yet effective technique for linking UMLS concepts to DBpedia resources. Experiments using Wikidata-based ground truth reveal a high mapping accuracy. In addition to the complete UMLS mapping downloadable in n-triple format, we provide an online browser and a RESTful service to explore the mappings.


Asunto(s)
Algoritmos , Bases del Conocimiento , Unified Medical Language System , Vocabulario Controlado , Nube Computacional , Conjuntos de Datos como Asunto , Aplicaciones de la Informática Médica , Web Semántica
6.
J Psychosom Res ; 106: 70-72, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29455902

RESUMEN

BACKGROUND: About 8% of U.S women are prescribed antidepressant medications around the time of pregnancy. Decisions about medication use in pregnancy can be swayed by the opinion of family, friends and online media, sometimes beyond the advice offered by healthcare providers. Exploration of the online social network response to research on antidepressant use in pregnancy could provide insight about how to optimize decision-making in this complex area. METHODS: For all 17 research articles published on the safety of antidepressant use in pregnancy in 2012, we sought to explore online social network activity regarding antidepressant use in pregnancy, via Twitter, in the 48h after a study was published, compared to the social network activity in the same period 1week prior to each article's publication. RESULTS: Online social network activity about antidepressants in pregnancy quickly doubled upon study publication. The increased activity was driven by studies demonstrating harm associated with antidepressants, lower-quality studies, and studies where abstracts presented relative versus absolute risks. IMPLICATIONS: These findings support a call for leadership from medical journals to consider how to best incentivize and support a balanced and clear translation of knowledge around antidepressant safety in pregnancy to their readership and the public.


Asunto(s)
Antidepresivos/uso terapéutico , Medios de Comunicación Sociales , Red Social , Adulto , Actitud , Toma de Decisiones , Femenino , Conocimientos, Actitudes y Práctica en Salud , Personal de Salud , Humanos , Embarazo
7.
J Clin Epidemiol ; 103: 101-111, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30297037

RESUMEN

OBJECTIVES: To illustrate the use of process mining concepts, techniques, and tools to improve the systematic review process. STUDY DESIGN AND SETTING: We simulated review activities and step-specific methods in the process for systematic reviews conducted by one research team over 1 year to generate an event log of activities, with start/end dates, reviewer assignment by expertise, and person-hours worked. Process mining techniques were applied to the event log to "discover" process models, which allowed visual display, animation, or replay of the simulated review activities. Summary statistics were calculated for person-time and timelines. We also analyzed the social networks of team interactions. RESULTS: The 12 simulated reviews included an average of 3,831 titles/abstracts (range: 1,565-6,368) and 20 studies (6-42). The average review completion time was 463 days (range: 289-629) (881 person-hours [range: 243-1,752]). The average person-hours per activity were study selection 26%, data collection 24%, report preparation 23%, and meta-analysis 17%. Social network analyses showed the organizational interaction of team members, including how they worked together to complete review tasks and to hand over tasks upon completion. CONCLUSION: Event log and process mining can be valuable tools for research teams interested in improving and modernizing the systematic review process.


Asunto(s)
Minería de Datos , Proyectos de Investigación/normas , Revisiones Sistemáticas como Asunto , Humanos , Relaciones Interprofesionales , Modelos Teóricos , Mejoramiento de la Calidad , Investigadores , Red Social , Factores de Tiempo
8.
J Biomed Semantics ; 8(1): 44, 2017 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-28938912

RESUMEN

The abundance and unstructured nature of biomedical texts, be it clinical or research content, impose significant challenges for the effective and efficient use of information and knowledge stored in such texts. Annotation of biomedical documents with machine intelligible semantics facilitates advanced, semantics-based text management, curation, indexing, and search. This paper focuses on annotation of biomedical entity mentions with concepts from relevant biomedical knowledge bases such as UMLS. As a result, the meaning of those mentions is unambiguously and explicitly defined, and thus made readily available for automated processing. This process is widely known as semantic annotation, and the tools that perform it are known as semantic annotators.Over the last dozen years, the biomedical research community has invested significant efforts in the development of biomedical semantic annotation technology. Aiming to establish grounds for further developments in this area, we review a selected set of state of the art biomedical semantic annotators, focusing particularly on general purpose annotators, that is, semantic annotation tools that can be customized to work with texts from any area of biomedicine. We also examine potential directions for further improvements of today's annotators which could make them even more capable of meeting the needs of real-world applications. To motivate and encourage further developments in this area, along the suggested and/or related directions, we review existing and potential practical applications and benefits of semantic annotators.


Asunto(s)
Ontologías Biológicas , Investigación Biomédica , Semántica , Registros Electrónicos de Salud , Humanos , Almacenamiento y Recuperación de la Información
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA