Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

Large language models encode clinical knowledge.

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, Philip; Demner-Fushman, Dina; Agüera Y Arcas, Blaise; Webster, Dale; Corrado, Greg S; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomasev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joelle; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek.

Nature ; 620(7972): 172-180, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37438534

RESUMEN

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.

Asunto(s)

Benchmarking , Simulación por Computador , Conocimiento , Medicina , Procesamiento de Lenguaje Natural , Sesgo , Competencia Clínica , Comprensión , Conjuntos de Datos como Asunto , Concesión de Licencias , Medicina/métodos , Medicina/normas , Seguridad del Paciente , Médicos

2.

Publisher Correction: Large language models encode clinical knowledge.

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, Philip; Demner-Fushman, Dina; Agüera Y Arcas, Blaise; Webster, Dale; Corrado, Greg S; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomasev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joelle; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek.

Nature ; 620(7973): E19, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37500979

3.

Medical Image Retrieval via Nearest Neighbor Search on Pre-trained Image Features.

Gupta, Deepak; Loane, Russell; Gayen, Soumya; Demner-Fushman, Dina.

Knowl Based Syst ; 2782023 Oct 25.

Artículo en Inglés | MEDLINE | ID: mdl-37780058

RESUMEN

Nearest neighbor search, also known as NNS, is a technique used to locate the points in a high-dimensional space closest to a given query point. This technique has multiple applications in medicine, such as searching large medical imaging databases, disease classification, and diagnosis. However, when the number of points is significantly large, the brute-force approach for finding the nearest neighbor becomes computationally infeasible. Therefore, various approaches have been developed to make the search faster and more efficient to support the applications. With a focus on medical imaging, this paper proposes DenseLinkSearch (DLS), an effective and efficient algorithm that searches and retrieves the relevant images from heterogeneous sources of medical images. Towards this, given a medical database, the proposed algorithm builds an index that consists of pre-computed links of each point in the database. The search algorithm utilizes the index to efficiently traverse the database in search of the nearest neighbor. We also explore the role of medical image feature representation in content-based medical image retrieval tasks. We propose a Transformer-based feature representation technique that outperformed the existing pre-trained Transformer-based approaches on benchmark medical image retrieval datasets. We extensively tested the proposed NNS approach and compared the performance with state-of-the-art NNS approaches on benchmark datasets and our created medical image datasets. The proposed approach outperformed the existing approaches in terms of retrieving accurate neighbors and retrieval speed. In comparison to the existing approximate NNS approaches, our proposed DLS approach outperformed them in terms of lower average time per query and ≥ 99% R@10 on 11 out of 13 benchmark datasets. We also found that the proposed medical feature representation approach is better for representing medical images compared to the existing pre-trained image models. The proposed feature extraction strategy obtained an improvement of 9.37%, 7.0%, and 13.33% in terms of P@5, P@10, and P@20, respectively, in comparison to the best-performing pre-trained image model. The source code and datasets of our experiments are available at https://github.com/deepaknlp/DLS.

4.

Question-aware transformer models for consumer health question summarization.

Yadav, Shweta; Gupta, Deepak; Abacha, Asma Ben; Demner-Fushman, Dina.

J Biomed Inform ; 128: 104040, 2022 04.

Artículo en Inglés | MEDLINE | ID: mdl-35259544

RESUMEN

Searching for health information online is becoming customary for more and more consumers every day, which makes the need for efficient and reliable question answering systems more pressing. An important contributor to the success rates of these systems is their ability to fully understand the consumers' questions. However, these questions are frequently longer than needed and mention peripheral information that is not useful in finding relevant answers. Question summarization is one of the potential solutions to simplifying long and complex consumer questions before attempting to find an answer. In this paper, we study the task of abstractive summarization for real-world consumer health questions. We develop an abstractive question summarization model that leverages the semantic interpretation of a question via recognition of medical entities, which enables generation of informative summaries. Towards this, we propose multiple Cloze tasks (i.e. the task of filing missing words in a given context) to identify the key medical entities that enforce the model to have better coverage in question-focus recognition. Additionally, we infuse the decoder inputs with question-type information to generate question-type driven summaries. When evaluated on the MeQSum benchmark corpus, our framework outperformed the state-of-the-art method by 10.2 ROUGE-L points. We also conducted a manual evaluation to assess the correctness of the generated summaries.

Asunto(s)

Semántica

5.

Searching for scientific evidence in a pandemic: An overview of TREC-COVID.

Roberts, Kirk; Alam, Tasmeer; Bedrick, Steven; Demner-Fushman, Dina; Lo, Kyle; Soboroff, Ian; Voorhees, Ellen; Wang, Lucy Lu; Hersh, William R.

J Biomed Inform ; 121: 103865, 2021 09.

Artículo en Inglés | MEDLINE | ID: mdl-34245913

RESUMEN

We present an overview of the TREC-COVID Challenge, an information retrieval (IR) shared task to evaluate search on scientific literature related to COVID-19. The goals of TREC-COVID include the construction of a pandemic search test collection and the evaluation of IR methods for COVID-19. The challenge was conducted over five rounds from April to July 2020, with participation from 92 unique teams and 556 individual submissions. A total of 50 topics (sets of related queries) were used in the evaluation, starting at 30 topics for Round 1 and adding 5 new topics per round to target emerging topics at that state of the still-emerging pandemic. This paper provides a comprehensive overview of the structure and results of TREC-COVID. Specifically, the paper provides details on the background, task structure, topic structure, corpus, participation, pooling, assessment, judgments, results, top-performing systems, lessons learned, and benchmark datasets.

Asunto(s)

COVID-19 , Pandemias , Humanos , Almacenamiento y Recuperación de la Información , SARS-CoV-2

6.

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Datta, Surabhi; Si, Yuqi; Rodriguez, Laritza; Shooshan, Sonya E; Demner-Fushman, Dina; Roberts, Kirk.

J Biomed Inform ; 108: 103473, 2020 08.

Artículo en Inglés | MEDLINE | ID: mdl-32562898

RESUMEN

Radiology reports contain a radiologist's interpretations of images, and these images frequently describe spatial relations. Important radiographic findings are mostly described in reference to an anatomical location through spatial prepositions. Such spatial relationships are also linked to various differential diagnoses and often described through uncertainty phrases. Structured representation of this clinically significant spatial information has the potential to be used in a variety of downstream clinical informatics applications. Our focus is to extract these spatial representations from the reports. For this, we first define a representation framework based on the Spatial Role Labeling (SpRL) scheme, which we refer to as Rad-SpRL. In Rad-SpRL, common radiological entities tied to spatial relations are encoded through four spatial roles: Trajector, Landmark, Diagnosis, and Hedge, all identified in relation to a spatial preposition (or Spatial Indicator). We annotated a total of 2,000 chest X-ray reports following Rad-SpRL. We then propose a deep learning-based natural language processing (NLP) method involving word and character-level encodings to first extract the Spatial Indicators followed by identifying the corresponding spatial roles. Specifically, we use a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) neural network as the baseline model. Additionally, we incorporate contextualized word representations from pre-trained language models (BERT and XLNet) for extracting the spatial information. We evaluate both gold and predicted Spatial Indicators to extract the four types of spatial roles. The results are promising, with the highest average F1 measure for Spatial Indicator extraction being 91.29 (XLNet); the highest average overall F1 measure considering all the four spatial roles being 92.9 using gold Indicators (XLNet); and 85.6 using predicted Indicators (BERT pre-trained on MIMIC notes). The corpus is available in Mendeley at http://dx.doi.org/10.17632/yhb26hfz8n.1 and https://github.com/krobertslab/datasets/blob/master/Rad-SpRL.xml.

Asunto(s)

Aprendizaje Profundo , Radiología , Lenguaje , Procesamiento de Lenguaje Natural , Rayos X

7.

A question-entailment approach to question answering.

Ben Abacha, Asma; Demner-Fushman, Dina.

BMC Bioinformatics ; 20(1): 511, 2019 Oct 22.

Artículo en Inglés | MEDLINE | ID: mdl-31640539

RESUMEN

BACKGROUND: One of the challenges in large-scale information retrieval (IR) is developing fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is mapping new questions to formerly answered questions that are "similar". RESULTS: We propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare logistic regression and deep learning methods for RQE using different kinds of datasets including textual inference, question similarity, and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources which we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score. CONCLUSIONS: The evaluation results support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA.

Asunto(s)

Aprendizaje Profundo , Almacenamiento y Recuperación de la Información/métodos , Informática Médica , Modelos Logísticos , Unified Medical Language System

8.

Semantic annotation of consumer health questions.

Kilicoglu, Halil; Ben Abacha, Asma; Mrabet, Yassine; Shooshan, Sonya E; Rodriguez, Laritza; Masterton, Kate; Demner-Fushman, Dina.

BMC Bioinformatics ; 19(1): 34, 2018 02 06.

Artículo en Inglés | MEDLINE | ID: mdl-29409442

RESUMEN

BACKGROUND: Consumers increasingly use online resources for their health information needs. While current search engines can address these needs to some extent, they generally do not take into account that most health information needs are complex and can only fully be expressed in natural language. Consumer health question answering (QA) systems aim to fill this gap. A major challenge in developing consumer health QA systems is extracting relevant semantic content from the natural language questions (question understanding). To develop effective question understanding tools, question corpora semantically annotated for relevant question elements are needed. In this paper, we present a two-part consumer health question corpus annotated with several semantic categories: named entities, question triggers/types, question frames, and question topic. The first part (CHQA-email) consists of relatively long email requests received by the U.S. National Library of Medicine (NLM) customer service, while the second part (CHQA-web) consists of shorter questions posed to MedlinePlus search engine as queries. Each question has been annotated by two annotators. The annotation methodology is largely the same between the two parts of the corpus; however, we also explain and justify the differences between them. Additionally, we provide information about corpus characteristics, inter-annotator agreement, and our attempts to measure annotation confidence in the absence of adjudication of annotations. RESULTS: The resulting corpus consists of 2614 questions (CHQA-email: 1740, CHQA-web: 874). Problems are the most frequent named entities, while treatment and general information questions are the most common question types. Inter-annotator agreement was generally modest: question types and topics yielded highest agreement, while the agreement for more complex frame annotations was lower. Agreement in CHQA-web was consistently higher than that in CHQA-email. Pairwise inter-annotator agreement proved most useful in estimating annotation confidence. CONCLUSIONS: To our knowledge, our corpus is the first focusing on annotation of uncurated consumer health questions. It is currently used to develop machine learning-based methods for question understanding. We make the corpus publicly available to stimulate further research on consumer health QA.

Asunto(s)

Estado de Salud , Encuestas y Cuestionarios , Correo Electrónico , Humanos , Semántica , Navegador Web

9.

The role of fine-grained annotations in supervised recognition of risk factors for heart disease from EHRs.

Roberts, Kirk; Shooshan, Sonya E; Rodriguez, Laritza; Abhyankar, Swapna; Kilicoglu, Halil; Demner-Fushman, Dina.

J Biomed Inform ; 58 Suppl: S111-S119, 2015 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-26122527

RESUMEN

This paper describes a supervised machine learning approach for identifying heart disease risk factors in clinical text, and assessing the impact of annotation granularity and quality on the system's ability to recognize these risk factors. We utilize a series of support vector machine models in conjunction with manually built lexicons to classify triggers specific to each risk factor. The features used for classification were quite simple, utilizing only lexical information and ignoring higher-level linguistic information such as syntax and semantics. Instead, we incorporated high-quality data to train the models by annotating additional information on top of a standard corpus. Despite the relative simplicity of the system, it achieves the highest scores (micro- and macro-F1, and micro- and macro-recall) out of the 20 participants in the 2014 i2b2/UTHealth Shared Task. This system obtains a micro- (macro-) precision of 0.8951 (0.8965), recall of 0.9625 (0.9611), and F1-measure of 0.9276 (0.9277). Additionally, we perform a series of experiments to assess the value of the annotated data we created. These experiments show how manually-labeled negative annotations can improve information extraction performance, demonstrating the importance of high-quality, fine-grained natural language annotations.

Asunto(s)

Enfermedad de la Arteria Coronaria/epidemiología , Minería de Datos/métodos , Complicaciones de la Diabetes/epidemiología , Registros Electrónicos de Salud/organización & administración , Procesamiento de Lenguaje Natural , Aprendizaje Automático Supervisado , Anciano , Estudios de Cohortes , Comorbilidad , Seguridad Computacional , Confidencialidad , Enfermedad de la Arteria Coronaria/diagnóstico , Complicaciones de la Diabetes/diagnóstico , Femenino , Humanos , Incidencia , Estudios Longitudinales , Masculino , Maryland/epidemiología , Persona de Mediana Edad , Narración , Reconocimiento de Normas Patrones Automatizadas/métodos , Medición de Riesgo/métodos , Vocabulario Controlado

10.

Peer review of GPT-4 technical report and systems card.

Gallifant, Jack; Fiske, Amelia; Levites Strekalova, Yulia A; Osorio-Valencia, Juan S; Parke, Rachael; Mwavu, Rogers; Martinez, Nicole; Gichoya, Judy Wawira; Ghassemi, Marzyeh; Demner-Fushman, Dina; McCoy, Liam G; Celi, Leo Anthony; Pierce, Robin.

PLOS Digit Health ; 3(1): e0000417, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-38236824

RESUMEN

The study provides a comprehensive review of OpenAI's Generative Pre-trained Transformer 4 (GPT-4) technical report, with an emphasis on applications in high-risk settings like healthcare. A diverse team, including experts in artificial intelligence (AI), natural language processing, public health, law, policy, social science, healthcare research, and bioethics, analyzed the report against established peer review guidelines. The GPT-4 report shows a significant commitment to transparent AI research, particularly in creating a systems card for risk assessment and mitigation. However, it reveals limitations such as restricted access to training data, inadequate confidence and uncertainty estimations, and concerns over privacy and intellectual property rights. Key strengths identified include the considerable time and economic investment in transparent AI research and the creation of a comprehensive systems card. On the other hand, the lack of clarity in training processes and data raises concerns about encoded biases and interests in GPT-4. The report also lacks confidence and uncertainty estimations, crucial in high-risk areas like healthcare, and fails to address potential privacy and intellectual property issues. Furthermore, this study emphasizes the need for diverse, global involvement in developing and evaluating large language models (LLMs) to ensure broad societal benefits and mitigate risks. The paper presents recommendations such as improving data transparency, developing accountability frameworks, establishing confidence standards for LLM outputs in high-risk settings, and enhancing industry research review processes. It concludes that while GPT-4's report is a step towards open discussions on LLMs, more extensive interdisciplinary reviews are essential for addressing bias, harm, and risk concerns, especially in high-risk domains. The review aims to expand the understanding of LLMs in general and highlights the need for new reflection forms on how LLMs are reviewed, the data required for effective evaluation, and addressing critical issues like bias and risk.

11.

Comparative effectiveness research designs: an analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree.

Bekhuis, Tanja; Demner-Fushman, Dina; Crowley, Rebecca S.

J Med Libr Assoc ; 101(2): 92-100, 2013 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-23646024

RESUMEN

OBJECTIVES: We analyzed the extent to which comparative effectiveness research (CER) organizations share terms for designs, analyzed coverage of CER designs in Medical Subject Headings (MeSH) and Emtree, and explored whether scientists use CER design terms. METHODS: We developed local terminologies (LTs) and a CER design terminology by extracting terms in documents from five organizations. We defined coverage as the distribution over match type in MeSH and Emtree. We created a crosswalk by recording terms to which design terms mapped in both controlled vocabularies. We analyzed the hits for queries restricted to titles and abstracts to explore scientists' language. RESULTS: Pairwise LT overlap ranged from 22.64% (12/53) to 75.61% (31/41). The CER design terminology (nâ=â78 terms) consisted of terms for primary study designs and a few terms useful for evaluating evidence, such as opinion paper and systematic review. Patterns of coverage were similar in MeSH and Emtree (gammaâ=â0.581, Pâ=â0.002). CONCLUSIONS: Stakeholder terminologies vary, and terms are inconsistently covered in MeSH and Emtree. The CER design terminology and crosswalk may be useful for expert searchers. For partially mapped terms, queries could consist of free text for modifiers such as nonrandomized or interrupted added to broad or related controlled terms.

Asunto(s)

Investigación sobre la Eficacia Comparativa/métodos , Bases de Datos Bibliográficas , Almacenamiento y Recuperación de la Información/métodos , MEDLINE/organización & administración , Medical Subject Headings , Terminología como Asunto , Humanos , National Library of Medicine (U.S.) , Estados Unidos

12.

A dataset for plain language adaptation of biomedical abstracts.

Attal, Kush; Ondov, Brian; Demner-Fushman, Dina.

Sci Data ; 10(1): 8, 2023 01 04.

Artículo en Inglés | MEDLINE | ID: mdl-36599892

RESUMEN

Though exponentially growing health-related literature has been made available to a broad audience online, the language of scientific articles can be difficult for the general public to understand. Therefore, adapting this expert-level language into plain language versions is necessary for the public to reliably comprehend the vast health-related literature. Deep Learning algorithms for automatic adaptation are a possible solution; however, gold standard datasets are needed for proper evaluation. Proposed datasets thus far consist of either pairs of comparable professional- and general public-facing documents or pairs of semantically similar sentences mined from such documents. This leads to a trade-off between imperfect alignments and small test sets. To address this issue, we created the Plain Language Adaptation of Biomedical Abstracts dataset. This dataset is the first manually adapted dataset that is both document- and sentence-aligned. The dataset contains 750 adapted abstracts, totaling 7643 sentence pairs. Along with describing the dataset, we benchmark automatic adaptation on the dataset with state-of-the-art Deep Learning approaches, setting baselines for future research.

13.

A dataset for medical instructional video classification and question answering.

Gupta, Deepak; Attal, Kush; Demner-Fushman, Dina.

Sci Data ; 10(1): 158, 2023 03 22.

Artículo en Inglés | MEDLINE | ID: mdl-36949119

RESUMEN

This paper introduces a new challenge and datasets to foster research toward designing systems that can understand medical videos and provide visual answers to natural language questions. We believe medical videos may provide the best possible answers to many first aid, medical emergency, and medical education questions. Toward this, we created the MedVidCL and MedVidQA datasets and introduce the tasks of Medical Video Classification (MVC) and Medical Visual Answer Localization (MVAL), two tasks that focus on cross-modal (medical language and medical video) understanding. The proposed tasks and datasets have the potential to support the development of sophisticated downstream applications that can benefit the public and medical practitioners. Our datasets consist of 6,117 fine-grained annotated videos for the MVC task and 3,010 questions and answers timestamps from 899 videos for the MVAL task. These datasets have been verified and corrected by medical informatics experts. We have also benchmarked each task with the created MedVidCL and MedVidQA datasets and propose the multimodal learning methods that set competitive baselines for future research.

Asunto(s)

Informática Médica , Lenguaje , Procesamiento de Lenguaje Natural , Semántica

14.

The NLM indexer assignment dataset: a new large-scale dataset for reviewer assignment research.

Rae, Alastair R; Mork, James G; Demner-Fushman, Dina.

J Assoc Inf Sci Technol ; 74(2): 205-218, 2023 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-36819642

RESUMEN

MEDLINE is the National Library of Medicine's (NLM) journal citation database. It contains over 28 million references to biomedical and life science journal articles, and a key feature of the database is that all articles are indexed with NLM Medical Subject Headings (MeSH). The library employs a team of MeSH indexers, and in recent years they have been asked to index close to 1 million articles per year in order to keep MEDLINE up to date. An important part of the MEDLINE indexing process is the assignment of articles to indexers. High quality and timely indexing is only possible when articles are assigned to indexers with suitable expertise. This paper introduces the NLM indexer assignment dataset: a large dataset of 4.2 million indexer article assignments for articles indexed between 2011 and 2019. The dataset is shown to be a valuable testbed for expert matching and assignment algorithms, and indexer article assignment is also found to be useful domain-adaptive pre-training for the closely related task of reviewer assignment.

15.

CHQ- SocioEmo: Identifying Social and Emotional Support Needs in Consumer-Health Questions.

Alasmari, Ashwag; Kudryashov, Luke; Yadav, Shweta; Lee, Heera; Demner-Fushman, Dina.

Sci Data ; 10(1): 329, 2023 05 27.

Artículo en Inglés | MEDLINE | ID: mdl-37244917

RESUMEN

General public, often called consumers, are increasingly seeking health information online. To be satisfactory, answers to health-related questions often have to go beyond informational needs. Automated approaches to consumer health question answering should be able to recognize the need for social and emotional support. Recently, large scale datasets have addressed the issue of medical question answering and highlighted the challenges associated with question classification from the standpoint of informational needs. However, there is a lack of annotated datasets for the non-informational needs. We introduce a new dataset for non-informational support needs, called CHQ-SocioEmo. The Dataset of Consumer Health Questions was collected from a community question answering forum and annotated with basic emotions and social support needs. This is the first publicly available resource for understanding non-informational support needs in consumer health-related questions online. We benchmark the corpus against multiple state-of-the-art classification models to demonstrate the dataset's effectiveness.

16.

Caregivers Attitude Detection From Clinical Notes.

Manzo, Gaetano; Celi, Leo Anthony; Shabazz, Yasmeen; Mulcahey, Rory; Flores, Lorenzo Jaime; Demner-Fushman, Dina.

AMIA Annu Symp Proc ; 2023: 1125-1134, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38222330

RESUMEN

Caregivers' attitudes impact healthcare quality and disparities. Clinical notes contain highly specialized and ambiguous language that requires extensive domain knowledge to understand, and using negative language does not necessarily imply a negative attitude. This study discusses the challenge of detecting caregivers' attitudes from their clinical notes. To address these challenges, we annotate MIMIC clinical notes and train state-of-the-art language models from the Hugging Face platform. The study focuses on the Neonatal Intensive Care Unit and evaluates models in zero-shot, few-shot, and fully-trained scenarios. Among the chosen models, RoBERTa identifies caregivers' attitudes from clinical notes with an F1-score of 0.75. This approach not only enhances patient satisfaction, but opens up exciting possibilities for detecting and preventing care provider syndromes, such as fatigue, stress, and burnout. The paper concludes by discussing limitations and potential future work.

Asunto(s)

Agotamiento Profesional , Cuidadores , Recién Nacido , Humanos , Actitud , Calidad de la Atención de Salud

17.

Effects of Porting Essie Tokenization and Normalization to Solr.

Gayen, Soumya; Gupta, Deepak; F Loane, Russell; Ide, Nicholas C; Demner-Fushman, Dina.

AMIA Annu Symp Proc ; 2023: 369-378, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38222430

RESUMEN

Search for information is now an integral part of healthcare. Searches are enabled by search engines whose objective is to efficiently retrieve the relevant information for the user query. When it comes to retrieving biomedical text and literature, Essie search engine developed at the National Library of Medicine (NLM) performs exceptionally well. However, Essie is a software system developed for NLM that has ceased development and support. On the other hand, Solr is a popular opensource enterprise search engine used by many of the world's largest internet sites, offering continuous developments and improvements along with the state-of-the-art features. In this paper, we present our approach to porting the key features of Essie and developing custom components to be used in Solr. We demonstrate the effectiveness of the added components on three benchmark biomedical datasets. The custom components may aid the community in improving search methods for biomedical text retrieval.

Asunto(s)

Almacenamiento y Recuperación de la Información , Programas Informáticos , Estados Unidos , Humanos , Motor de Búsqueda , National Library of Medicine (U.S.) , Benchmarking , Internet

18.

Auditing Learned Associations in Deep Learning Approaches to Extract Race and Ethnicity from Clinical Text.

Bear Don't Walk Iv, Oliver J; Pichon, Adrienne; Nieva, Harry Reyes; Sun, Tony; Altosaar, Jaan; Natarajan, Karthik; Perotte, Adler; Tarczy-Hornoch, Peter; Demner-Fushman, Dina; Elhadad, Noémie.

AMIA Annu Symp Proc ; 2023: 289-298, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38222422

RESUMEN

Complete and accurate race and ethnicity (RE) patient information is important for many areas of biomedical informatics research, such as defining and characterizing cohorts, performing quality assessments, and identifying health inequities. Patient-level RE data is often inaccurate or missing in structured sources, but can be supplemented through clinical notes and natural language processing (NLP). While NLP has made many improvements in recent years with large language models, bias remains an often-unaddressed concern, with research showing that harmful and negative language is more often used for certain racial/ethnic groups than others. We present an approach to audit the learned associations of models trained to identify RE information in clinical text by measuring the concordance between model-derived salient features and manually identified RE-related spans of text. We show that while models perform well on the surface, there exist concerning learned associations and potential for future harms from RE-identification models if left unaddressed.

Asunto(s)

Aprendizaje Profundo , Etnicidad , Humanos , Lenguaje , Procesamiento de Lenguaje Natural

19.

Lower short- and long-term mortality associated with overweight and obesity in a large cohort study of adult intensive care unit patients.

Abhyankar, Swapna; Leishear, Kira; Callaghan, Fiona M; Demner-Fushman, Dina; McDonald, Clement J.

Crit Care ; 16(6): R235, 2012 Dec 18.

Artículo en Inglés | MEDLINE | ID: mdl-23249446

RESUMEN

INTRODUCTION: Two thirds of United States adults are overweight or obese, which puts them at higher risk of developing chronic diseases and of death compared with normal-weight individuals. However, recent studies have found that overweight and obesity by themselves may be protective in some contexts, such as hospitalization in an intensive care unit (ICU). Our objective was to determine the relation between body mass index (BMI) and mortality at 30 days and 1 year after ICU admission. METHODS: We performed a cohort analysis of 16,812 adult patients from MIMIC-II, a large database of ICU patients at a tertiary care hospital in Boston, Massachusetts. The data were originally collected during the course of clinical care, and we subsequently extracted our dataset independent of the study outcome. RESULTS: Compared with normal-weight patients, obese patients had 26% and 43% lower mortality risk at 30 days and 1 year after ICU admission, respectively (odds ratio (OR), 0.74; 95% confidence interval (CI), 0.64 to 0.86) and 0.57 (95% CI, 0.49 to 0.67)); overweight patients had nearly 20% and 30% lower mortality risk (OR, 0.81; 95% CI, 0.70 to 0.93) and OR, 0.68 (95% CI, 0.59 to 0.79)). Severely obese patients (BMI ≥ 40 kg/m2) did not have a significant survival advantage at 30 days (OR, 0.94; 95% CI, 0.74 to 1.20), but did have 30% lower mortality risk at 1 year (OR, 0.70 (95% CI, 0.54 to 0.90)). No significant difference in admission acuity or ICU and hospital length of stay was found across BMI categories. CONCLUSION: Our study supports the hypothesis that patients who are overweight or obese have improved survival both 30 days and 1 year after ICU admission.

Asunto(s)

Enfermedad Crítica/mortalidad , Unidades de Cuidados Intensivos/estadística & datos numéricos , Obesidad/mortalidad , Sobrepeso/mortalidad , Adulto , Anciano , Anciano de 80 o más Años , Índice de Masa Corporal , Boston/epidemiología , Estudios de Cohortes , Femenino , Humanos , Masculino , Persona de Mediana Edad , Obesidad/complicaciones , Sobrepeso/complicaciones , Modelos de Riesgos Proporcionales , Factores de Riesgo , Índice de Severidad de la Enfermedad , Análisis de Supervivencia

20.

Standardizing clinical laboratory data for secondary use.

Abhyankar, Swapna; Demner-Fushman, Dina; McDonald, Clement J.

J Biomed Inform ; 45(4): 642-50, 2012 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-22561944

RESUMEN

Clinical databases provide a rich source of data for answering clinical research questions. However, the variables recorded in clinical data systems are often identified by local, idiosyncratic, and sometimes redundant and/or ambiguous names (or codes) rather than unique, well-organized codes from standard code systems. This reality discourages research use of such databases, because researchers must invest considerable time in cleaning up the data before they can ask their first research question. Researchers at MIT developed MIMIC-II, a nearly complete collection of clinical data about intensive care patients. Because its data are drawn from existing clinical systems, it has many of the problems described above. In collaboration with the MIT researchers, we have begun a process of cleaning up the data and mapping the variable names and codes to LOINC codes. Our first step, which we describe here, was to map all of the laboratory test observations to LOINC codes. We were able to map 87% of the unique laboratory tests that cover 94% of the total number of laboratory tests results. Of the 13% of tests that we could not map, nearly 60% were due to test names whose real meaning could not be discerned and 29% represented tests that were not yet included in the LOINC table. These results suggest that LOINC codes cover most of laboratory tests used in critical care. We have delivered this work to the MIMIC-II researchers, who have included it in their standard MIMIC-II database release so that researchers who use this database in the future will not have to do this work.

Asunto(s)

Investigación Biomédica/normas , Sistemas de Información en Laboratorio Clínico , Bases de Datos Factuales/normas , Registros Electrónicos de Salud , Informática Médica/normas , Vocabulario Controlado , Codificación Clínica , Humanos , Interfaz Usuario-Computador

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA