Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 104
Filtrar
1.
Neurosci Inform ; 4(1)2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38433986

RESUMEN

Introduction: While linguistic retrogenesis has been extensively investigated in the neuroscientific and behavioral literature, there has been little work on retrogenesis using computerized approaches to language analysis. Methods: We bridge this gap by introducing a method based on comparing output of a pre-trained neural language model (NLM) with an artificially degraded version of itself to examine the transcripts of speech produced by seniors with and without dementia and healthy children during spontaneous language tasks. We compare a range of linguistic characteristics including language model perplexity, syntactic complexity, lexical frequency and part-of-speech use across these groups. Results: Our results indicate that healthy seniors and children older than 8 years share similar linguistic characteristics, as do dementia patients and children who are younger than 8 years. Discussion: Our study aligns with the growing evidence that language deterioration in dementia mirrors language acquisition in development using computational linguistic methods based on NLMs. This insight underscores the importance of further research to refine its application in guiding developmentally appropriate patient care, particularly in early stages.

2.
J Biomed Inform ; 150: 104598, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38253228

RESUMEN

OBJECTIVES: We aimed to investigate how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy, specifically in the "Cookie Theft" picture description task. We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information for distinguishing between language samples from cognitively healthy individuals and those with Alzheimer's disease (AD). METHODS: We conducted experiments using various ASR models, refining their transcripts with post-editing techniques. Both these imperfect ASR transcripts and manually transcribed ones were used as inputs for the downstream dementia classification. We conducted comprehensive error analysis to compare model performance and assess ASR-generated transcript effectiveness in dementia classification. RESULTS: Imperfect ASR-generated transcripts surprisingly outperformed manual transcription for distinguishing between individuals with AD and those without in the "Cookie Theft" task. These ASR-based models surpassed the previous state-of-the-art approach, indicating that ASR errors may contain valuable cues related to dementia. The synergy between ASR and classification models improved overall accuracy in dementia classification. CONCLUSION: Imperfect ASR transcripts effectively capture linguistic anomalies linked to dementia, improving accuracy in classification tasks. This synergy between ASR and classification models underscores ASR's potential as a valuable tool in assessing cognitive impairment and related clinical applications.


Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Percepción del Habla , Humanos , Habla , Lenguaje , Enfermedad de Alzheimer/diagnóstico
3.
Pac Symp Biocomput ; 29: 24-38, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38160267

RESUMEN

We present a fully automated AI-based system for intensive monitoring of cognitive symptoms of neurotoxicity that frequently appear as a result of immunotherapy of hematologic malignancies. Early manifestations of these symptoms are evident in the patient's speech in the form of mild aphasia and confusion and can be detected and effectively treated prior to onset of more serious and potentially life-threatening impairment. We have developed the Automated Neural Nursing Assistant (ANNA) system designed to conduct a brief cognitive assessment several times per day over the telephone for 5-14 days following infusion of the immunotherapy medication. ANNA uses a conversational agent based on a large language model to elicit spontaneous speech in a semi-structured dialogue, followed by a series of brief language-based neurocognitive tests. In this paper we share ANNA's design and implementation, results of a pilot functional evaluation study, and discuss technical and logistic challenges facing the introduction of this type of technology in clinical practice. A large-scale clinical evaluation of ANNA will be conducted in an observational study of patients undergoing immunotherapy at the University of Minnesota Masonic Cancer Center starting in the Fall 2023.


Asunto(s)
Biología Computacional , Lenguaje , Humanos
4.
AMIA Jt Summits Transl Sci Proc ; 2023: 360-369, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37350929

RESUMEN

The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (Toolkit for Reproducible Execution of Speech Text and Language Experiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.

5.
Pac Symp Biocomput ; 28: 43-54, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36540963

RESUMEN

Consumer-grade heart rate (HR) sensors including chest straps, wrist-worn watches and rings have become very popular in recent years for tracking individual physiological state, training for sports and even measuring stress levels and emotional changes. While the majority of these consumer sensors are not medical devices, they can still offer insights for consumers and researchers if used correctly taking into account their limitations. Multiple previous studies have been done using a large variety of consumer sensors including Polar® devices, Apple® watches, and Fitbit® wrist bands. The vast majority of prior studies have been done in laboratory settings where collecting data is relatively straightforward. However, using consumer sensors in naturalistic settings that present significant challenges, including noise artefacts and missing data, has not been as extensively investigated. Additionally, the majority of prior studies focused on wrist-worn optical HR sensors. Arm-worn sensors have not been extensively investigated either. In the present study, we validate HR measurements obtained with an arm-worn optical sensor (Polar OH1) against those obtained with a chest-strap electrical sensor (Polar H10) from 16 participants over a 2-week study period in naturalistic settings. We also investigated the impact of physical activity measured with 3-D accelerometers embedded in the H10 chest strap and OH1 armband sensors on the agreement between the two sensors. Overall, we find that the arm-worn optical Polar OH1 sensor provides a good estimate of HR (Pearson r = 0.90, p <0.01). Filtering the signal that corresponds to physical activity further improves the HR estimates but only slightly (Pearson r = 0.91, p <0.01). Based on these preliminary findings, we conclude that the arm-worn Polar OH1 sensor provides usable HR measurements in daily living conditions, with some caveats discussed in the paper.


Asunto(s)
Biología Computacional , Monitores de Ejercicio , Humanos , Frecuencia Cardíaca/fisiología , Estudios de Factibilidad , Ejercicio Físico/fisiología
6.
AMIA Annu Symp Proc ; 2023: 923-932, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38222433

RESUMEN

Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets for training, and trained models have been shown to transfer poorly across sites. These issues have led to the promotion of data collection and integration across different institutions for accurate and portable models. However, this can introduce a form of bias called confounding by provenance. When source-specific data distributions differ at deployment, this may harm model performance. To address this issue, we evaluate the utility of backdoor adjustment for text classification in a multi-site dataset of clinical notes annotated for mentions of substance abuse. Using an evaluation framework devised to measure robustness to distributional shifts, we assess the utility of backdoor adjustment. Our results indicate that backdoor adjustment can effectively mitigate for confounding shift.


Asunto(s)
Registros Electrónicos de Salud , Trastornos Relacionados con Sustancias , Humanos , Recolección de Datos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Estudios Multicéntricos como Asunto
7.
BMC Med Inform Decis Mak ; 22(Suppl 1): 153, 2022 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-35799177

RESUMEN

BACKGROUND: Dietary supplements (DS) have been widely used by consumers, but the information around the efficacy and safety of DS is disparate or incomplete, thus creating barriers for consumers to find information effectively. Conversational agent (CA) systems have been applied to healthcare domain, but there is no such system to answer consumers regarding DS use, although widespread use of DS. In this study, we develop the first CA system for DS use. METHODS: Our CA system for DS use developed on the MindMeld framework, consists of three components: question understanding, DS knowledge base, and answer generation. We collected and annotated 1509 questions to develop a natural language understanding module (e.g., question type classifier, named entity recognizer) which was then integrated into MindMeld framework. CA then queries the DS knowledge base (i.e., iDISK) and generates answers using rule-based slot filling techniques. We evaluated the algorithms of each component and the CA system as a whole. RESULTS: CNN is the best question classifier with an F1 score of 0.81, and CRF is the best named entity recognizer with an F1 score of 0.87. The system achieves an overall accuracy of 81% and an average score of 1.82 with succ@3 + score of 76.2% and succ@2 + of 66% approximately. CONCLUSION: This study develops the first CA system for DS use using the MindMeld framework and iDISK domain knowledge base.


Asunto(s)
Algoritmos , Procesamiento de Lenguaje Natural , Suplementos Dietéticos , Humanos , Lenguaje
8.
Arch Phys Med Rehabil ; 103(10): 2001-2008, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35569640

RESUMEN

OBJECTIVE: To examine the frequency of postacute sequelae of SARS-CoV-2 (PASC) and the factors associated with rehabilitation utilization in a large adult population with PASC. DESIGN: Retrospective study. SETTING: Midwest hospital health system. PARTICIPANTS: 19,792 patients with COVID-19 from March 10, 2020, to January 17, 2021. INTERVENTION: Not applicable. MAIN OUTCOME MEASURES: Descriptive analyses were conducted across the entire cohort along with an adult subgroup analysis. A logistic regression was performed to assess factors associated with PASC development and rehabilitation utilization. RESULTS: In an analysis of 19,792 patients, the frequency of PASC was 42.8% in the adult population. Patients with PASC compared with those without had a higher utilization of rehabilitation services (8.6% vs 3.8%, P<.001). Risk factors for rehabilitation utilization in patients with PASC included younger age (odds ratio [OR], 0.99; 95% confidence interval [CI], 0.98-1.00; P=.01). In addition to several comorbidities and demographics factors, risk factors for rehabilitation utilization solely in the inpatient population included male sex (OR, 1.24; 95% CI, 1.02-1.50; P=.03) with patients on angiotensin-converting-enzyme inhibitors or angiotensin-receptor blockers 3 months prior to COVID-19 infections having a decreased risk of needing rehabilitation (OR, 0.80; 95% CI, 0.64-0.99; P=.04). CONCLUSIONS: Patients with PASC had higher rehabilitation utilization. We identified several clinical and demographic factors associated with the development of PASC and rehabilitation utilization.


Asunto(s)
COVID-19 , Adulto , Inhibidores de la Enzima Convertidora de Angiotensina , Angiotensinas , COVID-19/epidemiología , Humanos , Masculino , Estudios Retrospectivos , SARS-CoV-2
9.
J Biomed Inform ; 126: 103998, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35063668

RESUMEN

Formal thought disorder (ThD) is a clinical sign of schizophrenia amongst other serious mental health conditions. ThD can be recognized by observing incoherent speech - speech in which it is difficult to perceive connections between successive utterances and lacks a clear global theme. Automated assessment of the coherence of speech in patients with schizophrenia has been an active area of research for over a decade, in an effort to develop an objective and reliable instrument through which to quantify ThD. However, this work has largely been conducted in controlled settings using structured interviews and depended upon manual transcription services to render audio recordings amenable to computational analysis. In this paper, we present an evaluation of such automated methods in the context of a fully automated system using Automated Speech Recognition (ASR) in place of a manual transcription service, with "audio diaries" collected in naturalistic settings from participants experiencing Auditory Verbal Hallucinations (AVH). We show that performance lost due to ASR errors can often be restored through the application of Time-Series Augmented Representations for Detection of Incoherent Speech (TARDIS), a novel approach that involves treating the sequence of coherence scores from a transcript as a time-series, providing features for machine learning. With ASR, TARDIS improves average AUC across coherence metrics for detection of severe ThD by 0.09; average correlation with human-labeled derailment scores by 0.10; and average correlation between coherence estimates from manual and ASR-derived transcripts by 0.29. In addition, TARDIS improves the agreement between coherence estimates from manual transcripts and human judgment and correlation with self-reported estimates of AVH symptom severity. As such, TARDIS eliminates a fundamental barrier to the deployment of automated methods to detect linguistic indicators of ThD to monitor and improve clinical care in serious mental illness.


Asunto(s)
Esquizofrenia , Habla , Alucinaciones , Humanos , Lingüística , Aprendizaje Automático
10.
JAMIA Open ; 4(3): ooab070, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34423261

RESUMEN

OBJECTIVE: With COVID-19, there was a need for a rapidly scalable annotation system that facilitated real-time integration with clinical decision support systems (CDS). Current annotation systems suffer from a high-resource utilization and poor scalability limiting real-world integration with CDS. A potential solution to mitigate these issues is to use the rule-based gazetteer developed at our institution. MATERIALS AND METHODS: Performance, resource utilization, and runtime of the rule-based gazetteer were compared with five annotation systems: BioMedICUS, cTAKES, MetaMap, CLAMP, and MedTagger. RESULTS: This rule-based gazetteer was the fastest, had a low resource footprint, and similar performance for weighted microaverage and macroaverage measures of precision, recall, and f1-score compared to other annotation systems. DISCUSSION: Opportunities to increase its performance include fine-tuning lexical rules for symptom identification. Additionally, it could run on multiple compute nodes for faster runtime. CONCLUSION: This rule-based gazetteer overcame key technical limitations facilitating real-time symptomatology identification for COVID-19 and integration of unstructured data elements into our CDS. It is ideal for large-scale deployment across a wide variety of healthcare settings for surveillance of acute COVID-19 symptoms for integration into prognostic modeling. Such a system is currently being leveraged for monitoring of postacute sequelae of COVID-19 (PASC) progression in COVID-19 survivors. This study conducted the first in-depth analysis and developed a rule-based gazetteer for COVID-19 symptom extraction with the following key features: low processor and memory utilization, faster runtime, and similar weighted microaverage and macroaverage measures for precision, recall, and f1-score compared to industry-standard annotation systems.

11.
J Am Med Inform Assoc ; 28(10): 2193-2201, 2021 09 18.
Artículo en Inglés | MEDLINE | ID: mdl-34272955

RESUMEN

OBJECTIVE: : Developing clinical natural language processing systems often requires access to many clinical documents, which are not widely available to the public due to privacy and security concerns. To address this challenge, we propose to develop methods to generate synthetic clinical notes and evaluate their utility in real clinical natural language processing tasks. MATERIALS AND METHODS: : We implemented 4 state-of-the-art text generation models, namely CharRNN, SegGAN, GPT-2, and CTRL, to generate clinical text for the History and Present Illness section. We then manually annotated clinical entities for randomly selected 500 History and Present Illness notes generated from the best-performing algorithm. To compare the utility of natural and synthetic corpora, we trained named entity recognition (NER) models from all 3 corpora and evaluated their performance on 2 independent natural corpora. RESULTS: : Our evaluation shows GPT-2 achieved the best BLEU (bilingual evaluation understudy) score (with a BLEU-2 of 0.92). NER models trained on synthetic corpus generated by GPT-2 showed slightly better performance on 2 independent corpora: strict F1 scores of 0.709 and 0.748, respectively, when compared with the NER models trained on natural corpus (F1 scores of 0.706 and 0.737, respectively), indicating the good utility of synthetic corpora in clinical NER model development. In addition, we also demonstrated that an augmented method that combines both natural and synthetic corpora achieved better performance than that uses the natural corpus only. CONCLUSIONS: : Recent advances in text generation have made it possible to generate synthetic clinical notes that could be useful for training NER models for information extraction from natural clinical notes, thus lowering the privacy concern and increasing data availability. Further investigation is needed to apply this technology to practice.


Asunto(s)
Almacenamiento y Recuperación de la Información , Procesamiento de Lenguaje Natural , Algoritmos
12.
J Am Med Inform Assoc ; 27(9): 1437-1442, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32569358

RESUMEN

Large observational data networks that leverage routine clinical practice data in electronic health records (EHRs) are critical resources for research on coronavirus disease 2019 (COVID-19). Data normalization is a key challenge for the secondary use of EHRs for COVID-19 research across institutions. In this study, we addressed the challenge of automating the normalization of COVID-19 diagnostic tests, which are critical data elements, but for which controlled terminology terms were published after clinical implementation. We developed a simple but effective rule-based tool called COVID-19 TestNorm to automatically normalize local COVID-19 testing names to standard LOINC (Logical Observation Identifiers Names and Codes) codes. COVID-19 TestNorm was developed and evaluated using 568 test names collected from 8 healthcare systems. Our results show that it could achieve an accuracy of 97.4% on an independent test set. COVID-19 TestNorm is available as an open-source package for developers and as an online Web application for end users (https://clamp.uth.edu/covid/loinc.php). We believe that it will be a useful tool to support secondary use of EHRs for research on COVID-19.


Asunto(s)
Betacoronavirus , Técnicas de Laboratorio Clínico/clasificación , Infecciones por Coronavirus/diagnóstico , Logical Observation Identifiers Names and Codes , Neumonía Viral/diagnóstico , Terminología como Asunto , COVID-19 , Prueba de COVID-19 , Infecciones por Coronavirus/clasificación , Registros Electrónicos de Salud , Humanos , Pandemias , SARS-CoV-2
13.
PLoS One ; 15(3): e0229942, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32210441

RESUMEN

Psychosocial stress is a major risk factor for morbidity and mortality related to a wide range of health conditions and has a significant negative impact on public health. Quantifying exposure to stress in the naturalistic environment can help to better understand its health effects and identify strategies for timely intervention. The objective of the current project was to develop and test the infrastructure and methods necessary for using wearable technology to quantify individual response to stressful situations and to determine if popular and accessible fitness trackers such as Fitbit® equipped with an optical heart rate (HR) monitor could be used to detect physiological response to psychosocial stress in everyday life. The participants in this study were University of Minnesota students (n = 18) that owned a Fitbit® tracker and had at least one upcoming examination. Continuous HR and activity measurements were obtained during a 7-day observation period containing examinations self-reported by the participants. Participants responded to six ecological momentary assessment surveys per day (~ 2 hour intervals) to indicate occurrence of stressful events. We compared HR during stressful events (e.g., exams) to baseline HR during periods indicated as non-stressful using mixed effects modeling. Our results show that HR was elevated by 8.9 beats per minute during exams and by 3.2 beats per minute during non-exam stressors. These results are consistent with prior laboratory findings and indicate that consumer wearable fitness trackers could serve as a valuable source of information on exposure to psychosocial stressors encountered in the naturalistic environment.


Asunto(s)
Ejercicio Físico/fisiología , Monitoreo Fisiológico , Estrés Psicológico/fisiopatología , Dispositivos Electrónicos Vestibles , Acelerometría/tendencias , Adulto , Femenino , Monitores de Ejercicio/tendencias , Frecuencia Cardíaca/fisiología , Humanos , Masculino , Proyectos Piloto , Tecnología de Sensores Remotos , Teléfono , Adulto Joven
14.
J Trauma Acute Care Surg ; 88(5): 607-614, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-31977990

RESUMEN

BACKGROUND: Incomplete prehospital trauma care is a significant contributor to preventable deaths. Current databases lack timelines easily constructible of clinical events. Temporal associations and procedural indications are critical to characterize treatment appropriateness. Natural language processing (NLP) methods present a novel approach to bridge this gap. We sought to evaluate the efficacy of a novel and automated NLP pipeline to determine treatment appropriateness from a sample of prehospital EMS motor vehicle crash records. METHODS: A total of 142 records were used to extract airway procedures, intraosseous/intravenous access, packed red blood cell transfusion, crystalloid bolus, chest compression system, tranexamic acid bolus, and needle decompression. Reports were processed using four clinical NLP systems and augmented via a word2phrase method leveraging a large integrated health system clinical note repository to identify terms semantically similar with treatment indications. Indications were matched with treatments and categorized as indicated, missed (indicated but not performed), or nonindicated. Automated results were then compared with manual review, and precision and recall were calculated for each treatment determination. RESULTS: Natural language processing identified 184 treatments. Automated timeline summarization was completed for all patients. Treatments were characterized as indicated in a subset of cases including the following: 69% (18 of 26 patients) for airway, 54.5% (6 of 11 patients) for intraosseous access, 11.1% (1 of 9 patients) for needle decompression, 55.6% (10 of 18 patients) for tranexamic acid, 60% (9 of 15 patients) for packed red blood cell, 12.9% (4 of 31 patients) for crystalloid bolus, and 60% (3 of 5 patients) for chest compression system. The most commonly nonindicated treatment was crystalloid bolus (22 of 142 patients). Overall, the automated NLP system performed with high precision and recall with over 70% of comparisons achieving precision and recall of greater than 80%. CONCLUSION: Natural language processing methodologies show promise for enabling automated extraction of procedural indication data and timeline summarization. Future directions should focus on optimizing and expanding these techniques to scale and facilitate broader trauma care performance monitoring. LEVEL OF EVIDENCE: Diagnostic tests or criteria, level III.


Asunto(s)
Registros Electrónicos de Salud/estadística & datos numéricos , Servicios Médicos de Urgencia/organización & administración , Procesamiento de Lenguaje Natural , Garantía de la Calidad de Atención de Salud/métodos , Heridas y Lesiones/terapia , Servicios Médicos de Urgencia/estadística & datos numéricos , Humanos , Proyectos Piloto , Mejoramiento de la Calidad , Heridas y Lesiones/diagnóstico
15.
Exp Gerontol ; 130: 110794, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31790801

RESUMEN

Epidemiological studies have linked age-related hearing loss (ARHL) with an increased risk of neurocognitive decline. Difficulties in speech perception with subsequent changes in brain morphometry, including regions important for lexical-semantic memory, are thought to be a possible mechanism for this relationship. This study investigated differences in automatic and executive lexical-semantic processes on verbal fluency tasks in individuals with acquired hearing loss. The primary outcomes were indices of automatic (clustering/word retrieval at start of task) and executive (switching/word retrieval after start of the task) processes from semantic and phonemic fluency tasks. To extract indices of clustering and switching, we used both manual and computerised methods. There were no differences between groups on indices of executive fluency processes or on any indices from the semantic fluency task. The hearing loss group demonstrated weaker automatic processes on the phonemic fluency task. Further research into differences in lexical-semantic processes with ARHL is warranted.


Asunto(s)
Presbiacusia/fisiopatología , Conducta Verbal/fisiología , Anciano , Anciano de 80 o más Años , Disfunción Cognitiva/fisiopatología , Femenino , Humanos , Masculino , Memoria , Persona de Mediana Edad , Pruebas Neuropsicológicas , Semántica
16.
JAMIA Open ; 2(2): 246-253, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31825016

RESUMEN

OBJECTIVE: The objective of this study is to demonstrate the feasibility of applying word embeddings to expand the terminology of dietary supplements (DS) using over 26 million clinical notes. METHODS: Word embedding models (ie, word2vec and GloVe) trained on clinical notes were used to predefine a list of top 40 semantically related terms for each of 14 commonly used DS. Each list was further evaluated by experts to generate semantically similar terms. We investigated the effect of corpus size and other settings (ie, vector size and window size) as well as the 2 word embedding models on performance for DS term expansion. We compared the number of clinical notes (and patients they represent) that were retrieved using the word embedding expanded terms to both the baseline terms and external DS sources exandped terms. RESULTS: Using the word embedding models trained on clinical notes, we could identify 1-12 semantically similar terms for each DS. Using the word embedding exandped terms, we were able to retrieve averagely 8.39% more clinical notes and 11.68% more patients for each DS compared with 2 sets of terms. The increasing corpus size results in more misspellings, but not more semantic variants brand names. Word2vec model is also found more capable of detecting semantically similar terms than GloVe. CONCLUSION: Our study demonstrates the utility of word embeddings on clinical notes for terminology expansion on 14 DS. We propose that this method can be potentially applied to create a DS vocabulary for downstream applications, such as information extraction.

17.
BMC Med Inform Decis Mak ; 19(1): 183, 2019 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-31493797

RESUMEN

BACKGROUND: Medical data sharing is a big challenge in biomedicine, which often hinders collaborative research. Due to privacy concerns, clinical notes cannot be directly shared. A lot of efforts have been dedicated to de-identifying clinical notes but it is still very challenging to accurately locate and scrub all sensitive elements from notes in an automatic manner. An alternative approach is to remove sentences that might contain sensitive terms related to personal information. METHODS: A previous study introduced a frequency-based filtering approach that removes sentences containing low frequency bigrams to improve the privacy protection without significantly decreasing the utility. Our work extends this method to consider clinical notes from distributed sources with security and privacy considerations. We developed a novel secure protocol based on private set intersection and secure thresholding to identify uncommon and low-frequency terms, which can be used to guide sentence filtering. RESULTS: As the computational cost of our proposed framework mostly depends on the cardinality of the intersection of the sets and the number of data owners, we evaluated the framework in terms of these two factors. Experimental results demonstrate that our proposed method is scalable in various experimental settings. In addition, we evaluated our framework in terms of data utility. This evaluation shows that the proposed method is able to retain enough information for data analysis. CONCLUSION: This work demonstrates the feasibility of using homomorphic encryption to develop a secure and efficient multi-party protocol.


Asunto(s)
Artefactos , Seguridad Computacional , Difusión de la Información , Registros Electrónicos de Salud , Humanos
18.
Stud Health Technol Inform ; 264: 198-202, 2019 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-31437913

RESUMEN

Although a number of foundational natural language processing (NLP) tasks like text segmentation are considered a simple problem in the general English domain dominated by well-formed text, complexities of clinical documentation lead to poor performance of existing solutions designed for the general English domain. We present an alternative solution that relies on a convolutional neural network layer followed by a bidirectional long short-term memory layer (CNN-Bi-LSTM) for the task of sentence boundary disambiguation and describe an ensemble approach for domain adaptation using two training corpora. Implementations using the Keras neural-networks API are available at https://github.com/NLPIE/clinical-sentences.


Asunto(s)
Procesamiento de Lenguaje Natural , Redes Neurales de la Computación , Documentación , Lenguaje
19.
Stud Health Technol Inform ; 264: 1586-1587, 2019 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-31438244

RESUMEN

Natural language processing (NLP) methods would improve outcomes in the area of prehospital Emergency Medical Services (EMS) data collection and abstraction. This study evaluated off-the-shelf solutions for automating labelling of clinically relevant data from EMS reports. A qualitative approach for choosing the best possible ensemble of pretrained NLP systems was developed and validated along with a feature using word embeddings to test phrase synonymy. The ensemble showed increased performance over individual systems.


Asunto(s)
Servicios Médicos de Urgencia , Procesamiento de Lenguaje Natural
20.
Stud Health Technol Inform ; 264: 1684-1685, 2019 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-31438292

RESUMEN

This study used eye-tracking to understand how the order of note sections influences the way physicians read electronic progress notes. Participants (n = 7) wore an eye-tracking device while reviewing progress notes for four patient cases and then provided a verbal summary. We reviewed and analyzed verbal summaries and eye tracking recordings. Wide variation in reading behaviors existed. There was no relationship between time spent reading a section and section origin of verbal summaries.


Asunto(s)
Lectura , Comprensión , Registros Electrónicos de Salud , Ojo , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...