Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 113
Filtrar
1.
J Biomed Inform ; 154: 104653, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38734158

RESUMO

Many approaches in biomedical informatics (BMI) rely on the ability to define, gather, and manipulate biomedical data to support health through a cyclical research-practice lifecycle. Researchers within this field are often fortunate to work closely with healthcare and public health systems to influence data generation and capture and have access to a vast amount of biomedical data. Many informaticists also have the expertise to engage with stakeholders, develop new methods and applications, and influence policy. However, research and policy that explicitly seeks to address the systemic drivers of health would more effectively support health. Intersectionality is a theoretical framework that can facilitate such research. It holds that individual human experiences reflect larger socio-structural level systems of privilege and oppression, and cannot be truly understood if these systems are examined in isolation. Intersectionality explicitly accounts for the interrelated nature of systems of privilege and oppression, providing a lens through which to examine and challenge inequities. In this paper, we propose intersectionality as an intervention into how we conduct BMI research. We begin by discussing intersectionality's history and core principles as they apply to BMI. We then elaborate on the potential for intersectionality to stimulate BMI research. Specifically, we posit that our efforts in BMI to improve health should address intersectionality's five key considerations: (1) systems of privilege and oppression that shape health; (2) the interrelated nature of upstream health drivers; (3) the nuances of health outcomes within groups; (4) the problematic and power-laden nature of categories that we assign to people in research and in society; and (5) research to inform and support social change.

2.
Psychiatry Res ; 336: 115893, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38657475

RESUMO

Abnormal emotion processing is a core feature of schizophrenia spectrum disorders (SSDs) that encompasses multiple operations. While deficits in some areas have been well-characterized, we understand less about abnormalities in the emotion processing that happens through language, which is highly relevant for social life. Here, we introduce a novel method using deep learning to estimate emotion processing rapidly from spoken language, testing this approach in male-identified patients with SSDs (n = 37) and healthy controls (n = 51). Using free responses to evocative stimuli, we derived a measure of appropriateness, or "emotional alignment" (EA). We examined psychometric characteristics of EA and its sensitivity to a single-dose challenge of oxytocin, a neuropeptide shown to enhance the salience of socioemotional information in SSDs. Patients showed impaired EA relative to controls, and impairment correlated with poorer social cognitive skill and more severe motivation and pleasure deficits. Adding EA to a logistic regression model with language-based measures of formal thought disorder (FTD) improved classification of patients versus controls. Lastly, oxytocin administration improved EA but not FTD among patients. While additional validation work is needed, these initial results suggest that an automated assay using spoken language may be a promising approach to assess emotion processing in SSDs.


Assuntos
Emoções , Ocitocina , Esquizofrenia , Humanos , Masculino , Adulto , Esquizofrenia/fisiopatologia , Emoções/fisiologia , Pessoa de Meia-Idade , Ocitocina/administração & dosagem , Aprendizado Profundo , Psicologia do Esquizofrênico
3.
Artigo em Inglês | MEDLINE | ID: mdl-38657567

RESUMO

OBJECTIVES: Generative large language models (LLMs) are a subset of transformers-based neural network architecture models. LLMs have successfully leveraged a combination of an increased number of parameters, improvements in computational efficiency, and large pre-training datasets to perform a wide spectrum of natural language processing (NLP) tasks. Using a few examples (few-shot) or no examples (zero-shot) for prompt-tuning has enabled LLMs to achieve state-of-the-art performance in a broad range of NLP applications. This article by the American Medical Informatics Association (AMIA) NLP Working Group characterizes the opportunities, challenges, and best practices for our community to leverage and advance the integration of LLMs in downstream NLP applications effectively. This can be accomplished through a variety of approaches, including augmented prompting, instruction prompt tuning, and reinforcement learning from human feedback (RLHF). TARGET AUDIENCE: Our focus is on making LLMs accessible to the broader biomedical informatics community, including clinicians and researchers who may be unfamiliar with NLP. Additionally, NLP practitioners may gain insight from the described best practices. SCOPE: We focus on 3 broad categories of NLP tasks, namely natural language understanding, natural language inferencing, and natural language generation. We review the emerging trends in prompt tuning, instruction fine-tuning, and evaluation metrics used for LLMs while drawing attention to several issues that impact biomedical NLP applications, including falsehoods in generated text (confabulation/hallucinations), toxicity, and dataset contamination leading to overfitting. We also review potential approaches to address some of these current challenges in LLMs, such as chain of thought prompting, and the phenomena of emergent capabilities observed in LLMs that can be leveraged to address complex NLP challenge in biomedical applications.

4.
Neurosci Inform ; 4(1)2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38433986

RESUMO

Introduction: While linguistic retrogenesis has been extensively investigated in the neuroscientific and behavioral literature, there has been little work on retrogenesis using computerized approaches to language analysis. Methods: We bridge this gap by introducing a method based on comparing output of a pre-trained neural language model (NLM) with an artificially degraded version of itself to examine the transcripts of speech produced by seniors with and without dementia and healthy children during spontaneous language tasks. We compare a range of linguistic characteristics including language model perplexity, syntactic complexity, lexical frequency and part-of-speech use across these groups. Results: Our results indicate that healthy seniors and children older than 8 years share similar linguistic characteristics, as do dementia patients and children who are younger than 8 years. Discussion: Our study aligns with the growing evidence that language deterioration in dementia mirrors language acquisition in development using computational linguistic methods based on NLMs. This insight underscores the importance of further research to refine its application in guiding developmentally appropriate patient care, particularly in early stages.

5.
J Biomed Inform ; 149: 104580, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38163514

RESUMO

The complex linguistic structures and specialized terminology of expert-authored content limit the accessibility of biomedical literature to the general public. Automated methods have the potential to render this literature more interpretable to readers with different educational backgrounds. Prior work has framed such lay language generation as a summarization or simplification task. However, adapting biomedical text for the lay public includes the additional and distinct task of background explanation: adding external content in the form of definitions, motivation, or examples to enhance comprehensibility. This task is especially challenging because the source document may not include the required background knowledge. Furthermore, background explanation capabilities have yet to be formally evaluated, and little is known about how best to enhance them. To address this problem, we introduce Retrieval-Augmented Lay Language (RALL) generation, which intuitively fits the need for external knowledge beyond that in expert-authored source documents. In addition, we introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation. To evaluate RALL, we augmented state-of-the-art text generation models with information retrieval of either term definitions from the UMLS and Wikipedia, or embeddings of explanations from Wikipedia documents. Of these, embedding-based RALL models improved summary quality and simplicity while maintaining factual correctness, suggesting that Wikipedia is a helpful source for background explanation in this context. We also evaluated the ability of both an open-source Large Language Model (Llama 2) and a closed-source Large Language Model (GPT-4) in background explanation, with and without retrieval augmentation. Results indicate that these LLMs can generate simplified content, but that the summary quality is not ideal. Taken together, this work presents the first comprehensive study of background explanation for lay language generation, paving the path for disseminating scientific knowledge to a broader audience. Our code and data are publicly available at: https://github.com/LinguisticAnomalies/pls_retrieval.


Assuntos
Idioma , Processamento de Linguagem Natural , Armazenamento e Recuperação da Informação , Linguística , Unified Medical Language System
6.
J Biomed Inform ; 150: 104598, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38253228

RESUMO

OBJECTIVES: We aimed to investigate how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy, specifically in the "Cookie Theft" picture description task. We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information for distinguishing between language samples from cognitively healthy individuals and those with Alzheimer's disease (AD). METHODS: We conducted experiments using various ASR models, refining their transcripts with post-editing techniques. Both these imperfect ASR transcripts and manually transcribed ones were used as inputs for the downstream dementia classification. We conducted comprehensive error analysis to compare model performance and assess ASR-generated transcript effectiveness in dementia classification. RESULTS: Imperfect ASR-generated transcripts surprisingly outperformed manual transcription for distinguishing between individuals with AD and those without in the "Cookie Theft" task. These ASR-based models surpassed the previous state-of-the-art approach, indicating that ASR errors may contain valuable cues related to dementia. The synergy between ASR and classification models improved overall accuracy in dementia classification. CONCLUSION: Imperfect ASR transcripts effectively capture linguistic anomalies linked to dementia, improving accuracy in classification tasks. This synergy between ASR and classification models underscores ASR's potential as a valuable tool in assessing cognitive impairment and related clinical applications.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Percepção da Fala , Humanos , Fala , Idioma , Doença de Alzheimer/diagnóstico
7.
J Clin Endocrinol Metab ; 109(2): 402-412, 2024 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-37683082

RESUMO

CONTEXT: Thyroid nodule ultrasound-based risk stratification schemas rely on the presence of high-risk sonographic features. However, some malignant thyroid nodules have benign appearance on thyroid ultrasound. New methods for thyroid nodule risk assessment are needed. OBJECTIVE: We investigated polygenic risk score (PRS) accounting for inherited thyroid cancer risk combined with ultrasound-based analysis for improved thyroid nodule risk assessment. METHODS: The convolutional neural network classifier was trained on thyroid ultrasound still images and cine clips from 621 thyroid nodules. Phenome-wide association study (PheWAS) and PRS PheWAS were used to optimize PRS for distinguishing benign and malignant nodules. PRS was evaluated in 73 346 participants in the Colorado Center for Personalized Medicine Biobank. RESULTS: When the deep learning model output was combined with thyroid cancer PRS and genetic ancestry estimates, the area under the receiver operating characteristic curve (AUROC) of the benign vs malignant thyroid nodule classifier increased from 0.83 to 0.89 (DeLong, P value = .007). The combined deep learning and genetic classifier achieved a clinically relevant sensitivity of 0.95, 95% CI [0.88-0.99], specificity of 0.63 [0.55-0.70], and positive and negative predictive values of 0.47 [0.41-0.58] and 0.97 [0.92-0.99], respectively. AUROC improvement was consistent in European ancestry-stratified analysis (0.83 and 0.87 for deep learning and deep learning combined with PRS classifiers, respectively). Elevated PRS was associated with a greater risk of thyroid cancer structural disease recurrence (ordinal logistic regression, P value = .002). CONCLUSION: Augmenting ultrasound-based risk assessment with PRS improves diagnostic accuracy.


Assuntos
Neoplasias da Glândula Tireoide , Nódulo da Glândula Tireoide , Humanos , Nódulo da Glândula Tireoide/diagnóstico por imagem , Nódulo da Glândula Tireoide/genética , Estratificação de Risco Genético , Sensibilidade e Especificidade , Recidiva Local de Neoplasia , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/genética , Ultrassonografia/métodos
8.
AMIA Jt Summits Transl Sci Proc ; 2023: 360-369, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37350929

RESUMO

The evidence is growing that machine and deep learning methods can learn the subtle differences between the language produced by people with various forms of cognitive impairment such as dementia and cognitively healthy individuals. Valuable public data repositories such as TalkBank have made it possible for researchers in the computational community to join forces and learn from each other to make significant advances in this area. However, due to variability in approaches and data selection strategies used by various researchers, results obtained by different groups have been difficult to compare directly. In this paper, we present TRESTLE (Toolkit for Reproducible Execution of Speech Text and Language Experiments), an open source platform that focuses on two datasets from the TalkBank repository with dementia detection as an illustrative domain. Successfully deployed in the hackallenge (Hackathon/Challenge) of the International Workshop on Health Intelligence at AAAI 2022, TRESTLE provides a precise digital blueprint of the data pre-processing and selection strategies that can be reused via TRESTLE by other researchers seeking comparable results with their peers and current state-of-the-art (SOTA) approaches.

9.
JAMA ; 329(23): 2028-2037, 2023 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-37210665

RESUMO

Importance: Discussions about goals of care are important for high-quality palliative care yet are often lacking for hospitalized older patients with serious illness. Objective: To evaluate a communication-priming intervention to promote goals-of-care discussions between clinicians and hospitalized older patients with serious illness. Design, Setting, and Participants: A pragmatic, randomized clinical trial of a clinician-facing communication-priming intervention vs usual care was conducted at 3 US hospitals within 1 health care system, including a university, county, and community hospital. Eligible hospitalized patients were aged 55 years or older with any of the chronic illnesses used by the Dartmouth Atlas project to study end-of-life care or were aged 80 years or older. Patients with documented goals-of-care discussions or a palliative care consultation between hospital admission and eligibility screening were excluded. Randomization occurred between April 2020 and March 2021 and was stratified by study site and history of dementia. Intervention: Physicians and advance practice clinicians who were treating the patients randomized to the intervention received a 1-page, patient-specific intervention (Jumpstart Guide) to prompt and guide goals-of-care discussions. Main Outcomes and Measures: The primary outcome was the proportion of patients with electronic health record-documented goals-of-care discussions within 30 days. There was also an evaluation of whether the effect of the intervention varied by age, sex, history of dementia, minoritized race or ethnicity, or study site. Results: Of 3918 patients screened, 2512 were enrolled (mean age, 71.7 [SD, 10.8] years and 42% were women) and randomized (1255 to the intervention group and 1257 to the usual care group). The patients were American Indian or Alaska Native (1.8%), Asian (12%), Black (13%), Hispanic (6%), Native Hawaiian or Pacific Islander (0.5%), non-Hispanic (93%), and White (70%). The proportion of patients with electronic health record-documented goals-of-care discussions within 30 days was 34.5% (433 of 1255 patients) in the intervention group vs 30.4% (382 of 1257 patients) in the usual care group (hospital- and dementia-adjusted difference, 4.1% [95% CI, 0.4% to 7.8%]). The analyses of the treatment effect modifiers suggested that the intervention had a larger effect size among patients with minoritized race or ethnicity. Among 803 patients with minoritized race or ethnicity, the hospital- and dementia-adjusted proportion with goals-of-care discussions was 10.2% (95% CI, 4.0% to 16.5%) higher in the intervention group than in the usual care group. Among 1641 non-Hispanic White patients, the adjusted proportion with goals-of-care discussions was 1.6% (95% CI, -3.0% to 6.2%) higher in the intervention group than in the usual care group. There was no evidence of differential treatment effects of the intervention on the primary outcome by age, sex, history of dementia, or study site. Conclusions and Relevance: Among hospitalized older adults with serious illness, a pragmatic clinician-facing communication-priming intervention significantly improved documentation of goals-of-care discussions in the electronic health record, with a greater effect size in racially or ethnically minoritized patients. Trial Registration: ClinicalTrials.gov Identifier: NCT04281784.


Assuntos
Demência , Assistência Terminal , Humanos , Feminino , Idoso , Masculino , Comunicação , Hospitalização , Demência/terapia , Planejamento de Assistência ao Paciente
10.
JMIR Infodemiology ; 3: e40156, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37113378

RESUMO

Background: Despite increasing awareness about and advances in addressing social media misinformation, the free flow of false COVID-19 information has continued, affecting individuals' preventive behaviors, including masking, testing, and vaccine uptake. Objective: In this paper, we describe our multidisciplinary efforts with a specific focus on methods to (1) gather community needs, (2) develop interventions, and (3) conduct large-scale agile and rapid community assessments to examine and combat COVID-19 misinformation. Methods: We used the Intervention Mapping framework to perform community needs assessment and develop theory-informed interventions. To supplement these rapid and responsive efforts through large-scale online social listening, we developed a novel methodological framework, comprising qualitative inquiry, computational methods, and quantitative network models to analyze publicly available social media data sets to model content-specific misinformation dynamics and guide content tailoring efforts. As part of community needs assessment, we conducted 11 semistructured interviews, 4 listening sessions, and 3 focus groups with community scientists. Further, we used our data repository with 416,927 COVID-19 social media posts to gather information diffusion patterns through digital channels. Results: Our results from community needs assessment revealed the complex intertwining of personal, cultural, and social influences of misinformation on individual behaviors and engagement. Our social media interventions resulted in limited community engagement and indicated the need for consumer advocacy and influencer recruitment. The linking of theoretical constructs underlying health behaviors to COVID-19-related social media interactions through semantic and syntactic features using our computational models has revealed frequent interaction typologies in factual and misleading COVID-19 posts and indicated significant differences in network metrics such as degree. The performance of our deep learning classifiers was reasonable, with an F-measure of 0.80 for speech acts and 0.81 for behavior constructs. Conclusions: Our study highlights the strengths of community-based field studies and emphasizes the utility of large-scale social media data sets in enabling rapid intervention tailoring to adapt grassroots community interventions to thwart misinformation seeding and spread among minority communities. Implications for consumer advocacy, data governance, and industry incentives are discussed for the sustainable role of social media solutions in public health.

11.
J Am Med Inform Assoc ; 30(6): 1068-1078, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37043748

RESUMO

OBJECTIVE: Compared to natural language processing research investigating suicide risk prediction with social media (SM) data, research utilizing data from clinical settings are scarce. However, the utility of models trained on SM data in text from clinical settings remains unclear. In addition, commonly used performance metrics do not directly translate to operational value in a real-world deployment. The objectives of this study were to evaluate the utility of SM-derived training data for suicide risk prediction in a clinical setting and to develop a metric of the clinical utility of automated triage of patient messages for suicide risk. MATERIALS AND METHODS: Using clinical data, we developed a Bidirectional Encoder Representations from Transformers-based suicide risk detection model to identify messages indicating potential suicide risk. We used both annotated and unlabeled suicide-related SM posts for multi-stage transfer learning, leveraging customized contemporary learning rate schedules. We also developed a novel metric estimating predictive models' potential to reduce follow-up delays with patients in distress and used it to assess model utility. RESULTS: Multi-stage transfer learning from SM data outperformed baseline approaches by traditional classification performance metrics, improving performance from 0.734 to a best F1 score of 0.797. Using this approach for automated triage could reduce response times by 15 minutes per urgent message. DISCUSSION: Despite differences in data characteristics and distribution, publicly available SM data benefit clinical suicide risk prediction when used in conjunction with contemporary transfer learning techniques. Estimates of time saved due to automated triage indicate the potential for the practical impact of such models when deployed as part of established suicide prevention interventions. CONCLUSIONS: This work demonstrates a pathway for leveraging publicly available SM data toward improving risk assessment, paving the way for better clinical care and improved clinical outcomes.


Assuntos
Mídias Sociais , Suicídio , Envio de Mensagens de Texto , Humanos , Benchmarking , Aprendizado de Máquina
12.
JAMA Netw Open ; 6(3): e231204, 2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36862411

RESUMO

Importance: Many clinical trial outcomes are documented in free-text electronic health records (EHRs), making manual data collection costly and infeasible at scale. Natural language processing (NLP) is a promising approach for measuring such outcomes efficiently, but ignoring NLP-related misclassification may lead to underpowered studies. Objective: To evaluate the performance, feasibility, and power implications of using NLP to measure the primary outcome of EHR-documented goals-of-care discussions in a pragmatic randomized clinical trial of a communication intervention. Design, Setting, and Participants: This diagnostic study compared the performance, feasibility, and power implications of measuring EHR-documented goals-of-care discussions using 3 approaches: (1) deep-learning NLP, (2) NLP-screened human abstraction (manual verification of NLP-positive records), and (3) conventional manual abstraction. The study included hospitalized patients aged 55 years or older with serious illness enrolled between April 23, 2020, and March 26, 2021, in a pragmatic randomized clinical trial of a communication intervention in a multihospital US academic health system. Main Outcomes and Measures: Main outcomes were natural language processing performance characteristics, human abstractor-hours, and misclassification-adjusted statistical power of methods of measuring clinician-documented goals-of-care discussions. Performance of NLP was evaluated with receiver operating characteristic (ROC) curves and precision-recall (PR) analyses and examined the effects of misclassification on power using mathematical substitution and Monte Carlo simulation. Results: A total of 2512 trial participants (mean [SD] age, 71.7 [10.8] years; 1456 [58%] female) amassed 44 324 clinical notes during 30-day follow-up. In a validation sample of 159 participants, deep-learning NLP trained on a separate training data set identified patients with documented goals-of-care discussions with moderate accuracy (maximal F1 score, 0.82; area under the ROC curve, 0.924; area under the PR curve, 0.879). Manual abstraction of the outcome from the trial data set would require an estimated 2000 abstractor-hours and would power the trial to detect a risk difference of 5.4% (assuming 33.5% control-arm prevalence, 80% power, and 2-sided α = .05). Measuring the outcome by NLP alone would power the trial to detect a risk difference of 7.6%. Measuring the outcome by NLP-screened human abstraction would require 34.3 abstractor-hours to achieve estimated sensitivity of 92.6% and would power the trial to detect a risk difference of 5.7%. Monte Carlo simulations corroborated misclassification-adjusted power calculations. Conclusions and Relevance: In this diagnostic study, deep-learning NLP and NLP-screened human abstraction had favorable characteristics for measuring an EHR outcome at scale. Adjusted power calculations accurately quantified power loss from NLP-related misclassification, suggesting that incorporation of this approach into the design of studies using NLP would be beneficial.


Assuntos
Ensaios Clínicos como Assunto , Coleta de Dados , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Planejamento de Assistência ao Paciente , Idoso , Feminino , Humanos , Masculino , Simulação por Computador , Estudos de Viabilidade , Aprendizado Profundo , Coleta de Dados/métodos , Pessoa de Meia-Idade , Hospitalização
13.
J Biomed Inform ; 140: 104324, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36842490

RESUMO

BACKGROUND: Online health communities (OHCs) have emerged as prominent platforms for behavior modification, and the digitization of online peer interactions has afforded researchers with unique opportunities to model multilevel mechanisms that drive behavior change. Existing studies, however, have been limited by a lack of methods that allow the capture of conversational context and socio-behavioral dynamics at scale, as manifested in these digital platforms. OBJECTIVE: We develop, evaluate, and apply a novel methodological framework, Pragmatics to Reveal Intent in Social Media (PRISM), to facilitate granular characterization of peer interactions by combining multidimensional facets of human communication. METHODS: We developed and applied PRISM to analyze peer interactions (N = 2.23 million) in QuitNet, an OHC for tobacco cessation. First, we generated a labeled set of peer interactions (n = 2,005) through manual annotation along three dimensions: communication themes (CTs), behavior change techniques (BCTs), and speech acts (SAs). Second, we used deep learning models to apply our qualitative codes at scale. Third, we applied our validated model to perform a retrospective analysis. Finally, using social network analysis (SNA), we portrayed large-scale patterns and relationships among the aforementioned communication dimensions embedded in peer interactions in QuitNet. RESULTS: Qualitative analysis showed that the themes of social support and behavioral progress were common. The most used BCTs were feedback and monitoring and comparison of behavior, and users most commonly expressed their intentions using SAs-expressive and emotion. With additional in-domain pre-training, bidirectional encoder representations from Transformers (BERT) outperformed other deep learning models on the classification tasks. Content-specific SNA revealed that users' engagement or abstinence status is associated with the prevalence of various categories of BCTs and SAs, which also was evident from the visualization of network structures. CONCLUSIONS: Our study describes the interplay of multilevel characteristics of online communication and their association with individual health behaviors.


Assuntos
Mídias Sociais , Humanos , Estudos Retrospectivos , Intenção , Apoio Social , Comunicação
14.
BMC Med Inform Decis Mak ; 23(1): 2, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36609379

RESUMO

BACKGROUND: Low back pain (LBP) is a common condition made up of a variety of anatomic and clinical subtypes. Lumbar disc herniation (LDH) and lumbar spinal stenosis (LSS) are two subtypes highly associated with LBP. Patients with LDH/LSS are often started with non-surgical treatments and if those are not effective then go on to have decompression surgery. However, recommendation of surgery is complicated as the outcome may depend on the patient's health characteristics. We developed a deep learning (DL) model to predict decompression surgery for patients with LDH/LSS. MATERIALS AND METHOD: We used datasets of 8387 and 8620 patients from a prospective study that collected data from four healthcare systems to predict early (within 2 months) and late surgery (within 12 months after a 2 month gap), respectively. We developed a DL model to use patients' demographics, diagnosis and procedure codes, drug names, and diagnostic imaging reports to predict surgery. For each prediction task, we evaluated the model's performance using classical and generalizability evaluation. For classical evaluation, we split the data into training (80%) and testing (20%). For generalizability evaluation, we split the data based on the healthcare system. We used the area under the curve (AUC) to assess performance for each evaluation. We compared results to a benchmark model (i.e. LASSO logistic regression). RESULTS: For classical performance, the DL model outperformed the benchmark model for early surgery with an AUC of 0.725 compared to 0.597. For late surgery, the DL model outperformed the benchmark model with an AUC of 0.655 compared to 0.635. For generalizability performance, the DL model outperformed the benchmark model for early surgery. For late surgery, the benchmark model outperformed the DL model. CONCLUSIONS: For early surgery, the DL model was preferred for classical and generalizability evaluation. However, for late surgery, the benchmark and DL model had comparable performance. Depending on the prediction task, the balance of performance may shift between DL and a conventional ML method. As a result, thorough assessment is needed to quantify the value of DL, a relatively computationally expensive, time-consuming and less interpretable method.


Assuntos
Aprendizado Profundo , Deslocamento do Disco Intervertebral , Dor Lombar , Estenose Espinal , Humanos , Descompressão Cirúrgica/efeitos adversos , Descompressão Cirúrgica/métodos , Estudos Prospectivos , Vértebras Lombares/cirurgia , Dor Lombar/diagnóstico , Dor Lombar/cirurgia , Dor Lombar/complicações , Deslocamento do Disco Intervertebral/cirurgia , Estenose Espinal/cirurgia , Resultado do Tratamento , Estudos Retrospectivos
15.
J Pain Symptom Manage ; 65(3): 233-241, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36423800

RESUMO

CONTEXT: Goals-of-care discussions are important for patient-centered care among hospitalized patients with serious illness. However, there are little data on the occurrence, predictors, and timing of these discussions. OBJECTIVES: To examine the occurrence, predictors, and timing of electronic health record (EHR)-documented goals-of-care discussions for hospitalized patients. METHODS: This retrospective cohort study used natural language processing (NLP) to examine EHR-documented goals-of-care discussions for adults with chronic life-limiting illness or age ≥80 hospitalized 2015-2019. The primary outcome was NLP-identified documentation of a goals-of-care discussion during the index hospitalization. We used multivariable logistic regression to evaluate associations with baseline characteristics. RESULTS: Of 16,262 consecutive, eligible patients without missing data, 5,918 (36.4%) had a documented goals-of-care discussion during hospitalization; approximately 57% of these discussions occurred within 24 hours of admission. In multivariable analysis, documented goals-of-care discussions were more common for women (OR=1.26, 95%CI 1.18-1.36), older patients (OR=1.04 per year, 95%CI 1.03-1.04), and patients with more comorbidities (OR=1.11 per Deyo-Charlson point, 95%CI 1.10-1.13), cancer (OR=1.88, 95%CI 1.72-2.06), dementia (OR=2.60, 95%CI 2.29-2.94), higher acute illness severity (OR=1.12 per National Early Warning Score point, 95%CI 1.11-1.14), or prior advance care planning documents (OR=1.18, 95%CI 1.08-1.30). Documentation of these discussions was less common for racially or ethnically minoritized patients (OR=0.823, 95%CI 0.75-0.90). CONCLUSION: Among hospitalized patients with serious illness, documented goals-of-care discussions identified by NLP were more common among patients with older age and increased burden of acute or chronic illness, and less common among racially or ethnically minoritized patients. This suggests important disparities in goals-of-care discussions.


Assuntos
Planejamento Antecipado de Cuidados , Assistência Terminal , Adulto , Humanos , Feminino , Estudos Retrospectivos , Objetivos , Doença Crônica
16.
Psychiatr Serv ; 74(4): 407-410, 2023 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-36164769

RESUMO

OBJECTIVE: The authors tested whether natural language processing (NLP) methods can detect and classify cognitive distortions in text messages between clinicians and people with serious mental illness as effectively as clinically trained human raters. METHODS: Text messages (N=7,354) were collected from 39 clients in a randomized controlled trial of a 12-week texting intervention. Clinical annotators labeled messages for common cognitive distortions: mental filtering, jumping to conclusions, catastrophizing, "should" statements, and overgeneralizing. Multiple NLP classification methods were applied to the same messages, and performance was compared. RESULTS: A tuned model that used bidirectional encoder representations from transformers (F1=0.62) achieved performance comparable to that of clinical raters in classifying texts with any distortion (F1=0.63) and superior to that of other models. CONCLUSIONS: NLP methods can be used to effectively detect and classify cognitive distortions in text exchanges, and they have the potential to inform scalable automated tools for clinical support during message-based care for people with serious mental illness.


Assuntos
Transtornos Mentais , Envio de Mensagens de Texto , Humanos , Processamento de Linguagem Natural , Transtornos Mentais/diagnóstico , Cognição
17.
AMIA Annu Symp Proc ; 2023: 436-445, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222441

RESUMO

Despite the high prevalence and burden of mental health conditions, there is a global shortage of mental health providers. Artificial Intelligence (AI) methods have been proposed as a way to address this shortage, by supporting providers with less extensive training as they deliver care. To this end, we developed the AI-Assisted Provider Platform (A2P2), a text-based virtual therapy interface that includes a response suggestion feature, which supports providers in delivering protocolized therapies empathetically. We studied providers with and without expertise in mental health treatment delivering a therapy session using the platform with (intervention) and without (control) AI-assistance features. Upon evaluation, the AI-assisted system significantly decreased response times by 29.34% (p=0.002), tripled empathic response accuracy (p=0.0001), and increased goal recommendation accuracy by 66.67% (p=0.001) across both user groups compared to the control. Both groups rated the system as having excellent usability.


Assuntos
Inteligência Artificial , Transtornos Mentais , Humanos
18.
AMIA Annu Symp Proc ; 2023: 1226-1235, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222407

RESUMO

Prior work has shown that analyzing the use of first-person singular pronouns can provide insight into individuals' mental status, especially depression symptom severity. These findings were generated by counting frequencies of first-person singular pronouns in text data. However, counting doesn't capture how these pronouns are used. Recent advances in neural language modeling have leveraged methods generating contextual embeddings. In this study, we sought to utilize the embeddings of first-person pronouns obtained from contextualized language representation models to capture ways these pronouns are used, to analyze mental status. De-identified text messages sent during online psychotherapy with weekly assessment of depression severity were used for evaluation. Results indicate the advantage of contextualized first-person pronoun embeddings over standard classification token embeddings and frequency-based pronoun analysis results in predicting depression symptom severity. This suggests contextual representations of first-person pronouns can enhance the predictive utility of language used by people with depression symptoms.


Assuntos
Depressão , Envio de Mensagens de Texto , Humanos , Depressão/diagnóstico , Idioma
19.
AMIA Annu Symp Proc ; 2023: 923-932, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222433

RESUMO

Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets for training, and trained models have been shown to transfer poorly across sites. These issues have led to the promotion of data collection and integration across different institutions for accurate and portable models. However, this can introduce a form of bias called confounding by provenance. When source-specific data distributions differ at deployment, this may harm model performance. To address this issue, we evaluate the utility of backdoor adjustment for text classification in a multi-site dataset of clinical notes annotated for mentions of substance abuse. Using an evaluation framework devised to measure robustness to distributional shifts, we assess the utility of backdoor adjustment. Our results indicate that backdoor adjustment can effectively mitigate for confounding shift.


Assuntos
Registros Eletrônicos de Saúde , Transtornos Relacionados ao Uso de Substâncias , Humanos , Coleta de Dados , Aprendizado de Máquina , Processamento de Linguagem Natural , Estudos Multicêntricos como Assunto
20.
J Surg Case Rep ; 2022(11): rjac494, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36389435

RESUMO

Endometrial stromal sarcomas are the second most common type of mesenchymal uterine tumors, and they represent 1% of all uterine malignancies. Metastasis of this tumor occurs in about one third of patients, usually to the pelvis and lower genital tract. Metastases to the diaphragm or liver are exceedingly rare, with only a few published cases in the literature. This case presents a 28-year-old woman with a subphrenic endometrial stromal sarcoma metastasis between the right diaphragm and segment IVa of the liver that was treated with surgical resection.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...