RESUMO
Objective: To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient's menopausal status. Materials and methods: A rule-based NLP system was designed to capture evidence of a patient's menopause status including dates of a patient's last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. NLP-derived output was used in combination with structured EHR data to classify a patient's menopausal status. NLP processing and patient classification were performed on a cohort of 307 512 female Veterans receiving healthcare at the US Department of Veterans Affairs (VA). Results: NLP was validated at 99.6% precision. Including the NLP-derived data into a menopause phenotype increased the number of patients with data relevant to their menopausal status by 118%. Using structured codes alone, 81 173 (27.0%) are able to be classified as postmenopausal or premenopausal. However, with the inclusion of NLP, this number increased 167 804 (54.6%) patients. The premenopausal category grew by 532.7% with the inclusion of NLP data. Discussion: By employing NLP, it became possible to identify documented data elements that predate VA care, originate outside VA networks, or have no corresponding structured field in the VA EHR that would be otherwise inaccessible for further analysis. Conclusion: NLP can be used to identify concepts relevant to a patient's menopausal status in clinical notes. Adding NLP-derived data to an algorithm classifying a patient's menopausal status significantly increases the number of patients classified using EHR data, ultimately enabling more detailed assessments of the impact of menopause on health outcomes.
RESUMO
BACKGROUND: While evidence-based psychotherapy (EBP) for posttraumatic stress disorder (PTSD) is a first-line treatment, its real-world effectiveness is unknown. We compared cognitive processing therapy (CPT) and prolonged exposure (PE) each to an individual psychotherapy comparator group, and CPT to PE in a large national healthcare system. METHODS: We utilized effectiveness and comparative effectiveness emulated trials using retrospective cohort data from electronic medical records. Participants were veterans with PTSD initiating mental healthcare (N = 265 566). The primary outcome was PTSD symptoms measured by the PTSD Checklist (PCL) at baseline and 24-week follow-up. Emulated trials were comprised of 'person-trials,' representing 112 discrete 24-week periods of care (10/07-6/17) for each patient. Treatment group comparisons were made with generalized linear models, utilizing propensity score matching and inverse probability weights to account for confounding, selection, and non-adherence bias. RESULTS: There were 636 CPT person-trials matched to 636 non-EBP person-trials. Completing ⩾8 CPT sessions was associated with a 6.4-point greater improvement on the PCL (95% CI 3.1-10.0). There were 272 PE person-trials matched to 272 non-EBP person-trials. Completing ⩾8 PE sessions was associated with a 9.7-point greater improvement on the PCL (95% CI 5.4-13.8). There were 232 PE person-trials matched to 232 CPT person-trials. Those completing ⩾8 PE sessions had slightly greater, but not statistically significant, improvement on the PCL (8.3-points; 95% CI 5.9-10.6) than those completing ⩾8 CPT sessions (7.0-points; 95% CI 5.5-8.5). CONCLUSIONS: PTSD symptom improvement was similar and modest for both EBPs. Although EBPs are helpful, research to further improve PTSD care is critical.
Assuntos
Transtornos de Estresse Pós-Traumáticos , Veteranos , Humanos , Transtornos de Estresse Pós-Traumáticos/psicologia , Estudos Retrospectivos , Psicoterapia , Veteranos/psicologia , Registros Eletrônicos de Saúde , Resultado do TratamentoRESUMO
BACKGROUND AND AIMS: Postpolypectomy risk stratification for subsequent metachronous advanced neoplasia (MAN) is imprecise and does not account for colonoscopist adenoma detection rate (ADR). Our aim was to assess association of ADR with MAN and create a prediction model for postpolypectomy risk stratification incorporating ADR and other factors. METHODS: We conducted a retrospective cohort study of individuals with baseline polypectomy and subsequent surveillance colonoscopy from 2004 to 2016 within the U.S. Department of Veterans Affairs (VA). Clinical factors, polyp findings, and baseline colonoscopist ADR were considered for the model. Model performance (sensitivity, specificity, and area under the curve) for identifying individuals with MAN was compared with 2020 U.S. Multi-Society Task Force on Colorectal Cancer (USMSTF) surveillance recommendations. RESULTS: A total of 30,897 individuals were randomly assigned 2:1 into independent model training and validation sets. Increasing age, male sex, diabetes, current smoking, adenoma number, polyp location, adenoma ≥10 mm or with tubulovillous/villous features, and decreasing colonoscopist ADR were independently associated with MAN. A range of 1.48- to 1.66-fold increased risk for MAN was observed for ADR in the lowest 3 quintiles (ADR <19.7%-39.3%) vs the highest quintile (ADR >47.0%). When the final model selected based on the training set was applied to the validation set, improved sensitivity and specificity over 2020 USMSTF risk stratification were achieved (P = .001), with an area under the curve of 0.62 (95% confidence interval, 0.60-0.64). CONCLUSIONS: Colonoscopist ADR is associated with MAN. Combining clinical factors and ADR for risk stratification has potential to improve postpolypectomy risk stratification. Improving ADR is likely to improve postpolypectomy outcomes.
Assuntos
Adenoma , Pólipos do Colo , Neoplasias Colorretais , Segunda Neoplasia Primária , Pólipos , Humanos , Masculino , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/epidemiologia , Neoplasias Colorretais/cirurgia , Estudos Retrospectivos , Adenoma/diagnóstico , Adenoma/epidemiologia , Colonoscopia , Pólipos do Colo/diagnóstico , Pólipos do Colo/cirurgiaRESUMO
BACKGROUND AND AIMS: Traditional serrated adenomas (TSAs) may confer increased risk for colorectal cancer (CRC). Our objective with this study was to examine clinical characteristics and long-term outcomes associated with TSA diagnosis. METHODS: We conducted a retrospective cohort study of U.S. Veterans ≥18 years of age with ≥1 TSA between 1999 and 2018. Baseline characteristics, colonoscopy findings, and diagnosis of incident and fatal CRC were abstracted. Advanced neoplasia was defined by CRC or adenoma with high-grade dysplasia, villous histology, or size ≥1 cm. Follow-up was through CRC diagnosis, death, or end of study (December 31, 2018). RESULTS: A total of 853 Veterans with a baseline TSA were identified; 74% were ≥60 years of age, 96% were men, 14% were Black, and 73% were non-Hispanic White. About 64% were current or former smokers. Over 2044 total person-years at follow-up, there were 11 incident CRC cases and 1 CRC death. Cumulative CRC incidence was 1.34% (95% confidence interval [CI], 0.67%-2.68%), and cumulative CRC death was 0.12% (95% CI, 0.00%-0.35%). Among the subset of 378 TSA patients with ≥1 surveillance colonoscopy, 65.1% had high-risk neoplasia on follow-up. CRC incidence among TSA patients was significantly higher than in a comparison cohort of patients with normal baseline colonoscopy at baseline (hazard ratio, 3.70; 95% CI, 1.63-8.41) and similar to a comparison cohort with baseline conventional advanced adenoma (hazard ratio, 0.86; 95% CI, 0.45-1.64). CONCLUSION: Individuals with TSA have substantial risk for CRC based on their cumulative CRC incidence, as well as significant risk of developing other high-risk neoplasia at follow-up surveillance colonoscopy. These data underscore importance of current recommendations for close colonoscopy surveillance after TSA diagnosis.
Assuntos
Adenoma , Pólipos do Colo , Neoplasias Colorretais , Neoplasias Gastrointestinais , Masculino , Humanos , Feminino , Estudos de Coortes , Pólipos do Colo/patologia , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/epidemiologia , Neoplasias Colorretais/patologia , Estudos Retrospectivos , Fatores de Risco , Adenoma/diagnóstico , ColonoscopiaRESUMO
Importance: Reported risk of incident peripheral artery disease (PAD) by sex and race varies significantly and has not been reported in national cohorts among individuals free of baseline PAD. Objective: To evaluate the association of sex and race, as well as prevalent cardiovascular risk factors, with limb outcomes in a national cohort of people with normal baseline ankle-brachial indices (ABIs). Design, setting, and participants: This cohort study was conducted using data from participants in the Veterans Affairs Birth Cohort Study (born 1945-1965), with follow-up data between January 1, 2000, and December 31, 2016. Baseline demographics were collected from 77â¯041 participants receiving care from the Veterans Health Administration with baseline ABIs of 0.90 to 1.40 and no history of PAD. Data were analyzed from October 2019 through September 2022. Exposures: Sex, race, diabetes, and smoking status. Main Outcomes and Measures: Incident PAD, defined as subsequent ABI less than 0.90, surgical or percutaneous revascularization, or nontraumatic amputation. Results: Of 77â¯041 participants with normal ABIs (73â¯822 [95.8%] men; mean [SD] age, 60.2 [5.9] years; 13â¯080 Black [18.2%] and 54â¯377 White [75.6%] among 71â¯911 participants with race and ethnicity data), there were 6692 incident PAD events over a median [IQR] of 3.9 [1.7-6.9] years. Incidence rates were lower for women than men (incidence rates [IRs] per 1000 person-years, 7.4 incidents [95% CI, 6.2-8.8 incidents] vs 19.2 incidents [95% CI, 18.7-19.6 incidents]), with a lower risk of incident PAD (adjusted hazard ratio [aHR], 0.49 [95% CI, 0.41-0.59]). IRs per 1000 person-years of incident PAD were similar for Black and White participants (18.9 incidents [95% CI, 17.9-20.1 incidents] vs 18.8 incidents [95% CI, 18.3-19.4]). Compared with White participants, Black participants had increased risk of total PAD (aHR, 1.09 [95% CI, 1.02-1.16]) and nontraumatic amputation (aHR, 1.20 [95% CI, 1.06-1.36]) but not surgical or percutaneous revascularization (aHR, 1.10 [95% CI, 0.98-1.23]) or subsequent ABI less than 0.90 (aHR, 1.04 [95% CI, 0.95-1.13]). Diabetes (aHR, 1.62 [95% CI, 1.53-1.72]) and smoking (eg, current vs never: aHR, 1.76 [95% CI, 1.64-1.89]) were associated with incident PAD. Incident PAD was rare among individuals without a history of smoking or diabetes (eg, among 632 women: IR per 1000 people-years, 2.1 incidents [95% CI, 1.0-4.5 incidents]) despite an otherwise-high-risk cardiovascular profile (eg, 527 women [83.4%] with hypertension). Conclusions and Relevance: This study found that the risk of PAD was approximately 50% lower in women than men and less than 10% higher for Black vs White participants, while the risk of nontraumatic amputation was 20% higher among Black compared with White participants.
Assuntos
Diabetes Mellitus , Doença Arterial Periférica , Veteranos , Masculino , Feminino , Humanos , Pessoa de Meia-Idade , Índice Tornozelo-Braço , Estudos de Coortes , Doença Arterial Periférica/epidemiologia , Diabetes Mellitus/epidemiologiaRESUMO
Importance: Telehealth enables access to genetics clinicians, but impact on care coordination is unknown. Objective: To assess care coordination and equity of genetic care delivered by centralized telehealth and traditional genetic care models. Design, Setting, and Participants: This cross-sectional study included patients referred for genetic consultation from 2010 to 2017 with 2 years of follow-up in the US Department of Veterans Affairs (VA) health care system. Patients were excluded if they were referred for research, cytogenetic, or infectious disease testing, or if their care model could not be determined. Exposures: Genetic care models, which included VA-telehealth (ie, a centralized team of genetic counselors serving VA facilities nationwide), VA-traditional (ie, a regional service by clinical geneticists and genetic counselors), and non-VA care (ie, community care purchased by the VA). Main Outcomes and Measures: Multivariate regression models were used to assess associations between patient and consultation characteristics and the type of genetic care model referral; consultation completion; and having 0, 1, or 2 or more cancer surveillance (eg, colonoscopy) and risk-reducing procedures (eg, bilateral mastectomy) within 2 years following referral. Results: In this study, 24â¯778 patients with genetics referrals were identified, including 12â¯671 women (51.1%), 13â¯193 patients aged 50 years or older (53.2%), 15â¯639 White patients (63.1%), and 15â¯438 patients with cancer-related referrals (62.3%). The VA-telehealth model received 14â¯580 of the 24â¯778 consultations (58.8%). Asian patients, American Indian or Alaskan Native patients, and Hawaiian or Pacific Islander patients were less likely to be referred to VA-telehealth than White patients (OR, 0.54; 95% CI, 0.35-0.84) compared with the VA-traditional model. Completing consultations was less likely with non-VA care than the VA-traditional model (OR, 0.45; 95% CI, 0.35-0.57); there were no differences in completing consultations between the VA models. Black patients were less likely to complete consultations than White patients (OR, 0.84; 95% CI, 0.76-0.93), but only if referred to the VA-telehealth model. Patients were more likely to have multiple cancer preventive procedures if they completed their consultations (OR, 1.55; 95% CI, 1.40-1.72) but only if their consultations were completed with the VA-traditional model. Conclusions and Relevance: In this cross-sectional study, the VA-telehealth model was associated with improved access to genetics clinicians but also with exacerbated health care disparities and hindered care coordination. Addressing structural barriers and the needs and preferences of vulnerable subpopulations may complement the centralized telehealth approach, improve care coordination, and help mitigate health care disparities.
Assuntos
Neoplasias da Mama , Telemedicina , Veteranos , Estudos Transversais , Demografia , Feminino , Disparidades em Assistência à Saúde , Humanos , Mastectomia , Encaminhamento e Consulta , Estudos Retrospectivos , Estados Unidos , United States Department of Veterans AffairsRESUMO
Cognitive processing therapy (CPT) and prolonged exposure therapy (PE) are effective psychotherapies for post-traumatic stress disorder (PTSD). However, these treatments also have high rates of dropout and non-response. Therefore, patients may need a second course of treatment. We compared outcomes for patients who switched between CPT/PE and those who repeated CPT/PE during a second course of treatment. We collected data from Iraq and Afghanistan war veterans (n = 2,958) who received a second course of CPT/PE in the Veterans Health Administration from 2001 to 2017 and had symptom outcomes (PTSD checklist; PCL). We measured the association between treatment sequence and change in PCL score over the second course of treatment using hierarchical Bayesian regression, adjusted for sociodemographic and clinical characteristics. All treatment sequences showed a significant reduction in PCL score over time (ß = -4.80; HDI95: -5.74, -3.86). Veterans who switched from CPT to PE had modestly greater PCL reductions during the second course than those who repeated CPT. However, no significant difference in PCL change during the second course was observed between veterans who repeated PE and those who switched from PE to CPT. Veterans participating in a second course of CPT/PE can benefit, and switching treatment may be slightly more beneficial following CPT.
Assuntos
Terapia Cognitivo-Comportamental , Terapia Implosiva , Transtornos de Estresse Pós-Traumáticos , Veteranos , Teorema de Bayes , Humanos , Transtornos de Estresse Pós-Traumáticos/psicologia , Transtornos de Estresse Pós-Traumáticos/terapia , Resultado do Tratamento , Estados Unidos , United States Department of Veterans Affairs , Veteranos/psicologiaRESUMO
BACKGROUND: In this study we sought to explore the possibility of using patient centered care (PCC) documentation as a measure of the delivery of PCC in a health system. METHODS: We first selected 6 VA medical centers based on their scores for a measure of support for self-management subscale from a national patient satisfaction survey (the Survey for Healthcare Experience-Patients). We accessed clinical notes related to either smoking cessation or weight management consults. We then annotated this dataset of notes for documentation of PCC concepts including: patient goals, provider support for goal progress, social context, shared decision making, mention of caregivers, and use of the patient's voice. We examined the association of documentation of PCC with patients' perception of support for self-management with regression analyses. RESULTS: Two health centers had < 50 notes related to either tobacco cessation or weight management consults and were removed from further analysis. The resulting dataset includes 477 notes related to 311 patients total from 4 medical centers. For a majority of patients (201 out of 311; 64.8%) at least one PCC concept was present in their clinical notes. The most common PCC concepts documented were patient goals (patients n = 126; 63% clinical notes n = 302; 63%), patient voice (patients n = 165, 82%; clinical notes n = 323, 68%), social context (patients n = 105, 52%; clinical notes n = 181, 38%), and provider support for goal progress (patients n = 124, 62%; clinical notes n = 191, 40%). Documentation of goals for weight loss notes was greater at health centers with higher satisfaction scores compared to low. No such relationship was found for notes related to tobacco cessation. CONCLUSION: Providers document PCC concepts in their clinical notes. In this pilot study we explored the feasibility of using this data as a means to measure the degree to which care in a health center is patient centered. PRACTICE IMPLICATIONS: clinical EHR notes are a rich source of information about PCC that could potentially be used to assess PCC over time and across systems with scalable technologies such as natural language processing.
Assuntos
Documentação , Registros Eletrônicos de Saúde , Humanos , Satisfação do Paciente , Assistência Centrada no Paciente , Projetos PilotoRESUMO
BACKGROUND: Deaths from pneumonia were decreasing globally prior to the COVID-19 pandemic, but it is unclear whether this was due to changes in patient populations, illness severity, diagnosis, hospitalization thresholds, or treatment. Using clinical data from the electronic health record among a national cohort of patients initially diagnosed with pneumonia, we examined temporal trends in severity of illness, hospitalization, and short- and long-term deaths. DESIGN: Retrospective cohort PARTICIPANTS: All patients >18 years presenting to emergency departments (EDs) at 118 VA Medical Centers between 1/1/2006 and 12/31/2016 with an initial clinical diagnosis of pneumonia and confirmed by chest imaging report. EXPOSURES: Year of encounter. MAIN MEASURES: Hospitalization and 30-day and 90-day mortality. Illness severity was defined as the probability of each outcome predicted by machine learning predictive models using age, sex, comorbidities, vital signs, and laboratory data from encounters during years 2006-2007, and similar models trained on encounters from years 2015 to 2016. We estimated the changes in hospitalizations and 30-day and 90-day mortality between the first and the last 2 years of the study period accounted for by illness severity using time covariate decompositions with model estimates. RESULTS: Among 196,899 encounters across the study period, hospitalization decreased from 71 to 63%, 30-day mortality 10 to 7%, 90-day mortality 16 to 12%, and 1-year mortality 29 to 24%. Comorbidity risk increased, but illness severity decreased. Decreases in illness severity accounted for 21-31% of the decrease in hospitalizations, and 45-47%, 32-24%, and 17-19% of the decrease in 30-day, 90-day, and 1-year mortality. Findings were similar among underrepresented patients and those with only hospital discharge diagnosis codes. CONCLUSIONS: Outcomes for community-onset pneumonia have improved across the VA healthcare system after accounting for illness severity, despite an increase in cases and comorbidity burden.
Assuntos
COVID-19 , Pneumonia , Veteranos , Humanos , Estados Unidos/epidemiologia , Estudos Retrospectivos , Pandemias , COVID-19/terapia , Hospitalização , Gravidade do Paciente , HospitaisRESUMO
BACKGROUND: Despite its high prevalence and clinical impact, research on peripheral artery disease (PAD) remains limited due to poor accuracy of billing codes. Ankle-brachial index (ABI) and toe-brachial index can be used to identify PAD patients with high accuracy within electronic health records. METHODS: We developed a novel natural language processing (NLP) algorithm for extracting ABI and toe-brachial index values and laterality (right or left) from ABI reports. A random sample of 800 reports from 94 Veterans Affairs facilities during 2015 to 2017 was selected and annotated by clinical experts. We trained the NLP system using random forest models and optimized it through sequential iterations of 10-fold cross-validation and error analysis on 600 test reports and evaluated its final performance on a separate set of 200 reports. We also assessed the accuracy of NLP-extracted ABI and toe-brachial index values for identifying patients with PAD in a separate cohort undergoing ABI testing. RESULTS: The NLP system had an overall precision (positive predictive value) of 0.85, recall (sensitivity) of 0.93, and F1 measure (accuracy) of 0.89 to correctly identify ABI/toe-brachial index values and laterality. Among 261 patients with ABI testing (49% PAD), the NLP system achieved a positive predictive value of 92.3%, sensitivity of 83.1%, and specificity of 93.1% to identify PAD when compared with a structured chart review. The above findings were consistent in a range of sensitivity analysis. CONCLUSIONS: We successfully developed and validated an NLP system for identifying patients with PAD within the Veterans Affairs electronic health record. Our findings have broad implications for PAD research and quality improvement.
Assuntos
Índice Tornozelo-Braço , Doença Arterial Periférica , Tornozelo , Índice Tornozelo-Braço/métodos , Humanos , Extremidade Inferior , Doença Arterial Periférica/diagnóstico , Doença Arterial Periférica/epidemiologia , Valor Preditivo dos Testes , Resultado do TratamentoRESUMO
RATIONALE AIMS AND OBJECTIVES: As quality measurement becomes increasingly reliant on the availability of structured electronic medical record (EMR) data, clinicians are asked to perform documentation using tools that facilitate data capture. These tools may not be available, feasible, or acceptable in all clinical scenarios. Alternative methods of assessment, including natural language processing (NLP) of clinical notes, may improve the completeness of quality measurement in real-world practice. Our objective was to measure the quality of care for a set of evidence-based practices using structured EMR data alone, and then supplement those measures with additional data derived from NLP. METHOD: As a case example, we studied the quality of care for posttraumatic stress disorder (PTSD) in the United States Department of Veterans Affairs (VA) over a 20-year period. We measured two aspects of PTSD care, including delivery of evidence-based psychotherapy (EBP) and associated use of measurement-based care (MBC), using structured EMR data. We then recalculated these measures using additional data derived from NLP of clinical note text. RESULTS: There were 2 098 389 VA patients with a diagnosis of PTSD between 2000 and 2019, 72% (n = 1 515 345) of whom had not previously received EBP for PTSD and were treated after a 2015 mandate to document EBP using templates that generate structured EMR data. Using structured EMR data, we determined that 3.2% (n = 48 004) of those patients met our EBP for PTSD quality standard between 2015 and 2019, and 48.1% (n = 23 088) received associated MBC. With the addition of NLP-derived data, estimates increased to 4.1% (n = 62 789) and 58.0% (n = 36 435), respectively. CONCLUSION: Healthcare quality data can be significantly improved by supplementing structured EMR data with NLP-derived data. By using NLP, health systems may be able to fill the gaps in documentation when structured tools are not yet available or there are barriers to using them in clinical practice.
Assuntos
Processamento de Linguagem Natural , Transtornos de Estresse Pós-Traumáticos , Registros Eletrônicos de Saúde , Humanos , Psicoterapia , Transtornos de Estresse Pós-Traumáticos/terapia , Estados Unidos , United States Department of Veterans AffairsRESUMO
Background Sacubitril/valsartan, a first-in-class angiotensin receptor neprilysin inhibitor, received US Food and Drug Administration approval in 2015 for heart failure with reduced ejection fraction (HFrEF). Our objective was to describe the sacubitril/valsartan initiation rate, associated characteristics, and 6-month follow-up dosing among veterans with HFrEF who are renin-angiotensin-aldosterone system inhibitor (RAASi) naïve. Methods and Results Retrospective cohort study of veterans with HFrEF who are RAASi naïve defined as left ventricular ejection fraction (LVEF) ≤40%; ≥1 in/outpatient heart failure visit, first RAASi (sacubitril/valsartan, angiotensin-converting enzyme inhibitor [ACEI]), or angiotensin-II receptor blocker [ARB]) fill from July 2015 to June 2019. Characteristics associated with sacubitril/valsartan initiation were identified using Poisson regression models. From July 2015 to June 2019, we identified 3458 sacubitril/valsartan and 29 367 ACEI or ARB initiators among veterans with HFrEF who are RAASi naïve. Sacubitril/valsartan initiation increased from 0% to 26.5%. Sacubitril/valsartan (versus ACEI or ARB) initiators were less likely to have histories of stroke, myocardial infarction, or hypertension and more likely to be older and have diabetes mellitus and lower LVEF. At 6-month follow-up, the prevalence of ≥50% target daily dose for sacubitril/valsartan, ACEI, and ARB initiators was 23.5%, 43.2%, and 47.1%, respectively. Conclusions Sacubitril/valsartan initiation for HFrEF in the Veterans Administration increased in the 4 years immediately following Food and Drug Administration approval. Sacubitril/valsartan (versus ACEI or ARB) initiators had fewer baseline cardiovascular comorbidities and the lowest proportion on ≥50% target daily dose at 6-month follow-up. Identifying the reasons for lower follow-up dosing of sacubitril/valsartan could support guideline recommendations and quality improvement strategies for patients with HFrEF.
Assuntos
Aminobutiratos/uso terapêutico , Compostos de Bifenilo/uso terapêutico , Insuficiência Cardíaca , Valsartana/uso terapêutico , Veteranos , Aldosterona , Antagonistas de Receptores de Angiotensina/uso terapêutico , Inibidores da Enzima Conversora de Angiotensina/uso terapêutico , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/tratamento farmacológico , Insuficiência Cardíaca/epidemiologia , Humanos , Sistema Renina-Angiotensina , Estudos Retrospectivos , Volume Sistólico , Tetrazóis/uso terapêutico , Função Ventricular EsquerdaRESUMO
OBJECTIVE: Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs. MATERIALS AND METHODS: A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review. RESULTS: Smoking status (n = 27), substance use (n = 21), homelessness (n = 20), and alcohol use (n = 15) are the most frequently studied SDoH categories. Homelessness (n = 7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n = 13), substance use (n = 9), and alcohol use (n = 9). CONCLUSION: NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.
Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Gerenciamento de Dados , Humanos , Aprendizado de Máquina , Determinantes Sociais da SaúdeRESUMO
PURPOSE: Prostate cancer (PCa) is among the leading causes of cancer deaths. While localized PCa has a 5-year survival rate approaching 100%, this rate drops to 31% for metastatic prostate cancer (mPCa). Thus, timely identification of mPCa is a crucial step toward measuring and improving access to innovations that reduce PCa mortality. Yet, methods to identify patients diagnosed with mPCa remain elusive. Cancer registries provide detailed data at diagnosis but are not updated throughout treatment. This study reports on the development and validation of a natural language processing (NLP) algorithm deployed on oncology, urology, and radiology clinical notes to identify patients with a diagnosis or history of mPCa in the Department of Veterans Affairs. PATIENTS AND METHODS: Using a broad set of diagnosis and histology codes, the Veterans Affairs Corporate Data Warehouse was queried to identify all Veterans with PCa. An NLP algorithm was developed to identify patients with any history or progression of mPCa. The NLP algorithm was prototyped and developed iteratively using patient notes, grouped into development, training, and validation subsets. RESULTS: A total of 1,144,610 Veterans were diagnosed with PCa between January 2000 and October 2020, among which 76,082 (6.6%) were identified by NLP as having mPCa at some point during their care. The NLP system performed with a specificity of 0.979 and sensitivity of 0.919. CONCLUSION: Clinical documentation of mPCa is highly reliable. NLP can be leveraged to improve PCa data. When compared to other methods, NLP identified a significantly greater number of patients. NLP can be used to augment cancer registry data, facilitate research inquiries, and identify patients who may benefit from innovations in mPCa treatment.
Assuntos
Neoplasias da Próstata , Veteranos , Algoritmos , Registros Eletrônicos de Saúde , Humanos , Masculino , Processamento de Linguagem Natural , Neoplasias da Próstata/diagnóstico , Neoplasias da Próstata/terapiaRESUMO
BACKGROUND: Pulmonary hypertension incidence based on echocardiographic estimates of pulmonary artery systolic pressure in people living with HIV remains unstudied. We aimed to determine whether people living with HIV have higher incidence and risk of pulmonary hypertension than uninfected individuals. METHODS: In this retrospective cohort study, we evaluated data from participants in the Veterans Aging Cohort Study (VACS) referred for echocardiography with baseline pulmonary artery systolic pressure measures of 35 mm Hg or less. Incident pulmonary hypertension was defined as pulmonary artery systolic pressure higher than 35 mm Hg on subsequent echocardiogram. We used Poisson regression to estimate incidence rates (IRs) of pulmonary hypertension by HIV status. We then estimated hazard ratios (HRs) by HIV status using Cox proportional hazards regression. We further categorised veterans with HIV by CD4 count or HIV viral load to assess the association between pulmonary hypertension risk and HIV severity. Models included age, sex, race or ethnicity, prevalent heart failure, chronic obstructive pulmonary disease, hypertension, smoking status, diabetes, body-mass index, estimated glomerular filtration rate, hepatitis C virus infection, liver cirrhosis, and drug use as covariates. FINDINGS: Of 21 314 VACS participants with at least one measured PASP on or after April 1, 2003, 13 028 VACS participants were included in the analytic sample (4174 [32%] with HIV and 8854 [68%] without HIV). Median age was 58 years and 12 657 (97%) were male. Median follow-up time was 3·1 years (IQR 0·9-6·8) spanning from April 1, 2003, to Sept 30, 2017. Unadjusted IRs per 1000 person-years were higher in veterans with HIV (IR 28·6 [95% CI 26·1-31·3]) than in veterans without HIV (IR 23·4 [21·9-24·9]; p=0·0004). The risk of incident pulmonary hypertension was higher among veterans with HIV than among veterans without HIV (unadjusted HR 1·25 [95% CI 1·12-1·40], p<0·0001). After multivariable adjustment, this association was slightly attenuated but remained significant (HR 1·18 [1·05-1·34], p=0·0062). Veterans with HIV who had a CD4 count lower than 200 cells per µL or of 200-499 cells per µL had a higher risk of pulmonary hypertension than did veterans without HIV (HR 1·94 [1·49-2·54], p<0·0001, for those with <200 cell µL and HR 1·29 [1·08-1·53], p=0·0048, for those with 200-499 cells per µL). Similarly, veterans with HIV who had HIV viral loads of 500 copies per mL or more had a higher risk of pulmonary hypertension than did veterans without HIV (HR 1·88 [1·46-2·42], p<0·0001). INTERPRETATION: HIV is associated with pulmonary hypertension incidence, adjusting for risk factors. Low CD4 cell count and high HIV viral load contribute to increased pulmonary hypertension risk among veterans with HIV. Thus, as with other cardiopulmonary diseases, suppression of HIV should be prioritised to lessen the burden of pulmonary hypertension in people living with HIV. FUNDING: National Heart, Lung, and Blood Institute (National Institutes of Health, USA); National Institute on Alcohol Abuse and Alcoholism (National Institutes of Health, USA).
Assuntos
Infecções por HIV , Soropositividade para HIV , HIV-1 , Hipertensão Pulmonar , Veteranos , Estudos de Coortes , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos RetrospectivosRESUMO
BACKGROUND: Patient travel history can be crucial in evaluating evolving infectious disease events. Such information can be challenging to acquire in electronic health records, as it is often available only in unstructured text. OBJECTIVE: This study aims to assess the feasibility of annotating and automatically extracting travel history mentions from unstructured clinical documents in the Department of Veterans Affairs across disparate health care facilities and among millions of patients. Information about travel exposure augments existing surveillance applications for increased preparedness in responding quickly to public health threats. METHODS: Clinical documents related to arboviral disease were annotated following selection using a semiautomated bootstrapping process. Using annotated instances as training data, models were developed to extract from unstructured clinical text any mention of affirmed travel locations outside of the continental United States. Automated text processing models were evaluated, involving machine learning and neural language models for extraction accuracy. RESULTS: Among 4584 annotated instances, 2659 (58%) contained an affirmed mention of travel history, while 347 (7.6%) were negated. Interannotator agreement resulted in a document-level Cohen kappa of 0.776. Automated text processing accuracy (F1 85.6, 95% CI 82.5-87.9) and computational burden were acceptable such that the system can provide a rapid screen for public health events. CONCLUSIONS: Automated extraction of patient travel history from clinical documents is feasible for enhanced passive surveillance public health systems. Without such a system, it would usually be necessary to manually review charts to identify recent travel or lack of travel, use an electronic health record that enforces travel history documentation, or ignore this potential source of information altogether. The development of this tool was initially motivated by emergent arboviral diseases. More recently, this system was used in the early phases of response to COVID-19 in the United States, although its utility was limited to a relatively brief window due to the rapid domestic spread of the virus. Such systems may aid future efforts to prevent and contain the spread of infectious diseases.
Assuntos
Doenças Transmissíveis Emergentes/diagnóstico , Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Vigilância em Saúde Pública/métodos , Viagem/estatística & dados numéricos , Algoritmos , COVID-19/epidemiologia , Doenças Transmissíveis Emergentes/epidemiologia , Estudos de Viabilidade , Feminino , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Processamento de Linguagem Natural , Reprodutibilidade dos Testes , Estados Unidos/epidemiologiaRESUMO
Rationale: Computerized severity assessment for community-acquired pneumonia could improve consistency and reduce clinician burden. Objectives: To develop and compare 30-day mortality-prediction models using electronic health record data, including a computerized score with all variables from the original Pneumonia Severity Index (PSI) except confusion and pleural effusion ("ePSI score") versus models with additional variables. Methods: Among adults with community-acquired pneumonia presenting to emergency departments at 117 Veterans Affairs Medical Centers between January 1, 2006, and December 31, 2016, we compared an ePSI score with 10 novel models employing logistic regression, spline, and machine learning methods using PSI variables, age, sex and 26 physiologic variables as well as all 69 PSI variables. Models were trained using encounters before January 1, 2015; tested on encounters during and after January 1, 2015; and compared using the areas under the receiver operating characteristic curve, confidence intervals, and patient event rates at a threshold PSI score of 970. Results: Among 297,498 encounters, 7% resulted in death within 30 days. When compared using the ePSI score (confidence interval [CI] for the area under the receiver operating characteristic curve, 0.77-0.78), performance increased with model complexity (CI for the logistic regression PSI model, 0.79-0.80; CI for the boosted decision-tree algorithm machine learning PSI model using the Extreme Gradient Boosting algorithm [mlPSI] with the 19 original PSI factors, 0.83-0.85) and the number of variables (CI for the logistic regression PSI model using all 69 variables, 0.84-085; CI for the mlPSI with all 69 variables, 0.86-0.87). Models limited to age, sex, and physiologic variables also demonstrated high performance (CI for the mlPSI with age, sex, and 26 physiologic factors, 0.84-0.85). At an ePSI score of 970 and a mortality-risk cutoff of <2.7%, the ePSI score identified 31% of all patients as being at "low risk"; the mlPSI with age, sex, and 26 physiologic factors identified 53% of all patients as being at low risk; and the mlPSI with all 69 variables identified 56% of all patients as being at low risk, with similar rates of mortality, hospitalization, and 7-day secondary hospitalization being determined. Conclusions: Computerized versions of the PSI accurately identified patients with pneumonia who were at low risk of death. More complex models classified more patients as being at low risk of death and as having similar adverse outcomes.
Assuntos
Infecções Comunitárias Adquiridas , Pneumonia , Veteranos , Adulto , Humanos , Modelos Logísticos , Prognóstico , Curva ROC , Índice de Gravidade de DoençaRESUMO
Use of Electronic Nicotine Delivery Systems (ENDS, colloquially known as "electronic cigarettes") has increased substantially in the United States in the decade since 2010. However, currently relatively little is known regarding the documentation of ENDS use in clinical notes. With this study, we describe the development of an annotation scheme (and associated annotated corpus) consisting of 4,351 ENDS mentions derived from Department of Veterans Affairs clinical notes during the period 2010-2020. Analysis of our corpus provides important insights into ENDS documentation practices at the VA, in addition to providing a resource for the future development and validation of Natural Language Processing algorithms capable of reliably identifying ENDS-use status.
Assuntos
Sistemas Eletrônicos de Liberação de Nicotina , Vaping , Veteranos , Documentação , Humanos , Processamento de Linguagem Natural , Estados UnidosRESUMO
Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based on spaCy framework that allows flexible integration of rule-based and machine learning-based algorithms adapted to clinical text. MedspaCy includes a variety of components that meet common cNLP needs such as context analysis and mapping to standard terminologies. By utilizing spaCy's clear and easy-to-use conventions, medspaCy enables development of custom pipelines that integrate easily with other spaCy-based modules. Our toolkit includes several core components and facilitates rapid development of pipelines for clinical text.