RESUMO
Objective: Social Determinants of Health (SDOH) greatly influence health outcomes. SDOH surveys, such as the Assessing Circumstances & Offering Resources for Needs (ACORN) survey, have been developed to screen for SDOH in Veterans. The purpose of this study is to determine the terminological representation of the ACORN survey, to aid in natural language processing (NLP). Methods: Each ACORN survey question was read to determine its concepts. Next, Solor was searched for each of the concepts and for the appropriate attributes. If no attributes or concepts existed, they were proposed. Then, each question's concepts and attributes were arranged into subject-relation-object triples. Results: Eleven unique attributes and 18 unique concepts were proposed. These results demonstrate a gap in representing SDOH with terminologies. We believe that using these new concepts and relations will improve NLP, and thus, the care provided to Veterans.
RESUMO
BACKGROUND: There have been over 772 million confirmed cases of COVID-19 worldwide. A significant portion of these infections will lead to long COVID (post-COVID-19 condition) and its attendant morbidities and costs. Numerous life-altering complications have already been associated with the development of long COVID, including chronic fatigue, brain fog, and dangerous heart rhythms. OBJECTIVE: We aim to derive an actionable long COVID case definition consisting of significantly increased signs, symptoms, and diagnoses to support pandemic-related clinical, public health, research, and policy initiatives. METHODS: This research employs a case-crossover population-based study using International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) data generated at Veterans Affairs medical centers nationwide between January 1, 2020, and August 18, 2022. In total, 367,148 individuals with ICD-10-CM data both before and after a positive COVID-19 test were selected for analysis. We compared ICD-10-CM codes assigned 1 to 7 months following each patient's positive test with those assigned up to 6 months prior. Further, 350,315 patients had novel codes assigned during this window of time. We defined signs, symptoms, and diagnoses as being associated with long COVID if they had a novel case frequency of ≥1:1000, and they significantly increased in our entire cohort after a positive test. We present odds ratios with CIs for long COVID signs, symptoms, and diagnoses, organized by ICD-10-CM functional groups and medical specialty. We used our definition to assess long COVID risk based on a patient's demographics, Elixhauser score, vaccination status, and COVID-19 disease severity. RESULTS: We developed a long COVID definition consisting of 323 ICD-10-CM diagnosis codes grouped into 143 ICD-10-CM functional groups that were significantly increased in our 367,148 patient post-COVID-19 population. We defined 17 medical-specialty long COVID subtypes such as cardiology long COVID. Patients who were COVID-19-positive developed signs, symptoms, or diagnoses included in our long COVID definition at a proportion of at least 59.7% (268,320/449,450, based on a denominator of all patients who were COVID-19-positive). The long COVID cohort was 8 years older with more comorbidities (2-year Elixhauser score 7.97 in the patients with long COVID vs 4.21 in the patients with non-long COVID). Patients who had a more severe bout of COVID-19, as judged by their minimum oxygen saturation level, were also more likely to develop long COVID. CONCLUSIONS: An actionable, data-driven definition of long COVID can help clinicians screen for and diagnose long COVID, allowing identified patients to be admitted into appropriate monitoring and treatment programs. This long COVID definition can also support public health, research, and policy initiatives. Patients with COVID-19 who are older or have low oxygen saturation levels during their bout of COVID-19, or those who have multiple comorbidities should be preferentially watched for the development of long COVID.
Assuntos
COVID-19 , Estudos Cross-Over , Síndrome de COVID-19 Pós-Aguda , Humanos , COVID-19/epidemiologia , COVID-19/complicações , Fatores de Risco , Masculino , Feminino , Pessoa de Meia-Idade , Estados Unidos/epidemiologia , Idoso , Classificação Internacional de Doenças , AdultoRESUMO
Order sets that adhere to disease-specific guidelines have been shown to increase clinician efficiency and patient safety but curating these order sets, particularly for consistency across multiple sites, is difficult and time consuming. We created software called CDS-Compare to alleviate the burden on expert reviewers in rapidly and effectively curating large databases of order sets. We applied our clustering-based software to a database of NLP-processed order sets extracted from VA's Electronic Health Record, then had subject-matter experts review the web application version of our software for clustering validity.
Assuntos
Aprendizado de Máquina , Software , Bases de Dados Factuais , Registros Eletrônicos de Saúde , HumanosRESUMO
Our aim is to demonstrate a general-purpose data and knowledge validation approach that enables reproducible metrics for data and knowledge quality and safety. We researched widely accepted statistical process control methods from high-quality, high-safety industries and applied them to pharmacy prescription data being migrated between EHRs. Natural language medication instructions from prescriptions were independently categorized by two terminologists as a first step toward encoding those medication instructions using standardized terminology. Overall, the weighted average of medication instructions that were matched by reviewers was 43%, with strong agreement between reviewers for short instructions (K=0.82) and long instructions (K=0.85), and moderate agreement for medium instructions (K=0.61). Category definitions will be refined in future work to mitigate discrepancies. We recommend incorporating appropriate statistical tests, such as evaluating inter-rater and intra-rater reliability and bivariate comparison of reviewer agreement over an adequate statistical sample, when developing benchmarks for health data and knowledge quality and safety.
Assuntos
Farmácia , Confiança , Humanos , Reprodutibilidade dos Testes , Benchmarking , Preparações FarmacêuticasRESUMO
OBJECTIVE: One important concept in informatics is data which meets the principles of Findability, Accessibility, Interoperability and Reusability (FAIR). Standards, such as terminologies (findability), assist with important tasks like interoperability, Natural Language Processing (NLP) (accessibility) and decision support (reusability). One terminology, Solor, integrates SNOMED CT, LOINC and RxNorm. We describe Solor, HL7 Analysis Normal Form (ANF), and their use with the high definition natural language processing (HD-NLP) program. METHODS: We used HD-NLP to process 694 clinical narratives prior modeled by human experts into Solor and ANF. We compared HD-NLP output to the expert gold standard for 20% of the sample. Each clinical statement was judged "correct" if HD-NLP output matched ANF structure and Solor concepts, or "incorrect" if any ANF structure or Solor concepts were missing or incorrect. Judgements were summed to give totals for "correct" and "incorrect". RESULTS: 113 (80.7%) correct, 26 (18.6%) incorrect, and 1 error. Inter-rater reliability was 97.5% with Cohen's kappa of 0.948. CONCLUSION: The HD-NLP software provides useable complex standards-based representations for important clinical statements designed to drive CDS.
Assuntos
Processamento de Linguagem Natural , RxNorm , Humanos , Reprodutibilidade dos Testes , Systematized Nomenclature of Medicine , Vocabulário ControladoRESUMO
The authors sought to evaluate how well the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) controlled vocabulary represents terms commonly used clinically when documenting posttraumatic stress disorder (PTSD). A list was constructed based on the PTSD criteria in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV; American Psychiatric Association, 1994), symptom assessment instruments, and publications. Although two teams mapping the terms to SNOMED-CT differed in their approach, the consensus mapping accounted for 91% of the 153 PTSD terms. They found that the words used by clinicians in describing PTSD symptoms are represented in SNOMED-CT. These results can be used to codify mental health text reports for health information technology applications such as automated chart abstraction, algorithms for identifying documentation of symptoms representing PTSD in clinical notes, and clinical decision support.
Assuntos
Transtornos de Estresse Pós-Traumáticos/fisiopatologia , Systematized Nomenclature of Medicine , Terminologia como Assunto , Humanos , Transtornos de Estresse Pós-Traumáticos/diagnósticoRESUMO
PURPOSE: Computerized clinical decision support systems (CDSS) are an emerging means for improving healthcare safety, quality and efficiency, but meta-analyses findings are mixed. This meta-synthesis aggregates qualitative research findings as possible explanations for variable quantitative research outcomes. INCLUSION CRITERIA: Qualitative studies published between 2000 and 2013 in English, involving physicians, registered and advanced practice nurses' experience of CDSS use in clinical practice were included. SEARCH STRATEGY: PubMed and CINAHL databases were searched. Study titles and abstracts were screened against inclusion criteria. Retained studies were appraised against quality criteria. Findings were extracted iteratively from studies in the 4th quartile of quality scores. Two reviewers constructed themes inductively. A third reviewer applied the defined themes deductively achieving 92% agreement. RESULTS: 3798 unique records were returned; 56 met inclusion criteria and were reviewed against quality criteria. 9 studies were of sufficiently high quality for synthetic analysis. Five major themes (clinician-patient-system integration; user interface usability; the need for better 'algorithms'; system maturity; patient safety) were defined. CONCLUSIONS: Despite ongoing development, CDSS remains an emerging technology. Lack of understanding about and lack of consideration for the interaction between human decision makers and CDSS is a major reason for poor system adoption and use. Further high-quality qualitative research is needed to better understand human-system interaction issues. These issues may continue to confound quantitative study results if not addressed.
Assuntos
Sistemas de Apoio a Decisões Clínicas/organização & administração , Pesquisa sobre Serviços de Saúde/organização & administração , Metanálise como Assunto , Pesquisa Qualitativa , Integração de Sistemas , Fluxo de TrabalhoRESUMO
BACKGROUND: Clinical practice and epidemiological information aggregation require knowing when, how long, and in what sequence medically relevant events occur. The Temporal Awareness and Reasoning Systems for Question Interpretation (TARSQI) Toolkit (TTK) is a complete, open source software package for the temporal ordering of events within narrative text documents. TTK was developed on newspaper articles. We extended TTK to support medical notes using veterans' affairs (VA) clinical notes and compared it to TTK. METHODS: We used a development set consisting of 200 VA clinical notes to modify and append rules to TTK's time tagger, creating Med-TTK. We then evaluated the performances of TTK and Med-TTK on an independent random selection of 100 clinical notes. Evaluation tasks were to identify and classify time-referring expressions as one of four temporal classes (DATE, TIME, DURATION, and SET). The reference standard for this test set was generated by dual human manual review with disagreements resolved by a third reviewer. Outcome measures included recall and precision for each class, and inter-rater agreement scores. RESULTS: There were 3146 temporal expressions in the reference standard. TTK identified 1595 temporal expressions. Recall was 0.15 (95% confidence interval [CI] 0.12-0.15) and precision was 0.27 (95% CI 0.25-0.29) for TTK. Med-TTK identified 3174 expressions. Recall was 0.86 (95% CI 0.84-0.87) and precision was 0.85 (95% CI 0.84-0.86) for Med-TTK. CONCLUSION: The algorithms for identifying and classifying temporal expressions in medical narratives developed within Med-TTK significantly improved performance compared to TTK. Natural language processing applications such as Med-TTK provide a foundation for meaningful longitudinal mapping of patient history events among electronic health records. The tool can be accessed at the following site: http://code.google.com/p/med-ttk/.
Assuntos
Registros Eletrônicos de Saúde/estatística & dados numéricos , Registros de Saúde Pessoal , Narração , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Fatores de Tempo , Vocabulário Controlado , Software , Estados UnidosRESUMO
BACKGROUND: A practical data point for assessing information quality and value in the Electronic Health Record (EHR) is the professional category of the EHR author. We evaluated and compared free form electronic signatures against LOINC note titles in categorizing the profession of EHR authors. METHODS: A random 1000 clinical document sample was selected and divided into 500 document sets for training and testing. The gold standard for provider classification was generated by dual clinician manual review, disagreements resolved by a third reviewer. Text matching algorithms composed of document titles and author electronic signatures for provider classification were developed on the training set. RESULTS: Overall, detection of professional classification by note titles alone resulted in 76.1% sensitivity and 69.4% specificity. The aggregate of note titles with electronic signatures resulted in 95.7% sensitivity and 98.5% specificity. CONCLUSIONS: Note titles alone provided fair professional classification. Inclusion of author electronic signatures significantly boosted classification performance.
Assuntos
Algoritmos , Autoria , Registros Eletrônicos de Saúde , Logical Observation Identifiers Names and Codes , Humanos , Sistemas de Informação , Estados Unidos , United States Department of Veterans Affairs/organização & administração , VeteranosRESUMO
BACKGROUND: Traumatic Brain Injury (TBI) is a "signature" injury of the current wars in Iraq and Afghanistan. Structured electronic data regarding TBI findings is important for research, population health and other secondary uses but requires appropriate underlying standard terminologies to ensure interoperability and reuse. Currently the U.S. Department of Veterans Affairs (VA) uses the terminology SNOMED CT and the Department of Defense (DOD) uses Medcin. METHODS: We developed a comprehensive case definition of mild TBI composed of 68 clinical terms. Using automated and manual techniques, we evaluated how well the mild TBI case definition terms could be represented by SNOMED CT and Medcin, and compared the results. We performed additional analysis stratified by whether the concepts were rated by a TBI expert panel as having High, Medium, or Low importance to the definition of mild TBI. RESULTS: SNOMED CT sensitivity (recall) was 90% overall for coverage of mild TBI concepts, and Medcin sensitivity was 49%, p < 0.001 (using McNemar's chi square). Positive predictive value (precision) for each was 100%. SNOMED CT outperformed Medcin for concept coverage independent of import rating by our TBI experts. DISCUSSION: SNOMED CT was significantly better able to represent mild TBI concepts than Medcin. This finding may inform data gathering, management and sharing, and data exchange strategies between the VA and DOD for active duty soldiers and veterans with mild TBI. Since mild TBI is an important condition in the civilian population as well, the current study results may be useful also for the general medical setting.
Assuntos
Lesões Encefálicas/classificação , Systematized Nomenclature of Medicine , Vocabulário Controlado , Humanos , Sistemas Computadorizados de Registros Médicos , Estados Unidos , United States Department of Defense , United States Department of Veterans AffairsRESUMO
BACKGROUND: Two candidate terminologies to support entry of general medical data are SNOMED CT and MEDCIN. We compare the ability of SNOMED CT and MEDCIN to represent concepts and interface terms from a VA general medical examination template. METHODS: We parsed the VA general medical evaluation template and mapped the resulting expressions into SNOMED CT and MEDCIN. Internists conducted double independent reviews on 864 expressions. Exact concept level matches were used to evaluate reference coverage. Exact term level matches were required for interface terms. RESULTS: Sensitivity of SNOMED CT as a reference terminology was 83% vs. 25% for MEDCIN (p<0.001). The sensitivity of SNOMED CT as an interface terminology was 53% vs. 7% for MEDCIN (P< 0.001). DISCUSSION: The content coverage of SNOMED CT as a reference terminology and as an interface terminology outperformed MEDCIN. We did not evaluate other aspects of interface terminologies such as richness of clinical linkages.