RESUMEN
INTRODUCTION: Adrenal gland incidentalomas (AGIs) are found in up to 5% of cross-sectional images. However, rates of guideline-based workup for AGIs are notoriously low. We sought to determine if a natural language processing (NLP)-informed AGI clinic could improve the rates of indicated biochemical evaluation and adrenal-specific imaging. METHODS: An NLP algorithm was created to detect clinically significant adrenal nodules from radiology reports of cross-sectional images at an academic institution. The NLP algorithm was applied to scans occurring between June 2020 and July 2021 to form a baseline cohort. The NLP algorithm was re-applied to scans from August 2021 to February 2023 and identified patients were invited to join an outpatient clinic dedicated to AGIs. Patients evaluated in the clinic from March 2022 to February 2023 were included in the intervention cohort. Statistical analysis utilized chi-square, t-test, and a multivariable logistic regression. RESULTS: The baseline and intervention cohorts included 1784 and 322 unique patients, respectively. Patients in the intervention cohort were more likely to be female (59% vs. 51%, p = 0.01), be younger (60 ± 13.1 vs. 64 ± 13.2 years, p < 0.001), have smaller nodules (1.7 cm, IQR 1.4-2.1 vs. 1.8 cm, IQR 1.4-2.5 cm, p = 0.017), have had biochemical workup (99% vs. 13%, p < 0.001), and have had adrenal-specific imaging (40% vs. 11%, p < 0.001). In a multivariable analysis, intervention cohort patients were significantly more likely to have had biochemical workup (odds ratio ,OR 1209, confidence interval ,CI 434-5117, p < 0.001) and adrenal-specific imaging (OR 8.89, CI 6.42-12.4, p < 0.001). CONCLUSION: The implementation of an NLP-informed AGI clinic was associated with a seven-fold increase in biochemical workup and a three-fold increase in adrenal-specific imaging in participating patients.
RESUMEN
BACKGROUND: Mycoplasma pneumoniae is a common pathogen that causes upper and lower respiratory tract infections in people of all ages, responsible for up to 40% of community-acquired pneumonias. It also causes a wide array of extrapulmonary infections and autoimmune phenomena. Phylogenetic studies of the organism have been generally restricted to specific genes or regions of the genome, because whole genome sequencing has been completed for only 4 strains. To better understand the physiology and pathogenicity of this important human pathogen, we performed comparative genomic analysis of 15 strains of M. pneumoniae that were isolated between the 1940s to 2009 from respiratory specimens and cerebrospinal fluid originating from the USA, China and England. RESULTS: Illumina MiSeq whole genome sequencing was performed on the 15 strains and all genome sequences were completed. Results from the comparative genomic analysis indicate that although about 1500 SNP and indel variants exist between type1 and type 2 strains, there is an overall high degree of sequence similarity among the strains (>99% identical to each other). Within the two subtypes, conservation of most genes, including the CARDS toxin gene and arginine deiminase genes, was observed. The major variation occurs in the P1 and ORF6 genes associated with the adhesin complex. Multiple hsdS genes (encodes S subunit of type I restriction enzyme) with variable tandem repeat copy numbers were found in all 15 genomes. CONCLUSIONS: These data indicate that despite conclusions drawn from 16S rRNA sequences suggesting rapid evolution, the M. pneumoniae genome is extraordinarily stable over time and geographic distance across the globe with a striking lack of evidence of horizontal gene transfer.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mycoplasma pneumoniae/clasificación , Mycoplasma pneumoniae/aislamiento & purificación , Análisis de Secuencia de ADN/métodos , China , Hibridación Genómica Comparativa , Inglaterra , Evolución Molecular , Variación Genética , Genoma Bacteriano , Humanos , Mycoplasma pneumoniae/genética , Filogenia , Homología de Secuencia de Ácido Nucleico , Estados UnidosRESUMEN
Natural Language Processing can be used to identify opioid use disorder in patients from clinical text1. We annotate a corpus of clinical text for mentions of concepts associated with unhealthy use of opiates including concept modifiers such as negation, subject, uncertainty, relation to document time and illicit use.
Asunto(s)
Procesamiento de Lenguaje Natural , Trastornos Relacionados con Opioides , Humanos , Trastornos Relacionados con Opioides/epidemiología , IncertidumbreRESUMEN
BACKGROUND: Delirium is a common complication during acute care hospitalizations in older adults. A substantial percentage of admissions are for ambulatory care-sensitive conditions (ACSCs) or potentially avoidable hospitalizations-conditions that might be treated early in the outpatient setting to prevent hospitalization and hospital complications. METHODS: This retrospective cross-sectional study examined rates of delirium among older adults hospitalized for ACSCs. Participants were 39 933 older adults ≥65 years of age admitted from January 1, 2015 to December 31, 2019 to general inpatient units and ICUs of a large Southeastern academic medical center. Delirium was defined as a score ≥ 2 on the Nursing Delirium Screening Scale or positive on the Confusion Assessment Method for the Intensive Care Unit during admission, and ACSCs were identified from the primary admission diagnosis using standardized definitions. Generalized linear mixed models were used to examine the association between ACSCs and delirium, compared with admissions for non-ACSC diagnoses, adjusting for covariates and repeated observations for individuals with multiple admissions. RESULTS: Delirium occurred in 15.6% of admissions for older adults. Rates were lower for ACSC admissions versus admissions for other conditions (13.9% vs 15.8%, p < .001). Older age and higher comorbidity were significant predictors of the development of delirium. CONCLUSIONS: Rates of delirium among older adults hospitalized for ACSCs were lower than rates for non-ACSC hospitalization but still substantial. Optimizing the treatment of ACSCs in the outpatient setting is an important goal not only for reducing hospitalizations but also for reducing risks for hospital-associated complications such as delirium.
Asunto(s)
Delirio , Hospitalización , Humanos , Anciano , Estudios Retrospectivos , Estudios Transversales , Delirio/diagnóstico , Delirio/epidemiología , Delirio/etiología , Atención AmbulatoriaRESUMEN
BACKGROUND: The semantics of entities extracted from a clinical text can be dramatically altered by modifiers, including entity negation, uncertainty, conditionality, severity, and subject. Existing models for determining modifiers of clinical entities involve regular expression or features weights that are trained independently for each modifier. METHODS: We develop and evaluate a multi-task transformer architecture design where modifiers are learned and predicted jointly using the publicly available SemEval 2015 Task 14 corpus and a new Opioid Use Disorder (OUD) data set that contains modifiers shared with SemEval as well as novel modifiers specific for OUD. We evaluate the effectiveness of our multi-task learning approach versus previously published systems and assess the feasibility of transfer learning for clinical entity modifiers when only a portion of clinical modifiers are shared. RESULTS: Our approach achieved state-of-the-art results on the ShARe corpus from SemEval 2015 Task 14, showing an increase of 1.1% on weighted accuracy, 1.7% on unweighted accuracy, and 10% on micro F1 scores. CONCLUSIONS: We show that learned weights from our shared model can be effectively transferred to a new partially matched data set, validating the use of transfer learning for clinical text modifiers.
Asunto(s)
Trastornos Relacionados con Opioides , Humanos , Aprendizaje Automático , Semántica , Procesamiento de Lenguaje NaturalRESUMEN
Several mycoplasma species have been shown to form biofilms that confer resistance to antimicrobials and which may affect the host immune system, thus making treatment and eradication of the pathogens difficult. The present study shows that the biofilms formed by two strains of the human pathogen Mycoplasma pneumoniae differ quantitatively and qualitatively. Compared with strain UAB PO1, strain M129 grows well but forms biofilms that are less robust, with towers that are less smooth at the margins. A polysaccharide containing N-acetylglucosamine is secreted by M129 into the culture medium but found in tight association with the cells of UAB PO1. The polysaccharide may have a role in biofilm formation, contributing to differences in virulence, chronicity and treatment outcome between strains of M. pneumoniae. The UAB PO1 genome was found to be that of a type 2 strain of M. pneumoniae, whereas M129 is type 1. Examination of other M. pneumoniae isolates suggests that the robustness of the biofilm correlates with the strain type.
Asunto(s)
Biopelículas/clasificación , Mycoplasma pneumoniae/crecimiento & desarrollo , Acetilglucosamina/metabolismo , Adhesión Bacteriana , Biopelículas/crecimiento & desarrollo , Recuento de Colonia Microbiana , Medios de Cultivo Condicionados/química , Humanos , Mycoplasma pneumoniae/clasificación , Mycoplasma pneumoniae/genética , Mycoplasma pneumoniae/patogenicidad , Especificidad de la EspecieRESUMEN
Widespread adoption of electronic health records (EHR) in the U.S. has been followed by unintended consequences, overexposing clinicians to widely reported EHR limitations. As an attempt to fixing the EHR, we propose the use of a clinical context ontology (CCO), applied to turn implicit contextual statements into formally represented data in the form of concept-relationship-concept tuples. These tuples form what we call a patient specific knowledge base (PSKB), a collection of formally defined tuples containing facts about the patient's care context. We report the process to create a CCO, which guides annotation of structured and narrative patient data to produce a PSKB. We also present an application of our PSKB using real patient data displayed on a semantically oriented patient summary to improve EHR navigation. Our approach can potentially save precious time spent by clinicians using today's EHRs, by showing a chronological view of the patient's record along with contextual statements needed for care decisions with minimum effort. We propose several other applications of a PSKB to improve multiple EHR functions to guide future research.
Asunto(s)
Registros Electrónicos de Salud , Narración , Humanos , Bases del ConocimientoRESUMEN
BACKGROUND: Hospital-associated disability (HAD) is a common complication during the course of acute care hospitalizations in older adults. Many admissions are for ambulatory care sensitive conditions (ACSCs), considered potentially avoidable hospitalizations-conditions that might be treated in outpatient settings to prevent hospitalization and HAD. We compared the incidence of HAD between older adults hospitalized for ACSCs versus those hospitalized for other diagnoses. METHODS: We conducted a retrospective cohort study in inpatient (non-ICU) medical and surgical units of a large southeastern regional academic medical center. Participants were 38,960 older adults ≥ 65 years of age admitted from January 1, 2015, to December 31, 2019. The primary outcome was HAD, defined as decline on the Katz Activities of Daily Living (ADL) scale from hospital admission to discharge. We used generalized linear mixed models to examine differences in HAD between hospitalizations with a primary diagnosis for an ACSC using standard definitions versus primary diagnosis for other conditions, adjusting for covariates and repeated observations for individuals with multiple hospitalizations. RESULTS: We found that 10% of older adults were admitted for an ACSC, with rates of HAD in those admitted for ACSCs lower than those admitted for other conditions (16% vs. 20.7%, p < 0.001). Age, comorbidity, admission functional status, and admission cognitive impairment were significant predictors for development of HAD. ACSC admissions to medical and medical/surgical services had lower odds of HAD compared with admissions for other conditions, with no significant differences between ACSC and non-ACSC admissions to surgical services. CONCLUSIONS: Rates of HAD among older adults hospitalized for ACSCs are substantial, though lower than rates of HAD with hospitalization for other conditions, reflecting that acute care hospitalization is not a benign event in this population. Treatment of ACSCs in the outpatient setting could be an important component of efforts to reduce HAD.
Asunto(s)
Actividades Cotidianas , Hospitalización , Humanos , Anciano , Estudios Retrospectivos , Alta del Paciente , Hospitales , Atención AmbulatoriaRESUMEN
OBJECTIVE: Patients with acute gout are frequently treated in the emergency department (ED) and represent a typically underresourced and understudied population. A key limitation for gout research in the ED is the timely ability to identify acute gout patients. Our goal was to refine a multicriteria, electronic medical record alert for gout flares and to determine its diagnostic characteristics in the ED. METHODS: The gout flare alert used electronic medical record data from ED nursing notes and was triggered by the term 'gout' preceding past medical history in the chief complaint, the term 'gout' and a musculoskeletal problem in the chief complaint, or the term 'gout' in the problem list and a musculoskeletal chief complaint. We validated its diagnostic properties to assess presence/absence of gout through manual medical record review using adjudicated expert consensus as the gold standard. RESULTS: In January 2020, we analyzed 202 patient records from 2 university-based EDs; from these records, 57 patients were identified by our gout flare alert, and 145 were identified by other means as potentially having an acute gout flare. The gout flare alert's positive predictive value was 47% (95% confidence interval [95% CI] 34-60%), negative predictive value was 94% (95% CI 90-98%), sensitivity was 75% (95% CI 61-89%), and specificity was 82% (95% CI 76-88%). The diagnostic properties were similar at both institutions. CONCLUSION: Our multicomponent gout flare alert had reasonable sensitivity and specificity, albeit a modest positive predictive value. An electronic gout flare alert may help enable the conduct of gout research in the ED setting.
Asunto(s)
Gota , Humanos , Gota/diagnóstico , Gota/epidemiología , Registros Electrónicos de Salud , Brote de los Síntomas , Sensibilidad y Especificidad , Servicio de Urgencia en HospitalRESUMEN
OBJECTIVE: To examine whether delirium predicts occurrence of hospital-associated disability (HAD), or functional decline after admission, among hospitalized older adults. DESIGN: Retrospective cross-sectional study. SETTING AND PARTICIPANTS: General inpatient (non-ICU) units of a large regional Southeastern US academic medical center, involving 33,111 older adults ≥65 years of age admitted from January 1, 2015, to December 31, 2019. METHODS: Delirium was defined as a score ≥2 on the Nursing Delirium Screening Scale (NuDESC) during hospital admission. HAD was defined as a decline on the Katz Activities of Daily Living (ADL) scale from hospital admission to discharge. Generalized linear mixed models were used to examine the association between delirium and HAD, adjusting for covariates and repeated observations with multiple admissions. We performed multivariate and mediation analyses to examine strength and direction of association between delirium and HAD. RESULTS: One-fifth (21.6%) of older adults developed HAD during hospitalization and experienced higher delirium rates compared to those not developing HAD (24.3% vs 14.3%, P < .001). Age, presence of delirium, Elixhauser Comorbidity Score, admission cognitive status, admission ADL function, and length of stay were associated (all P < .001) with incident HAD. Mediational analyses found 46.7% of the effect of dementia and 16.7% of the effect of comorbidity was due to delirium (P < .001). CONCLUSIONS AND IMPLICATIONS: Delirium significantly increased the likelihood of HAD within a multivariate predictor model that included comorbidity, demographics, and length of stay. For dementia and comorbidity, mediation analysis showed a significant portion of their effect attributable to delirium. Overall, these findings suggest that reducing delirium rates may diminish HAD rates.
Asunto(s)
Delirio , Demencia , Humanos , Anciano , Delirio/diagnóstico , Actividades Cotidianas , Estudios Retrospectivos , Incidencia , Estudios Transversales , Factores de Riesgo , Estudios Prospectivos , Hospitalización , Hospitales , Demencia/diagnósticoRESUMEN
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
Asunto(s)
COVID-19 , Procesamiento de Lenguaje Natural , Humanos , Registros Electrónicos de Salud , AlgoritmosRESUMEN
Allergy mention normalization is challenging because of the wide range of possible allergens including medications, foods, plants, animals, and consumer products. This paper describes the process of mapping free-text allergy information from an electronic health record (EHR) system in a university hospital to standard terminologies and migration of those data into an enterprise EHR system. The review, mapping, and migration revealed interesting issues and challenges with the free-text allergy information and the mapping in preparation for implementation in the new EHR system. These findings provide insights that can form the basis of guidelines for future mapping and migration efforts involving free-text allergy data. As part of this process, we generate and make freely available AllergyMap, a mapping between free-text entered allergy medication to standard non-proprietary ontologies. To our knowledge, this is the first such mapping available and could serve as a public resource for allergy mention normalization and system evaluation.
Asunto(s)
Alérgenos , Minería de Datos , Registros Electrónicos de Salud/normas , Hipersensibilidad , Procesamiento de Lenguaje Natural , Humanos , Sistemas de Registros Médicos Computarizados , Sistemas de Medicación en Hospital , RxNorm , Integración de SistemasRESUMEN
Many patients with gout flares treated in the Emergency Department (ED) often do not receive optimal continuity of care after an ED visit. Thus, developing methods to identify patients with gout flares in the ED and referring them to appropriate outpatient gout care is required. While Natural Language Processing (NLP) has been used to detect gout flares retrospectively, it is much more challenging to identify patients prospectively during an ED visit where documentation is usually minimal. We annotate a corpus of ED triage nurse chief complaint notes for the presence of gout flares and implement a simple algorithm for gout flare ED alerts. We show that the chief complaint alone has strong predictive power for gout flares. We make available a de-identified version of this corpus annotated for gout mentions, which is to our knowledge the first free text chief complaint clinical corpus available.
Asunto(s)
Servicio de Urgencia en Hospital , Gota/diagnóstico , Procesamiento de Lenguaje Natural , Brote de los Síntomas , Algoritmos , Humanos , Estudios Retrospectivos , Envío de Mensajes de Texto , TriajeRESUMEN
BACKGROUND: The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. RESULTS: We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. CONCLUSION: The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome.
Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Programas Informáticos , Unified Medical Language System , Biología Computacional/métodos , HumanosRESUMEN
BACKGROUND: Traditionally text mention normalization corpora have normalized concepts to single ontology identifiers ("pre-coordinated concepts"). Less frequently, normalization corpora have used concepts with multiple identifiers ("post-coordinated concepts") but the additional identifiers have been restricted to a defined set of relationships to the core concept. This approach limits the ability of the normalization process to express semantic meaning. We generated a freely available corpus using post-coordinated concepts without a defined set of relationships that we term "compositional concepts" to evaluate their use in clinical text. METHODS: We annotated 5397 disorder mentions from the ShARe corpus to SNOMED CT that were previously normalized as "CUI-less" in the "SemEval-2015 Task 14" shared task because they lacked a pre-coordinated mapping. Unlike the previous normalization method, we do not restrict concept mappings to a particular set of the Unified Medical Language System (UMLS) semantic types and allow normalization to occur to multiple UMLS Concept Unique Identifiers (CUIs). We computed annotator agreement and assessed semantic coverage with this method. RESULTS: We generated the largest clinical text normalization corpus to date with mappings to multiple identifiers and made it freely available. All but 8 of the 5397 disorder mentions were normalized using this methodology. Annotator agreement ranged from 52.4% using the strictest metric (exact matching) to 78.2% using a hierarchical agreement that measures the overlap of shared ancestral nodes. CONCLUSION: Our results provide evidence that compositional concepts can increase semantic coverage in clinical text. To our knowledge we provide the first freely available corpus of compositional concept annotation in clinical text.
Asunto(s)
Procesamiento de Lenguaje Natural , Systematized Nomenclature of Medicine , Programas InformáticosRESUMEN
Precision medicine requires that groups of patients matching clinical or genetic characteristics be identified in a clinical care setting and treated with the appropriate intervention. In the clinical setting, this process is often facilitated by a patient registry. While the software architecture of federated patient registries for research has been well characterized, local registries focused on clinical quality and care have received less attention. Many clinical registries appear to be one-off projects that lack generalizability and the ability to scale to multiple diseases. We evaluate the applicability of existing registry guidelines for registries designed for clinical intervention, propose a software architecture more practical for single-institution clinical registries and report the implementation of a generalizable clinical patient registry architecture at the University of Alabama at Birmingham (UAB).
Asunto(s)
Fenotipo , Sistema de Registros , Programas Informáticos , Alabama , Seguridad Computacional , Anonimización de la Información , Guías como Asunto , Instituciones de Salud , Humanos , Medicina de Precisión , Sistema de Registros/normas , Interfaz Usuario-ComputadorRESUMEN
Detailed instruction is described for mapping unstructured, free text data into common biomedical concepts (drugs, diseases, anatomy, and so on) found in the Unified Medical Language System using MetaMap Transfer (MMTx). MMTx can be used in applications including mining and inferring relationship between concepts in MEDLINE publications by transforming free text into computable concepts. MMTx is in general not designed to be an end-user program; therefore, a simple analysis is described using MMTx for users without any programming knowledge. In addition, two Java template files are provided for automated processing of the output from MMTx and users can adopt this with minimum Java program experience.
Asunto(s)
Biología Computacional/estadística & datos numéricos , Procesamiento de Lenguaje Natural , Unified Medical Language System , Interpretación Estadística de Datos , Almacenamiento y Recuperación de la Información/estadística & datos numéricos , MEDLINE , Programas Informáticos , Diseño de SoftwareRESUMEN
Methods are described to take a list of genes generated from a microarray experiment and interpret these results using various tools and ontologies. A workflow is described that details how to convert gene identifiers with SOURCE and MatchMiner and then use these converted gene lists to search the gene ontology (GO) and the medical subject headings (MeSH) ontology. Examples of searching GO with DAVID, EASE, and GOMiner are provided along with an interpretation of results. The mining of MeSH using high-density array pattern interpreter with a set of gene identifiers is also described.
Asunto(s)
Genes , Medical Subject Headings , Biología Molecular/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis por Conglomerados , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Humanos , InternetRESUMEN
OBJECTIVE: To help cancer registrars efficiently and accurately identify reportable cancer cases. MATERIAL AND METHODS: The Cancer Registry Control Panel (CRCP) was developed to detect mentions of reportable cancer cases using a pipeline built on the Unstructured Information Management Architecture - Asynchronous Scaleout (UIMA-AS) architecture containing the National Library of Medicine's UIMA MetaMap annotator as well as a variety of rule-based UIMA annotators that primarily act to filter out concepts referring to nonreportable cancers. CRCP inspects pathology reports nightly to identify pathology records containing relevant cancer concepts and combines this with diagnosis codes from the Clinical Electronic Data Warehouse to identify candidate cancer patients using supervised machine learning. Cancer mentions are highlighted in all candidate clinical notes and then sorted in CRCP's web interface for faster validation by cancer registrars. RESULTS: CRCP achieved an accuracy of 0.872 and detected reportable cancer cases with a precision of 0.843 and a recall of 0.848. CRCP increases throughput by 22.6% over a baseline (manual review) pathology report inspection system while achieving a higher precision and recall. Depending on registrar time constraints, CRCP can increase recall to 0.939 at the expense of precision by incorporating a data source information feature. CONCLUSION: CRCP demonstrates accurate results when applying natural language processing features to the problem of detecting patients with cases of reportable cancer from clinical notes. We show that implementing only a portion of cancer reporting rules in the form of regular expressions is sufficient to increase the precision, recall, and speed of the detection of reportable cancer cases when combined with off-the-shelf information extraction software and machine learning.