Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Epilepsia ; 64(7): 1862-1872, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37150944

RESUMEN

OBJECTIVE: Epilepsy is largely a treatable condition with antiseizure medication (ASM). Recent national administrative claims data suggest one third of newly diagnosed adult epilepsy patients remain untreated 3 years after diagnosis. We aimed to quantify and characterize this treatment gap within a large US academic health system leveraging the electronic health record for enriched clinical detail. METHODS: This retrospective cohort study evaluated the proportion of adult patients in the health system from 2012 to 2020 who remained untreated 3 years after initial epilepsy diagnosis. To identify incident epilepsy, we applied validated administrative health data criteria of two encounters for epilepsy/seizures and/or convulsions, and we required no ASM prescription preceding the first encounter. Engagement with the health system at least 2 years before and at least 3 years after diagnosis was required. Among subjects who met administrative data diagnosis criteria, we manually reviewed medical records for a subset of 240 subjects to verify epilepsy diagnosis, confirm treatment status, and elucidate reason for nontreatment. These results were applied to estimate the proportion of the full cohort with untreated epilepsy. RESULTS: Of 831 patients who were automatically classified as having incident epilepsy by inclusion criteria, 80 (10%) remained untreated 3 years after incident epilepsy diagnosis. Manual chart review of incident epilepsy classification revealed only 33% (78/240) had true incident epilepsy. We found untreated patients were more frequently misclassified (p < .001). Using corrected counts, we extrapolated to the full cohort (831) and estimated <1%-3% had true untreated epilepsy. SIGNIFICANCE: We found a substantially lower proportion of patients with newly diagnosed epilepsy remained untreated compared to previous estimates from administrative data analysis. Manual chart review revealed patients were frequently misclassified as having incident epilepsy, particularly patients who were not treated with an ASM. Administrative data analyses utilizing only diagnosis codes may misclassify patients as having incident epilepsy.


Asunto(s)
Anticonvulsivantes , Epilepsia , Humanos , Adulto , Estados Unidos/epidemiología , Estudios Retrospectivos , Anticonvulsivantes/uso terapéutico , Epilepsia/diagnóstico , Epilepsia/tratamiento farmacológico , Epilepsia/epidemiología , Convulsiones/tratamiento farmacológico , Registros Electrónicos de Salud
2.
J Biomed Inform ; 139: 104269, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36621750

RESUMEN

Electronic health records (EHR) are collected as a routine part of healthcare delivery, and have great potential to be utilized to improve patient health outcomes. They contain multiple years of health information to be leveraged for risk prediction, disease detection, and treatment evaluation. However, they do not have a consistent, standardized format across institutions, particularly in the United States, and can present significant analytical challenges- they contain multi-scale data from heterogeneous domains and include both structured and unstructured data. Data for individual patients are collected at irregular time intervals and with varying frequencies. In addition to the analytical challenges, EHR can reflect inequity- patients belonging to different groups will have differing amounts of data in their health records. Many of these issues can contribute to biased data collection. The consequence is that the data for under-served groups may be less informative partly due to more fragmented care, which can be viewed as a type of missing data problem. For EHR data in this complex form, there is currently no framework for introducing realistic missing values. There has also been little to no work in assessing the impact of missing data in EHR. In this work, we first introduce a terminology to define three levels of EHR data and then propose a novel framework for simulating realistic missing data scenarios in EHR to adequately assess their impact on predictive modeling. We incorporate the use of a medical knowledge graph to capture dependencies between medical events to create a more realistic missing data framework. In an intensive care unit setting, we found that missing data have greater negative impact on the performance of disease prediction models in groups that tend to have less access to healthcare, or seek less healthcare. We also found that the impact of missing data on disease prediction models is stronger when using the knowledge graph framework to introduce realistic missing values as opposed to random event removal.


Asunto(s)
Atención a la Salud , Registros Electrónicos de Salud , Humanos , Estados Unidos , Unidades de Cuidados Intensivos
3.
J Biomed Inform ; 146: 104483, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37657712

RESUMEN

OBJECTIVE: To evaluate the technical feasibility and potential value of a digital assistant that prompts intensive care unit (ICU) rounding teams to use evidence-based practices based on analysis of their real-time discussions. METHODS: We evaluated a novel voice-based digital assistant which audio records and processes the ICU care team's rounding discussions to determine which evidence-based practices are applicable to the patient but have yet to be addressed by the team. The system would then prompt the team to consider indicated but not yet delivered practices, thereby reducing cognitive burden compared to traditional rigid rounding checklists. In a retrospective analysis, we applied automatic transcription, natural language processing, and a rule-based expert system to generate personalized prompts for each patient in 106 audio-recorded ICU rounding discussions. To assess technical feasibility, we compared the system's prompts to those created by experienced critical care nurses who directly observed rounds. To assess potential value, we also compared the system's prompts to a hypothetical paper checklist containing all evidence-based practices. RESULTS: The positive predictive value, negative predictive value, true positive rate, and true negative rate of the system's prompts were 0.45 ± 0.06, 0.83 ± 0.04, 0.68 ± 0.07, and 0.66 ± 0.04, respectively. If implemented in lieu of a paper checklist, the system would generate 56% fewer prompts per patient, with 50%±17% greater precision. CONCLUSION: A voice-based digital assistant can reduce prompts per patient compared to traditional approaches for improving evidence uptake on ICU rounds. Additional work is needed to evaluate field performance and team acceptance.

4.
J Biomed Inform ; 139: 104306, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36738870

RESUMEN

BACKGROUND: In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as ​​reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients. METHODS: We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern. RESULTS: With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors. CONCLUSION: In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.


Asunto(s)
COVID-19 , Registros Electrónicos de Salud , Humanos , Recolección de Datos , Registros , Análisis por Conglomerados
5.
Genet Med ; 24(3): 601-609, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34906489

RESUMEN

PURPOSE: Genome-wide association studies have identified hundreds of single nucleotide variations (formerly single nucleotide polymorphisms) associated with several cancers, but the predictive ability of polygenic risk scores (PRSs) is unclear, especially among non-Whites. METHODS: PRSs were derived from genome-wide significant single-nucleotide variations for 15 cancers in 20,079 individuals in an academic biobank. We evaluated the improvement in discriminatory accuracy by including cancer-specific PRS in patients of genetically-determined African and European ancestry. RESULTS: Among the individuals of European genetic ancestry, PRSs for breast, colon, melanoma, and prostate were significantly associated with their respective cancers. Among the individuals of African genetic ancestry, PRSs for breast, colon, prostate, and thyroid were significantly associated with their respective cancers. The area under the curve of the model consisting of age, sex, and principal components was 0.621 to 0.710, and it increased by 1% to 4% with the inclusion of PRS in individuals of European genetic ancestry. In individuals of African genetic ancestry, area under the curve was overall higher in the model without the PRS (0.723-0.810) but increased by <1% with the inclusion of PRS for most cancers. CONCLUSION: PRS moderately increased the ability to discriminate the cancer status in individuals of European but not African ancestry. Further large-scale studies are needed to identify ancestry-specific genetic factors in non-White populations to incorporate PRS into cancer risk assessment.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Neoplasias , Bancos de Muestras Biológicas , Población Negra/genética , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Neoplasias/etnología , Neoplasias/genética , Factores de Riesgo , Población Blanca/genética
6.
J Biomed Inform ; 134: 104176, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36007785

RESUMEN

OBJECTIVE: For multi-center heterogeneous Real-World Data (RWD) with time-to-event outcomes and high-dimensional features, we propose the SurvMaximin algorithm to estimate Cox model feature coefficients for a target population by borrowing summary information from a set of health care centers without sharing patient-level information. MATERIALS AND METHODS: For each of the centers from which we want to borrow information to improve the prediction performance for the target population, a penalized Cox model is fitted to estimate feature coefficients for the center. Using estimated feature coefficients and the covariance matrix of the target population, we then obtain a SurvMaximin estimated set of feature coefficients for the target population. The target population can be an entire cohort comprised of all centers, corresponding to federated learning, or a single center, corresponding to transfer learning. RESULTS: Simulation studies and a real-world international electronic health records application study, with 15 participating health care centers across three countries (France, Germany, and the U.S.), show that the proposed SurvMaximin algorithm achieves comparable or higher accuracy compared with the estimator using only the information of the target site and other existing methods. The SurvMaximin estimator is robust to variations in sample sizes and estimated feature coefficients between centers, which amounts to significantly improved estimates for target sites with fewer observations. CONCLUSIONS: The SurvMaximin method is well suited for both federated and transfer learning in the high-dimensional survival analysis setting. SurvMaximin only requires a one-time summary information exchange from participating centers. Estimated regression vectors can be very heterogeneous. SurvMaximin provides robust Cox feature coefficient estimates without outcome information in the target population and is privacy-preserving.


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Humanos , Privacidad , Modelos de Riesgos Proporcionales , Análisis de Supervivencia
7.
J Med Internet Res ; 24(6): e36151, 2022 06 29.
Artículo en Inglés | MEDLINE | ID: mdl-35767327

RESUMEN

BACKGROUND: Free-text communication between patients and providers plays an increasing role in chronic disease management, through platforms varying from traditional health care portals to novel mobile messaging apps. These text data are rich resources for clinical purposes, but their sheer volume render them difficult to manage. Even automated approaches, such as natural language processing, require labor-intensive manual classification for developing training data sets. Automated approaches to organizing free-text data are necessary to facilitate use of free-text communication for clinical care. OBJECTIVE: The aim of this study was to apply unsupervised learning approaches to (1) understand the types of topics discussed and (2) learn medication-related intents from messages sent between patients and providers through a bidirectional text messaging system for managing participant blood pressure (BP). METHODS: This study was a secondary analysis of deidentified messages from a remote, mobile, text-based employee hypertension management program at an academic institution. We trained a latent Dirichlet allocation (LDA) model for each message type (ie, inbound patient messages and outbound provider messages) and identified the distribution of major topics and significant topics (probability >.20) across message types. Next, we annotated all medication-related messages with a single medication intent. Then, we trained a second medication-specific LDA (medLDA) model to assess how well the unsupervised method could identify more fine-grained medication intents. We encoded each medication message with n-grams (n=1-3 words) using spaCy, clinical named entities using Stanza, and medication categories using MedEx; we then applied chi-square feature selection to learn the most informative features associated with each medication intent. RESULTS: In total, 253 participants and 5 providers engaged in the program, generating 12,131 total messages: 46.90% (n=5689) patient messages and 53.10% (n=6442) provider messages. Most patient messages corresponded to BP reporting, BP encouragement, and appointment scheduling; most provider messages corresponded to BP reporting, medication adherence, and confirmatory statements. Most patient and provider messages contained 1 topic and few contained more than 3 topics identified using LDA. In total, 534 medication messages were annotated with a single medication intent. Of these, 282 (52.8%) were patient medication messages: most referred to the medication request intent (n=134, 47.5%). Most of the 252 (47.2%) provider medication messages referred to the medication question intent (n=173, 68.7%). Although the medLDA model could identify a majority intent within each topic, it could not distinguish medication intents with low prevalence within patient or provider messages. Richer feature engineering identified informative lexical-semantic patterns associated with each medication intent class. CONCLUSIONS: LDA can be an effective method for generating subgroups of messages with similar term usage and facilitating the review of topics to inform annotations. However, few training cases and shared vocabulary between intents precludes the use of LDA for fully automated, deep, medication intent classification. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.1101/2021.12.23.21268061.


Asunto(s)
Hipertensión , Envío de Mensajes de Texto , Humanos , Hipertensión/tratamiento farmacológico , Proyectos Piloto , Estudios Retrospectivos , Aprendizaje Automático no Supervisado
8.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-34533459

RESUMEN

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Asunto(s)
COVID-19 , Pandemias , Adulto , Anciano , Femenino , Hospitalización , Hospitales , Humanos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , SARS-CoV-2
10.
J Pharm Technol ; 37(2): 89-94, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34752556

RESUMEN

Background: Currently, there are no guidelines regarding the optimal daily timing of inpatient warfarin administration. Objective: The purpose of this study was to determine whether dosing warfarin in the morning will have a significant impact on therapeutic international normalized ratio (INR) achievement compared with evening administration in mechanical mitral valve patients initiated on warfarin following cardiac surgery. Methods: This was a single-center, pre- and post-retrospective cohort conducted between 2014 and 2018. One-hundred fifty-four adult patients who underwent a mechanical mitral valve replacement or alternative cardiac surgery with a history of a mechanical mitral valve were enrolled. The primary outcome was achievement of therapeutic INR at any time point after initiation of warfarin. Pre-intervention administration timing was 6 pm and post-intervention timing was 10 am. Results: Baseline characteristics including age, sex, and race were similar between the 2 groups (P = NS for each characteristic). Therapeutic INR achievement was significantly improved at all time points following 10 am warfarin administration compared with 6 pm (hazard ratio = 1.69; P = .005). Mean time-to-therapeutic INR was 7.37 days in the post-intervention group and 8.39 days in the pre-intervention group (P = .073). There were no significant differences in INR >4, bleeding, or thrombotic complications between groups. Conclusion and Relevance: This retrospective analysis suggests that there may be a postoperative benefit in therapeutic INR achievement in mechanical valve patients when dosing warfarin in the morning compared with evening administration. Large-scale studies should be conducted to further elucidate the potential benefit across more heterogeneous populations.

11.
J Biomed Inform ; 112S: 100086, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-34417005

RESUMEN

Standardizing clinical information in a semantically rich data model is useful for promoting interoperability and facilitating high quality research. Semantic Web technologies such as Resource Description Framework can be utilized to their full potential when a model accurately reflects the semantics of the clinical situation it describes. To this end, ontologies that abide by sound organizational principles can be used as the building blocks of a semantically rich model for the storage of clinical data. However, it is a challenge to programmatically define such a model and load data from disparate sources. The PennTURBO Semantic Engine is a tool developed at the University of Pennsylvania that transforms concise RDF data into a source-independent, semantically rich model. This system sources classes from an application ontology and specifically defines how instances of those classes may relate to each other. Additionally, the system defines and executes RDF data transformations by launching dynamically generated SPARQL update statements. The Semantic Engine was designed as a generalizable data standardization tool, and is able to work with various data models and incoming data sources. Its human-readable configuration files can easily be shared between institutions, providing the basis for collaboration on a standard data model.

12.
J Med Internet Res ; 22(12): e22493, 2020 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-33270032

RESUMEN

BACKGROUND: Automated texting platforms have emerged as a tool to facilitate communication between patients and health care providers with variable effects on achieving target blood pressure (BP). Understanding differences in the way patients interact with these communication platforms can inform their use and design for hypertension management. OBJECTIVE: Our primary aim was to explore the unique phenotypes of patient interactions with an automated text messaging platform for BP monitoring. Our secondary aim was to estimate associations between interaction phenotypes and BP control. METHODS: This study was a secondary analysis of data from a randomized controlled trial for adults with poorly controlled hypertension. A total of 201 patients with established primary care were assigned to the automated texting platform; messages exchanged throughout the 4-month program were analyzed. We used the k-means clustering algorithm to characterize two different interaction phenotypes: program conformity and engagement style. First, we identified unique clusters signifying differences in program conformity based on the frequency over time of error alerts, which were generated to patients when they deviated from the requested text message format (eg, ###/## for BP). Second, we explored overall engagement styles, defined by error alerts and responsiveness to text prompts, unprompted messages, and word count averages. Finally, we applied the chi-square test to identify associations between each interaction phenotype and achieving the target BP. RESULTS: We observed 3 categories of program conformity based on their frequency of error alerts: those who immediately and consistently submitted texts without system errors (perfect users, 51/201), those who did so after an initial learning period (adaptive users, 66/201), and those who consistently submitted messages generating errors to the platform (nonadaptive users, 38/201). Next, we observed 3 categories of engagement style: the enthusiast, who tended to submit unprompted messages with high word counts (17/155); the student, who inconsistently engaged (35/155); and the minimalist, who engaged only when prompted (103/155). Of all 6 phenotypes, we observed a statistically significant association between patients demonstrating the minimalist communication style (high adherence, few unprompted messages, limited information sharing) and achieving target BP (P<.001). CONCLUSIONS: We identified unique interaction phenotypes among patients engaging with an automated text message platform for remote BP monitoring. Only the minimalist communication style was associated with achieving target BP. Identifying and understanding interaction phenotypes may be useful for tailoring future automated texting interactions and designing future interventions to achieve better BP control.


Asunto(s)
Presión Sanguínea/fisiología , Hipertensión/terapia , Monitoreo Fisiológico/métodos , Envío de Mensajes de Texto/normas , Adolescente , Adulto , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Fenotipo , Adulto Joven
13.
BMC Med Inform Decis Mak ; 20(Suppl 11): 338, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33380319

RESUMEN

BACKGROUND: Age and time information stored within the histories of clinical notes can provide valuable insights for assessing a patient's disease risk, understanding disease progression, and studying therapeutic outcomes. However, details of age and temporally-specified clinical events are not well captured, consistently codified, and readily available to research databases for study. METHODS: We expanded upon existing annotation schemes to capture additional age and temporal information, conducted an annotation study to validate our expanded schema, and developed a prototypical, rule-based Named Entity Recognizer to extract our novel clinical named entities (NE). The annotation study was conducted on 138 discharge summaries from the pre-annotated 2014 ShARe/CLEF eHealth Challenge corpus. In addition to existing NE classes (TIMEX3, SUBJECT_CLASS, DISEASE_DISORDER), our schema proposes 3 additional NEs (AGE, PROCEDURE, OTHER_EVENTS). We also propose new attributes, e.g., "degree_relation" which captures the degree of biological relation for subjects annotated under SUBJECT_CLASS. As a proof of concept, we applied the schema to 49 H&P notes to encode pertinent history information for a lung cancer cohort study. RESULTS: An abundance of information was captured under the new OTHER_EVENTS, PROCEDURE and AGE classes, with 23%, 10% and 8% of all annotated NEs belonging to the above classes, respectively. We observed high inter-annotator agreement of >80% for AGE and TIMEX3; the automated NLP system achieved F1 scores of 86% (AGE) and 86% (TIMEX3). Age and temporally-specified mentions within past medical, family, surgical, and social histories were common in our lung cancer data set; annotation is ongoing to support this translational research study. CONCLUSIONS: Our annotation schema and NLP system can encode historical events from clinical notes to support clinical and translational research studies.


Asunto(s)
Procesamiento de Lenguaje Natural , Anciano de 80 o más Años , Estudios de Cohortes , Humanos
14.
J Med Internet Res ; 19(2): e48, 2017 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-28246066

RESUMEN

BACKGROUND: With a lifetime prevalence of 16.2%, major depressive disorder is the fifth biggest contributor to the disease burden in the United States. OBJECTIVE: The aim of this study, building on previous work qualitatively analyzing depression-related Twitter data, was to describe the development of a comprehensive annotation scheme (ie, coding scheme) for manually annotating Twitter data with Diagnostic and Statistical Manual of Mental Disorders, Edition 5 (DSM 5) major depressive symptoms (eg, depressed mood, weight change, psychomotor agitation, or retardation) and Diagnostic and Statistical Manual of Mental Disorders, Edition IV (DSM-IV) psychosocial stressors (eg, educational problems, problems with primary support group, housing problems). METHODS: Using this annotation scheme, we developed an annotated corpus, Depressive Symptom and Psychosocial Stressors Acquired Depression, the SAD corpus, consisting of 9300 tweets randomly sampled from the Twitter application programming interface (API) using depression-related keywords (eg, depressed, gloomy, grief). An analysis of our annotated corpus yielded several key results. RESULTS: First, 72.09% (6829/9473) of tweets containing relevant keywords were nonindicative of depressive symptoms (eg, "we're in for a new economic depression"). Second, the most prevalent symptoms in our dataset were depressed mood and fatigue or loss of energy. Third, less than 2% of tweets contained more than one depression related category (eg, diminished ability to think or concentrate, depressed mood). Finally, we found very high positive correlations between some depression-related symptoms in our annotated dataset (eg, fatigue or loss of energy and educational problems; educational problems and diminished ability to think). CONCLUSIONS: We successfully developed an annotation scheme and an annotated corpus, the SAD corpus, consisting of 9300 tweets randomly-selected from the Twitter application programming interface using depression-related keywords. Our analyses suggest that keyword queries alone might not be suitable for public health monitoring because context can change the meaning of keyword in a statement. However, postprocessing approaches could be useful for reducing the noise and improving the signal needed to detect depression symptoms using social media.


Asunto(s)
Depresión/diagnóstico , Trastorno Depresivo Mayor/diagnóstico , Internet/estadística & datos numéricos , Medios de Comunicación Sociales/estadística & datos numéricos , Estrés Psicológico/diagnóstico , Depresión/epidemiología , Trastorno Depresivo Mayor/epidemiología , Humanos , Aprendizaje Automático , Psicología , Estrés Psicológico/epidemiología
15.
Neuroepidemiology ; 47(3-4): 201-209, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-28135707

RESUMEN

BACKGROUND: Direct oral anticoagulants (DOACs) have the potential to improve stroke prevention among atrial fibrillation (AF) patients. We sought to determine if oral anticoagulation (OAC) treatment rates have increased since the approval of DOACs. METHODS: We identified 6,688 patients with AF at an academic medical center from January 2008 to June 2015. We examined OAC prescription rates over time and according to CHA2DS2VASc score using multivariable Poisson regression models, with an interaction term between risk score and year of AF diagnosis. RESULTS: Among 6,688 AF patients, 78% had CHA2DS2VASc scores ≥2, 51.6% of whom received an OAC prescription within 90 days of diagnosis. The OAC prescription rate was 47.8% in the pre-DOAC era and peaked at 56.4% in 2014. Relative to the pre-DOAC era, prescription rates increased in 2012 and leveled off thereafter. The prescription rate for the highest risk group was 58.5%, compared with 45.0% in patients with a CHA2DS2VASc score of 2 (p < 0.01). In the adjusted analysis, prescription rates were higher for the higher risk group (adjusted relative risk 1.24 for CHA2DS2VASc score 7-9 vs. 2, 95% CI 1.09-1.40). CONCLUSIONS: OAC treatment rates have increased since DOAC introduction, but substantial treatment gaps remain, specifically among the higher risk patients.


Asunto(s)
Anticoagulantes/administración & dosificación , Fibrilación Atrial/complicaciones , Prescripciones de Medicamentos/estadística & datos numéricos , Pautas de la Práctica en Medicina/estadística & datos numéricos , Accidente Cerebrovascular/prevención & control , Administración Oral , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Persona de Mediana Edad , Factores de Riesgo , Accidente Cerebrovascular/etiología , Adulto Joven
16.
J Biomed Inform ; 50: 162-72, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24859155

RESUMEN

The Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor method requires removal of 18 types of protected health information (PHI) from clinical documents to be considered "de-identified" prior to use for research purposes. Human review of PHI elements from a large corpus of clinical documents can be tedious and error-prone. Indeed, multiple annotators may be required to consistently redact information that represents each PHI class. Automated de-identification has the potential to improve annotation quality and reduce annotation time. For instance, using machine-assisted annotation by combining de-identification system outputs used as pre-annotations and an interactive annotation interface to provide annotators with PHI annotations for "curation" rather than manual annotation from "scratch" on raw clinical documents. In order to assess whether machine-assisted annotation improves the reliability and accuracy of the reference standard quality and reduces annotation effort, we conducted an annotation experiment. In this annotation study, we assessed the generalizability of the VA Consortium for Healthcare Informatics Research (CHIR) annotation schema and guidelines applied to a corpus of publicly available clinical documents called MTSamples. Specifically, our goals were to (1) characterize a heterogeneous corpus of clinical documents manually annotated for risk-ranked PHI and other annotation types (clinical eponyms and person relations), (2) evaluate how well annotators apply the CHIR schema to the heterogeneous corpus, (3) compare whether machine-assisted annotation (experiment) improves annotation quality and reduces annotation time compared to manual annotation (control), and (4) assess the change in quality of reference standard coverage with each added annotator's annotations.


Asunto(s)
Registros Electrónicos de Salud , Interfaz Usuario-Computador , Health Insurance Portability and Accountability Act , Estados Unidos
17.
Stud Health Technol Inform ; 310: 614-618, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38269882

RESUMEN

In the United States, more than 12% of the population will experience thyroid dysfunction. Patient symptoms often reported with thyroid dysfunction include fatigue and weight change. However, little is understood about the relationship between these symptoms documented in the outpatient setting and ordering patterns for thyroid testing among various patient groups by age and sex. We developed a natural language processing and deep learning pipeline to identify patient-reported outcomes of weight change and fatigue among patients with a thyroid stimulating hormone test. We built upon prior works by comparing 5 open-source, Bidirectional Encoder Representations from Transformers (BERT) to determine which models could accurately identify these symptoms from clinical texts. For both fatigue (f) and weight change (wc), Bio_ClinicalBERT achieved the highest F1-score (f: 0.900; wc: 0.906) compared BERT (f: 0.899; wc: 0.890), DistilBERT (f: 0.852; wc: 0.912), Biomedical RoBERTa (f: 0.864; wc: 0.904), and PubMedBERT (f: 0.882; wc: 0.892).


Asunto(s)
Procesamiento de Lenguaje Natural , Glándula Tiroides , Humanos , Pacientes Ambulatorios , Suministros de Energía Eléctrica , Fatiga
18.
JMIR Form Res ; 8: e52200, 2024 Jan 26.
Artículo en Inglés | MEDLINE | ID: mdl-38277207

RESUMEN

BACKGROUND: Atopic dermatitis (AD) is a chronic skin condition that millions of people around the world live with each day. Performing research into identifying the causes and treatment for this disease has great potential to provide benefits for these individuals. However, AD clinical trial recruitment is not a trivial task due to the variance in diagnostic precision and phenotypic definitions leveraged by different clinicians, as well as the time spent finding, recruiting, and enrolling patients by clinicians to become study participants. Thus, there is a need for automatic and effective patient phenotyping for cohort recruitment. OBJECTIVE: This study aims to present an approach for identifying patients whose electronic health records suggest that they may have AD. METHODS: We created a vectorized representation of each patient and trained various supervised machine learning methods to classify when a patient has AD. Each patient is represented by a vector of either probabilities or binary values, where each value indicates whether they meet a different criteria for AD diagnosis. RESULTS: The most accurate AD classifier performed with a class-balanced accuracy of 0.8036, a precision of 0.8400, and a recall of 0.7500 when using XGBoost (Extreme Gradient Boosting). CONCLUSIONS: Creating an automated approach for identifying patient cohorts has the potential to accelerate, standardize, and automate the process of patient recruitment for AD studies; therefore, reducing clinician burden and informing the discovery of better treatment options for AD.

19.
JMIR Ment Health ; 11: e53366, 2024 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-38224481

RESUMEN

BACKGROUND: Information regarding opioid use disorder (OUD) status and severity is important for patient care. Clinical notes provide valuable information for detecting and characterizing problematic opioid use, necessitating development of natural language processing (NLP) tools, which in turn requires reliably labeled OUD-relevant text and understanding of documentation patterns. OBJECTIVE: To inform automated NLP methods, we aimed to develop and evaluate an annotation schema for characterizing OUD and its severity, and to document patterns of OUD-relevant information within clinical notes of heterogeneous patient cohorts. METHODS: We developed an annotation schema to characterize OUD severity based on criteria from the Diagnostic and Statistical Manual of Mental Disorders, 5th edition. In total, 2 annotators reviewed clinical notes from key encounters of 100 adult patients with varied evidence of OUD, including patients with and those without chronic pain, with and without medication treatment for OUD, and a control group. We completed annotations at the sentence level. We calculated severity scores based on annotation of note text with 18 classes aligned with criteria for OUD severity and determined positive predictive values for OUD severity. RESULTS: The annotation schema contained 27 classes. We annotated 1436 sentences from 82 patients; notes of 18 patients (11 of whom were controls) contained no relevant information. Interannotator agreement was above 70% for 11 of 15 batches of reviewed notes. Severity scores for control group patients were all 0. Among noncontrol patients, the mean severity score was 5.1 (SD 3.2), indicating moderate OUD, and the positive predictive value for detecting moderate or severe OUD was 0.71. Progress notes and notes from emergency department and outpatient settings contained the most and greatest diversity of information. Substance misuse and psychiatric classes were most prevalent and highly correlated across note types with high co-occurrence across patients. CONCLUSIONS: Implementation of the annotation schema demonstrated strong potential for inferring OUD severity based on key information in a small set of clinical notes and highlighting where such information is documented. These advancements will facilitate NLP tool development to improve OUD prevention, diagnosis, and treatment.


Asunto(s)
Dolor Crónico , Trastornos Relacionados con Opioides , Adulto , Humanos , Procesamiento de Lenguaje Natural , Pacientes Ambulatorios , Grupos Control , Trastornos Relacionados con Opioides/diagnóstico
20.
Stud Health Technol Inform ; 310: 619-623, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38269883

RESUMEN

According to the World Stroke Organization, 12.2 million people world-wide will have their first stroke this year almost half of which will die as a result. Natural Language Processing (NLP) may improve stroke phenotyping; however, existing rule-based classifiers are rigid, resulting in inadequate performance. We report findings from a pilot study using NLP to improve relation detection for stroke assertion detection to support research studies and healthcare operations.


Asunto(s)
Procesamiento de Lenguaje Natural , Accidente Cerebrovascular , Humanos , Proyectos Piloto , Accidente Cerebrovascular/diagnóstico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA