Pesquisa | Biblioteca Virtual em Saúde

1.

A general framework for developing computable clinical phenotype algorithms.

Carrell, David S; Floyd, James S; Gruber, Susan; Hazlehurst, Brian L; Heagerty, Patrick J; Nelson, Jennifer L; Williamson, Brian D; Ball, Robert.

J Am Med Inform Assoc ; 2024 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-38748991

RESUMO

OBJECTIVE: Present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data. MATERIALS/METHODS: Drawing on extensive prior phenotyping experiences and insights derived from three algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process. RESULTS: We propose five stages of algorithm development and corresponding principles, strategies, and guidelines: 1) assessing fitness-for-purpose, 2) creating gold standard data, 3) feature engineering, 4) model development, and 5) model evaluation. DISCUSSION/CONCLUSION: This framework is intended to provide practical guidance and serve as a basis for future elaboration and extension.

2.

Finding uncoded anaphylaxis in electronic health records to estimate the sensitivity of ICD10 codes.

Hazlehurst, Brian; Carrell, David S; Bann, Maralyssa A; Nelson, Jennifer; Gruber, Susan; Slaughter, Matthew; Cronkite, David J; Ball, Robert; Floyd, James S.

Am J Epidemiol ; 2024 May 16.

Artigo em Inglês | MEDLINE | ID: mdl-38751242

3.

Data-driven automated classification algorithms for acute health conditions: applying PheNorm to COVID-19 disease.

Smith, Joshua C; Williamson, Brian D; Cronkite, David J; Park, Daniel; Whitaker, Jill M; McLemore, Michael F; Osmanski, Joshua T; Winter, Robert; Ramaprasan, Arvind; Kelley, Ann; Shea, Mary; Wittayanukorn, Saranrat; Stojanovic, Danijela; Zhao, Yueqin; Toh, Sengwee; Johnson, Kevin B; Aronoff, David M; Carrell, David S.

J Am Med Inform Assoc ; 31(3): 574-582, 2024 Feb 16.

Artigo em Inglês | MEDLINE | ID: mdl-38109888

RESUMO

OBJECTIVES: Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions. MATERIALS AND METHODS: PheNorm is a general-purpose automated approach to creating computable phenotype algorithms based on natural language processing, machine learning, and (low cost) silver-standard training labels. We applied PheNorm to cohorts of potential COVID-19 patients from 2 institutions and used gold-standard manual chart review data to investigate the impact on performance of alternative feature engineering options and implementing externally trained models without local retraining. RESULTS: Models at each institution achieved AUC, sensitivity, and positive predictive value of 0.853, 0.879, 0.851 and 0.804, 0.976, and 0.885, respectively, at quantiles of model-predicted risk that maximize F1. We report performance metrics for all combinations of silver labels, feature engineering options, and models trained internally versus externally. DISCUSSION: Phenotyping algorithms developed using PheNorm performed well at both institutions. Performance varied with different silver-standard labels and feature engineering options. Models developed locally at one site also worked well when implemented externally at the other site. CONCLUSION: PheNorm models successfully identified an acute health condition, symptomatic COVID-19. The simplicity of the PheNorm approach allows it to be applied at multiple study sites with substantially reduced overhead compared to traditional approaches.

Assuntos

Algoritmos , COVID-19 , Humanos , Registros Eletrônicos de Saúde , Aprendizado de Máquina , Processamento de Linguagem Natural

4.

Scalable Incident Detection via Natural Language Processing and Probabilistic Language Models.

Walsh, Colin G; Wilimitis, Drew; Chen, Qingxia; Wright, Aileen; Kolli, Jhansi; Robinson, Katelyn; Ripperger, Michael A; Johnson, Kevin B; Carrell, David; Desai, Rishi J; Mosholder, Andrew; Dharmarajan, Sai; Adimadhyam, Sruthi; Fabbri, Daniel; Stojanovic, Danijela; Matheny, Michael E; Bejan, Cosmin A.

medRxiv ; 2023 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-38076830

RESUMO

Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risk under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It's based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: 1) suicide attempt; 2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of â¼ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR â¼ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race were dissimilar across phenotypes and require algorithmovigilance and debiasing prior to implementation.

5.

Author Correction: Genetic variation in the human leukocyte antigen region confers susceptibility to Clostridioides difficile infection.

Ferar, Kathleen; Hall, Taryn O; Crawford, Dana C; Rowley, Robb; Satterfield, Benjamin A; Li, Rongling; Gragert, Loren; Karlson, Elizabeth W; de Andrade, Mariza; Kullo, Iftikhar J; McCarty, Catherine A; Kho, Abel; Hayes, M Geoffrey; Ritchie, Marylyn D; Crane, Paul K; Mirel, Daniel B; Carlson, Christopher; Connolly, John J; Hakonarson, Hakon; Crenshaw, Andrew T; Carrell, David; Luo, Yuan; Dikilitas, Ozan; Denny, Joshua C; Jarvik, Gail P; Crosslin, David R.

Sci Rep ; 13(1): 19972, 2023 Nov 15.

Artigo em Inglês | MEDLINE | ID: mdl-37968452

6.

Genetic variation in the human leukocyte antigen region confers susceptibility to Clostridioides difficile infection.

Ferar, Kathleen; Hall, Taryn O; Crawford, Dana C; Rowley, Robb; Satterfield, Benjamin A; Li, Rongling; Gragert, Loren; Karlson, Elizabeth W; de Andrade, Mariza; Kullo, Iftikhar J; McCarty, Catherine A; Kho, Abel; Hayes, M Geoffrey; Ritchie, Marylyn D; Crane, Paul K; Mirel, Daniel B; Carlson, Christopher; Connolly, John J; Hakonarson, Hakon; Crenshaw, Andrew T; Carrell, David; Luo, Yuan; Dikilitas, Ozan; Denny, Joshua C; Jarvik, Gail P; Crosslin, David R.

Sci Rep ; 13(1): 18532, 2023 10 28.

Artigo em Inglês | MEDLINE | ID: mdl-37898691

RESUMO

Clostridioides difficile (C. diff.) infection (CDI) is a leading cause of hospital acquired diarrhea in North America and Europe and a major cause of morbidity and mortality. Known risk factors do not fully explain CDI susceptibility, and genetic susceptibility is suggested by the fact that some patients with colons that are colonized with C. diff. do not develop any infection while others develop severe or recurrent infections. To identify common genetic variants associated with CDI, we performed a genome-wide association analysis in 19,861 participants (1349 cases; 18,512 controls) from the Electronic Medical Records and Genomics (eMERGE) Network. Using logistic regression, we found strong evidence for genetic variation in the DRB locus of the MHC (HLA) II region that predisposes individuals to CDI (P > 1.0 × 10-14; OR 1.56). Altered transcriptional regulation in the HLA region may play a role in conferring susceptibility to this opportunistic enteric pathogen.

Assuntos

Infecções por Clostridium , Estudo de Associação Genômica Ampla , Humanos , Infecções por Clostridium/genética , Diarreia , Antígenos de Histocompatibilidade , Antígenos HLA/genética , Antígenos de Histocompatibilidade Classe II , Variação Genética

7.

Multi-ancestry genome- and phenome-wide association studies of diverticular disease in electronic health records with natural language processing enriched phenotyping algorithm.

Joo, Yoonjung Yoonie; Pacheco, Jennifer A; Thompson, William K; Rasmussen-Torvik, Laura J; Rasmussen, Luke V; Lin, Frederick T J; Andrade, Mariza de; Borthwick, Kenneth M; Bottinger, Erwin; Cagan, Andrew; Carrell, David S; Denny, Joshua C; Ellis, Stephen B; Gottesman, Omri; Linneman, James G; Pathak, Jyotishman; Peissig, Peggy L; Shang, Ning; Tromp, Gerard; Veerappan, Annapoorani; Smith, Maureen E; Chisholm, Rex L; Gawron, Andrew J; Hayes, M Geoffrey; Kho, Abel N.

PLoS One ; 18(5): e0283553, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37196047

RESUMO

OBJECTIVE: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique. MATERIALS AND METHODS: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs. We performed genome-wide association studies (GWAS) of DD in European, African and multi-ancestry participants, followed by phenome-wide association studies (PheWAS) of the risk variants to identify their potential comorbid/pleiotropic effects in clinical phenotypes. RESULTS: Our developed algorithm showed a significant improvement in patient classification performance for DD analysis (algorithm PPVs ≥ 0.94), with up to a 3.5 fold increase in terms of the number of identified patients than the traditional method. Ancestry-stratified analyses of diverticulosis and diverticulitis of the identified subjects replicated the well-established associations between ARHGAP15 loci with DD, showing overall intensified GWAS signals in diverticulitis patients compared to diverticulosis patients. Our PheWAS analyses identified significant associations between the DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes. DISCUSSION: As the first multi-ancestry GWAS-PheWAS study, we showcased that heterogenous EHR data can be mapped through an integrative analytical pipeline and reveal significant genotype-phenotype associations with clinical interpretation. CONCLUSION: A systematic framework to process unstructured EHR data with NLP could advance a deep and scalable phenotyping for better patient identification and facilitate etiological investigation of a disease with multilayered data.

Assuntos

Doenças Diverticulares , Diverticulite , Divertículo , Humanos , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla/métodos , Processamento de Linguagem Natural , Fenótipo , Algoritmos , Polimorfismo de Nucleotídeo Único

8.

Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network.

Pacheco, Jennifer A; Rasmussen, Luke V; Wiley, Ken; Person, Thomas Nate; Cronkite, David J; Sohn, Sunghwan; Murphy, Shawn; Gundelach, Justin H; Gainer, Vivian; Castro, Victor M; Liu, Cong; Mentch, Frank; Lingren, Todd; Sundaresan, Agnes S; Eickelberg, Garrett; Willis, Valerie; Furmanchuk, Al'ona; Patel, Roshan; Carrell, David S; Deng, Yu; Walton, Nephi; Satterfield, Benjamin A; Kullo, Iftikhar J; Dikilitas, Ozan; Smith, Joshua C; Peterson, Josh F; Shang, Ning; Kiryluk, Krzysztof; Ni, Yizhao; Li, Yikuan; Nadkarni, Girish N; Rosenthal, Elisabeth A; Walunas, Theresa L; Williams, Marc S; Karlson, Elizabeth W; Linder, Jodell E; Luo, Yuan; Weng, Chunhua; Wei, WeiQi.

Sci Rep ; 13(1): 1971, 2023 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-36737471

RESUMO

The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.

Assuntos

Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Genômica , Algoritmos , Fenótipo

9.

Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning.

Carrell, David S; Gruber, Susan; Floyd, James S; Bann, Maralyssa A; Cushing-Haugen, Kara L; Johnson, Ron L; Graham, Vina; Cronkite, David J; Hazlehurst, Brian L; Felcher, Andrew H; Bejan, Cosmin A; Kennedy, Adee; Shinde, Mayura U; Karami, Sara; Ma, Yong; Stojanovic, Danijela; Zhao, Yueqin; Ball, Robert; Nelson, Jennifer C.

Am J Epidemiol ; 192(2): 283-295, 2023 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-36331289

RESUMO

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.

Assuntos

Anafilaxia , Processamento de Linguagem Natural , Humanos , Anafilaxia/diagnóstico , Anafilaxia/epidemiologia , Aprendizado de Máquina , Algoritmos , Serviço Hospitalar de Emergência , Registros Eletrônicos de Saúde

10.

Characterizing variability of electronic health record-driven phenotype definitions.

Brandt, Pascal S; Kho, Abel; Luo, Yuan; Pacheco, Jennifer A; Walunas, Theresa L; Hakonarson, Hakon; Hripcsak, George; Liu, Cong; Shang, Ning; Weng, Chunhua; Walton, Nephi; Carrell, David S; Crane, Paul K; Larson, Eric B; Chute, Christopher G; Kullo, Iftikhar J; Carroll, Robert; Denny, Josh; Ramirez, Andrea; Wei, Wei-Qi; Pathak, Jyoti; Wiley, Laura K; Richesson, Rachel; Starren, Justin B; Rasmussen, Luke V.

J Am Med Inform Assoc ; 30(3): 427-437, 2023 02 16.

Artigo em Inglês | MEDLINE | ID: mdl-36474423

RESUMO

OBJECTIVE: The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used. MATERIALS AND METHODS: A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries. RESULTS: Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts. Most use 4 or fewer medical terminologies. The number of codes used ranges from 5 to 6865, and value sets from 1 to 19. We found that the most common expressions used were literal, data, and logical expressions. Aggregate and arithmetic expressions are the least common. Expression depth ranges from 4 to 27. DISCUSSION: Despite the range of conditions, we found that all of the phenotype definitions consisted of logical criteria, representing both clinical and operational logic, and tabular data, consisting of codes from standard terminologies and keywords for natural language processing. The total number and variety of expressions are low, which may be to simplify implementation, or authors may limit complexity due to data availability constraints. CONCLUSIONS: The phenotype definitions analyzed show significant variation in specific logical, arithmetic, and other operators but are all composed of the same high-level components, namely tabular data and logical expressions. A standard representation for phenotype definitions should support these formats and be modular to support localization and shared logic.

Assuntos

Registros Eletrônicos de Saúde , Idioma , Fenótipo , Narração

11.

Validation of Acute Pancreatitis Among Adults in an Integrated Healthcare System.

Floyd, James S; Bann, Maralyssa A; Felcher, Andrew H; Sapp, Daniel; Nguyen, Michael D; Ajao, Adebola; Ball, Robert; Carrell, David S; Nelson, Jennifer C; Hazlehurst, Brian.

Epidemiology ; 34(1): 33-37, 2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36007092

RESUMO

BACKGROUND: Acute pancreatitis is a serious gastrointestinal disease that is an important target for drug safety surveillance. Little is known about the accuracy of ICD-10 codes for acute pancreatitis in the United States, or their performance in specific clinical settings. We conducted a validation study to assess the accuracy of acute pancreatitis ICD-10 diagnosis codes in inpatient, emergency department (ED), and outpatient settings. METHODS: We reviewed electronic medical records for encounters with acute pancreatitis diagnosis codes in an integrated healthcare system from October 2015 to December 2019. Trained abstractors and physician adjudicators determined whether events met criteria for acute pancreatitis. RESULTS: Out of 1,844 eligible events, we randomly sampled 300 for review. Across all clinical settings, 182 events met validation criteria for an overall positive predictive value (PPV) of 61% (95% confidence intervals [CI] = 55, 66). The PPV was 87% (95% CI = 79, 92%) for inpatient codes, but only 45% for ED (95% CI = 35, 54%) and outpatient (95% CI = 34, 55%) codes. ED and outpatient encounters accounted for 43% of validated events. Acute pancreatitis codes from any encounter type with lipase >3 times the upper limit of normal had a PPV of 92% (95% CI = 86, 95%) and identified 85% of validated events (95% CI = 79, 89%), while codes with lipase <3 times the upper limit of normal had a PPV of only 22% (95% CI = 16, 30%). CONCLUSIONS: These results suggest that ICD-10 codes accurately identified acute pancreatitis in the inpatient setting, but not in the ED and outpatient settings. Laboratory data substantially improved algorithm performance.

Assuntos

Prestação Integrada de Cuidados de Saúde , Pancreatite , Adulto , Humanos , Estados Unidos/epidemiologia , Doença Aguda , Pancreatite/diagnóstico , Pancreatite/epidemiologia , Classificação Internacional de Doenças , Valor Preditivo dos Testes , Lipase

12.

Patient characteristics, pain treatment patterns, and incidence of total joint replacement in a US population with osteoarthritis.

Shinde, Mayura; Rodriguez-Watson, Carla; Zhang, Tancy C; Carrell, David S; Mendelsohn, Aaron B; Nam, Young Hee; Carruth, Amanda; Petronis, Kenneth R; McMahill-Walraven, Cheryl N; Jamal-Allial, Aziza; Nair, Vinit; Pawloski, Pamala A; Hickman, Anne; Brown, Mark T; Francis, Jennie; Hornbuckle, Ken; Brown, Jeffrey S; Mo, Jingping.

BMC Musculoskelet Disord ; 23(1): 883, 2022 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-36151530

RESUMO

BACKGROUND: Currently available medications for chronic osteoarthritis pain are only moderately effective, and their use is limited in many patients because of serious adverse effects and contraindications. The primary surgical option for osteoarthritis is total joint replacement (TJR). The objectives of this study were to describe the treatment history of patients with osteoarthritis receiving prescription pain medications and/or intra-articular corticosteroid injections, and to estimate the incidence of TJR in these patients. METHODS: This retrospective, multicenter, cohort study utilized health plan administrative claims data (January 1, 2013, through December 31, 2019) of adult patients with osteoarthritis in the Innovation in Medical Evidence Development and Surveillance Distributed Database, a subset of the US FDA Sentinel Distributed Database. Patients were analyzed in two cohorts: those with prevalent use of "any pain medication" (prescription non-steroidal anti-inflammatory drugs [NSAIDs], opioids, and/or intra-articular corticosteroid injections) using only the first qualifying dispensing (index date); and those with prevalent use of "each specific pain medication class" with all qualifying treatment episodes identified. RESULTS: Among 1 992 670 prevalent users of "any pain medication", pain medications prescribed on the index date were NSAIDs (596 624 [29.9%] patients), opioids (1 161 806 [58.3%]), and intra-articular corticosteroids (323 459 [16.2%]). Further, 92 026 patients received multiple pain medications on the index date, including 71 632 (3.6%) receiving both NSAIDs and opioids. Altogether, 20.6% of patients used an NSAID at any time following an opioid index dispensing and 17.2% used an opioid following an NSAID index dispensing. The TJR incidence rates per 100 person-years (95% confidence interval [CI]) were 3.21 (95% CI: 3.20-3.23) in the "any pain medication" user cohort, and among those receiving "each specific pain medication class" were NSAIDs, 4.63 (95% CI: 4.58-4.67); opioids, 7.45 (95% CI: 7.40-7.49); and intra-articular corticosteroids, 8.05 (95% CI: 7.97-8.13). CONCLUSIONS: In patients treated with prescription medications for osteoarthritis pain, opioids were more commonly prescribed at index than NSAIDs and intra-articular corticosteroid injections. Of the pain medication classes examined, the incidence of TJR was highest in patients receiving intra-articular corticosteroids and lowest in patients receiving NSAIDs.

Assuntos

Artroplastia de Substituição , Dor Crônica , Osteoartrite , Corticosteroides/efeitos adversos , Adulto , Analgésicos Opioides/uso terapêutico , Anti-Inflamatórios não Esteroides , Artroplastia de Substituição/efeitos adversos , Dor Crônica/tratamento farmacológico , Dor Crônica/epidemiologia , Estudos de Coortes , Humanos , Incidência , Osteoartrite/tratamento farmacológico , Osteoartrite/epidemiologia , Osteoartrite/cirurgia , Estudos Retrospectivos

13.

Development of a machine learning model to predict mild cognitive impairment using natural language processing in the absence of screening.

Penfold, Robert B; Carrell, David S; Cronkite, David J; Pabiniak, Chester; Dodd, Tammy; Glass, Ashley Mh; Johnson, Eric; Thompson, Ella; Arrighi, H Michael; Stang, Paul E.

BMC Med Inform Decis Mak ; 22(1): 129, 2022 05 12.

Artigo em Inglês | MEDLINE | ID: mdl-35549702

RESUMO

BACKGROUND: Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information. METHODS: There were two populations of patients: 1794 participants in the Adult Changes in Thought (ACT) study and 2391 patients in the general population of Kaiser Permanente Washington. All individuals had standardized cognitive assessment scores. We excluded patients with a diagnosis of Alzheimer's Disease, Dementia or use of donepezil. We manually annotated 10,391 clinic notes to train the NLP model. Standard Python code was used to extract phrases from notes and map each phrase to a cognitive functioning concept. Concepts derived from the NLP system were used to predict future MCI. The prediction model was trained on the ACT cohort and 60% of the general population cohort with 40% withheld for validation. We used a least absolute shrinkage and selection operator logistic regression approach (LASSO) to fit a prediction model with MCI as the prediction target. Using the predicted case status from the LASSO model and known MCI from standardized scores, we constructed receiver operating curves to measure model performance. RESULTS: Chart abstraction identified 42 MCI concepts. Prediction model performance in the validation data set was modest with an area under the curve of 0.67. Setting the cutoff for correct classification at 0.60, the classifier yielded sensitivity of 1.7%, specificity of 99.7%, PPV of 70% and NPV of 70.5% in the validation cohort. DISCUSSION AND CONCLUSION: Although the sensitivity of the machine learning model was poor, negative predictive value was high, an important characteristic of models used for population-based screening. While an AUC of 0.67 is generally considered moderate performance, it is also comparable to several tests that are widely used in clinical practice.

Assuntos

Doença de Alzheimer , Disfunção Cognitiva , Doença de Alzheimer/diagnóstico , Disfunção Cognitiva/diagnóstico , Humanos , Aprendizado de Máquina , Programas de Rastreamento , Processamento de Linguagem Natural

14.

Comparison of Medical Cannabis Use Reported on a Confidential Survey vs Documented in the Electronic Health Record Among Primary Care Patients.

Lapham, Gwen T; Matson, Theresa E; Carrell, David S; Bobb, Jennifer F; Luce, Casey; Oliver, Malia M; Ghitza, Udi E; Hsu, Clarissa; Browne, Kendall C; Binswanger, Ingrid A; Campbell, Cynthia I; Saxon, Andrew J; Vandrey, Ryan; Schauer, Gillian L; Pacula, Rosalie Liccardo; Horberg, Michael A; Bailey, Steffani R; McClure, Erin A; Bradley, Katharine A.

JAMA Netw Open ; 5(5): e2211677, 2022 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-35604691

RESUMO

Importance: Patients who use cannabis for medical reasons may benefit from discussions with clinicians about health risks of cannabis and evidence-based treatment alternatives. However, little is known about the prevalence of medical cannabis use in primary care and how often it is documented in patient electronic health records (EHR). Objective: To estimate the primary care prevalence of medical cannabis use according to confidential patient survey and to compare the prevalence of medical cannabis use documented in the EHR with patient report. Design, Setting, and Participants: This study is a cross-sectional survey performed in a large health system that conducts routine cannabis screening in Washington state where medical and nonmedical cannabis use are legal. Among 108â¯950 patients who completed routine cannabis screening (between March 28, 2019, and September 12, 2019), 5000 were randomly selected for a confidential survey about cannabis use, using stratified random sampling for frequency of past-year use and patient race and ethnicity. Data were analyzed from November 2020 to December 2021. Exposures: Survey measures of patient-reported past-year cannabis use, medical cannabis use (ie, explicit medical use), and any health reason(s) for use (ie, implicit medical use). Main Outcomes and Measures: Survey data were linked to EHR data in the year before screening. EHR measures included documentation of explicit and/or implicit medical cannabis use. Analyses estimated the primary care prevalence of cannabis use and compared EHR-documented with patient-reported medical cannabis use, accounting for stratified sampling and nonresponse. Results: Overall, 1688 patients responded to the survey (34% response rate; mean [SD] age, 50.7 [17.5] years; 861 female [56%], 1184 White [74%], 1514 non-Hispanic [97%], and 1059 commercially insured [65%]). The primary care prevalence of any past-year patient-reported cannabis use on the survey was 38.8% (95% CI, 31.9%-46.1%), whereas the prevalence of explicit and implicit medical use were 26.5% (95% CI, 21.6%-31.3%) and 35.1% (95% CI, 29.3%-40.8%), respectively. The prevalence of EHR-documented medical cannabis use was 4.8% (95% CI, 3.45%-6.2%). Compared with patient-reported explicit medical use, the sensitivity and specificity of EHR-documented medical cannabis use were 10.0% (95% CI, 4.4%-15.6%) and 97.1% (95% CI, 94.4%-99.8%), respectively. Conclusions and Relevance: These findings suggest that medical cannabis use is common among primary care patients in a state with legal use, and most use is not documented in the EHR. Patient report of health reasons for cannabis use identifies more medical use compared with explicit questions about medical use.

Assuntos

Registros Eletrônicos de Saúde , Pesquisas sobre Atenção à Saúde , Maconha Medicinal , Autorrelato , Adulto , Idoso , Confidencialidade , Estudos Transversais , Documentação , Registros Eletrônicos de Saúde/normas , Feminino , Humanos , Masculino , Maconha Medicinal/uso terapêutico , Pessoa de Meia-Idade , Atenção Primária à Saúde

15.

Clinical documentation of patient-reported medical cannabis use in primary care: Toward scalable extraction using natural language processing methods.

Carrell, David S; Cronkite, David J; Shea, Mary; Oliver, Malia; Luce, Casey; Matson, Theresa E; Bobb, Jennifer F; Hsu, Clarissa; Binswanger, Ingrid A; Browne, Kendall C; Saxon, Andrew J; McCormack, Jennifer; Jelstrom, Eve; Ghitza, Udi E; Campbell, Cynthia I; Bradley, Katharine A; Lapham, Gwen T.

Subst Abus ; 43(1): 917-924, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35254218

RESUMO

Background: Most states have legalized medical cannabis, yet little is known about how medical cannabis use is documented in patients' electronic health records (EHRs). We used natural language processing (NLP) to calculate the prevalence of clinician-documented medical cannabis use among adults in an integrated health system in Washington State where medical and recreational use are legal. Methods: We analyzed EHRs of patients ≥18 years old screened for past-year cannabis use (November 1, 2017-October 31, 2018), to identify clinician-documented medical cannabis use. We defined medical use as any documentation of cannabis that was recommended by a clinician or described by the clinician or patient as intended to manage health conditions or symptoms. We developed and applied an NLP system that included NLP-assisted manual review to identify such documentation in encounter notes. Results: Medical cannabis use was documented for 16,684 (5.6%) of 299,597 outpatient encounters with routine screening for cannabis use among 203,489 patients seeing 1,274 clinicians. The validated NLP system identified 54% of documentation and NLP-assisted manual review the remainder. Language documenting reasons for cannabis use included 125 terms indicating medical use, 28 terms indicating non-medical use and 41 ambiguous terms. Implicit documentation of medical use (e.g., "edible THC nightly for lumbar pain") was more common than explicit (e.g., "continues medical cannabis use"). Conclusions: Clinicians use diverse and often ambiguous language to document patients' reasons for cannabis use. Automating extraction of documentation about patients' cannabis use could facilitate clinical decision support and epidemiological investigation but will require large amounts of gold standard training data.

Assuntos

Maconha Medicinal , Processamento de Linguagem Natural , Adolescente , Adulto , Documentação , Humanos , Maconha Medicinal/uso terapêutico , Medidas de Resultados Relatados pelo Paciente , Atenção Primária à Saúde

16.

Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions.

Yu, Jingzhi; Pacheco, Jennifer A; Ghosh, Anika S; Luo, Yuan; Weng, Chunhua; Shang, Ning; Benoit, Barbara; Carrell, David S; Carroll, Robert J; Dikilitas, Ozan; Freimuth, Robert R; Gainer, Vivian S; Hakonarson, Hakon; Hripcsak, George; Kullo, Iftikhar J; Mentch, Frank; Murphy, Shawn N; Peissig, Peggy L; Ramirez, Andrea H; Walton, Nephi; Wei, Wei-Qi; Rasmussen, Luke V.

BMC Med Inform Decis Mak ; 22(1): 23, 2022 01 28.

Artigo em Inglês | MEDLINE | ID: mdl-35090449

RESUMO

INTRODUCTION: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. METHODS: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. RESULTS: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. DISCUSSION AND CONCLUSION: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.

Assuntos

Algoritmos , Registros Eletrônicos de Saúde , Genômica , Humanos , Bases de Conhecimento , Fenótipo

17.

The Authors Respond.

Floyd, James S; Gruber, Susan; Carrell, David S; Bann, Maralyssa A.

Epidemiology ; 33(1): e2-e3, 2022 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-34847087

18.

Arrhythmia Variant Associations and Reclassifications in the eMERGE-III Sequencing Study.

Glazer, Andrew M; Davogustto, Giovanni; Shaffer, Christian M; Vanoye, Carlos G; Desai, Reshma R; Farber-Eger, Eric H; Dikilitas, Ozan; Shang, Ning; Pacheco, Jennifer A; Yang, Tao; Muhammad, Ayesha; Mosley, Jonathan D; Van Driest, Sara L; Wells, Quinn S; Shaffer, Lauren Lee; Kalash, Olivia R; Wada, Yuko; Bland, Harris T; Yoneda, Zachary T; Mitchell, Devyn W; Kroncke, Brett M; Kullo, Iftikhar J; Jarvik, Gail P; Gordon, Adam S; Larson, Eric B; Manolio, Teri A; Mirshahi, Tooraj; Luo, Jonathan Z; Schaid, Daniel; Namjou, Bahram; Alsaied, Tarek; Singh, Rajbir; Singhal, Ashutosh; Liu, Cong; Weng, Chunhua; Hripcsak, George; Ralston, James D; McNally, Elizabeth M; Chung, Wendy K; Carrell, David S; Leppig, Kathleen A; Hakonarson, Hakon; Sleiman, Patrick; Sohn, Sunghwan; Glessner, Joseph; Denny, Joshua; Wei, Wei-Qi; George, Alfred L; Shoemaker, M Benjamin; Roden, Dan M.

Circulation ; 145(12): 877-891, 2022 03 22.

Artigo em Inglês | MEDLINE | ID: mdl-34930020

RESUMO

BACKGROUND: Sequencing Mendelian arrhythmia genes in individuals without an indication for arrhythmia genetic testing can identify carriers of pathogenic or likely pathogenic (P/LP) variants. However, the extent to which these variants are associated with clinically meaningful phenotypes before or after return of variant results is unclear. In addition, the majority of discovered variants are currently classified as variants of uncertain significance, limiting clinical actionability. METHODS: The eMERGE-III study (Electronic Medical Records and Genomics Phase III) is a multicenter prospective cohort that included 21 846 participants without previous indication for cardiac genetic testing. Participants were sequenced for 109 Mendelian disease genes, including 10 linked to arrhythmia syndromes. Variant carriers were assessed with electronic health record-derived phenotypes and follow-up clinical examination. Selected variants of uncertain significance (n=50) were characterized in vitro with automated electrophysiology experiments in HEK293 cells. RESULTS: As previously reported, 3.0% of participants had P/LP variants in the 109 genes. Herein, we report 120 participants (0.6%) with P/LP arrhythmia variants. Compared with noncarriers, arrhythmia P/LP carriers had a significantly higher burden of arrhythmia phenotypes in their electronic health records. Fifty-four participants had variant results returned. Nineteen of these 54 participants had inherited arrhythmia syndrome diagnoses (primarily long-QT syndrome), and 12 of these 19 diagnoses were made only after variant results were returned (0.05%). After in vitro functional evaluation of 50 variants of uncertain significance, we reclassified 11 variants: 3 to likely benign and 8 to P/LP. CONCLUSIONS: Genome sequencing in a large population without indication for arrhythmia genetic testing identified phenotype-positive carriers of variants in congenital arrhythmia syndrome disease genes. As the genomes of large numbers of people are sequenced, the disease risk from rare variants in arrhythmia genes can be assessed by integrating genomic screening, electronic health record phenotypes, and in vitro functional studies. REGISTRATION: URL: https://www. CLINICALTRIALS: gov; Unique identifier; NCT03394859.

Assuntos

Arritmias Cardíacas , Testes Genéticos , Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/genética , Predisposição Genética para Doença , Testes Genéticos/métodos , Genômica , Células HEK293 , Humanos , Fenótipo , Estudos Prospectivos

19.

Letter to the Editor re Beachler, et al, 2021.

Gruber, Susan; Carrell, David S; Floyd, James S; Nelson, Jennifer C; Hazlehurst, Brian L; Heagerty, Patrick J.

Pharmacoepidemiol Drug Saf ; 30(12): 1735-1736, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34409675

20.

Prevalence of Medical Cannabis Use and Associated Health Conditions Documented in Electronic Health Records Among Primary Care Patients in Washington State.

Matson, Theresa E; Carrell, David S; Bobb, Jennifer F; Cronkite, David J; Oliver, Malia M; Luce, Casey; Ghitza, Udi E; Hsu, Clarissa W; Campbell, Cynthia I; Browne, Kendall C; Binswanger, Ingrid A; Saxon, Andrew J; Bradley, Katharine A; Lapham, Gwen T.

JAMA Netw Open ; 4(5): e219375, 2021 05 03.

Artigo em Inglês | MEDLINE | ID: mdl-33956129

RESUMO

Importance: Many people use cannabis for medical reasons despite limited evidence of therapeutic benefit and potential risks. Little is known about medical practitioners' documentation of medical cannabis use or clinical characteristics of patients with documented medical cannabis use. Objectives: To estimate the prevalence of past-year medical cannabis use documented in electronic health records (EHRs) and to describe patients with EHR-documented medical cannabis use, EHR-documented cannabis use without evidence of medical use (other cannabis use), and no EHR-documented cannabis use. Design, Setting, and Participants: This cross-sectional study assessed adult primary care patients who completed a cannabis screen during a visit between November 1, 2017, and October 31, 2018, at a large health system that conducts routine cannabis screening in a US state with legal medical and recreational cannabis use. Exposures: Three mutually exclusive categories of EHR-documented cannabis use (medical, other, and no use) based on practitioner documentation of medical cannabis use in the EHR and patient report of past-year cannabis use at screening. Main Outcomes and Measures: Health conditions for which cannabis use has potential benefits or risks were defined based on National Academies of Sciences, Engineering, and Medicine's review. The adjusted prevalence of conditions diagnosed in the prior year were estimated across 3 categories of EHR-documented cannabis use with logistic regression. Results: A total of 185â¯565 patients (mean [SD] age, 52.0 [18.1] years; 59% female, 73% White, 94% non-Hispanic, and 61% commercially insured) were screened for cannabis use in a primary care visit during the study period. Among these patients, 3551 (2%) had EHR-documented medical cannabis use, 36 599 (20%) had EHR-documented other cannabis use, and 145 415 (78%) had no documented cannabis use. Patients with medical cannabis use had a higher prevalence of health conditions for which cannabis has potential benefits (49.8%; 95% CI, 48.3%-51.3%) compared with patients with other cannabis use (39.9%; 95% CI, 39.4%-40.3%) or no cannabis use (40.0%; 95% CI, 39.8%-40.2%). In addition, patients with medical cannabis use had a higher prevalence of health conditions for which cannabis has potential risks (60.7%; 95% CI, 59.0%-62.3%) compared with patients with other cannabis use (50.5%; 95% CI, 50.0%-51.0%) or no cannabis use (42.7%; 95% CI, 42.4%-42.9%). Conclusions and Relevance: In this cross-sectional study, primary care patients with documented medical cannabis use had a high prevalence of health conditions for which cannabis use has potential benefits, yet a higher prevalence of conditions with potential risks from cannabis use. These findings suggest that practitioners should be prepared to discuss potential risks and benefits of cannabis use with patients.

Assuntos

Registros Eletrônicos de Saúde/estatística & dados numéricos , Maconha Medicinal/uso terapêutico , Atenção Primária à Saúde/estatística & dados numéricos , Adolescente , Adulto , Idoso , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Medição de Risco , Resultado do Tratamento , Washington/epidemiologia , Adulto Jovem

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA