|

1.

Artificial intelligence and machine learning for clinical pharmacology.

Ryan, David K; Maclean, Rory H; Balston, Alfred; Scourfield, Andrew; Shah, Anoop D; Ross, Jack.

Br J Clin Pharmacol ; 90(3): 629-639, 2024 03.

Article En | MEDLINE | ID: mdl-37845024

Artificial intelligence (AI) will impact many aspects of clinical pharmacology, including drug discovery and development, clinical trials, personalized medicine, pharmacogenomics, pharmacovigilance and clinical toxicology. The rapid progress of AI in healthcare means clinical pharmacologists should have an understanding of AI and its implementation in clinical practice. As with any new therapy or health technology, it is imperative that AI tools are subject to robust and stringent evaluation to ensure that they enhance clinical practice in a safe and equitable manner. This review serves as an introduction to AI for the clinical pharmacologist, highlighting current applications, aspects of model development and issues surrounding evaluation and deployment. The aim of this article is to empower clinical pharmacologists to embrace and lead on the safe and effective use of AI within healthcare.

Artificial Intelligence , Pharmacology, Clinical , Humans , Machine Learning , Biomedical Technology , Drug Discovery

2.

The Relationship Between Cardiac Troponin in People Hospitalised for Exacerbation of COPD and Major Adverse Cardiac Events (MACE) and COPD Readmissions.

Kallis, Constantinos; Kaura, Amit; Samuel, Nathan A; Mulla, Abdulrahim; Glampson, Ben; O'Gallagher, Kevin; Davies, Jim; Papadimitriou, Dimitri; Woods, Kerrie J; Shah, Anoop D; Williams, Bryan; Asselbergs, Folkert W; Mayer, Erik K; Lee, Richard W; Herbert, Christopher; Grant, Stuart W; Curzen, Nick; Squire, Iain B; Johnson, Thomas; Shah, Ajay M; Perera, Divaka; Kharbanda, Rajesh K; Patel, Riyaz S; Channon, Keith M; Mayet, Jamil; Quint, Jennifer K.

Int J Chron Obstruct Pulmon Dis ; 18: 2405-2416, 2023.

Article En | MEDLINE | ID: mdl-37955026

Background: No single biomarker currently risk stratifies chronic obstructive pulmonary disease (COPD) patients at the time of an exacerbation, though previous studies have suggested that patients with elevated troponin at exacerbation have worse outcomes. This study evaluated the relationship between peak cardiac troponin and subsequent major adverse cardiac events (MACE) including all-cause mortality and COPD hospital readmission, among patients admitted with COPD exacerbation. Methods: Data from five cross-regional hospitals in England were analysed using the National Institute of Health Research Health Informatics Collaborative (NIHR-HIC) acute coronary syndrome database (2008-2017). People hospitalised with a COPD exacerbation were included, and peak troponin levels were standardised relative to the 99th percentile (upper limit of normal). We used Cox Proportional Hazard models adjusting for age, sex, laboratory results and clinical risk factors, and implemented logarithmic transformation (base-10 logarithm). The primary outcome was risk of MACE within 90 days from peak troponin measurement. Secondary outcome was risk of COPD readmission within 90 days from peak troponin measurement. Results: There were 2487 patients included. Of these, 377 (15.2%) patients had a MACE event and 203 (8.2%) were readmitted within 90 days from peak troponin measurement. A total of 1107 (44.5%) patients had an elevated troponin level. Of 1107 patients with elevated troponin at exacerbation, 256 (22.8%) had a MACE event and 101 (9.0%) a COPD readmission within 90 days from peak troponin measurement. Patients with troponin above the upper limit of normal had a higher risk of MACE (adjusted HR 2.20, 95% CI 1.75-2.77) and COPD hospital readmission (adjusted HR 1.37, 95% CI 1.02-1.83) when compared with patients without elevated troponin. Conclusion: An elevated troponin level at the time of COPD exacerbation may be a useful tool for predicting MACE in COPD patients. The relationship between degree of troponin elevation and risk of future events is complex and requires further investigation.

Cardiovascular Diseases , Pulmonary Disease, Chronic Obstructive , Humans , Patient Readmission , Hospitalization , Troponin , Cardiovascular Diseases/etiology

3.

Long Covid symptoms and diagnosis in primary care: A cohort study using structured and unstructured data in The Health Improvement Network primary care database.

Shah, Anoop D; Subramanian, Anuradhaa; Lewis, Jadene; Dhalla, Samir; Ford, Elizabeth; Haroon, Shamil; Kuan, Valerie; Nirantharakumar, Krishnarajah.

PLoS One ; 18(9): e0290583, 2023.

Article En | MEDLINE | ID: mdl-37751444

BACKGROUND: Long Covid is a widely recognised consequence of COVID-19 infection, but little is known about the burden of symptoms that patients present with in primary care, as these are typically recorded only in free text clinical notes. AIMS: To compare symptoms in patients with and without a history of COVID-19, and investigate symptoms associated with a Long Covid diagnosis. METHODS: We used primary care electronic health record data until the end of December 2020 from The Health Improvement Network (THIN), a Cegedim database. We included adults registered with participating practices in England, Scotland or Wales. We extracted information about 89 symptoms and 'Long Covid' diagnoses from free text using natural language processing. We calculated hazard ratios (adjusted for age, sex, baseline medical conditions and prior symptoms) for each symptom from 12 weeks after the COVID-19 diagnosis. RESULTS: We compared 11,015 patients with confirmed COVID-19 and 18,098 unexposed controls. Only 20% of symptom records were coded, with 80% in free text. A wide range of symptoms were associated with COVID-19 at least 12 weeks post-infection, with strongest associations for fatigue (adjusted hazard ratio (aHR) 3.46, 95% confidence interval (CI) 2.87, 4.17), shortness of breath (aHR 2.89, 95% CI 2.48, 3.36), palpitations (aHR 2.59, 95% CI 1.86, 3.60), and phlegm (aHR 2.43, 95% CI 1.65, 3.59). However, a limited subset of symptoms were recorded within 7 days prior to a Long Covid diagnosis in more than 20% of cases: shortness of breath, chest pain, pain, fatigue, cough, and anxiety / depression. CONCLUSIONS: Numerous symptoms are reported to primary care at least 12 weeks after COVID-19 infection, but only a subset are commonly associated with a GP diagnosis of Long Covid.

COVID-19 , Post-Acute COVID-19 Syndrome , Adult , Humans , Chest Pain , Cohort Studies , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19 Testing , Dyspnea/diagnosis , Dyspnea/epidemiology , Fatigue/diagnosis , Fatigue/epidemiology , Primary Health Care , Male , Female

4.

Understanding Views Around the Creation of a Consented, Donated Databank of Clinical Free Text to Develop and Train Natural Language Processing Models for Research: Focus Group Interviews With Stakeholders.

Fitzpatrick, Natalie K; Dobson, Richard; Roberts, Angus; Jones, Kerina; Shah, Anoop D; Nenadic, Goran; Ford, Elizabeth.

JMIR Med Inform ; 11: e45534, 2023 May 03.

Article En | MEDLINE | ID: mdl-37133927

BACKGROUND: Information stored within electronic health records is often recorded as unstructured text. Special computerized natural language processing (NLP) tools are needed to process this text; however, complex governance arrangements make such data in the National Health Service hard to access, and therefore, it is difficult to use for research in improving NLP methods. The creation of a donated databank of clinical free text could provide an important opportunity for researchers to develop NLP methods and tools and may circumvent delays in accessing the data needed to train the models. However, to date, there has been little or no engagement with stakeholders on the acceptability and design considerations of establishing a free-text databank for this purpose. OBJECTIVE: This study aimed to ascertain stakeholder views around the creation of a consented, donated databank of clinical free text to help create, train, and evaluate NLP for clinical research and to inform the potential next steps for adopting a partner-led approach to establish a national, funded databank of free text for use by the research community. METHODS: Web-based in-depth focus group interviews were conducted with 4 stakeholder groups (patients and members of the public, clinicians, information governance leads and research ethics members, and NLP researchers). RESULTS: All stakeholder groups were strongly in favor of the databank and saw great value in creating an environment where NLP tools can be tested and trained to improve their accuracy. Participants highlighted a range of complex issues for consideration as the databank is developed, including communicating the intended purpose, the approach to access and safeguarding the data, who should have access, and how to fund the databank. Participants recommended that a small-scale, gradual approach be adopted to start to gather donations and encouraged further engagement with stakeholders to develop a road map and set of standards for the databank. CONCLUSIONS: These findings provide a clear mandate to begin developing the databank and a framework for stakeholder expectations, which we would aim to meet with the databank delivery.

5.

Digital technology and patient and public involvement (PPI) in routine care and clinical research-A pilot study.

Chen, Yang; Hosin, Ali A; George, Marc J; Asselbergs, Folkert W; Shah, Anoop D.

PLoS One ; 18(2): e0278260, 2023.

Article En | MEDLINE | ID: mdl-36735724

BACKGROUND: Patient and public involvement (PPI) has growing impact on the design of clinical care and research studies. There remains underreporting of formal PPI events including views related to using digital tools. This study aimed to assess the feasibility of hosting a hybrid PPI event to gather views on the use of digital tools in clinical care and research. METHODS: A PPI focus day was held following local procedures and published recommendations related to advertisement, communication and delivery. Two exemplar projects were used as the basis for discussions and qualitative and quantitative data was collected. RESULTS: 32 individuals expressed interest in the PPI day and 9 were selected to attend. 3 participated in person and 6 via an online video-calling platform. Selected written and verbal feedback was collected on two digitally themed projects and on the event itself. The overall quality and interactivity for the event was rated as 4/5 for those who attended in person and 4.5/5 and 4.8/5 respectively, for those who attended remotely. CONCLUSIONS: A hybrid PPI event is feasible and offers a flexible format to capture the views of patients. The overall enthusiasm for digital tools amongst patients in routine care and clinical research is high, though further work and standardised, systematic reporting of PPI events is required.

Digital Technology , Patient Participation , Humans , Pilot Projects , Research Design

6.

Translating and evaluating historic phenotyping algorithms using SNOMED CT.

Elkheder, Musaab; Gonzalez-Izquierdo, Arturo; Qummer Ul Arfeen, Muhammad; Kuan, Valerie; Lumbers, R Thomas; Denaxas, Spiros; Shah, Anoop D.

J Am Med Inform Assoc ; 30(2): 222-232, 2023 01 18.

Article En | MEDLINE | ID: mdl-36083213

OBJECTIVE: Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database. MATERIALS AND METHODS: We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: "primary" (primary concept and its descendants), "extended" (primary concept, descendants, and additional relations), and "value set" (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to "gold standard" manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum. RESULTS: SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The "value set" and "extended" codelists had slightly greater recall but lower precision than "primary" codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9. CONCLUSIONS: SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists.

Asthma , Systematized Nomenclature of Medicine , Humans , Algorithms , Electronic Health Records , Databases, Factual

7.

Symptoms and risk factors for long COVID in non-hospitalized adults.

Subramanian, Anuradhaa; Nirantharakumar, Krishnarajah; Hughes, Sarah; Myles, Puja; Williams, Tim; Gokhale, Krishna M; Taverner, Tom; Chandan, Joht Singh; Brown, Kirsty; Simms-Williams, Nikita; Shah, Anoop D; Singh, Megha; Kidy, Farah; Okoth, Kelvin; Hotham, Richard; Bashir, Nasir; Cockburn, Neil; Lee, Siang Ing; Turner, Grace M; Gkoutos, Georgios V; Aiyegbusi, Olalekan Lee; McMullan, Christel; Denniston, Alastair K; Sapey, Elizabeth; Lord, Janet M; Wraith, David C; Leggett, Edward; Iles, Clare; Marshall, Tom; Price, Malcolm J; Marwaha, Steven; Davies, Elin Haf; Jackson, Louise J; Matthews, Karen L; Camaradou, Jenny; Calvert, Melanie; Haroon, Shamil.

Nat Med ; 28(8): 1706-1714, 2022 08.

Article En | MEDLINE | ID: mdl-35879616

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection is associated with a range of persistent symptoms impacting everyday functioning, known as post-COVID-19 condition or long COVID. We undertook a retrospective matched cohort study using a UK-based primary care database, Clinical Practice Research Datalink Aurum, to determine symptoms that are associated with confirmed SARS-CoV-2 infection beyond 12 weeks in non-hospitalized adults and the risk factors associated with developing persistent symptoms. We selected 486,149 adults with confirmed SARS-CoV-2 infection and 1,944,580 propensity score-matched adults with no recorded evidence of SARS-CoV-2 infection. Outcomes included 115 individual symptoms, as well as long COVID, defined as a composite outcome of 33 symptoms by the World Health Organization clinical case definition. Cox proportional hazards models were used to estimate adjusted hazard ratios (aHRs) for the outcomes. A total of 62 symptoms were significantly associated with SARS-CoV-2 infection after 12 weeks. The largest aHRs were for anosmia (aHR 6.49, 95% CI 5.02-8.39), hair loss (3.99, 3.63-4.39), sneezing (2.77, 1.40-5.50), ejaculation difficulty (2.63, 1.61-4.28) and reduced libido (2.36, 1.61-3.47). Among the cohort of patients infected with SARS-CoV-2, risk factors for long COVID included female sex, belonging to an ethnic minority, socioeconomic deprivation, smoking, obesity and a wide range of comorbidities. The risk of developing long COVID was also found to be increased along a gradient of decreasing age. SARS-CoV-2 infection is associated with a plethora of symptoms that are associated with a range of sociodemographic and clinical risk factors.

COVID-19 , Adult , COVID-19/complications , COVID-19/epidemiology , Cohort Studies , Ethnicity , Female , Humans , Male , Minority Groups , Retrospective Studies , Risk Factors , SARS-CoV-2 , Post-Acute COVID-19 Syndrome

8.

Reproducible disease phenotyping at scale: Example of coronary artery disease in UK Biobank.

Patel, Riyaz S; Denaxas, Spiros; Howe, Laurence J; Eggo, Rosalind M; Shah, Anoop D; Allen, Naomi E; Danesh, John; Hingorani, Aroon; Sudlow, Cathie; Hemingway, Harry.

PLoS One ; 17(4): e0264828, 2022.

Article En | MEDLINE | ID: mdl-35381005

IMPORTANCE: A lack of internationally agreed standards for combining available data sources at scale risks inconsistent disease phenotyping limiting research reproducibility. OBJECTIVE: To develop and then evaluate if a rules-based algorithm can identify coronary artery disease (CAD) sub-phenotypes using electronic health records (EHR) and questionnaire data from UK Biobank (UKB). DESIGN: Case-control and cohort study. SETTING: Prospective cohort study of 502K individuals aged 40-69 years recruited between 2006-2010 into the UK Biobank with linked hospitalization and mortality data and genotyping. PARTICIPANTS: We included all individuals for phenotyping into 6 predefined CAD phenotypes using hospital admission and procedure codes, mortality records and baseline survey data. Of these, 408,470 unrelated individuals of European descent had a polygenic risk score (PRS) for CAD estimated. EXPOSURE: CAD Phenotypes. MAIN OUTCOMES AND MEASURES: Association with baseline risk factors, mortality (n = 14,419 over 7.8 years median f/u), and a PRS for CAD. RESULTS: The algorithm classified individuals with CAD into prevalent MI (n = 4,900); incident MI (n = 4,621), prevalent CAD without MI (n = 10,910), incident CAD without MI (n = 8,668), prevalent self-reported MI (n = 2,754); prevalent self-reported CAD without MI (n = 5,623), yielding 37,476 individuals with any type of CAD. Risk factors were similar across the six CAD phenotypes, except for fewer men in the self-reported CAD without MI group (46.7% v 70.1% for the overall group). In age- and sex- adjusted survival analyses, mortality was highest following incident MI (HR 6.66, 95% CI 6.07-7.31) and lowest for prevalent self-reported CAD without MI at baseline (HR 1.31, 95% CI 1.15-1.50) compared to disease-free controls. There were similar graded associations across the six phenotypes per SD increase in PRS, with the strongest association for prevalent MI (OR 1.50, 95% CI 1.46-1.55) and the weakest for prevalent self-reported CAD without MI (OR 1.08, 95% CI 1.05-1.12). The algorithm is available in the open phenotype HDR UK phenotype library (https://portal.caliberresearch.org/). CONCLUSIONS: An algorithmic, EHR-based approach distinguished six phenotypes of CAD with distinct survival and PRS associations, supporting adoption of open approaches to help standardize CAD phenotyping and its wider potential value for reproducible research in other conditions.

Coronary Artery Disease , Biological Specimen Banks , Cohort Studies , Coronary Artery Disease/epidemiology , Coronary Artery Disease/genetics , Genetic Predisposition to Disease , Humans , Phenotype , Prospective Studies , Reproducibility of Results , Risk Factors , United Kingdom/epidemiology

9.

Prognostic Significance of Ventricular Arrhythmias in 13 444 Patients With Acute Coronary Syndrome: A Retrospective Cohort Study Based on Routine Clinical Data (NIHR Health Informatics Collaborative VA-ACS Study).

Sau, Arunashis; Kaura, Amit; Ahmed, Amar; Patel, Kiran H K; Li, Xinyang; Mulla, Abdulrahim; Glampson, Benjamin; Panoulas, Vasileios; Davies, Jim; Woods, Kerrie; Gautama, Sanjay; Shah, Anoop D; Elliott, Paul; Hemingway, Harry; Williams, Bryan; Asselbergs, Folkert W; Melikian, Narbeh; Peters, Nicholas S; Shah, Ajay M; Perera, Divaka; Kharbanda, Rajesh; Patel, Riyaz S; Channon, Keith M; Mayet, Jamil; Ng, Fu Siong.

J Am Heart Assoc ; 11(6): e024260, 2022 03 15.

Article En | MEDLINE | ID: mdl-35258317

Background A minority of acute coronary syndrome (ACS) cases are associated with ventricular arrhythmias (VA) and/or cardiac arrest (CA). We investigated the effect of VA/CA at the time of ACS on long-term outcomes. Methods and Results We analyzed routine clinical data from 5 National Health Service trusts in the United Kingdom, collected between 2010 and 2017 by the National Institute for Health Research Health Informatics Collaborative. A total of 13 444 patients with ACS, 376 (2.8%) of whom had concurrent VA, survived to hospital discharge and were followed up for a median of 3.42 years. Patients with VA or CA at index presentation had significantly increased risks of subsequent VA during follow-up (VA group: adjusted hazard ratio [HR], 4.15 [95% CI, 2.42-7.09]; CA group: adjusted HR, 2.60 [95% CI, 1.23-5.48]). Patients who suffered a CA in the context of ACS and survived to discharge also had a 36% increase in long-term mortality (adjusted HR, 1.36 [95% CI, 1.04-1.78]), although the concurrent diagnosis of VA alone during ACS did not affect all-cause mortality (adjusted HR, 1.03 [95% CI, 0.80-1.33]). Conclusions Patients who develop VA or CA during ACS who survive to discharge have increased risks of subsequent VA, whereas those who have CA during ACS also have an increase in long-term mortality. These individuals may represent a subgroup at greater risk of subsequent arrhythmic events as a result of intrinsically lower thresholds for developing VA.

Acute Coronary Syndrome , Medical Informatics , Acute Coronary Syndrome/complications , Acute Coronary Syndrome/diagnosis , Acute Coronary Syndrome/epidemiology , Arrhythmias, Cardiac/complications , Arrhythmias, Cardiac/diagnosis , Arrhythmias, Cardiac/epidemiology , Humans , Prognosis , Retrospective Studies , State Medicine

10.

Mortality risk prediction of high-sensitivity C-reactive protein in suspected acute coronary syndrome: A cohort study.

Kaura, Amit; Hartley, Adam; Panoulas, Vasileios; Glampson, Ben; Shah, Anoop S V; Davies, Jim; Mulla, Abdulrahim; Woods, Kerrie; Omigie, Joe; Shah, Anoop D; Thursz, Mark R; Elliott, Paul; Hemmingway, Harry; Williams, Bryan; Asselbergs, Folkert W; O'Sullivan, Michael; Lord, Graham M; Trickey, Adam; Sterne, Jonathan Ac; Haskard, Dorian O; Melikian, Narbeh; Francis, Darrel P; Koenig, Wolfgang; Shah, Ajay M; Kharbanda, Rajesh; Perera, Divaka; Patel, Riyaz S; Channon, Keith M; Mayet, Jamil; Khamis, Ramzi.

PLoS Med ; 19(2): e1003911, 2022 02.

Article En | MEDLINE | ID: mdl-35192610

BACKGROUND: There is limited evidence on the use of high-sensitivity C-reactive protein (hsCRP) as a biomarker for selecting patients for advanced cardiovascular (CV) therapies in the modern era. The prognostic value of mildly elevated hsCRP beyond troponin in a large real-world cohort of unselected patients presenting with suspected acute coronary syndrome (ACS) is unknown. We evaluated whether a mildly elevated hsCRP (up to 15 mg/L) was associated with mortality risk, beyond troponin level, in patients with suspected ACS. METHODS AND FINDINGS: We conducted a retrospective cohort study based on the National Institute for Health Research Health Informatics Collaborative data of 257,948 patients with suspected ACS who had a troponin measured at 5 cardiac centres in the United Kingdom between 2010 and 2017. Patients were divided into 4 hsCRP groups (<2, 2 to 4.9, 5 to 9.9, and 10 to 15 mg/L). The main outcome measure was mortality within 3 years of index presentation. The association between hsCRP levels and all-cause mortality was assessed using multivariable Cox regression analysis adjusted for age, sex, haemoglobin, white cell count (WCC), platelet count, creatinine, and troponin. Following the exclusion criteria, there were 102,337 patients included in the analysis (hsCRP <2 mg/L (n = 38,390), 2 to 4.9 mg/L (n = 27,397), 5 to 9.9 mg/L (n = 26,957), and 10 to 15 mg/L (n = 9,593)). On multivariable Cox regression analysis, there was a positive and graded relationship between hsCRP level and mortality at baseline, which remained at 3 years (hazard ratio (HR) (95% CI) of 1.32 (1.18 to 1.48) for those with hsCRP 2.0 to 4.9 mg/L and 1.40 (1.26 to 1.57) and 2.00 (1.75 to 2.28) for those with hsCRP 5 to 9.9 mg/L and 10 to 15 mg/L, respectively. This relationship was independent of troponin in all suspected ACS patients and was further verified in those who were confirmed to have an ACS diagnosis by clinical coding. The main limitation of our study is that we did not have data on underlying cause of death; however, the exclusion of those with abnormal WCC or hsCRP levels >15 mg/L makes it unlikely that sepsis was a major contributor. CONCLUSIONS: These multicentre, real-world data from a large cohort of patients with suspected ACS suggest that mildly elevated hsCRP (up to 15 mg/L) may be a clinically meaningful prognostic marker beyond troponin and point to its potential utility in selecting patients for novel treatments targeting inflammation. TRIAL REGISTRATION: ClinicalTrials.gov - NCT03507309.

Acute Coronary Syndrome/blood , Acute Coronary Syndrome/mortality , C-Reactive Protein/metabolism , Acute Coronary Syndrome/diagnosis , Aged , Aged, 80 and over , Biomarkers/blood , Cohort Studies , Female , Follow-Up Studies , Humans , Longitudinal Studies , Male , Middle Aged , Mortality/trends , Predictive Value of Tests , Retrospective Studies , Risk Factors , United Kingdom/epidemiology

11.

An informatics consult approach for generating clinical evidence for treatment decisions.

Lai, Alvina G; Chang, Wai Hoong; Parisinos, Constantinos A; Katsoulis, Michail; Blackburn, Ruth M; Shah, Anoop D; Nguyen, Vincent; Denaxas, Spiros; Davey Smith, George; Gaunt, Tom R; Nirantharakumar, Krishnarajah; Cox, Murray P; Forde, Donall; Asselbergs, Folkert W; Harris, Steve; Richardson, Sylvia; Sofat, Reecha; Dobson, Richard J B; Hingorani, Aroon; Patel, Riyaz; Sterne, Jonathan; Banerjee, Amitava; Denniston, Alastair K; Ball, Simon; Sebire, Neil J; Shah, Nigam H; Foster, Graham R; Williams, Bryan; Hemingway, Harry.

BMC Med Inform Decis Mak ; 21(1): 281, 2021 10 12.

Article En | MEDLINE | ID: mdl-34641870

BACKGROUND: An Informatics Consult has been proposed in which clinicians request novel evidence from large scale health data resources, tailored to the treatment of a specific patient. However, the availability of such consultations is lacking. We seek to provide an Informatics Consult for a situation where a treatment indication and contraindication coexist in the same patient, i.e., anti-coagulation use for stroke prevention in a patient with both atrial fibrillation (AF) and liver cirrhosis. METHODS: We examined four sources of evidence for the effect of warfarin on stroke risk or all-cause mortality from: (1) randomised controlled trials (RCTs), (2) meta-analysis of prior observational studies, (3) trial emulation (using population electronic health records (N = 3,854,710) and (4) genetic evidence (Mendelian randomisation). We developed prototype forms to request an Informatics Consult and return of results in electronic health record systems. RESULTS: We found 0 RCT reports and 0 trials recruiting for patients with AF and cirrhosis. We found broad concordance across the three new sources of evidence we generated. Meta-analysis of prior observational studies showed that warfarin use was associated with lower stroke risk (hazard ratio [HR] = 0.71, CI 0.39-1.29). In a target trial emulation, warfarin was associated with lower all-cause mortality (HR = 0.61, CI 0.49-0.76) and ischaemic stroke (HR = 0.27, CI 0.08-0.91). Mendelian randomisation served as a drug target validation where we found that lower levels of vitamin K1 (warfarin is a vitamin K1 antagonist) are associated with lower stroke risk. A pilot survey with an independent sample of 34 clinicians revealed that 85% of clinicians found information on prognosis useful and that 79% thought that they should have access to the Informatics Consult as a service within their healthcare systems. We identified candidate steps for automation to scale evidence generation and to accelerate the return of results. CONCLUSION: We performed a proof-of-concept Informatics Consult for evidence generation, which may inform treatment decisions in situations where there is dearth of randomised trials. Patients are surprised to know that their clinicians are currently not able to learn in clinic from data on 'patients like me'. We identify the key challenges in offering such an Informatics Consult as a service.

Atrial Fibrillation , Stroke , Anticoagulants/therapeutic use , Atrial Fibrillation/drug therapy , Humans , Informatics , Referral and Consultation , Stroke/drug therapy , Treatment Outcome , Warfarin/therapeutic use

12.

Descriptors of Sepsis Using the Sepsis-3 Criteria: A Cohort Study in Critical Care Units Within the U.K. National Institute for Health Research Critical Care Health Informatics Collaborative.

Shah, Anoop D; MacCallum, Niall S; Harris, Steve; Brealey, David A; Palmer, Edward; Hetherington, James; Shi, Sinan; Perez-Suarez, David; Ercole, Ari; Watkinson, Peter J; Jones, Andrew; Ashworth, Simon; Beale, Richard; Brett, Stephen J; Singer, Mervyn.

Crit Care Med ; 49(11): 1883-1894, 2021 11 01.

Article En | MEDLINE | ID: mdl-34259454

OBJECTIVES: To describe the epidemiology of sepsis in critical care by applying the Sepsis-3 criteria to electronic health records. DESIGN: Retrospective cohort study using electronic health records. SETTING: Ten ICUs from four U.K. National Health Service hospital trusts contributing to the National Institute for Health Research Critical Care Health Informatics Collaborative. PATIENTS: A total of 28,456 critical care admissions (14,332 emergency medical, 4,585 emergency surgical, and 9,539 elective surgical). MEASUREMENTS AND MAIN RESULTS: Twenty-nine thousand three hundred forty-three episodes of clinical deterioration were identified with a rise in Sequential Organ Failure Assessment score of at least 2 points, of which 14,869 (50.7%) were associated with antibiotic escalation and thereby met the Sepsis-3 criteria for sepsis. A total of 4,100 episodes of sepsis (27.6%) were associated with vasopressor use and lactate greater than 2.0 mmol/L, and therefore met the Sepsis-3 criteria for septic shock. ICU mortality by source of sepsis was highest for ICU-acquired sepsis (23.7%; 95% CI, 21.9-25.6%), followed by hospital-acquired sepsis (18.6%; 95% CI, 17.5-19.9%), and community-acquired sepsis (12.9%; 95% CI, 12.1-13.6%) (p for comparison less than 0.0001). CONCLUSIONS: We successfully operationalized the Sepsis-3 criteria to an electronic health record dataset to describe the characteristics of critical care patients with sepsis. This may facilitate sepsis research using electronic health record data at scale without relying on human coding.

Critical Care/statistics & numerical data , Cross Infection/mortality , Organ Dysfunction Scores , Sepsis/mortality , Sepsis/therapy , Severity of Illness Index , Adult , Aged , Cohort Studies , Cross Infection/therapy , Female , Humans , Intensive Care Units , Male , Middle Aged , Retrospective Studies , Shock, Septic/mortality , State Medicine

13.

Multi-domain clinical natural language processing with MedCAT: The Medical Concept Annotation Toolkit.

Kraljevic, Zeljko; Searle, Thomas; Shek, Anthony; Roguski, Lukasz; Noor, Kawsar; Bean, Daniel; Mascio, Aurelie; Zhu, Leilei; Folarin, Amos A; Roberts, Angus; Bendayan, Rebecca; Richardson, Mark P; Stewart, Robert; Shah, Anoop D; Wong, Wai Keong; Ibrahim, Zina; Teo, James T; Dobson, Richard J B.

Artif Intell Med ; 117: 102083, 2021 07.

Article En | MEDLINE | ID: mdl-34127232

Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of information extraction (IE) technologies to enable clinical analysis. We present the open source Medical Concept Annotation Toolkit (MedCAT) that provides: (a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; (b) a feature-rich annotation interface for customizing and training IE models; and (c) integrations to the broader CogStack ecosystem for vendor-agnostic health system deployment. We show improved performance in extracting UMLS concepts from open datasets (F1:0.448-0.738 vs 0.429-0.650). Further real-world validation demonstrates SNOMED-CT extraction at 3 large London hospitals with self-supervised training over â¼8.8B words from â¼17M clinical records and further fine-tuning with â¼6K clinician annotated examples. We show strong transferability (F1â¯>â¯0.94) between hospitals, datasets and concept types indicating cross-domain EHR-agnostic utility for accelerated clinical and research use cases.

Natural Language Processing , Systematized Nomenclature of Medicine , Electronic Health Records , Information Storage and Retrieval , Unified Medical Language System

14.

Data gaps in electronic health record (EHR) systems: An audit of problem list completeness during the COVID-19 pandemic.

Poulos, Jordan; Zhu, Leilei; Shah, Anoop D.

Int J Med Inform ; 150: 104452, 2021 06.

Article En | MEDLINE | ID: mdl-33864979

OBJECTIVE: To evaluate the completeness of diagnosis recording in problem lists in a hospital electronic health record (EHR) system during the COVID-19 pandemic. DESIGN: Retrospective chart review with manual review of free text electronic case notes. SETTING: Major teaching hospital trust in London, one year after the launch of a comprehensive EHR system (Epic), during the first peak of the COVID-19 pandemic in the UK. PARTICIPANTS: 516 patients with suspected or confirmed COVID-19. MAIN OUTCOME MEASURES: Percentage of diagnoses already included in the structured problem list. RESULTS: Prior to review, these patients had a combined total of 2841 diagnoses recorded in their EHR problem lists. 1722 additional diagnoses were identified, increasing the mean number of recorded problems per patient from 5.51 to 8.84. The overall percentage of diagnoses originally included in the problem list was 62.3% (2841 / 4563, 95% confidence interval 60.8%, 63.7%). CONCLUSIONS: Diagnoses and other clinical information stored in a structured way in electronic health records is extremely useful for supporting clinical decisions, improving patient care and enabling better research. However, recording of medical diagnoses on the structured problem list for inpatients is incomplete, with almost 40% of important diagnoses mentioned only in the free text notes.

COVID-19 , Electronic Health Records , Humans , Pandemics , Retrospective Studies , SARS-CoV-2

15.

Invasive versus non-invasive management of older patients with non-ST elevation myocardial infarction (SENIOR-NSTEMI): a cohort study based on routine clinical data.

Kaura, Amit; Sterne, Jonathan A C; Trickey, Adam; Abbott, Sam; Mulla, Abdulrahim; Glampson, Benjamin; Panoulas, Vasileios; Davies, Jim; Woods, Kerrie; Omigie, Joe; Shah, Anoop D; Channon, Keith M; Weber, Jonathan N; Thursz, Mark R; Elliott, Paul; Hemingway, Harry; Williams, Bryan; Asselbergs, Folkert W; O'Sullivan, Michael; Lord, Graham M; Melikian, Narbeh; Johnson, Thomas; Francis, Darrel P; Shah, Ajay M; Perera, Divaka; Kharbanda, Rajesh; Patel, Riyaz S; Mayet, Jamil.

Lancet ; 396(10251): 623-634, 2020 08 29.

Article En | MEDLINE | ID: mdl-32861307

BACKGROUND: Previous trials suggest lower long-term risk of mortality after invasive rather than non-invasive management of patients with non-ST elevation myocardial infarction (NSTEMI), but the trials excluded very elderly patients. We aimed to estimate the effect of invasive versus non-invasive management within 3 days of peak troponin concentration on the survival of patients aged 80 years or older with NSTEMI. METHODS: Routine clinical data for this study were obtained from five collaborating hospitals hosting NIHR Biomedical Research Centres in the UK (all tertiary centres with emergency departments). Eligible patients were 80 years old or older when they underwent troponin measurements and were diagnosed with NSTEMI between 2010 (2008 for University College Hospital) and 2017. Propensity scores (patients' estimated probability of receiving invasive management) based on pretreatment variables were derived using logistic regression; patients with high probabilities of non-invasive or invasive management were excluded. Patients who died within 3 days of peak troponin concentration without receiving invasive management were assigned to the invasive or non-invasive management groups based on their propensity scores, to mitigate immortal time bias. We estimated mortality hazard ratios comparing invasive with non-invasive management, and compared the rate of hospital admissions for heart failure. FINDINGS: Of the 1976 patients with NSTEMI, 101 died within 3 days of their peak troponin concentration and 375 were excluded because of extreme propensity scores. The remaining 1500 patients had a median age of 86 (IQR 82-89) years of whom (845 [56%] received non-invasive management. During median follow-up of 3·0 (IQR 1·2-4·8) years, 613 (41%) patients died. The adjusted cumulative 5-year mortality was 36% in the invasive management group and 55% in the non-invasive management group (adjusted hazard ratio 0·68, 95% CI 0·55-0·84). Invasive management was associated with lower incidence of hospital admissions for heart failure (adjusted rate ratio compared with non-invasive management 0·67, 95% CI 0·48-0·93). INTERPRETATION: The survival advantage of invasive compared with non-invasive management appears to extend to patients with NSTEMI who are aged 80 years or older. FUNDING: NIHR Imperial Biomedical Research Centre, as part of the NIHR Health Informatics Collaborative.

Non-ST Elevated Myocardial Infarction/mortality , Non-ST Elevated Myocardial Infarction/therapy , Age Factors , Aged, 80 and over , Cohort Studies , Female , Hospitalization , Humans , Logistic Models , Male , Non-ST Elevated Myocardial Infarction/diagnosis , Propensity Score , Survival Rate , Troponin/blood , United Kingdom

16.

Prognostic significance of troponin level in 3121 patients presenting with atrial fibrillation (The NIHR Health Informatics Collaborative TROP-AF study).

Kaura, Amit; Arnold, Ahran D; Panoulas, Vasileios; Glampson, Benjamin; Davies, Jim; Mulla, Abdulrahim; Woods, Kerrie; Omigie, Joe; Shah, Anoop D; Channon, Keith M; Weber, Jonathan N; Thursz, Mark R; Elliott, Paul; Hemingway, Harry; Williams, Bryan; Asselbergs, Folkert W; O'Sullivan, Michael; Lord, Graham M; Melikian, Narbeh; Lefroy, David C; Francis, Darrel P; Shah, Ajay M; Kharbanda, Rajesh; Perera, Divaka; Patel, Riyaz S; Mayet, Jamil.

J Am Heart Assoc ; 9(7): e013684, 2020 04 07.

Article En | MEDLINE | ID: mdl-32212911

Background Patients presenting with atrial fibrillation (AF) often undergo a blood test to measure troponin, but interpretation of the result is impeded by uncertainty about its clinical importance. We investigated the relationship between troponin level, coronary angiography, and all-cause mortality in real-world patients presenting with AF. Methods and Results We used National Institute of Health Research Health Informatics Collaborative data to identify patients admitted between 2010 and 2017 at 5 tertiary centers in the United Kingdom with a primary diagnosis of AF. Peak troponin results were scaled as multiples of the upper limit of normal. A total of 3121 patients were included in the analysis. Over a median follow-up of 1462 (interquartile range, 929-1975) days, there were 586 deaths (18.8%). The adjusted hazard ratio for mortality associated with a positive troponin (value above upper limit of normal) was 1.20 (95% CI, 1.01-1.43; P<0.05). Higher troponin levels were associated with higher risk of mortality, reaching a maximum hazard ratio of 2.6 (95% CI, 1.9-3.4) at ≈250 multiples of the upper limit of normal. There was an exponential relationship between higher troponin levels and increased odds of coronary angiography. The mortality risk was 36% lower in patients undergoing coronary angiography than in those who did not (adjusted hazard ratio, 0.61; 95% CI, 0.42-0.89; P=0.01). Conclusions Increased troponin was associated with increased risk of mortality in patients presenting with AF. The lower hazard ratio in patients undergoing invasive management raises the possibility that the clinical importance of troponin release in AF may be mediated by coronary artery disease, which may be responsive to revascularization.

Atrial Fibrillation/blood , Coronary Artery Disease/blood , Troponin/blood , Aged , Aged, 80 and over , Atrial Fibrillation/diagnosis , Atrial Fibrillation/mortality , Biomarkers/blood , Coronary Angiography , Coronary Artery Disease/diagnosis , Coronary Artery Disease/mortality , England , Female , Humans , Male , Middle Aged , Predictive Value of Tests , Prognosis , Retrospective Studies , Risk Assessment , Risk Factors , Time Factors , Up-Regulation

17.

Natural Language Processing for Mimicking Clinical Trial Recruitment in Critical Care: A Semi-Automated Simulation Based on the LeoPARDS Trial.

Tissot, Hegler C; Shah, Anoop D; Brealey, David; Harris, Steve; Agbakoba, Ruth; Folarin, Amos; Romao, Luis; Roguski, Lukasz; Dobson, Richard; Asselbergs, Folkert W.

IEEE J Biomed Health Inform ; 24(10): 2950-2959, 2020 10.

Article En | MEDLINE | ID: mdl-32149659

Clinical trials often fail to recruit an adequate number of appropriate patients. Identifying eligible trial participants is resource-intensive when relying on manual review of clinical notes, particularly in critical care settings where the time window is short. Automated review of electronic health records (EHR) may help, but much of the information is in free text rather than a computable form. We applied natural language processing (NLP) to free text EHR data using the CogStack platform to simulate recruitment into the LeoPARDS study, a clinical trial aiming to reduce organ dysfunction in septic shock. We applied an algorithm to identify eligible patients using a moving 1-hour time window, and compared patients identified by our approach with those actually screened and recruited for the trial, for the time period that data were available. We manually reviewed records of a random sample of patients identified by the algorithm but not screened in the original trial. Our method identified 376 patients, including 34 patients with EHR data available who were actually recruited to LeoPARDS in our centre. The sensitivity of CogStack for identifying patients screened was 90% (95% CI 85%, 93%). Of the 203 patients identified by both manual screening and CogStack, the index date matched in 95 (47%) and CogStack was earlier in 94 (47%). In conclusion, analysis of EHR data using NLP could effectively replicate recruitment in a critical care trial, and identify some eligible patients at an earlier stage, potentially improving trial recruitment if implemented in real time.

Clinical Trials as Topic , Data Mining/methods , Electronic Health Records , Natural Language Processing , Patient Selection , Adult , Computer Simulation , Critical Care , Female , Humans , Male

18.

A semi-supervised approach for rapidly creating clinical biomarker phenotypes in the UK Biobank using different primary care EHR and clinical terminology systems.

Denaxas, Spiros; Shah, Anoop D; Mateen, Bilal A; Kuan, Valerie; Quint, Jennifer K; Fitzpatrick, Natalie; Torralbo, Ana; Fatemifar, Ghazaleh; Hemingway, Harry.

JAMIA Open ; 3(4): 545-556, 2020 Dec.

Article En | MEDLINE | ID: mdl-33619467

OBJECTIVES: The UK Biobank (UKB) is making primary care electronic health records (EHRs) for 500 000 participants available for COVID-19-related research. Data are extracted from four sources, recorded using five clinical terminologies and stored in different schemas. The aims of our research were to: (a) develop a semi-supervised approach for bootstrapping EHR phenotyping algorithms in UKB EHR, and (b) to evaluate our approach by implementing and evaluating phenotypes for 31 common biomarkers. MATERIALS AND METHODS: We describe an algorithmic approach to phenotyping biomarkers in primary care EHR involving (a) bootstrapping definitions using existing phenotypes, (b) excluding generic, rare, or semantically distant terms, (c) forward-mapping terminology terms, (d) expert review, and (e) data extraction. We evaluated the phenotypes by assessing the ability to reproduce known epidemiological associations with all-cause mortality using Cox proportional hazards models. RESULTS: We created and evaluated phenotyping algorithms for 31 biomarkers many of which are directly related to COVID-19 complications, for example diabetes, cardiovascular disease, respiratory disease. Our algorithm identified 1651 Read v2 and Clinical Terms Version 3 terms and automatically excluded 1228 terms. Clinical review excluded 103 terms and included 44 terms, resulting in 364 terms for data extraction (sensitivity 0.89, specificity 0.92). We extracted 38 190 682 events and identified 220 978 participants with at least one biomarker measured. DISCUSSION AND CONCLUSION: Bootstrapping phenotyping algorithms from similar EHR can potentially address pre-existing methodological concerns that undermine the outputs of biomarker discovery pipelines and provide research-quality phenotyping algorithms.

19.

Bleeding in cardiac patients prescribed antithrombotic drugs: electronic health record phenotyping algorithms, incidence, trends and prognosis.

Pasea, Laura; Chung, Sheng-Chia; Pujades-Rodriguez, Mar; Shah, Anoop D; Alvarez-Madrazo, Samantha; Allan, Victoria; Teo, James T; Bean, Daniel; Sofat, Reecha; Dobson, Richard; Banerjee, Amitava; Patel, Riyaz S; Timmis, Adam; Denaxas, Spiros; Hemingway, Harry.

BMC Med ; 17(1): 206, 2019 11 20.

Article En | MEDLINE | ID: mdl-31744503

BACKGROUND: Clinical guidelines and public health authorities lack recommendations on scalable approaches to defining and monitoring the occurrence and severity of bleeding in populations prescribed antithrombotic therapy. METHODS: We examined linked primary care, hospital admission and death registry electronic health records (CALIBER 1998-2010, England) of patients with newly diagnosed atrial fibrillation, acute myocardial infarction, unstable angina or stable angina with the aim to develop algorithms for bleeding events. Using the developed bleeding phenotypes, Kaplan-Meier plots were used to estimate the incidence of bleeding events and we used Cox regression models to assess the prognosis for all-cause mortality, atherothrombotic events and further bleeding. RESULTS: We present electronic health record phenotyping algorithms for bleeding based on bleeding diagnosis in primary or hospital care, symptoms, transfusion, surgical procedures and haemoglobin values. In validation of the phenotype, we estimated a positive predictive value of 0.88 (95% CI 0.64, 0.99) for hospitalised bleeding. Amongst 128,815 patients, 27,259 (21.2%) had at least 1 bleeding event, with 5-year risks of bleeding of 29.1%, 21.9%, 25.3% and 23.4% following diagnoses of atrial fibrillation, acute myocardial infarction, unstable angina and stable angina, respectively. Rates of hospitalised bleeding per 1000 patients more than doubled from 1.02 (95% CI 0.83, 1.22) in January 1998 to 2.68 (95% CI 2.49, 2.88) in December 2009 coinciding with the increased rates of antiplatelet and vitamin K antagonist prescribing. Patients with hospitalised bleeding and primary care bleeding, with or without markers of severity, were at increased risk of all-cause mortality and atherothrombotic events compared to those with no bleeding. For example, the hazard ratio for all-cause mortality was 1.98 (95% CI 1.86, 2.11) for primary care bleeding with markers of severity and 1.99 (95% CI 1.92, 2.05) for hospitalised bleeding without markers of severity, compared to patients with no bleeding. CONCLUSIONS: Electronic health record bleeding phenotyping algorithms offer a scalable approach to monitoring bleeding in the population. Incidence of bleeding has doubled in incidence since 1998, affects one in four cardiovascular disease patients, and is associated with poor prognosis. Efforts are required to tackle this iatrogenic epidemic.

Anticoagulants/adverse effects , Heart Diseases/drug therapy , Hemorrhage/chemically induced , Aged , Algorithms , Anticoagulants/therapeutic use , Antithrombins/adverse effects , Electronic Health Records , England , Female , Hemorrhage/epidemiology , Humans , Incidence , Male , Prognosis , Risk Factors

20.

Natural language processing for disease phenotyping in UK primary care records for research: a pilot study in myocardial infarction and death.

Shah, Anoop D; Bailey, Emily; Williams, Tim; Denaxas, Spiros; Dobson, Richard; Hemingway, Harry.

J Biomed Semantics ; 10(Suppl 1): 20, 2019 11 12.

Article En | MEDLINE | ID: mdl-31711543

BACKGROUND: Free text in electronic health records (EHR) may contain additional phenotypic information beyond structured (coded) information. For major health events - heart attack and death - there is a lack of studies evaluating the extent to which free text in the primary care record might add information. Our objectives were to describe the contribution of free text in primary care to the recording of information about myocardial infarction (MI), including subtype, left ventricular function, laboratory results and symptoms; and recording of cause of death. We used the CALIBER EHR research platform which contains primary care data from the Clinical Practice Research Datalink (CPRD) linked to hospital admission data, the MINAP registry of acute coronary syndromes and the death registry. In CALIBER we randomly selected 2000 patients with MI and 1800 deaths. We implemented a rule-based natural language engine, the Freetext Matching Algorithm, on site at CPRD to analyse free text in the primary care record without raw data being released to researchers. We analysed text recorded within 90 days before or 90 days after the MI, and on or after the date of death. RESULTS: We extracted 10,927 diagnoses, 3658 test results, 3313 statements of negation, and 850 suspected diagnoses from the myocardial infarction patients. Inclusion of free text increased the recorded proportion of patients with chest pain in the week prior to MI from 19 to 27%, and differentiated between MI subtypes in a quarter more patients than structured data alone. Cause of death was incompletely recorded in primary care; in 36% the cause was in coded data and in 21% it was in free text. Only 47% of patients had exactly the same cause of death in primary care and the death registry, but this did not differ between coded and free text causes of death. CONCLUSIONS: Among patients who suffer MI or die, unstructured free text in primary care records contains much information that is potentially useful for research such as symptoms, investigation results and specific diagnoses. Access to large scale unstructured data in electronic health records (millions of patients) might yield important insights.

Electronic Health Records , Myocardial Infarction/mortality , Natural Language Processing , Phenotype , Primary Health Care/statistics & numerical data , Data Mining , Humans , Pilot Projects , United Kingdom