Search | VHL Regional Portal

1.

Artificial intelligence-assisted automated heart failure detection and classification from electronic health records.

Oo, Mon Myat; Gao, Chuang; Cole, Christian; Hummel, Yoran; Guignard-Duff, Magalie; Jefferson, Emily; Hare, James; Voors, Adriaan A; de Boer, Rudolf A; Lam, Carolyn S P; Mordi, Ify R; Tromp, Jasper; Lang, Chim C.

ESC Heart Fail ; 2024 May 03.

Article in English | MEDLINE | ID: mdl-38700133

ABSTRACT

AIMS: Electronic health records (EHR) linked to Digital Imaging and Communications in Medicine (DICOM), biological specimens, and deep learning (DL) algorithms could potentially improve patient care through automated case detection and surveillance. We hypothesized that by applying keyword searches to routinely stored EHR, in conjunction with AI-powered automated reading of DICOM echocardiography images and analysing biomarkers from routinely stored plasma samples, we were able to identify heart failure (HF) patients. METHODS AND RESULTS: We used EHR data between 1993 and 2021 from Tayside and Fife (~20% of the Scottish population). We implemented a keyword search strategy complemented by filtering based on International Classification of Diseases (ICD) codes and prescription data to EHR data set. We then applied DL for the automated interpretation of echocardiographic DICOM images. These methods were then integrated with the analysis of routinely stored plasma samples to identify and categorize patients into HF with reduced ejection fraction (HFrEF), HF with preserved ejection fraction (HFpEF), and controls without HF. The final diagnosis was verified through a manual review of medical records, measured natriuretic peptides in stored blood samples, and by comparing clinical outcomes among groups. In our study, we selected the patient cohort through an algorithmic workflow. This process started with 60 850 EHR data and resulted in a final cohort of 578 patients, divided into 186 controls, 236 with HFpEF, and 156 with HFrEF, after excluding individuals with mismatched data or significant valvular heart disease. The analysis of baseline characteristics revealed that compared with controls, patients with HFrEF and HFpEF were generally older, had higher BMI, and showed a greater prevalence of co-morbidities such as diabetes, COPD, and CKD. Echocardiographic analysis, enhanced by DL, provided high coverage, and detailed insights into cardiac function, showing significant differences in parameters such as left ventricular diameter, ejection fraction, and myocardial strain among the groups. Clinical outcomes highlighted a higher risk of hospitalization and mortality for HF patients compared with controls, with particularly elevated risk ratios for both HFrEF and HFpEF groups. The concordance between the algorithmic selection of patients and manual validation demonstrated high accuracy, supporting the effectiveness of our approach in identifying and classifying HF subtypes, which could significantly impact future HF diagnosis and management strategies. CONCLUSIONS: Our study highlights the feasibility of combining keyword searches in EHR, DL automated echocardiographic interpretation, and biobank resources to identify HF subtypes.

2.

Association of adrenal steroids with metabolomic profiles in patients with primary and endocrine hypertension.

Knuchel, Robin; Erlic, Zoran; Gruber, Sven; Amar, Laurence; Larsen, Casper K; Gimenez-Roqueplo, Anne-Paule; Mulatero, Paolo; Tetti, Martina; Pecori, Alessio; Pamporaki, Christina; Langton, Katharina; Peitzsch, Mirko; Ceccato, Filippo; Prejbisz, Aleksander; Januszewicz, Andrzej; Adolf, Christian; Remde, Hanna; Lenzini, Livia; Dennedy, Michael; Deinum, Jaap; Jefferson, Emily; Blanchard, Anne; Zennaro, Maria-Christina; Eisenhofer, Graeme; Beuschlein, Felix.

Front Endocrinol (Lausanne) ; 15: 1370525, 2024.

Article in English | MEDLINE | ID: mdl-38596218

ABSTRACT

Introduction: Endocrine hypertension (EHT) due to pheochromocytoma/paraganglioma (PPGL), Cushing's syndrome (CS), or primary aldosteronism (PA) is linked to a variety of metabolic alterations and comorbidities. Accordingly, patients with EHT and primary hypertension (PHT) are characterized by distinct metabolic profiles. However, it remains unclear whether the metabolomic differences relate solely to the disease-defining hormonal parameters. Therefore, our objective was to study the association of disease defining hormonal excess and concomitant adrenal steroids with metabolomic alterations in patients with EHT. Methods: Retrospective European multicenter study of 263 patients (mean age 49 years, 50% females; 58 PHT, 69 PPGL, 37 CS, 99 PA) in whom targeted metabolomic and adrenal steroid profiling was available. The association of 13 adrenal steroids with differences in 79 metabolites between PPGL, CS, PA and PHT was examined after correction for age, sex, BMI, and presence of diabetes mellitus. Results: After adjustment for BMI and diabetes mellitus significant association between adrenal steroids and metabolites - 18 in PPGL, 15 in CS, and 23 in PA - were revealed. In PPGL, the majority of metabolite associations were linked to catecholamine excess, whereas in PA, only one metabolite was associated with aldosterone. In contrast, cortisone (16 metabolites), cortisol (6 metabolites), and DHEA (8 metabolites) had the highest number of associated metabolites in PA. In CS, 18-hydroxycortisol significantly influenced 5 metabolites, cortisol affected 4, and cortisone, 11-deoxycortisol, and DHEA each were linked to 3 metabolites. Discussions: Our study indicates cortisol, cortisone, and catecholamine excess are significantly associated with metabolomic variances in EHT versus PHT patients. Notably, catecholamine excess is key to PPGL's metabolomic changes, whereas in PA, other non-defining adrenal steroids mainly account for metabolomic differences. In CS, cortisol, alongside other non-defining adrenal hormones, contributes to these differences, suggesting that metabolic disorders and cardiovascular morbidity in these conditions could also be affected by various adrenal steroids.

Subject(s)

Adrenal Gland Neoplasms , Cortisone , Cushing Syndrome , Diabetes Mellitus , Hypertension , Paraganglioma , Pheochromocytoma , Female , Humans , Middle Aged , Male , Hydrocortisone/metabolism , Retrospective Studies , Cushing Syndrome/complications , Steroids , Adrenal Gland Neoplasms/complications , Hypertension/complications , Pheochromocytoma/complications , Paraganglioma/complications , Catecholamines , Dehydroepiandrosterone

3.

The Scottish Medical Imaging Archive: 57.3 Million Radiology Studies Linked to Their Medical Records.

Baxter, Rob; Nind, Thomas; Sutherland, James; McAllister, Gordon; Hardy, Douglas; Hume, Ally; MacLeod, Ruairidh; Caldwell, Jacqueline; Krueger, Susan; Tramma, Leandro; Teviotdale, Ross; Gillen, Kenny; Scobbie, Donald; Baillie, Ian; Brooks, Andrew; Prodan, Bianca; Kerr, William; Sloan-Murphy, Dominic; Herrera, Juan F R; van Beek, Edwin J R; Reel, Parminder Singh; Reel, Smarti; Mansouri-Benssassi, Esma; Mudie, Roy; Steele, Douglas; Doney, Alex; Trucco, Emanuele; Morris, Carole; Wallace, Robert; Morris, Andrew; Parsons, Mark; Jefferson, Emily.

Radiol Artif Intell ; 6(1): e220266, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38166330

ABSTRACT

Keywords: MRI, Imaging Sequences, Ultrasound, Mammography, CT, Angiography, Conventional Radiography Published under a CC BY 4.0 license. See also the commentary by Whitman and Vining in this issue.

Subject(s)

Mammography , Radiology , Radiography , Medical Records , Scotland

4.

Impact of data source choice on multimorbidity measurement: a comparison study of 2.3 million individuals in the Welsh National Health Service.

MacRae, Clare; Morales, Daniel; Mercer, Stewart W; Lone, Nazir; Lawson, Andrew; Jefferson, Emily; McAllister, David; van den Akker, Marjan; Marshall, Alan; Seth, Sohan; Rawlings, Anna; Lyons, Jane; Lyons, Ronan A; Mizen, Amy; Abubakar, Eleojo; Dibben, Chris; Guthrie, Bruce.

BMC Med ; 21(1): 309, 2023 08 15.

Article in English | MEDLINE | ID: mdl-37582755

ABSTRACT

BACKGROUND: Measurement of multimorbidity in research is variable, including the choice of the data source used to ascertain conditions. We compared the estimated prevalence of multimorbidity and associations with mortality using different data sources. METHODS: A cross-sectional study of SAIL Databank data including 2,340,027 individuals of all ages living in Wales on 01 January 2019. Comparison of prevalence of multimorbidity and constituent 47 conditions using data from primary care (PC), hospital inpatient (HI), and linked PC-HI data sources and examination of associations between condition count and 12-month mortality. RESULTS: Using linked PC-HI compared with only HI data, multimorbidity was more prevalent (32.2% versus 16.5%), and the population of people identified as having multimorbidity was younger (mean age 62.5 versus 66.8 years) and included more women (54.2% versus 52.6%). Individuals with multimorbidity in both PC and HI data had stronger associations with mortality than those with multimorbidity only in HI data (adjusted odds ratio 8.34 [95% CI 8.02-8.68] versus 6.95 (95%CI 6.79-7.12] in people with ≥ 4 conditions). The prevalence of conditions identified using only PC versus only HI data was significantly higher for 37/47 and significantly lower for 10/47: the highest PC/HI ratio was for depression (14.2 [95% CI 14.1-14.4]) and the lowest for aneurysm (0.51 [95% CI 0.5-0.5]). Agreement in ascertainment of conditions between the two data sources varied considerably, being slight for five (kappa < 0.20), fair for 12 (kappa 0.21-0.40), moderate for 16 (kappa 0.41-0.60), and substantial for 12 (kappa 0.61-0.80) conditions, and by body system was lowest for mental and behavioural disorders. The percentage agreement, individuals with a condition identified in both PC and HI data, was lowest in anxiety (4.6%) and highest in coronary artery disease (62.9%). CONCLUSIONS: The use of single data sources may underestimate prevalence when measuring multimorbidity and many important conditions (especially mental and behavioural disorders). Caution should be used when interpreting findings of research examining individual and multiple long-term conditions using single data sources. Where available, researchers using electronic health data should link primary care and hospital inpatient data to generate more robust evidence to support evidence-based healthcare planning decisions for people with multimorbidity.

Subject(s)

Multimorbidity , State Medicine , Humans , Female , Middle Aged , Cross-Sectional Studies , Information Sources , Prevalence , Chronic Disease

5.

Disclosure control of machine learning models from trusted research environments (TRE): New challenges and opportunities.

Mansouri-Benssassi, Esma; Rogers, Simon; Reel, Smarti; Malone, Maeve; Smith, Jim; Ritchie, Felix; Jefferson, Emily.

Heliyon ; 9(4): e15143, 2023 Apr.

Article in English | MEDLINE | ID: mdl-37123891

ABSTRACT

Introduction: Artificial intelligence (AI) applications in healthcare and medicine have increased in recent years. To enable access to personal data, Trusted Research Environments (TREs) (otherwise known as Safe Havens) provide safe and secure environments in which researchers can access sensitive personal data and develop AI (in particular machine learning (ML)) models. However, currently few TREs support the training of ML models in part due to a gap in the practical decision-making guidance for TREs in handling model disclosure. Specifically, the training of ML models creates a need to disclose new types of outputs from TREs. Although TREs have clear policies for the disclosure of statistical outputs, the extent to which trained models can leak personal training data once released is not well understood. Background: We review, for a general audience, different types of ML models and their applicability within healthcare. We explain the outputs from training a ML model and how trained ML models can be vulnerable to external attacks to discover personal data encoded within the model. Risks: We present the challenges for disclosure control of trained ML models in the context of training and exporting models from TREs. We provide insights and analyse methods that could be introduced within TREs to mitigate the risk of privacy breaches when disclosing trained models. Discussion: Although specific guidelines and policies exist for statistical disclosure controls in TREs, they do not satisfactorily address these new types of output requests; i.e., trained ML models. There is significant potential for new interdisciplinary research opportunities in developing and adapting policies and tools for safely disclosing ML outputs from TREs.

6.

The impact of varying the number and selection of conditions on estimated multimorbidity prevalence: A cross-sectional study using a large, primary care population dataset.

MacRae, Clare; McMinn, Megan; Mercer, Stewart W; Henderson, David; McAllister, David A; Ho, Iris; Jefferson, Emily; Morales, Daniel R; Lyons, Jane; Lyons, Ronan A; Dibben, Chris; Guthrie, Bruce.

PLoS Med ; 20(4): e1004208, 2023 04.

Article in English | MEDLINE | ID: mdl-37014910

ABSTRACT

BACKGROUND: Multimorbidity prevalence rates vary considerably depending on the conditions considered in the morbidity count, but there is no standardised approach to the number or selection of conditions to include. METHODS AND FINDINGS: We conducted a cross-sectional study using English primary care data for 1,168,260 participants who were all people alive and permanently registered with 149 included general practices. Outcome measures of the study were prevalence estimates of multimorbidity (defined as ≥2 conditions) when varying the number and selection of conditions considered for 80 conditions. Included conditions featured in ≥1 of the 9 published lists of conditions examined in the study and/or phenotyping algorithms in the Health Data Research UK (HDR-UK) Phenotype Library. First, multimorbidity prevalence was calculated when considering the individually most common 2 conditions, 3 conditions, etc., up to 80 conditions. Second, prevalence was calculated using 9 condition-lists from published studies. Analyses were stratified by dependent variables age, socioeconomic position, and sex. Prevalence when only the 2 commonest conditions were considered was 4.6% (95% CI [4.6, 4.6] p < 0.001), rising to 29.5% (95% CI [29.5, 29.6] p < 0.001) considering the 10 commonest, 35.2% (95% CI [35.1, 35.3] p < 0.001) considering the 20 commonest, and 40.5% (95% CI [40.4, 40.6] p < 0.001) when considering all 80 conditions. The threshold number of conditions at which multimorbidity prevalence was >99% of that measured when considering all 80 conditions was 52 for the whole population but was lower in older people (29 in >80 years) and higher in younger people (71 in 0- to 9-year-olds). Nine published condition-lists were examined; these were either recommended for measuring multimorbidity, used in previous highly cited studies of multimorbidity prevalence, or widely applied measures of "comorbidity." Multimorbidity prevalence using these lists varied from 11.1% to 36.4%. A limitation of the study is that conditions were not always replicated using the same ascertainment rules as previous studies to improve comparability across condition-lists, but this highlights further variability in prevalence estimates across studies. CONCLUSIONS: In this study, we observed that varying the number and selection of conditions results in very large differences in multimorbidity prevalence, and different numbers of conditions are needed to reach ceiling rates of multimorbidity prevalence in certain groups of people. These findings imply that there is a need for a standardised approach to defining multimorbidity, and to facilitate this, researchers can use existing condition-lists associated with highest multimorbidity prevalence.

Subject(s)

Multimorbidity , Primary Health Care , Humans , Cross-Sectional Studies , Chronic Disease , Comorbidity , Prevalence

7.

Age, sex, and socioeconomic differences in multimorbidity measured in four ways: UK primary care cross-sectional analysis.

MacRae, Clare; Mercer, Stewart W; Henderson, David; McMinn, Megan; Morales, Daniel R; Jefferson, Emily; Lyons, Ronan A; Lyons, Jane; Dibben, Chris; McAllister, David A; Guthrie, Bruce.

Br J Gen Pract ; 73(729): e249-e256, 2023 04.

Article in English | MEDLINE | ID: mdl-36997222

ABSTRACT

BACKGROUND: Multimorbidity poses major challenges to healthcare systems worldwide. Definitions with cut-offs in excess of ≥2 long-term conditions (LTCs) might better capture populations with complexity but are not standardised. AIM: To examine variation in prevalence using different definitions of multimorbidity. DESIGN AND SETTING: Cross-sectional study of 1 168 620 people in England. METHOD: Comparison of multimorbidity (MM) prevalence using four definitions: MM2+ (≥2 LTCs), MM3+ (≥3 LTCs), MM3+ from 3+ (≥3 LTCs from ≥3 International Classification of Diseases, 10th revision chapters), and mental-physical MM (≥2 LTCs where ≥1 mental health LTC and ≥1 physical health LTC are recorded). Logistic regression was used to examine patient characteristics associated with multimorbidity under all four definitions. RESULTS: MM2+ was most common (40.4%) followed by MM3+ (27.5%), MM3+ from 3+ (22.6%), and mental-physical MM (18.9%). MM2+, MM3+, and MM3+ from 3+ were strongly associated with oldest age (adjusted odds ratio [aOR] 58.09, 95% confidence interval [CI] = 56.13 to 60.14; aOR 77.69, 95% CI = 75.33 to 80.12; and aOR 102.06, 95% CI = 98.61 to 105.65; respectively), but mental-physical MM was much less strongly associated (aOR 4.32, 95% CI = 4.21 to 4.43). People in the most deprived decile had equivalent rates of multimorbidity at a younger age than those in the least deprived decile. This was most marked in mental-physical MM at 40-45 years younger, followed by MM2+ at 15-20 years younger, and MM3+ and MM3+ from 3+ at 10-15 years younger. Females had higher prevalence of multimorbidity under all definitions, which was most marked for mental-physical MM. CONCLUSION: Estimated prevalence of multimorbidity depends on the definition used, and associations with age, sex, and socioeconomic position vary between definitions. Applicable multimorbidity research requires consistency of definitions across studies.

Subject(s)

Multimorbidity , Primary Health Care , Female , Humans , Cross-Sectional Studies , Prevalence , Socioeconomic Factors , United Kingdom/epidemiology

8.

Machine learning models in trusted research environments - understanding operational risks.

Ritchie, Felix; Tilbrook, Amy; Cole, Christian; Jefferson, Emily; Krueger, Susan; Mansouri-Benssassi, Esma; Rogers, Simon; Smith, Jim.

Int J Popul Data Sci ; 8(1): 2165, 2023.

Article in English | MEDLINE | ID: mdl-38414545

ABSTRACT

Introduction: Trusted research environments (TREs) provide secure access to very sensitive data for research. All TREs operate manual checks on outputs to ensure there is no residual disclosure risk. Machine learning (ML) models require very large amount of data; if this data is personal, the TRE is a well-established data management solution. However, ML models present novel disclosure risks, in both type and scale. Objectives: As part of a series on ML disclosure risk in TREs, this article is intended to introduce TRE managers to the conceptual problems and work being done to address them. Methods: We demonstrate how ML models present a qualitatively different type of disclosure risk, compared to traditional statistical outputs. These arise from both the nature and the scale of ML modelling. Results: We show that there are a large number of unresolved issues, although there is progress in many areas. We show where areas of uncertainty remain, as well as remedial responses available to TREs. Conclusions: At this stage, disclosure checking of ML models is very much a specialist activity. However, TRE managers need a basic awareness of the potential risk in ML models to enable them to make sensible decisions on using TREs for ML model development.

Subject(s)

Disclosure , Machine Learning

9.

A Hybrid Architecture (CO-CONNECT) to Facilitate Rapid Discovery and Access to Data Across the United Kingdom in Response to the COVID-19 Pandemic: Development Study.

Jefferson, Emily; Cole, Christian; Mumtaz, Shahzad; Cox, Samuel; Giles, Thomas Charles; Adejumo, Sam; Urwin, Esmond; Lea, Daniel; Macdonald, Calum; Best, Joseph; Masood, Erum; Milligan, Gordon; Johnston, Jenny; Horban, Scott; Birced, Ipek; Hall, Christopher; Jackson, Aaron S; Collins, Clare; Rising, Sam; Dodsley, Charlotte; Hampton, Jill; Hadfield, Andrew; Santos, Roberto; Tarr, Simon; Panagi, Vasiliki; Lavagna, Joseph; Jackson, Tracy; Chuter, Antony; Beggs, Jillian; Martinez-Queipo, Magdalena; Ward, Helen; von Ziegenweidt, Julie; Burns, Frances; Martin, Joanne; Sebire, Neil; Morris, Carole; Bradley, Declan; Baxter, Rob; Ahonen-Bishopp, Anni; Smith, Paul; Shoemark, Amelia; Valdes, Ana M; Ollivere, Benjamin; Manisty, Charlotte; Eyre, David; Gallant, Stephanie; Joy, George; McAuley, Andrew; Connell, David; Northstone, Kate.

J Med Internet Res ; 24(12): e40035, 2022 12 27.

Article in English | MEDLINE | ID: mdl-36322788

ABSTRACT

BACKGROUND: COVID-19 data have been generated across the United Kingdom as a by-product of clinical care and public health provision, as well as numerous bespoke and repurposed research endeavors. Analysis of these data has underpinned the United Kingdom's response to the pandemic, and informed public health policies and clinical guidelines. However, these data are held by different organizations, and this fragmented landscape has presented challenges for public health agencies and researchers as they struggle to find relevant data to access and interrogate the data they need to inform the pandemic response at pace. OBJECTIVE: We aimed to transform UK COVID-19 diagnostic data sets to be findable, accessible, interoperable, and reusable (FAIR). METHODS: A federated infrastructure model (COVID - Curated and Open Analysis and Research Platform [CO-CONNECT]) was rapidly built to enable the automated and reproducible mapping of health data partners' pseudonymized data to the Observational Medical Outcomes Partnership Common Data Model without the need for any data to leave the data controllers' secure environments, and to support federated cohort discovery queries and meta-analysis. RESULTS: A total of 56 data sets from 19 organizations are being connected to the federated network. The data include research cohorts and COVID-19 data collected through routine health care provision linked to longitudinal health care records and demographics. The infrastructure is live, supporting aggregate-level querying of data across the United Kingdom. CONCLUSIONS: CO-CONNECT was developed by a multidisciplinary team. It enables rapid COVID-19 data discovery and instantaneous meta-analysis across data sources, and it is researching streamlined data extraction for use in a Trusted Research Environment for research and public health analysis. CO-CONNECT has the potential to make UK health data more interconnected and better able to answer national-level research questions while maintaining patient confidentiality and local governance procedures.

Subject(s)

COVID-19 , Humans , COVID-19/epidemiology , Pandemics , United Kingdom/epidemiology

10.

Whole blood methylome-derived features to discriminate endocrine hypertension.

Armignacco, Roberta; Reel, Parminder S; Reel, Smarti; Jouinot, Anne; Septier, Amandine; Gaspar, Cassandra; Perlemoine, Karine; Larsen, Casper K; Bouys, Lucas; Braun, Leah; Riester, Anna; Kroiss, Matthias; Bonnet-Serrano, Fidéline; Amar, Laurence; Blanchard, Anne; Gimenez-Roqueplo, Anne-Paule; Prejbisz, Aleksander; Januszewicz, Andrzej; Dobrowolski, Piotr; Davies, Eleanor; MacKenzie, Scott M; Rossi, Gian Paolo; Lenzini, Livia; Ceccato, Filippo; Scaroni, Carla; Mulatero, Paolo; Williams, Tracy A; Pecori, Alessio; Monticone, Silvia; Beuschlein, Felix; Reincke, Martin; Zennaro, Maria-Christina; Bertherat, Jérôme; Jefferson, Emily; Assié, Guillaume.

Clin Epigenetics ; 14(1): 142, 2022 11 03.

Article in English | MEDLINE | ID: mdl-36329530

ABSTRACT

BACKGROUND: Arterial hypertension represents a worldwide health burden and a major risk factor for cardiovascular morbidity and mortality. Hypertension can be primary (primary hypertension, PHT), or secondary to endocrine disorders (endocrine hypertension, EHT), such as Cushing's syndrome (CS), primary aldosteronism (PA), and pheochromocytoma/paraganglioma (PPGL). Diagnosis of EHT is currently based on hormone assays. Efficient detection remains challenging, but is crucial to properly orientate patients for diagnostic confirmation and specific treatment. More accurate biomarkers would help in the diagnostic pathway. We hypothesized that each type of endocrine hypertension could be associated with a specific blood DNA methylation signature, which could be used for disease discrimination. To identify such markers, we aimed at exploring the methylome profiles in a cohort of 255 patients with hypertension, either PHT (n = 42) or EHT (n = 213), and at identifying specific discriminating signatures using machine learning approaches. RESULTS: Unsupervised classification of samples showed discrimination of PHT from EHT. CS patients clustered separately from all other patients, whereas PA and PPGL showed an overall overlap. Global methylation was decreased in the CS group compared to PHT. Supervised comparison with PHT identified differentially methylated CpG sites for each type of endocrine hypertension, showing a diffuse genomic location. Among the most differentially methylated genes, FKBP5 was identified in the CS group. Using four different machine learning methods-Lasso (Least Absolute Shrinkage and Selection Operator), Logistic Regression, Random Forest, and Support Vector Machine-predictive models for each type of endocrine hypertension were built on training cohorts (80% of samples for each hypertension type) and estimated on validation cohorts (20% of samples for each hypertension type). Balanced accuracies ranged from 0.55 to 0.74 for predicting EHT, 0.85 to 0.95 for predicting CS, 0.66 to 0.88 for predicting PA, and 0.70 to 0.83 for predicting PPGL. CONCLUSIONS: The blood DNA methylome can discriminate endocrine hypertension, with methylation signatures for each type of endocrine disorder.

Subject(s)

Adrenal Gland Neoplasms , Hypertension , Pheochromocytoma , Humans , Epigenome , DNA Methylation , Pheochromocytoma/complications , Pheochromocytoma/genetics , Hypertension/diagnosis , Hypertension/genetics , Adrenal Gland Neoplasms/diagnosis , Adrenal Gland Neoplasms/genetics , Adrenal Gland Neoplasms/complications , Biomarkers

11.

Next-Generation Capabilities in Trusted Research Environments: Interview Study.

Kavianpour, Sanaz; Sutherland, James; Mansouri-Benssassi, Esma; Coull, Natalie; Jefferson, Emily.

J Med Internet Res ; 24(9): e33720, 2022 09 20.

Article in English | MEDLINE | ID: mdl-36125859

ABSTRACT

BACKGROUND: A Trusted Research Environment (TRE; also known as a Safe Haven) is an environment supported by trained staff and agreed processes (principles and standards), providing access to data for research while protecting patient confidentiality. Accessing sensitive data without compromising the privacy and security of the data is a complex process. OBJECTIVE: This paper presents the security measures, administrative procedures, and technical approaches adopted by TREs. METHODS: We contacted 73 TRE operators, 22 (30%) of whom, in the United Kingdom and internationally, agreed to be interviewed remotely under a nondisclosure agreement and to complete a questionnaire about their TRE. RESULTS: We observed many similar processes and standards that TREs follow to adhere to the Seven Safes principles. The security processes and TRE capabilities for supporting observational studies using classical statistical methods were mature, and the requirements were well understood. However, we identified limitations in the security measures and capabilities of TREs to support "next-generation" requirements such as wide ranges of data types, ability to develop artificial intelligence algorithms and software within the environment, handling of big data, and timely import and export of data. CONCLUSIONS: We found a lack of software or other automation tools to support the community and limited knowledge of how to meet the next-generation requirements from the research community. Disclosure control for exporting artificial intelligence algorithms and software was found to be particularly challenging, and there is a clear need for additional controls to support this capability within TREs.

Subject(s)

Artificial Intelligence , Computer Security , Confidentiality , Humans , Privacy , Qualitative Research

12.

Machine learning for classification of hypertension subtypes using multi-omics: A multi-centre, retrospective, data-driven study.

Reel, Parminder S; Reel, Smarti; van Kralingen, Josie C; Langton, Katharina; Lang, Katharina; Erlic, Zoran; Larsen, Casper K; Amar, Laurence; Pamporaki, Christina; Mulatero, Paolo; Blanchard, Anne; Kabat, Marek; Robertson, Stacy; MacKenzie, Scott M; Taylor, Angela E; Peitzsch, Mirko; Ceccato, Filippo; Scaroni, Carla; Reincke, Martin; Kroiss, Matthias; Dennedy, Michael C; Pecori, Alessio; Monticone, Silvia; Deinum, Jaap; Rossi, Gian Paolo; Lenzini, Livia; McClure, John D; Nind, Thomas; Riddell, Alexandra; Stell, Anthony; Cole, Christian; Sudano, Isabella; Prehn, Cornelia; Adamski, Jerzy; Gimenez-Roqueplo, Anne-Paule; Assié, Guillaume; Arlt, Wiebke; Beuschlein, Felix; Eisenhofer, Graeme; Davies, Eleanor; Zennaro, Maria-Christina; Jefferson, Emily.

EBioMedicine ; 84: 104276, 2022 Oct.

Article in English | MEDLINE | ID: mdl-36179553

ABSTRACT

BACKGROUND: Arterial hypertension is a major cardiovascular risk factor. Identification of secondary hypertension in its various forms is key to preventing and targeting treatment of cardiovascular complications. Simplified diagnostic tests are urgently required to distinguish primary and secondary hypertension to address the current underdiagnosis of the latter. METHODS: This study uses Machine Learning (ML) to classify subtypes of endocrine hypertension (EHT) in a large cohort of hypertensive patients using multidimensional omics analysis of plasma and urine samples. We measured 409 multi-omics (MOmics) features including plasma miRNAs (PmiRNA: 173), plasma catechol O-methylated metabolites (PMetas: 4), plasma steroids (PSteroids: 16), urinary steroid metabolites (USteroids: 27), and plasma small metabolites (PSmallMB: 189) in primary hypertension (PHT) patients, EHT patients with either primary aldosteronism (PA), pheochromocytoma/functional paraganglioma (PPGL) or Cushing syndrome (CS) and normotensive volunteers (NV). Biomarker discovery involved selection of disease combination, outlier handling, feature reduction, 8 ML classifiers, class balancing and consideration of different age- and sex-based scenarios. Classifications were evaluated using balanced accuracy, sensitivity, specificity, AUC, F1, and Kappa score. FINDINGS: Complete clinical and biological datasets were generated from 307 subjects (PA=113, PPGL=88, CS=41 and PHT=112). The random forest classifier provided â¼92% balanced accuracy (â¼11% improvement on the best mono-omics classifier), with 96% specificity and 0.95 AUC to distinguish one of the four conditions in multi-class ALL-ALL comparisons (PPGL vs PA vs CS vs PHT) on an unseen test set, using 57 MOmics features. For discrimination of EHT (PA + PPGL + CS) vs PHT, the simple logistic classifier achieved 0.96 AUC with 90% sensitivity, and â¼86% specificity, using 37 MOmics features. One PmiRNA (hsa-miR-15a-5p) and two PSmallMB (C9 and PC ae C38:1) features were found to be most discriminating for all disease combinations. Overall, the MOmics-based classifiers were able to provide better classification performance in comparison to mono-omics classifiers. INTERPRETATION: We have developed a ML pipeline to distinguish different EHT subtypes from PHT using multi-omics data. This innovative approach to stratification is an advancement towards the development of a diagnostic tool for EHT patients, significantly increasing testing throughput and accelerating administration of appropriate treatment. FUNDING: European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 633983, Clinical Research Priority Program of the University of Zurich for the CRPP HYRENE (to Z.E. and F.B.), and Deutsche Forschungsgemeinschaft (CRC/Transregio 205/1).

Subject(s)

Hypertension , MicroRNAs , Biomarkers , Catechols , Humans , Hypertension/diagnosis , Machine Learning , Retrospective Studies

13.

Predicting Hypertension Subtypes with Machine Learning Using Targeted Metabolites and Their Ratios.

Reel, Smarti; Reel, Parminder S; Erlic, Zoran; Amar, Laurence; Pecori, Alessio; Larsen, Casper K; Tetti, Martina; Pamporaki, Christina; Prehn, Cornelia; Adamski, Jerzy; Prejbisz, Aleksander; Ceccato, Filippo; Scaroni, Carla; Kroiss, Matthias; Dennedy, Michael C; Deinum, Jaap; Eisenhofer, Graeme; Langton, Katharina; Mulatero, Paolo; Reincke, Martin; Rossi, Gian Paolo; Lenzini, Livia; Davies, Eleanor; Gimenez-Roqueplo, Anne-Paule; Assié, Guillaume; Blanchard, Anne; Zennaro, Maria-Christina; Beuschlein, Felix; Jefferson, Emily R.

Metabolites ; 12(8)2022 Aug 16.

Article in English | MEDLINE | ID: mdl-36005627

ABSTRACT

Hypertension is a major global health problem with high prevalence and complex associated health risks. Primary hypertension (PHT) is most common and the reasons behind primary hypertension are largely unknown. Endocrine hypertension (EHT) is another complex form of hypertension with an estimated prevalence varying from 3 to 20% depending on the population studied. It occurs due to underlying conditions associated with hormonal excess mainly related to adrenal tumours and sub-categorised: primary aldosteronism (PA), Cushing's syndrome (CS), pheochromocytoma or functional paraganglioma (PPGL). Endocrine hypertension is often misdiagnosed as primary hypertension, causing delays in treatment for the underlying condition, reduced quality of life, and costly antihypertensive treatment that is often ineffective. This study systematically used targeted metabolomics and high-throughput machine learning methods to predict the key biomarkers in classifying and distinguishing the various subtypes of endocrine and primary hypertension. The trained models successfully classified CS from PHT and EHT from PHT with 92% specificity on the test set. The most prominent targeted metabolites and metabolite ratios for hypertension identification for different disease comparisons were C18:1, C18:2, and Orn/Arg. Sex was identified as an important feature in CS vs. PHT classification.

14.

Preanalytical Pitfalls in Untargeted Plasma Nuclear Magnetic Resonance Metabolomics of Endocrine Hypertension.

Bliziotis, Nikolaos G; Kluijtmans, Leo A J; Tinnevelt, Gerjen H; Reel, Parminder; Reel, Smarti; Langton, Katharina; Robledo, Mercedes; Pamporaki, Christina; Pecori, Alessio; Van Kralingen, Josie; Tetti, Martina; Engelke, Udo F H; Erlic, Zoran; Engel, Jasper; Deutschbein, Timo; Nölting, Svenja; Prejbisz, Aleksander; Richter, Susan; Adamski, Jerzy; Januszewicz, Andrzej; Ceccato, Filippo; Scaroni, Carla; Dennedy, Michael C; Williams, Tracy A; Lenzini, Livia; Gimenez-Roqueplo, Anne-Paule; Davies, Eleanor; Fassnacht, Martin; Remde, Hanna; Eisenhofer, Graeme; Beuschlein, Felix; Kroiss, Matthias; Jefferson, Emily; Zennaro, Maria-Christina; Wevers, Ron A; Jansen, Jeroen J; Deinum, Jaap; Timmers, Henri J L M.

Metabolites ; 12(8)2022 Jul 24.

Article in English | MEDLINE | ID: mdl-35893246

ABSTRACT

Despite considerable morbidity and mortality, numerous cases of endocrine hypertension (EHT) forms, including primary aldosteronism (PA), pheochromocytoma and functional paraganglioma (PPGL), and Cushing's syndrome (CS), remain undetected. We aimed to establish signatures for the different forms of EHT, investigate potentially confounding effects and establish unbiased disease biomarkers. Plasma samples were obtained from 13 biobanks across seven countries and analyzed using untargeted NMR metabolomics. We compared unstratified samples of 106 PHT patients to 231 EHT patients, including 104 PA, 94 PPGL and 33 CS patients. Spectra were subjected to a multivariate statistical comparison of PHT to EHT forms and the associated signatures were obtained. Three approaches were applied to investigate and correct confounding effects. Though we found signatures that could separate PHT from EHT forms, there were also key similarities with the signatures of sample center of origin and sample age. The study design restricted the applicability of the corrections employed. With the samples that were available, no biomarkers for PHT vs. EHT could be identified. The complexity of the confounding effects, evidenced by their robustness to correction approaches, highlighted the need for a consensus on how to deal with variabilities probably attributed to preanalytical factors in retrospective, multicenter metabolomics studies.

15.

A National Network of Safe Havens: Scottish Perspective.

Gao, Chuang; McGilchrist, Mark; Mumtaz, Shahzad; Hall, Christopher; Anderson, Lesley Ann; Zurowski, John; Gordon, Sharon; Lumsden, Joanne; Munro, Vicky; Wozniak, Artur; Sibley, Michael; Banks, Christopher; Duncan, Chris; Linksted, Pamela; Hume, Alastair; Stables, Catherine L; Mayor, Charlie; Caldwell, Jacqueline; Wilde, Katie; Cole, Christian; Jefferson, Emily.

J Med Internet Res ; 24(3): e31684, 2022 03 09.

Article in English | MEDLINE | ID: mdl-35262495

ABSTRACT

For over a decade, Scotland has implemented and operationalized a system of Safe Havens, which provides secure analytics platforms for researchers to access linked, deidentified electronic health records (EHRs) while managing the risk of unauthorized reidentification. In this paper, a perspective is provided on the state-of-the-art Scottish Safe Haven network, including its evolution, to define the key activities required to scale the Scottish Safe Haven network's capability to facilitate research and health care improvement initiatives. A set of processes related to EHR data and their delivery in Scotland have been discussed. An interview with each Safe Haven was conducted to understand their services in detail, as well as their commonalities. The results show how Safe Havens in Scotland have protected privacy while facilitating the reuse of the EHR data. This study provides a common definition of a Safe Haven and promotes a consistent understanding among the Scottish Safe Haven network and the clinical and academic research community. We conclude by identifying areas where efficiencies across the network can be made to meet the needs of population-level studies at scale.

Subject(s)

Electronic Health Records , Privacy , Humans , Scotland

16.

An overview of the National COVID-19 Chest Imaging Database: data quality and cohort analysis.

Cushnan, Dominic; Bennett, Oscar; Berka, Rosalind; Bertolli, Ottavia; Chopra, Ashwin; Dorgham, Samie; Favaro, Alberto; Ganepola, Tara; Halling-Brown, Mark; Imreh, Gergely; Jacob, Joseph; Jefferson, Emily; Lemarchand, François; Schofield, Daniel; Wyatt, Jeremy C.

Gigascience ; 10(11)2021 11 25.

Article in English | MEDLINE | ID: mdl-34849869

ABSTRACT

BACKGROUND: The National COVID-19 Chest Imaging Database (NCCID) is a centralized database containing mainly chest X-rays and computed tomography scans from patients across the UK. The objective of the initiative is to support a better understanding of the coronavirus SARS-CoV-2 disease (COVID-19) and the development of machine learning technologies that will improve care for patients hospitalized with a severe COVID-19 infection. This article introduces the training dataset, including a snapshot analysis covering the completeness of clinical data, and availability of image data for the various use-cases (diagnosis, prognosis, longitudinal risk). An additional cohort analysis measures how well the NCCID represents the wider COVID-19-affected UK population in terms of geographic, demographic, and temporal coverage. FINDINGS: The NCCID offers high-quality DICOM images acquired across a variety of imaging machinery; multiple time points including historical images are available for a subset of patients. This volume and variety make the database well suited to development of diagnostic/prognostic models for COVID-associated respiratory conditions. Historical images and clinical data may aid long-term risk stratification, particularly as availability of comorbidity data increases through linkage to other resources. The cohort analysis revealed good alignment to general UK COVID-19 statistics for some categories, e.g., sex, whilst identifying areas for improvements to data collection methods, particularly geographic coverage. CONCLUSION: The NCCID is a growing resource that provides researchers with a large, high-quality database that can be leveraged both to support the response to the COVID-19 pandemic and as a test bed for building clinically viable medical imaging models.

Subject(s)

COVID-19 , Cohort Studies , Data Accuracy , Humans , Pandemics , SARS-CoV-2 , Tomography, X-Ray Computed

17.

Erratum to: An overview of the National COVID-19 Chest Imaging Database: data quality and cohort analysis.

Cushnan, Dominic; Bennett, Oscar; Berka, Rosalind; Bertolli, Ottavia; Chopra, Ashwin; Dorgham, Samie; Favaro, Alberto; Ganepola, Tara; Halling-Brown, Mark; Imreh, Gergely; Jacob, Joseph; Jefferson, Emily; Lemarchand, François; Schofield, Daniel; Wyatt, Jeremy C; Collaborative, N C C I D.

Gigascience ; 10(12)2021 12 01.

Article in English | MEDLINE | ID: mdl-34850874

18.

Towards nationally curated data archives for clinical radiology image analysis at scale: Learnings from national data collection in response to a pandemic.

Cushnan, Dominic; Berka, Rosalind; Bertolli, Ottavia; Williams, Peter; Schofield, Daniel; Joshi, Indra; Favaro, Alberto; Halling-Brown, Mark; Imreh, Gergely; Jefferson, Emily; Sebire, Neil J; Reilly, Gerry; Rodrigues, Jonathan C L; Robinson, Graham; Copley, Susan; Malik, Rizwan; Bloomfield, Claire; Gleeson, Fergus; Crotty, Moira; Denton, Erika; Dickson, Jeanette; Leeming, Gary; Hardwick, Hayley E; Baillie, Kenneth; Openshaw, Peter Jm; Semple, Malcolm G; Rubin, Caroline; Howlett, Andy; Rockall, Andrea G; Bhayat, Ayub; Fascia, Daniel; Sudlow, Cathie; Jacob, Joseph.

Digit Health ; 7: 20552076211048654, 2021.

Article in English | MEDLINE | ID: mdl-34868617

ABSTRACT

The prevalence of the coronavirus SARS-CoV-2 disease has resulted in the unprecedented collection of health data to support research. Historically, coordinating the collation of such datasets on a national scale has been challenging to execute for several reasons, including issues with data privacy, the lack of data reporting standards, interoperable technologies, and distribution methods. The coronavirus SARS-CoV-2 disease pandemic has highlighted the importance of collaboration between government bodies, healthcare institutions, academic researchers and commercial companies in overcoming these issues during times of urgency. The National COVID-19 Chest Imaging Database, led by NHSX, British Society of Thoracic Imaging, Royal Surrey NHS Foundation Trust and Faculty, is an example of such a national initiative. Here, we summarise the experiences and challenges of setting up the National COVID-19 Chest Imaging Database, and the implications for future ambitions of national data curation in medical imaging to advance the safe adoption of artificial intelligence in healthcare.

19.

Desiderata for the development of next-generation electronic health record phenotype libraries.

Chapman, Martin; Mumtaz, Shahzad; Rasmussen, Luke V; Karwath, Andreas; Gkoutos, Georgios V; Gao, Chuang; Thayer, Dan; Pacheco, Jennifer A; Parkinson, Helen; Richesson, Rachel L; Jefferson, Emily; Denaxas, Spiros; Curcin, Vasa.

Gigascience ; 10(9)2021 09 11.

Article in English | MEDLINE | ID: mdl-34508578

ABSTRACT

BACKGROUND: High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. METHODS: A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. RESULTS: We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. CONCLUSIONS: There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains.

Subject(s)

Electronic Health Records , Humans , Phenotype , Reproducibility of Results

20.

Using machine learning approaches for multi-omics data analysis: A review.

Reel, Parminder S; Reel, Smarti; Pearson, Ewan; Trucco, Emanuele; Jefferson, Emily.

Biotechnol Adv ; 49: 107739, 2021.

Article in English | MEDLINE | ID: mdl-33794304

ABSTRACT

With the development of modern high-throughput omic measurement platforms, it has become essential for biomedical studies to undertake an integrative (combined) approach to fully utilise these data to gain insights into biological systems. Data from various omics sources such as genetics, proteomics, and metabolomics can be integrated to unravel the intricate working of systems biology using machine learning-based predictive algorithms. Machine learning methods offer novel techniques to integrate and analyse the various omics data enabling the discovery of new biomarkers. These biomarkers have the potential to help in accurate disease prediction, patient stratification and delivery of precision medicine. This review paper explores different integrative machine learning methods which have been used to provide an in-depth understanding of biological systems during normal physiological functioning and in the presence of a disease. It provides insight and recommendations for interdisciplinary professionals who envisage employing machine learning skills in multi-omics studies.

Subject(s)

Machine Learning , Systems Biology , Algorithms , Humans , Metabolomics , Proteomics

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL