Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 227
Filtrar
Más filtros

Base de datos
Tipo del documento
Intervalo de año de publicación
1.
Commun Med (Lond) ; 4(1): 195, 2024 Oct 08.
Artículo en Inglés | MEDLINE | ID: mdl-39379679

RESUMEN

BACKGROUND: Despite the growing interest in the use of human genomic data for drug target identification and validation, the extent to which the spectrum of human disease has been addressed by genome-wide association studies (GWAS), or by drug development, and the degree to which these efforts overlap remain unclear. METHODS: In this study we harmonize and integrate different data sources to create a sample space of all the human drug targets and diseases and identify points of convergence or divergence of GWAS and drug development efforts. RESULTS: We show that only 612 of 11,158 diseases listed in Human Disease Ontology have an approved drug treatment in at least one region of the world. Of the 1414 diseases that are the subject of preclinical or clinical phase drug development, only 666 have been investigated in GWAS. Conversely, of the 1914 human diseases that have been the subject of GWAS, 1121 have yet to be investigated in drug development. CONCLUSIONS: We produce target-disease indication lists to help the pharmaceutical industry to prioritize future drug development efforts based on genetic evidence, academia to prioritize future GWAS for diseases without effective treatments, and both sectors to harness genetic evidence to expand the indications for licensed drugs or to identify repurposing opportunities for clinical candidates that failed in their originally intended indication.


The pharma industry has shown growing interest in the use of human genomic data to support drug development and reduce the risk of clinical-stage failure. We investigate the extent to which human diseases have been the subject of genetic studies, of pharmaceutical research and development, or both. We show that only a small proportion of all human diseases have an approved drug treatment and that less than half of all the diseases that are the subject of preclinical or clinical phase drug development have been investigated in genetic studies. In addition, approximately two-thirds of the diseases covered in genetic studies have yet to be investigated in drug development. These findings could help prioritize drug development efforts or genetic studies for diseases without effective treatments.

2.
Nat Med ; 30(9): 2489-2498, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39039249

RESUMEN

For many diseases there are delays in diagnosis due to a lack of objective biomarkers for disease onset. Here, in 41,931 individuals from the United Kingdom Biobank Pharma Proteomics Project, we integrated measurements of ~3,000 plasma proteins with clinical information to derive sparse prediction models for the 10-year incidence of 218 common and rare diseases (81-6,038 cases). We then compared prediction models developed using proteomic data with models developed using either basic clinical information alone or clinical information combined with data from 37 clinical assays. The predictive performance of sparse models including as few as 5 to 20 proteins was superior to the performance of models developed using basic clinical information for 67 pathologically diverse diseases (median delta C-index = 0.07; range = 0.02-0.31). Sparse protein models further outperformed models developed using basic information combined with clinical assay data for 52 diseases, including multiple myeloma, non-Hodgkin lymphoma, motor neuron disease, pulmonary fibrosis and dilated cardiomyopathy. For multiple myeloma, single-cell RNA sequencing from bone marrow in newly diagnosed patients showed that four of the five predictor proteins were expressed specifically in plasma cells, consistent with the strong predictive power of these proteins. External replication of sparse protein models in the EPIC-Norfolk study showed good generalizability for prediction of the six diseases tested. These findings show that sparse plasma protein signatures, including both disease-specific proteins and protein predictors shared across several diseases, offer clinically useful prediction of common and rare diseases.


Asunto(s)
Proteómica , Enfermedades Raras , Humanos , Proteómica/métodos , Enfermedades Raras/sangre , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Reino Unido/epidemiología , Femenino , Masculino , Biomarcadores/sangre , Proteínas Sanguíneas/metabolismo , Persona de Mediana Edad , Anciano , Adulto , Medición de Riesgo
3.
Commun Med (Lond) ; 4(1): 94, 2024 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-38977844

RESUMEN

BACKGROUND: Early evidence that patients with (multiple) pre-existing diseases are at highest risk for severe COVID-19 has been instrumental in the pandemic to allocate critical care resources and later vaccination schemes. However, systematic studies exploring the breadth of medical diagnoses are scarce but may help to understand severe COVID-19 among patients at supposedly low risk. METHODS: We systematically harmonized >12 million primary care and hospitalisation health records from ~500,000 UK Biobank participants into 1448 collated disease terms to systematically identify diseases predisposing to severe COVID-19 (requiring hospitalisation or death) and its post-acute sequalae, Long COVID. RESULTS: Here we identify 679 diseases associated with an increased risk for severe COVID-19 (n = 672) and/or Long COVID (n = 72) that span almost all clinical specialties and are strongly enriched in clusters of cardio-respiratory and endocrine-renal diseases. For 57 diseases, we establish consistent evidence to predispose to severe COVID-19 based on survival and genetic susceptibility analyses. This includes a possible role of symptoms of malaise and fatigue as a so far largely overlooked risk factor for severe COVID-19. We finally observe partially opposing risk estimates at known risk loci for severe COVID-19 for etiologically related diseases, such as post-inflammatory pulmonary fibrosis or rheumatoid arthritis, possibly indicating a segregation of disease mechanisms. CONCLUSIONS: Our results provide a unique reference that demonstrates how 1) complex co-occurrence of multiple - including non-fatal - conditions predispose to increased COVID-19 severity and 2) how incorporating the whole breadth of medical diagnosis can guide the interpretation of genetic risk loci.


Early in the COVID-19 pandemic it was clear that people with multiple chronic diseases were vulnerable and needed special protection, such as shielding. However, many people without such diseases required hospital care or died from COVID-19. Here, we investigated the importance of underlying diseases, including mild diseases not requiring hospitalization, for COVID-19 outcomes. Using information from electronic health records we find that many severe, but also less severe diseases increase the risk for severe COVID-19 and its impact on health even months after acute infection (Long COVID). This included an almost two-fold higher risk among people that reported poor well-being and fatigue. Our findings show the value of using primary care health records and the need to consider all the medical history of patients to identify those in need of special protection.

4.
BMJ Health Care Inform ; 31(1)2024 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-39074912

RESUMEN

BACKGROUND: Despite the increasing availability of electronic healthcare record (EHR) data and wide availability of plug-and-play machine learning (ML) Application Programming Interfaces, the adoption of data-driven decision-making within routine hospital workflows thus far, has remained limited. Through the lens of deriving clusters of diagnoses by age, this study investigated the type of ML analysis that can be performed using EHR data and how results could be communicated to lay stakeholders. METHODS: Observational EHR data from a tertiary paediatric hospital, containing 61 522 unique patients and 3315 unique ICD-10 diagnosis codes was used, after preprocessing. K-means clustering was applied to identify age distributions of patient diagnoses. The final model was selected using quantitative metrics and expert assessment of the clinical validity of the clusters. Additionally, uncertainty over preprocessing decisions was analysed. FINDINGS: Four age clusters of diseases were identified, broadly aligning to ages between: 0 and 1; 1 and 5; 5 and 13; 13 and 18. Diagnoses, within the clusters, aligned to existing knowledge regarding the propensity of presentation at different ages, and sequential clusters presented known disease progressions. The results validated similar methodologies within the literature. The impact of uncertainty induced by preprocessing decisions was large at the individual diagnoses but not at a population level. Strategies for mitigating, or communicating, this uncertainty were successfully demonstrated. CONCLUSION: Unsupervised ML applied to EHR data identifies clinically relevant age distributions of diagnoses which can augment existing decision making. However, biases within healthcare datasets dramatically impact results if not appropriately mitigated or communicated.


Asunto(s)
Registros Electrónicos de Salud , Aprendizaje Automático no Supervisado , Humanos , Niño , Preescolar , Lactante , Adolescente , Análisis por Conglomerados , Recién Nacido , Masculino , Femenino , Factores de Edad
5.
medRxiv ; 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-39006431

RESUMEN

Early evidence that patients with (multiple) pre-existing diseases are at highest risk for severe COVID-19 has been instrumental in the pandemic to allocate critical care resources and later vaccination schemes. However, systematic studies exploring the breadth of medical diagnoses, including common, but non-fatal diseases are scarce, but may help to understand severe COVID-19 among patients at supposedly low risk. Here, we systematically harmonized >12 million primary care and hospitalisation health records from ~500,000 UK Biobank participants into 1448 collated disease terms to systematically identify diseases predisposing to severe COVID-19 (requiring hospitalisation or death) and its post-acute sequalae, Long COVID. We identified a total of 679 diseases associated with an increased risk for severe COVID-19 (n=672) and/or Long COVID (n=72) that spanned almost all clinical specialties and were strongly enriched in clusters of cardio-respiratory and endocrine-renal diseases. For 57 diseases, we established consistent evidence to predispose to severe COVID-19 based on survival and genetic susceptibility analyses. This included a possible role of symptoms of malaise and fatigue as a so far largely overlooked risk factor for severe COVID-19. We finally observed partially opposing risk estimates at known risk loci for severe COVID-19 for etiologically related diseases, such as post-inflammatory pulmonary fibrosis (e.g., MUC5B, NPNT, and PSMD3) or rheumatoid arthritis (e.g., TYK2), possibly indicating a segregation of disease mechanisms. Our results provide a unique reference that demonstrates how 1) complex co-occurrence of multiple - including non-fatal - conditions predispose to increased COVID-19 severity and 2) how incorporating the whole breadth of medical diagnosis can guide the interpretation of genetic risk loci.

6.
JAMIA Open ; 7(2): ooae049, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38895652

RESUMEN

Objective: To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping algorithms. Materials and Methods: We undertook a structured approach to identifying requirements for a phenotype algorithm platform by engaging with key stakeholders. User experience analysis was used to inform the design, which we implemented as a web application featuring a novel metadata standard for defining phenotyping algorithms, access via Application Programming Interface (API), support for computable data flows, and version control. The application has creation and editing functionality, enabling researchers to submit phenotypes directly. Results: We created and launched the Phenotype Library in October 2021. The platform currently hosts 1049 phenotype definitions defined against 40 health data sources and >200K terms across 16 medical ontologies. We present several case studies demonstrating its utility for supporting and enabling research: the library hosts curated phenotype collections for the BREATHE respiratory health research hub and the Adolescent Mental Health Data Platform, and it is supporting the development of an informatics tool to generate clinical evidence for clinical guideline development groups. Discussion: This platform makes an impact by being open to all health data users and accepting all appropriate content, as well as implementing key features that have not been widely available, including managing structured metadata, access via an API, and support for computable phenotypes. Conclusions: We have created the first openly available, programmatically accessible resource enabling the global health research community to store and manage phenotyping algorithms. Removing barriers to describing, sharing, and computing phenotypes will help unleash the potential benefit of health data for patients and the public.

7.
Respir Res ; 25(1): 249, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38898447

RESUMEN

BACKGROUND: Our study examined whether prevalent and incident comorbidities are increased in idiopathic pulmonary fibrosis (IPF) patients when compared to matched chronic obstructive pulmonary disease (COPD) patients and control subjects without IPF or COPD. METHODS: IPF and age, gender and smoking matched COPD patients, diagnosed between 01/01/1997 and 01/01/2019 were identified from the Clinical Practice Research Datalink GOLD database multiple registrations cohort at the first date an ICD-10 or read code mentioned IPF/COPD. A control cohort comprised age, gender and pack-year smoking matched subjects without IPF or COPD. Prevalent (prior to IPF/COPD diagnosis) and incident (after IPF/COPD diagnosis) comorbidities were examined. Group differences were estimated using a t-test. Mortality relationships were examined using multivariable Cox proportional hazards adjusted for patient age, gender and smoking status. RESULTS: Across 3055 IPF patients, 38% had 3 or more prevalent comorbidities versus 32% of COPD patients and 21% of matched control subjects. Survival time reduced as the number of comorbidities in an individual increased (p < 0.0001). In IPF, prevalent heart failure (Hazard ratio [HR] = 1.62, 95% Confidence Interval [CI]: 1.43-1.84, p < 0.001), chronic kidney disease (HR = 1.27, 95%CI: 1.10-1.47, p = 0.001), cerebrovascular disease (HR = 1.18, 95%CI: 1.02-1.35, p = 0.02), abdominal and peripheral vascular disease (HR = 1.29, 95%CI: 1.09-1.50, p = 0.003) independently associated with reduced survival. Key comorbidities showed increased incidence in IPF (versus COPD) 7-10 years prior to IPF diagnosis. INTERPRETATION: The mortality impact of excessive prevalent comorbidities in IPF versus COPD and smoking matched controls suggests that multiorgan mechanisms of injury need elucidation in patients that develop IPF.


Asunto(s)
Comorbilidad , Fibrosis Pulmonar Idiopática , Enfermedad Pulmonar Obstructiva Crónica , Humanos , Fibrosis Pulmonar Idiopática/epidemiología , Fibrosis Pulmonar Idiopática/mortalidad , Fibrosis Pulmonar Idiopática/diagnóstico , Masculino , Femenino , Anciano , Persona de Mediana Edad , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Enfermedad Pulmonar Obstructiva Crónica/mortalidad , Prevalencia , Anciano de 80 o más Años , Estudios de Cohortes , Incidencia
8.
Nat Commun ; 15(1): 4257, 2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38763986

RESUMEN

The COVID-19 pandemic exposed a global deficiency of systematic, data-driven guidance to identify high-risk individuals. Here, we illustrate the utility of routinely recorded medical history to predict the risk for 1883 diseases across clinical specialties and support the rapid response to emerging health threats such as COVID-19. We developed a neural network to learn from health records of 502,460 UK Biobank. Importantly, we observed discriminative improvements over basic demographic predictors for 1774 (94.3%) endpoints. After transferring the unmodified risk models to the All of US cohort, we replicated these improvements for 1347 (89.8%) of 1500 investigated endpoints, demonstrating generalizability across healthcare systems and historically underrepresented groups. Ultimately, we showed how this approach could have been used to identify individuals vulnerable to severe COVID-19. Our study demonstrates the potential of medical history to support guidance for emerging pandemics by systematically estimating risk for thousands of diseases at once at minimal cost.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/epidemiología , COVID-19/virología , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , Masculino , Femenino , Reino Unido/epidemiología , Pandemias , Anamnesis , Persona de Mediana Edad , Redes Neurales de la Computación , Anciano , Adulto , Factores de Riesgo , Medición de Riesgo/métodos , Estados Unidos/epidemiología , Estudios de Cohortes
9.
Lancet Digit Health ; 6(4): e281-e290, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38519155

RESUMEN

BACKGROUND: An electronic health record (EHR) holds detailed longitudinal information about a patient's health status and general clinical history, a large portion of which is stored as unstructured, free text. Existing approaches to model a patient's trajectory focus mostly on structured data and a subset of single-domain outcomes. This study aims to evaluate the effectiveness of Foresight, a generative transformer in temporal modelling of patient data, integrating both free text and structured formats, to predict a diverse array of future medical outcomes, such as disorders, substances (eg, to do with medicines, allergies, or poisonings), procedures, and findings (eg, relating to observations, judgements, or assessments). METHODS: Foresight is a novel transformer-based pipeline that uses named entity recognition and linking tools to convert EHR document text into structured, coded concepts, followed by providing probabilistic forecasts for future medical events, such as disorders, substances, procedures, and findings. The Foresight pipeline has four main components: (1) CogStack (data retrieval and preprocessing); (2) the Medical Concept Annotation Toolkit (structuring of the free-text information from EHRs); (3) Foresight Core (deep-learning model for biomedical concept modelling); and (4) the Foresight web application. We processed the entire free-text portion from three different hospital datasets (King's College Hospital [KCH], South London and Maudsley [SLaM], and the US Medical Information Mart for Intensive Care III [MIMIC-III]), resulting in information from 811 336 patients and covering both physical and mental health institutions. We measured the performance of models using custom metrics derived from precision and recall. FINDINGS: Foresight achieved a precision@10 (ie, of 10 forecasted candidates, at least one is correct) of 0·68 (SD 0·0027) for the KCH dataset, 0·76 (0·0032) for the SLaM dataset, and 0·88 (0·0018) for the MIMIC-III dataset, for forecasting the next new disorder in a patient timeline. Foresight also achieved a precision@10 value of 0·80 (0·0013) for the KCH dataset, 0·81 (0·0026) for the SLaM dataset, and 0·91 (0·0011) for the MIMIC-III dataset, for forecasting the next new biomedical concept. In addition, Foresight was validated on 34 synthetic patient timelines by five clinicians and achieved a relevancy of 33 (97% [95% CI 91-100]) of 34 for the top forecasted candidate disorder. As a generative model, Foresight can forecast follow-on biomedical concepts for as many steps as required. INTERPRETATION: Foresight is a general-purpose model for biomedical concept modelling that can be used for real-world risk forecasting, virtual trials, and clinical research to study the progression of disorders, to simulate interventions and counterfactuals, and for educational purposes. FUNDING: National Health Service Artificial Intelligence Laboratory, National Institute for Health and Care Research Biomedical Research Centre, and Health Data Research UK.


Asunto(s)
Registros Electrónicos de Salud , Medicina Estatal , Humanos , Estudios Retrospectivos , Inteligencia Artificial , Salud Mental
10.
PLoS Med ; 21(2): e1004343, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38358949

RESUMEN

BACKGROUND: The occurrence of a range of health outcomes following myocardial infarction (MI) is unknown. Therefore, this study aimed to determine the long-term risk of major health outcomes following MI and generate sociodemographic stratified risk charts in order to inform care recommendations in the post-MI period and underpin shared decision making. METHODS AND FINDINGS: This nationwide cohort study includes all individuals aged ≥18 years admitted to one of 229 National Health Service (NHS) Trusts in England between 1 January 2008 and 31 January 2017 (final follow-up 27 March 2017). We analysed 11 non-fatal health outcomes (subsequent MI and first hospitalisation for heart failure, atrial fibrillation, cerebrovascular disease, peripheral arterial disease, severe bleeding, renal failure, diabetes mellitus, dementia, depression, and cancer) and all-cause mortality. Of the 55,619,430 population of England, 34,116,257 individuals contributing to 145,912,852 hospitalisations were included (mean age 41.7 years (standard deviation [SD 26.1]); n = 14,747,198 (44.2%) male). There were 433,361 individuals with MI (mean age 67.4 years [SD 14.4)]; n = 283,742 (65.5%) male). Following MI, all-cause mortality was the most frequent event (adjusted cumulative incidence at 9 years 37.8% (95% confidence interval [CI] [37.6,37.9]), followed by heart failure (29.6%; 95% CI [29.4,29.7]), renal failure (27.2%; 95% CI [27.0,27.4]), atrial fibrillation (22.3%; 95% CI [22.2,22.5]), severe bleeding (19.0%; 95% CI [18.8,19.1]), diabetes (17.0%; 95% CI [16.9,17.1]), cancer (13.5%; 95% CI [13.3,13.6]), cerebrovascular disease (12.5%; 95% CI [12.4,12.7]), depression (8.9%; 95% CI [8.7,9.0]), dementia (7.8%; 95% CI [7.7,7.9]), subsequent MI (7.1%; 95% CI [7.0,7.2]), and peripheral arterial disease (6.5%; 95% CI [6.4,6.6]). Compared with a risk-set matched population of 2,001,310 individuals, first hospitalisation of all non-fatal health outcomes were increased after MI, except for dementia (adjusted hazard ratio [aHR] 1.01; 95% CI [0.99,1.02];p = 0.468) and cancer (aHR 0.56; 95% CI [0.56,0.57];p < 0.001). The study includes data from secondary care only-as such diagnoses made outside of secondary care may have been missed leading to the potential underestimation of the total burden of disease following MI. CONCLUSIONS: In this study, up to a third of patients with MI developed heart failure or renal failure, 7% had another MI, and 38% died within 9 years (compared with 35% deaths among matched individuals). The incidence of all health outcomes, except dementia and cancer, was higher than expected during the normal life course without MI following adjustment for age, sex, year, and socioeconomic deprivation. Efforts targeted to prevent or limit the accrual of chronic, multisystem disease states following MI are needed and should be guided by the demographic-specific risk charts derived in this study.


Asunto(s)
Fibrilación Atrial , Trastornos Cerebrovasculares , Demencia , Diabetes Mellitus , Insuficiencia Cardíaca , Infarto del Miocardio , Neoplasias , Insuficiencia Renal , Humanos , Masculino , Adolescente , Adulto , Anciano , Femenino , Estudios de Cohortes , Fibrilación Atrial/diagnóstico , Medicina Estatal , Infarto del Miocardio/epidemiología , Insuficiencia Cardíaca/complicaciones , Evaluación de Resultado en la Atención de Salud , Insuficiencia Renal/complicaciones , Neoplasias/complicaciones
11.
Nat Commun ; 14(1): 6156, 2023 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-37828025

RESUMEN

Raynaud's phenomenon (RP) is a common vasospastic disorder that causes severe pain and ulcers, but despite its high reported heritability, no causal genes have been robustly identified. We conducted a genome-wide association study including 5,147 RP cases and 439,294 controls, based on diagnoses from electronic health records, and identified three unreported genomic regions associated with the risk of RP (p < 5 × 10-8). We prioritized ADRA2A (rs7090046, odds ratio (OR) per allele: 1.26; 95%-CI: 1.20-1.31; p < 9.6 × 10-27) and IRX1 (rs12653958, OR: 1.17; 95%-CI: 1.12-1.22, p < 4.8 × 10-13) as candidate causal genes through integration of gene expression in disease relevant tissues. We further identified a likely causal detrimental effect of low fasting glucose levels on RP risk (rG = -0.21; p-value = 2.3 × 10-3), and systematically highlighted drug repurposing opportunities, like the antidepressant mirtazapine. Our results provide the first robust evidence for a strong genetic contribution to RP and highlight a so far underrated role of α2A-adrenoreceptor signalling, encoded at ADRA2A, as a possible mechanism for hypersensitivity to catecholamine-induced vasospasms.


Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedad de Raynaud , Humanos , Úlcera , Enfermedad de Raynaud/genética , Enfermedad de Raynaud/complicaciones , Dolor/complicaciones , Factores de Transcripción/genética , Proteínas de Homeodominio , Receptores Adrenérgicos alfa 2/genética
12.
BMJ Med ; 2(1): e000554, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37859783

RESUMEN

Objective: To clarify the performance of polygenic risk scores in population screening, individual risk prediction, and population risk stratification. Design: Secondary analysis of data in the Polygenic Score Catalog. Setting: Polygenic Score Catalog, April 2022. Secondary analysis of 3915 performance metric estimates for 926 polygenic risk scores for 310 diseases to generate estimates of performance in population screening, individual risk, and population risk stratification. Participants: Individuals contributing to the published studies in the Polygenic Score Catalog. Main outcome measures: Detection rate for a 5% false positive rate (DR5) and the population odds of becoming affected given a positive result; individual odds of becoming affected for a person with a particular polygenic score; and odds of becoming affected for groups of individuals in different portions of a polygenic risk score distribution. Coronary artery disease and breast cancer were used as illustrative examples. Results: For performance in population screening, median DR5 for all polygenic risk scores and all diseases studied was 11% (interquartile range 8-18%). Median DR5 was 12% (9-19%) for polygenic risk scores for coronary artery disease and 10% (9-12%) for breast cancer. The population odds of becoming affected given a positive results were 1:8 for coronary artery disease and 1:21 for breast cancer, with background 10 year odds of 1:19 and 1:41, respectively, which are typical for these diseases at age 50. For individual risk prediction, the corresponding 10 year odds of becoming affected for individuals aged 50 with a polygenic risk score at the 2.5th, 25th, 75th, and 97.5th centiles were 1:54, 1:29, 1:15, and 1:8 for coronary artery disease and 1:91, 1:56, 1:34, and 1:21 for breast cancer. In terms of population risk stratification, at age 50, the risk of coronary artery disease was divided into five groups, with 10 year odds of 1:41 and 1:11 for the lowest and highest quintile groups, respectively. The 10 year odds was 1:7 for the upper 2.5% of the polygenic risk score distribution for coronary artery disease, a group that contributed 7% of cases. The corresponding estimates for breast cancer were 1:72 and 1:26 for the lowest and highest quintile groups, and 1:19 for the upper 2.5% of the distribution, which contributed 6% of cases. Conclusion: Polygenic risk scores performed poorly in population screening, individual risk prediction, and population risk stratification. Strong claims about the effect of polygenic risk scores on healthcare seem to be disproportionate to their performance.

13.
Eur J Prev Cardiol ; 30(15): 1715-1722, 2023 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-37294923

RESUMEN

BACKGROUND: Incident events of cardiovascular diseases (CVDs) are heterogenous and may result in different mortality risks. Such evidence may help inform patient and physician decisions in CVD prevention and risk factor management. AIMS: This study aimed to determine the extent to which incident events of common CVD show heterogeneous associations with subsequent mortality risk in the general population. METHODS AND RESULTS: Based on England-wide linked electronic health records, we established a cohort of 1 310 518 people ≥30 years of age initially free of CVD and followed up for non-fatal events of 12 common CVD and cause-specific mortality. The 12 CVDs were considered as time-varying exposures in Cox's proportional hazards models to estimate hazard rate ratios (HRRs) with 95% confidence intervals (CIs). Over the median follow-up of 4.2 years (2010-16), 81 516 non-fatal CVD, 10 906 cardiovascular deaths, and 40 843 non-cardiovascular deaths occurred. All 12 CVDs were associated with increased risk of cardiovascular mortality, with HRR (95% CI) ranging from 1.67 (1.47-1.89) for stable angina to 7.85 (6.62-9.31) for haemorrhagic stroke. All 12 CVDs were also associated with increased non-cardiovascular and all-cause mortality risk but to a lesser extent: HRR (95% CI) ranged from 1.10 (1.00-1.22) to 4.55 (4.03-5.13) and from 1.24 (1.13-1.35) to 4.92 (4.44-5.46) for transient ischaemic attack and sudden cardiac arrest, respectively. CONCLUSION: Incident events of 12 common CVD show significant adverse and markedly differential associations with subsequent cardiovascular, non-cardiovascular, and all-cause mortality risk in the general population.


We linked data available for 1.31 million people seen by English general practitioners in 2010 with data from hospital admissions and death certificates up to 2016 to investigate the risk of death in people who suffered from any of 12 common cardiovascular diseases (CVDs) compared with those who did not. The results show heterogeneously increased risks of death in people who suffered from any of 12 common CVD when compared with people who remained CVD free. The results support efforts of prevention for the entire spectrum of CVD including alleged minor types such as stable angina and transient ischaemic attack.


Asunto(s)
Enfermedades Cardiovasculares , Humanos , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/epidemiología , Enfermedades Cardiovasculares/etiología , Incidencia , Factores de Riesgo , Modelos de Riesgos Proporcionales , Inglaterra/epidemiología
14.
Lancet Digit Health ; 5(6): e370-e379, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37236697

RESUMEN

BACKGROUND: Machine learning has been used to analyse heart failure subtypes, but not across large, distinct, population-based datasets, across the whole spectrum of causes and presentations, or with clinical and non-clinical validation by different machine learning methods. Using our published framework, we aimed to discover heart failure subtypes and validate them upon population representative data. METHODS: In this external, prognostic, and genetic validation study we analysed individuals aged 30 years or older with incident heart failure from two population-based databases in the UK (Clinical Practice Research Datalink [CPRD] and The Health Improvement Network [THIN]) from 1998 to 2018. Pre-heart failure and post-heart failure factors (n=645) included demographic information, history, examination, blood laboratory values, and medications. We identified subtypes using four unsupervised machine learning methods (K-means, hierarchical, K-Medoids, and mixture model clustering) with 87 of 645 factors in each dataset. We evaluated subtypes for (1) external validity (across datasets); (2) prognostic validity (predictive accuracy for 1-year mortality); and (3) genetic validity (UK Biobank), association with polygenic risk score (PRS) for heart failure-related traits (n=11), and single nucleotide polymorphisms (n=12). FINDINGS: We included 188 800, 124 262, and 9573 individuals with incident heart failure from CPRD, THIN, and UK Biobank, respectively, between Jan 1, 1998, and Jan 1, 2018. After identifying five clusters, we labelled heart failure subtypes as (1) early onset, (2) late onset, (3) atrial fibrillation related, (4) metabolic, and (5) cardiometabolic. In the external validity analysis, subtypes were similar across datasets (c-statistics: THIN model in CPRD ranged from 0·79 [subtype 3] to 0·94 [subtype 1], and CPRD model in THIN ranged from 0·79 [subtype 1] to 0·92 [subtypes 2 and 5]). In the prognostic validity analysis, 1-year all-cause mortality after heart failure diagnosis (subtype 1 0·20 [95% CI 0·14-0·25], subtype 2 0·46 [0·43-0·49], subtype 3 0·61 [0·57-0·64], subtype 4 0·11 [0·07-0·16], and subtype 5 0·37 [0·32-0·41]) differed across subtypes in CPRD and THIN data, as did risk of non-fatal cardiovascular diseases and all-cause hospitalisation. In the genetic validity analysis the atrial fibrillation-related subtype showed associations with the related PRS. Late onset and cardiometabolic subtypes were the most similar and strongly associated with PRS for hypertension, myocardial infarction, and obesity (p<0·0009). We developed a prototype app for routine clinical use, which could enable evaluation of effectiveness and cost-effectiveness. INTERPRETATION: Across four methods and three datasets, including genetic data, in the largest study of incident heart failure to date, we identified five machine learning-informed subtypes, which might inform aetiological research, clinical risk prediction, and the design of heart failure trials. FUNDING: European Union Innovative Medicines Initiative-2.


Asunto(s)
Fibrilación Atrial , Insuficiencia Cardíaca , Humanos , Pronóstico , Registros Electrónicos de Salud , Insuficiencia Cardíaca/diagnóstico , Insuficiencia Cardíaca/epidemiología , Aprendizaje Automático
15.
Eur J Prev Cardiol ; 30(11): 1151-1161, 2023 08 21.
Artículo en Inglés | MEDLINE | ID: mdl-36895179

RESUMEN

AIMS: Most adults presenting in primary care with chest pain symptoms will not receive a diagnosis ('unattributed' chest pain) but are at increased risk of cardiovascular events. To assess within patients with unattributed chest pain, risk factors for cardiovascular events and whether those at greatest risk of cardiovascular disease can be ascertained by an existing general population risk prediction model or by development of a new model. METHODS AND RESULTS: The study used UK primary care electronic health records from the Clinical Practice Research Datalink linked to admitted hospitalizations. Study population was patients aged 18 plus with recorded unattributed chest pain 2002-2018. Cardiovascular risk prediction models were developed with external validation and comparison of performance to QRISK3, a general population risk prediction model. There were 374 917 patients with unattributed chest pain in the development data set. The strongest risk factors for cardiovascular disease included diabetes, atrial fibrillation, and hypertension. Risk was increased in males, patients of Asian ethnicity, those in more deprived areas, obese patients, and smokers. The final developed model had good predictive performance (external validation c-statistic 0.81, calibration slope 1.02). A model using a subset of key risk factors for cardiovascular disease gave nearly identical performance. QRISK3 underestimated cardiovascular risk. CONCLUSION: Patients presenting with unattributed chest pain are at increased risk of cardiovascular events. It is feasible to accurately estimate individual risk using routinely recorded information in the primary care record, focusing on a small number of risk factors. Patients at highest risk could be targeted for preventative measures.


It is known that patients with chest pain without a recognized cause are at increased risk of future cardiovascular events (for example, heart disease) and so this study aimed to find out whether those patients at greatest risk could be determined using information in their health records. It is possible to accurately estimate a person's risk of future cardiovascular events using the information entered into their health records, and this risk can be estimated using only a small number of factors.Patients at highest risk could now be targeted for management to help prevent future cardiovascular events.


Asunto(s)
Enfermedades Cardiovasculares , Adulto , Masculino , Humanos , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/epidemiología , Factores de Riesgo , Registros Electrónicos de Salud , Medición de Riesgo/métodos , Dolor en el Pecho/diagnóstico , Dolor en el Pecho/epidemiología , Dolor en el Pecho/etiología , Factores de Riesgo de Enfermedad Cardiaca , Atención Primaria de Salud , Reino Unido/epidemiología
16.
EBioMedicine ; 89: 104489, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36857859

RESUMEN

BACKGROUND: Although chronic kidney disease (CKD) is associated with high multimorbidity, polypharmacy, morbidity and mortality, existing classification systems (mild to severe, usually based on estimated glomerular filtration rate, proteinuria or urine albumin-creatinine ratio) and risk prediction models largely ignore the complexity of CKD, its risk factors and its outcomes. Improved subtype definition could improve prediction of outcomes and inform effective interventions. METHODS: We analysed individuals ≥18 years with incident and prevalent CKD (n = 350,067 and 195,422 respectively) from a population-based electronic health record resource (2006-2020; Clinical Practice Research Datalink, CPRD). We included factors (n = 264 with 2670 derived variables), e.g. demography, history, examination, blood laboratory values and medications. Using a published framework, we identified subtypes through seven unsupervised machine learning (ML) methods (K-means, Diana, HC, Fanny, PAM, Clara, Model-based) with 66 (of 2670) variables in each dataset. We evaluated subtypes for: (i) internal validity (within dataset, across methods); (ii) prognostic validity (predictive accuracy for 5-year all-cause mortality and admissions); and (iii) medications (new and existing by British National Formulary chapter). FINDINGS: After identifying five clusters across seven approaches, we labelled CKD subtypes: 1. Early-onset, 2. Late-onset, 3. Cancer, 4. Metabolic, and 5. Cardiometabolic. Internal validity: We trained a high performing model (using XGBoost) that could predict disease subtypes with 95% accuracy for incident and prevalent CKD (Sensitivity: 0.81-0.98, F1 score:0.84-0.97). Prognostic validity: 5-year all-cause mortality, hospital admissions, and incidence of new chronic diseases differed across CKD subtypes. The 5-year risk of mortality and admissions in the overall incident CKD population were highest in cardiometabolic subtype: 43.3% (42.3-42.8%) and 29.5% (29.1-30.0%), respectively, and lowest in the early-onset subtype: 5.7% (5.5-5.9%) and 18.7% (18.4-19.1%). MEDICATIONS: Across CKD subtypes, the distribution of prescription medication classes at baseline varied, with highest medication burden in cardiometabolic and metabolic subtypes, and higher burden in prevalent than incident CKD. INTERPRETATION: In the largest CKD study using ML, to-date, we identified five distinct subtypes in individuals with incident and prevalent CKD. These subtypes have relevance to study of aetiology, therapeutics and risk prediction. FUNDING: AstraZeneca UK Ltd, Health Data Research UK.


Asunto(s)
Enfermedades Cardiovasculares , Insuficiencia Renal Crónica , Humanos , Pronóstico , Registros Electrónicos de Salud , Aprendizaje Automático
17.
Lancet Digit Health ; 5(1): e16-e27, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36460578

RESUMEN

BACKGROUND: Globally, there is a paucity of multimorbidity and comorbidity data, especially for minority ethnic groups and younger people. We estimated the frequency of common disease combinations and identified non-random disease associations for all ages in a multiethnic population. METHODS: In this population-based study, we examined multimorbidity and comorbidity patterns stratified by ethnicity or race, sex, and age for 308 health conditions using electronic health records from individuals included on the Clinical Practice Research Datalink linked with the Hospital Episode Statistics admitted patient care dataset in England. We included individuals who were older than 1 year and who had been registered for at least 1 year in a participating general practice during the study period (between April 1, 2010, and March 31, 2015). We identified the most common combinations of conditions and comorbidities for index conditions. We defined comorbidity as the accumulation of additional conditions to an index condition over an individual's lifetime. We used network analysis to identify conditions that co-occurred more often than expected by chance. We developed online interactive tools to explore multimorbidity and comorbidity patterns overall and by subgroup based on ethnicity, sex, and age. FINDINGS: We collected data for 3 872 451 eligible patients, of whom 1 955 700 (50·5%) were women and girls, 1 916 751 (49·5%) were men and boys, 2 666 234 (68·9%) were White, 155 435 (4·0%) were south Asian, and 98 815 (2·6%) were Black. We found that a higher proportion of boys aged 1-9 years (132 506 [47·8%] of 277 158) had two or more diagnosed conditions than did girls in the same age group (106 982 [40·3%] of 265 179), but more women and girls were diagnosed with multimorbidity than were boys aged 10 years and older and men (1 361 232 [80·5%] of 1 690 521 vs 1 161 308 [70·8%] of 1 639 593). White individuals (2 097 536 [78·7%] of 2 666 234) were more likely to be diagnosed with two or more conditions than were Black (59 339 [60·1%] of 98 815) or south Asian individuals (93 617 [60·2%] of 155 435). Depression commonly co-occurred with anxiety, migraine, obesity, atopic conditions, deafness, soft-tissue disorders, and gastrointestinal disorders across all subgroups. Heart failure often co-occurred with hypertension, atrial fibrillation, osteoarthritis, stable angina, myocardial infarction, chronic kidney disease, type 2 diabetes, and chronic obstructive pulmonary disease. Spinal fractures were most strongly non-randomly associated with malignancy in Black individuals, but with osteoporosis in White individuals. Hypertension was most strongly associated with kidney disorders in those aged 20-29 years, but with dyslipidaemia, obesity, and type 2 diabetes in individuals aged 40 years and older. Breast cancer was associated with different comorbidities in individuals from different ethnic groups. Asthma was associated with different comorbidities between males and females. Bipolar disorder was associated with different comorbidities in younger age groups compared with older age groups. INTERPRETATION: Our findings and interactive online tools are a resource for: patients and their clinicians, to prevent and detect comorbid conditions; research funders and policy makers, to redesign service provision, training priorities, and guideline development; and biomedical researchers and manufacturers of medicines, to provide leads for research into common or sequential pathways of disease and inform the design of clinical trials. FUNDING: UK Research and Innovation, Medical Research Council, National Institute for Health and Care Research, Department of Health and Social Care, Wellcome Trust, British Heart Foundation, and The Alan Turing Institute.


Asunto(s)
Diabetes Mellitus Tipo 2 , Hipertensión , Masculino , Humanos , Femenino , Adulto , Persona de Mediana Edad , Anciano , Multimorbilidad , Medicina Estatal , Diabetes Mellitus Tipo 2/epidemiología , Comorbilidad , Hipertensión/epidemiología , Obesidad/epidemiología
18.
J R Soc Med ; 116(1): 10-20, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36374585

RESUMEN

OBJECTIVES: To use national, pre- and post-pandemic electronic health records (EHR) to develop and validate a scenario-based model incorporating baseline mortality risk, infection rate (IR) and relative risk (RR) of death for prediction of excess deaths. DESIGN: An EHR-based, retrospective cohort study. SETTING: Linked EHR in Clinical Practice Research Datalink (CPRD); and linked EHR and COVID-19 data in England provided in NHS Digital Trusted Research Environment (TRE). PARTICIPANTS: In the development (CPRD) and validation (TRE) cohorts, we included 3.8 million and 35.1 million individuals aged ≥30 years, respectively. MAIN OUTCOME MEASURES: One-year all-cause excess deaths related to COVID-19 from March 2020 to March 2021. RESULTS: From 1 March 2020 to 1 March 2021, there were 127,020 observed excess deaths. Observed RR was 4.34% (95% CI, 4.31-4.38) and IR was 6.27% (95% CI, 6.26-6.28). In the validation cohort, predicted one-year excess deaths were 100,338 compared with the observed 127,020 deaths with a ratio of predicted to observed excess deaths of 0.79. CONCLUSIONS: We show that a simple, parsimonious model incorporating baseline mortality risk, one-year IR and RR of the pandemic can be used for scenario-based prediction of excess deaths in the early stages of a pandemic. Our analyses show that EHR could inform pandemic planning and surveillance, despite limited use in emergency preparedness to date. Although infection dynamics are important in the prediction of mortality, future models should take greater account of underlying conditions.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Estudios Retrospectivos , Pandemias , Registros Electrónicos de Salud , Inglaterra/epidemiología
19.
Wellcome Open Res ; 8: 262, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-39092423

RESUMEN

Background: Electronic health records (EHRs) have the potential to be used to produce detailed disease burden estimates. In this study we created disease estimates using national EHR for three high burden conditions, compared estimates between linked and unlinked datasets and produced stratified estimates by age, sex, ethnicity, socio-economic deprivation and geographical region. Methods: EHRs containing primary care (Clinical Practice Research Datalink), secondary care (Hospital Episode Statistics) and mortality records (Office for National Statistics) were used. We used existing disease phenotyping algorithms to identify cases of cancer (breast, lung, colorectal and prostate), type 1 and 2 diabetes, and lower back pain. We calculated age-standardised incidence of first cancer, point prevalence for diabetes, and primary care consultation prevalence for low back pain. Results: 7.2 million people contributing 45.3 million person-years of active follow-up between 2000-2014 were included. CPRD-HES combined and CPRD-HES-ONS combined lung and bowel cancer incidence estimates by sex were similar to cancer registry estimates. Linked CPRD-HES estimates for combined Type 1 and Type 2 diabetes were consistently higher than those of CPRD alone, with the difference steadily increasing over time from 0.26% (2.99% for CPRD-HES vs. 2.73 for CPRD) in 2002 to 0.58% (6.17% vs. 5.59) in 2013. Low back pain prevalence was highest in the most deprived quintile and when compared to the least deprived quintile the difference in prevalence increased over time between 2000 and 2013, with the largest difference of 27% (558.70 per 10,000 people vs 438.20) in 2013. Conclusions: We use national EHRs to produce estimates of burden of disease to produce detailed estimates by deprivation, ethnicity and geographical region. National EHRs have the potential to improve disease burden estimates at a local and global level and may serve as more automated, timely and precise inputs for policy making and global burden of disease estimation.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA