Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 155
Filtrar
1.
medRxiv ; 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38853961

RESUMO

Polygenic scores (PGS) have transformed human genetic research and have multiple potential clinical applications, including risk stratification for disease prevention and prediction of treatment response. Here, we present a series of recent enhancements to the PGS Catalog (www.PGSCatalog.org), the largest findable, accessible, interoperable, and reusable (FAIR) repository of PGS. These include expansions in data content and ancestral diversity as well as the addition of new features. We further present the PGS Catalog Calculator (pgsc_calc, https://github.com/PGScatalog/pgsc_calc), an open-source, scalable and portable pipeline to reproducibly calculate PGS that securely democratizes equitable PGS applications by implementing genetic ancestry estimation and score normalization using reference data. With the PGS Catalog & calculator users can now quantify an individual's genetic predisposition for hundreds of common diseases and clinically relevant traits. Taken together, these updates and tools facilitate the next generation of PGS, thus lowering barriers to the clinical studies necessary to identify where PGS may be integrated into clinical practice.

2.
Am J Hum Genet ; 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38908374

RESUMO

Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.

3.
medRxiv ; 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38699308

RESUMO

Blood cell phenotypes are routinely tested in healthcare to inform clinical decisions. Genetic variants influencing mean blood cell phenotypes have been used to understand disease aetiology and improve prediction; however, additional information may be captured by genetic effects on observed variance. Here, we mapped variance quantitative trait loci (vQTL), i.e. genetic loci associated with trait variance, for 29 blood cell phenotypes from the UK Biobank (N~408,111). We discovered 176 independent blood cell vQTLs, of which 147 were not found by additive QTL mapping. vQTLs displayed on average 1.8-fold stronger negative selection than additive QTL, highlighting that selection acts to reduce extreme blood cell phenotypes. Variance polygenic scores (vPGSs) were constructed to stratify individuals in the INTERVAL cohort (N~40,466), where genetically less variable individuals (low vPGS) had increased conventional PGS accuracy (by ~19%) than genetically more variable individuals. Genetic prediction of blood cell traits improved by ~10% on average combining PGS with vPGS. Using Mendelian randomisation and vPGS association analyses, we found that alcohol consumption significantly increased blood cell trait variances highlighting the utility of blood cell vQTLs and vPGSs to provide novel insight into phenotype aetiology as well as improve prediction.

4.
J Am Heart Assoc ; 13(11): e034254, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38780153

RESUMO

BACKGROUND: Ten-year risk equations for incident heart failure (HF) are available for the general population, but not for patients with established atherosclerotic cardiovascular disease (ASCVD), which is highly prevalent in HF cohorts. This study aimed to develop and validate 10-year risk equations for incident HF in patients with known ASCVD. METHODS AND RESULTS: Ten-year risk equations for incident HF were developed using the United Kingdom Biobank cohort (recruitment 2006-2010) including participants with established ASCVD but free from HF at baseline. Model performance was validated using the Australian Baker Heart and Diabetes Institute Biobank cohort (recruitment 2000-2011) and compared with the performance of general population risk models. Incident HF occurred in 13.7% of the development cohort (n=31 446, median 63 years, 35% women, follow-up 10.7±2.7 years) and in 21.3% of the validation cohort (n=1659, median age 65 years, 25% women, follow-up 9.4±3.7 years). Predictors of HF included in the sex-specific models were age, body mass index, systolic blood pressure (treated or untreated), glucose (treated or untreated), cholesterol, smoking status, QRS duration, kidney disease, myocardial infarction, and atrial fibrillation. ASCVD-HF equations had good discrimination and calibration in development and validation cohorts, with superior performance to general population risk equations. CONCLUSIONS: ASCVD-specific 10-year risk equations for HF outperform general population risk models in individuals with established ASCVD. The ASCVD-HF equations can be calculated from readily available clinical data and could facilitate screening and preventative treatment decisions in this high-risk group.


Assuntos
Aterosclerose , Insuficiência Cardíaca , Humanos , Feminino , Masculino , Insuficiência Cardíaca/epidemiologia , Insuficiência Cardíaca/diagnóstico , Pessoa de Meia-Idade , Idoso , Medição de Risco/métodos , Incidência , Aterosclerose/epidemiologia , Aterosclerose/diagnóstico , Reino Unido/epidemiologia , Fatores de Risco , Fatores de Tempo , Austrália/epidemiologia , Reprodutibilidade dos Testes
5.
Nat Genet ; 56(5): 752-757, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38684898

RESUMO

Health equity is the state in which everyone has fair and just opportunities to attain their highest level of health. The field of human genomics has fallen short in increasing health equity, largely because the diversity of the human population has been inadequately reflected among participants of genomics research. This lack of diversity leads to disparities that can have scientific and clinical consequences. Achieving health equity related to genomics will require greater effort in addressing inequities within the field. As part of the commitment of the National Human Genome Research Institute (NHGRI) to advancing health equity, it convened experts in genomics and health equity research to make recommendations and performed a review of current literature to identify the landscape of gaps and opportunities at the interface between human genomics and health equity research. This Perspective describes these findings and examines health equity within the context of human genomics and genomic medicine.


Assuntos
Genômica , Equidade em Saúde , Humanos , Genômica/métodos , Estados Unidos , Genoma Humano , National Human Genome Research Institute (U.S.)
6.
Nat Aging ; 4(4): 584-594, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38528230

RESUMO

Multiomics has shown promise in noninvasive risk profiling and early detection of various common diseases. In the present study, in a prospective population-based cohort with ~18 years of e-health record follow-up, we investigated the incremental and combined value of genomic and gut metagenomic risk assessment compared with conventional risk factors for predicting incident coronary artery disease (CAD), type 2 diabetes (T2D), Alzheimer disease and prostate cancer. We found that polygenic risk scores (PRSs) improved prediction over conventional risk factors for all diseases. Gut microbiome scores improved predictive capacity over baseline age for CAD, T2D and prostate cancer. Integrated risk models of PRSs, gut microbiome scores and conventional risk factors achieved the highest predictive performance for all diseases studied compared with models based on conventional risk factors alone. The present study demonstrates that integrated PRSs and gut metagenomic risk models improve the predictive value over conventional risk factors for common chronic diseases.


Assuntos
Doença da Artéria Coronariana , Diabetes Mellitus Tipo 2 , Neoplasias da Próstata , Masculino , Humanos , Diabetes Mellitus Tipo 2/diagnóstico , Estudos Prospectivos , Fatores de Risco , Doença da Artéria Coronariana/genética , Estratificação de Risco Genético
7.
Genome Med ; 16(1): 33, 2024 02 19.
Artigo em Inglês | MEDLINE | ID: mdl-38373998

RESUMO

Polygenic scores (PGS) can be used for risk stratification by quantifying individuals' genetic predisposition to disease, and many potentially clinically useful applications have been proposed. Here, we review the latest potential benefits of PGS in the clinic and challenges to implementation. PGS could augment risk stratification through combined use with traditional risk factors (demographics, disease-specific risk factors, family history, etc.), to support diagnostic pathways, to predict groups with therapeutic benefits, and to increase the efficiency of clinical trials. However, there exist challenges to maximizing the clinical utility of PGS, including FAIR (Findable, Accessible, Interoperable, and Reusable) use and standardized sharing of the genomic data needed to develop and recalculate PGS, the equitable performance of PGS across populations and ancestries, the generation of robust and reproducible PGS calculations, and the responsible communication and interpretation of results. We outline how these challenges may be overcome analytically and with more diverse data as well as highlight sustained community efforts to achieve equitable, impactful, and responsible use of PGS in healthcare.


Assuntos
Comunicação , Predisposição Genética para Doença , Humanos , Genômica , Herança Multifatorial , Fatores de Risco , Estudo de Associação Genômica Ampla
9.
Nat Commun ; 15(1): 1540, 2024 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-38378775

RESUMO

Recent advancements in plasma lipidomic profiling methodology have significantly increased specificity and accuracy of lipid measurements. This evolution, driven by improved chromatographic and mass spectrometric resolution of newer platforms, has made it challenging to align datasets created at different times, or on different platforms. Here we present a framework for harmonising such plasma lipidomic datasets with different levels of granularity in their lipid measurements. Our method utilises elastic-net prediction models, constructed from high-resolution lipidomics reference datasets, to predict unmeasured lipid species in lower-resolution studies. The approach involves (1) constructing composite lipid measures in the reference dataset that map to less resolved lipids in the target dataset, (2) addressing discrepancies between aligned lipid species, (3) generating prediction models, (4) assessing their transferability into the targe dataset, and (5) evaluating their prediction accuracy. To demonstrate our approach, we used the AusDiab population-based cohort (747 lipid species) as the reference to impute unmeasured lipid species into the LIPID study (342 lipid species). Furthermore, we compared measured and imputed lipids in terms of parameter estimation and predictive performance, and validated imputations in an independent study. Our method for harmonising plasma lipidomic datasets will facilitate model validation and data integration efforts.


Assuntos
Lipidômica , Plasma , Humanos , Espectrometria de Massas , Lipídeos
11.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38200587

RESUMO

MOTIVATION: Protein-protein interactions (PPIs) are essential to understanding biological pathways as well as their roles in development and disease. Computational tools, based on classic machine learning, have been successful at predicting PPIs in silico, but the lack of consistent and reliable frameworks for this task has led to network models that are difficult to compare and discrepancies between algorithms that remain unexplained. RESULTS: To better understand the underlying inference mechanisms that underpin these models, we designed an open-source framework for benchmarking that accounts for a range of biological and statistical pitfalls while facilitating reproducibility. We use it to shed light on the impact of network topology and how different algorithms deal with highly connected proteins. By studying functional genomics-based and sequence-based models on human PPIs, we show their complementarity as the former performs best on lone proteins while the latter specializes in interactions involving hubs. We also show that algorithm design has little impact on performance with functional genomic data. We replicate our results between both human and S. cerevisiae data and demonstrate that models using functional genomics are better suited to PPI prediction across species. With rapidly increasing amounts of sequence and functional genomics data, our study provides a principled foundation for future construction, comparison, and application of PPI networks. AVAILABILITY AND IMPLEMENTATION: The code and data are available on GitHub: https://github.com/Llannelongue/B4PPI.


Assuntos
Mapas de Interação de Proteínas , Saccharomyces cerevisiae , Humanos , Mapas de Interação de Proteínas/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Reprodutibilidade dos Testes , Proteínas/metabolismo , Algoritmos , Aprendizado de Máquina , Mapeamento de Interação de Proteínas/métodos
12.
Trends Microbiol ; 32(7): 707-719, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38246848

RESUMO

The human microbiome has been increasingly recognized as having potential use for disease prediction. Predicting the risk, progression, and severity of diseases holds promise to transform clinical practice, empower patient decisions, and reduce the burden of various common diseases, as has been demonstrated for cardiovascular disease or breast cancer. Combining multiple modifiable and non-modifiable risk factors, including high-dimensional genomic data, has been traditionally favored, but few studies have incorporated the human microbiome into models for predicting the prospective risk of disease. Here, we review research into the use of the human microbiome for disease prediction with a particular focus on prospective studies as well as the modulation and engineering of the microbiome as a therapeutic strategy.


Assuntos
Microbiota , Humanos , Fatores de Risco , Doenças Cardiovasculares/microbiologia
13.
Arterioscler Thromb Vasc Biol ; 44(2): 477-487, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-37970720

RESUMO

BACKGROUND: Dyslipidemia is treated effectively with statins, but treatment has the potential to induce new-onset type-2 diabetes. Gut microbiota may contribute to this outcome variability. We assessed the associations of gut microbiota diversity and composition with statins. Bacterial associations with statin-associated new-onset type-2 diabetes (T2D) risk were also prospectively evaluated. METHODS: We examined shallow-shotgun-sequenced fecal samples from 5755 individuals in the FINRISK-2002 population cohort with a 17+-year-long register-based follow-up. Alpha-diversity was quantified using Shannon index and beta-diversity with Aitchison distance. Species-specific differential abundances were analyzed using general multivariate regression. Prospective associations were assessed with Cox regression. Applicable results were validated using gradient boosting. RESULTS: Statin use associated with differing taxonomic composition (R2, 0.02%; q=0.02) and 13 differentially abundant species in fully adjusted models (MaAsLin; q<0.05). The strongest positive association was with Clostridium sartagoforme (ß=0.37; SE=0.13; q=0.02) and the strongest negative association with Bacteroides cellulosilyticus (ß=-0.31; SE=0.11; q=0.02). Twenty-five microbial features had significant associations with incident T2D in statin users, of which only Bacteroides vulgatus (HR, 1.286 [1.136-1.457]; q=0.03) was consistent regardless of model adjustment. Finally, higher statin-associated T2D risk was seen with [Ruminococcus] torques (ΔHRstatins, +0.11; q=0.03), Blautia obeum (ΔHRstatins, +0.06; q=0.01), Blautia sp. KLE 1732 (ΔHRstatins, +0.05; q=0.01), and beta-diversity principal component 1 (ΔHRstatin, +0.07; q=0.03) but only when adjusting for demographic covariates. CONCLUSIONS: Statin users have compositionally differing microbiotas from nonusers. The human gut microbiota is associated with incident T2D risk in statin users and possibly has additive effects on statin-associated new-onset T2D risk.


Assuntos
Diabetes Mellitus Tipo 2 , Dislipidemias , Microbioma Gastrointestinal , Inibidores de Hidroximetilglutaril-CoA Redutases , Humanos , Inibidores de Hidroximetilglutaril-CoA Redutases/efeitos adversos , Estudos Transversais , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/epidemiologia , Dislipidemias/diagnóstico , Dislipidemias/tratamento farmacológico , Dislipidemias/epidemiologia
14.
Artigo em Inglês | MEDLINE | ID: mdl-38040454

RESUMO

Computing tools and machine learning models play an increasingly important role in biology and are now an essential part of discoveries in protein science. The growing energy needs of modern algorithms have raised concerns in the computational science community in light of the climate emergency. In this work, we summarize the different ways in which protein science can negatively impact the environment and we present the carbon footprint of some popular protein algorithms: molecular simulations, inference of protein-protein interactions, and protein structure prediction. We show that large deep learning models such as AlphaFold and ESMFold can have carbon footprints reaching over 100 tonnes of CO2e in some cases. The magnitude of these impacts highlights the importance of monitoring and mitigating them, and we list actions scientists can take to achieve more sustainable protein computational science.


Assuntos
Pegada de Carbono , Aprendizado de Máquina , Algoritmos , Proteínas
15.
Microbiol Spectr ; 11(6): e0256223, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-37971428

RESUMO

IMPORTANCE: Drug-resistant tuberculosis (TB) infection is a growing and potent concern, and combating it will be necessary to achieve the WHO's goal of a 95% reduction in TB deaths by 2035. While prior studies have explored the evolution and spread of drug resistance, we still lack a clear understanding of the fitness costs (if any) imposed by resistance-conferring mutations and the role that Mtb genetic lineage plays in determining the likelihood of resistance evolution. This study offers insight into these questions by assessing the dynamics of resistance evolution in a high-burden Southeast Asian setting with a diverse lineage composition. It demonstrates that there are clear lineage-specific differences in the dynamics of resistance acquisition and transmission and shows that different lineages evolve resistance via characteristic mutational pathways.


Assuntos
Mycobacterium tuberculosis , Tuberculose Resistente a Múltiplos Medicamentos , Humanos , Mycobacterium tuberculosis/genética , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Pequim , Vietnã/epidemiologia , Genótipo , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , Farmacorresistência Bacteriana Múltipla/genética , Mutação
16.
medRxiv ; 2023 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-37873403

RESUMO

Heart failure (HF) is a major public health problem. Early identification of at-risk individuals could allow for interventions that reduce morbidity or mortality. The community-based FINRISK Microbiome DREAM challenge (synapse.org/finrisk) evaluated the use of machine learning approaches on shotgun metagenomics data obtained from fecal samples to predict incident HF risk over 15 years in a population cohort of 7231 Finnish adults (FINRISK 2002, n=559 incident HF cases). Challenge participants used synthetic data for model training and testing. Final models submitted by seven teams were evaluated in the real data. The two highest-scoring models were both based on Cox regression but used different feature selection approaches. We aggregated their predictions to create an ensemble model. Additionally, we refined the models after the DREAM challenge by eliminating phylum information. Models were also evaluated at intermediate timepoints and they predicted 10-year incident HF more accurately than models for 5- or 15-year incidence. We found that bacterial species, especially those linked to inflammation, are predictive of incident HF. This highlights the role of the gut microbiome as a potential driver of inflammation in HF pathophysiology. Our results provide insights into potential modeling strategies of microbiome data in prospective cohort studies. Overall, this study provides evidence that incorporating microbiome information into incident risk models can provide important biological insights into the pathogenesis of HF.

17.
Nat Genet ; 55(11): 1854-1865, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37814053

RESUMO

The analysis of longitudinal data from electronic health records (EHRs) has the potential to improve clinical diagnoses and enable personalized medicine, motivating efforts to identify disease subtypes from patient comorbidity information. Here we introduce an age-dependent topic modeling (ATM) method that provides a low-rank representation of longitudinal records of hundreds of distinct diseases in large EHR datasets. We applied ATM to 282,957 UK Biobank samples, identifying 52 diseases with heterogeneous comorbidity profiles; analyses of 211,908 All of Us samples produced concordant results. We defined subtypes of the 52 heterogeneous diseases based on their comorbidity profiles and compared genetic risk across disease subtypes using polygenic risk scores (PRSs), identifying 18 disease subtypes whose PRS differed significantly from other subtypes of the same disease. We further identified specific genetic variants with subtype-dependent effects on disease risk. In conclusion, ATM identifies disease subtypes with differential genome-wide and locus-specific genetic risk profiles.


Assuntos
Predisposição Genética para Doença , Saúde da População , Humanos , Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Comorbidade , Herança Multifatorial/genética , Reino Unido/epidemiologia
19.
Commun Biol ; 6(1): 804, 2023 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-37532769

RESUMO

RNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations and the Geuvadis study, we show that RG-PCs had comparable results to paired array-based genotypes, with high genotype concordance and high correlations of genetic principal components, capturing subpopulations within the dataset. In differential gene expression analysis, we found that inclusion of RG-PCs as covariates reduced test statistic inflation. Our paper demonstrates that genetic population structure can be directly inferred and controlled for using RNAseq data, thus facilitating improved retrospective and future analyses of transcriptomic data.


Assuntos
Genética Populacional , Humanos , Estudos Retrospectivos , Genótipo , Sequência de Bases , Análise de Sequência de RNA
20.
J Am Heart Assoc ; 12(15): e029296, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37489768

RESUMO

Background The aim of this study was to provide quantitative evidence of the use of polygenic risk scores for systematically identifying individuals for invitation for full formal cardiovascular disease (CVD) risk assessment. Methods and Results A total of 108 685 participants aged 40 to 69 years, with measured biomarkers, linked primary care records, and genetic data in UK Biobank were used for model derivation and population health modeling. Prioritization tools using age, polygenic risk scores for coronary artery disease and stroke, and conventional risk factors for CVD available within longitudinal primary care records were derived using sex-specific Cox models. We modeled the implications of initiating guideline-recommended statin therapy after prioritizing individuals for invitation to a formal CVD risk assessment. If primary care records were used to prioritize individuals for formal risk assessment using age- and sex-specific thresholds corresponding to 5% false-negative rates, then the numbers of men and women needed to be screened to prevent 1 CVD event are 149 and 280, respectively. In contrast, adding polygenic risk scores to both prioritization and formal assessments, and selecting thresholds to capture the same number of events, resulted in a number needed to screen of 116 for men and 180 for women. Conclusions Using both polygenic risk scores and primary care records to prioritize individuals at highest risk of a CVD event for a formal CVD risk assessment can efficiently prioritize those who need interventions the most than using primary care records alone. This could lead to better allocation of resources by reducing the number of risk assessments in primary care while still preventing the same number of CVD events.


Assuntos
Doenças Cardiovasculares , Doença da Artéria Coronariana , Acidente Vascular Cerebral , Masculino , Humanos , Feminino , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Fatores de Risco , Doença da Artéria Coronariana/complicações , Medição de Risco/métodos , Acidente Vascular Cerebral/epidemiologia , Acidente Vascular Cerebral/genética , Acidente Vascular Cerebral/prevenção & controle
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...