Your browser doesn't support javascript.
loading
Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning.
Monti, Remo; Eick, Lisa; Hudjashov, Georgi; Läll, Kristi; Kanoni, Stavroula; Wolford, Brooke N; Wingfield, Benjamin; Pain, Oliver; Wharrie, Sophie; Jermy, Bradley; McMahon, Aoife; Hartonen, Tuomo; Heyne, Henrike; Mars, Nina; Lambert, Samuel; Hveem, Kristian; Inouye, Michael; van Heel, David A; Mägi, Reedik; Marttinen, Pekka; Ripatti, Samuli; Ganna, Andrea; Lippert, Christoph.
Afiliación
  • Monti R; Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany; Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Berlin Institute for Medical Systems Biology, Berlin, Germany.
  • Eick L; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland.
  • Hudjashov G; Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia.
  • Läll K; Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia.
  • Kanoni S; William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK.
  • Wolford BN; K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway.
  • Wingfield B; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
  • Pain O; Maurice Wohl Clinical Neuroscience Institute, Department of Basic and Clinical Neuroscience; Institute of Psychiatry, Psychology and Neuroscience; King's College London, London, UK.
  • Wharrie S; Aalto University, Department of Computer Science, Espoo, Finland.
  • Jermy B; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland.
  • McMahon A; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
  • Hartonen T; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland.
  • Heyne H; Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany.
  • Mars N; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research and Program in Medical and Population Genetics,
  • Lambert S; Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia; British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambr
  • Hveem K; K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway; Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway.
  • Inouye M; Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia; British Heart Foundation Cardiovascular Epidemiology Unit, De
  • van Heel DA; Blizard Institute, Queen Mary University of London, London, UK.
  • Mägi R; Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia.
  • Marttinen P; Aalto University, Department of Computer Science, Espoo, Finland.
  • Ripatti S; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland.
  • Ganna A; Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Massachusetts General Hospital and Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Lippert C; Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany; Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Me
Am J Hum Genet ; 111(7): 1431-1447, 2024 07 11.
Article en En | MEDLINE | ID: mdl-38908374
ABSTRACT
Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Bancos de Muestras Biológicas / Herencia Multifactorial / Estudio de Asociación del Genoma Completo Límite: Humans Idioma: En Revista: Am J Hum Genet Año: 2024 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Bancos de Muestras Biológicas / Herencia Multifactorial / Estudio de Asociación del Genoma Completo Límite: Humans Idioma: En Revista: Am J Hum Genet Año: 2024 Tipo del documento: Article País de afiliación: Alemania