Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Machine learning models for predicting blood pressure phenotypes by combining multiple polygenic risk scores.

Hrytsenko, Yana; Shea, Benjamin; Elgart, Michael; Kurniansyah, Nuzulul; Lyons, Genevieve; Morrison, Alanna C; Carson, April P; Haring, Bernhard; Mitchell, Braxton D; Psaty, Bruce M; Jaeger, Byron C; Gu, C Charles; Kooperberg, Charles; Levy, Daniel; Lloyd-Jones, Donald; Choi, Eunhee; Brody, Jennifer A; Smith, Jennifer A; Rotter, Jerome I; Moll, Matthew; Fornage, Myriam; Simon, Noah; Castaldi, Peter; Casanova, Ramon; Chung, Ren-Hua; Kaplan, Robert; Loos, Ruth J F; Kardia, Sharon L R; Rich, Stephen S; Redline, Susan; Kelly, Tanika; O'Connor, Timothy; Zhao, Wei; Kim, Wonji; Guo, Xiuqing; Ida Chen, Yii-Der; Sofer, Tamar.

Sci Rep ; 14(1): 12436, 2024 05 30.

Artigo em Inglês | MEDLINE | ID: mdl-38816422

RESUMO

We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1 to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8 to 5.1% (SBP) and 4.7 to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs. In summary, non-linear ML models improves BP prediction in models incorporating diverse populations.

Assuntos

Pressão Sanguínea , Estudo de Associação Genômica Ampla , Aprendizado de Máquina , Herança Multifatorial , Fenótipo , Humanos , Pressão Sanguínea/genética , Herança Multifatorial/genética , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Masculino , Feminino , Predisposição Genética para Doença , Modelos Genéticos , Hipertensão/genética , Hipertensão/fisiopatologia , Pessoa de Meia-Idade , Estratificação de Risco Genético

A parametric bootstrap approach for computing confidence intervals for genetic correlations with application to genetically determined protein-protein networks.

Tsai, Yi-Ting; Hrytsenko, Yana; Elgart, Michael; Tahir, Usman A; Chen, Zsu-Zsu; Wilson, James G; Gerszten, Robert E; Sofer, Tamar.

HGG Adv ; 5(3): 100304, 2024 Jul 18.

Artigo em Inglês | MEDLINE | ID: mdl-38720460

RESUMO

Genetic correlation refers to the correlation between genetic determinants of a pair of traits. When using individual-level data, it is typically estimated based on a bivariate model specification where the correlation between the two variables is identifiable and can be estimated from a covariance model that incorporates the genetic relationship between individuals, e.g., using a pre-specified kinship matrix. Inference relying on asymptotic normality of the genetic correlation parameter estimates may be inaccurate when the sample size is low, when the genetic correlation is close to the boundary of the parameter space, and when the heritability of at least one of the traits is low. We address this problem by developing a parametric bootstrap procedure to construct confidence intervals for genetic correlation estimates. The procedure simulates paired traits under a range of heritability and genetic correlation parameters, and it uses the population structure encapsulated by the kinship matrix. Heritabilities and genetic correlations are estimated using the close-form, method of moment, Haseman-Elston regression estimators. The proposed parametric bootstrap procedure is especially useful when genetic correlations are computed on pairs of thousands of traits measured on the same exact set of individuals. We demonstrate the parametric bootstrap approach on a proteomics dataset from the Jackson Heart Study.

Assuntos

Modelos Genéticos , Humanos , Mapas de Interação de Proteínas/genética , Intervalos de Confiança , Simulação por Computador , Algoritmos , Fenótipo

A parametric bootstrap approach for computing confidence intervals for genetic correlations with application to genetically-determined protein-protein networks.

Tsai, Yi-Ting; Hrytsenko, Yana; Elgart, Michael; Tahir, Usman; Chen, Zsu-Zsu; Wilson, James G; Gerszten, Robert; Sofer, Tamar.

medRxiv ; 2023 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-37961678

RESUMO

Machine learning models for blood pressure phenotypes combining multiple polygenic risk scores.

Hrytsenko, Yana; Shea, Benjamin; Elgart, Michael; Kurniansyah, Nuzulul; Lyons, Genevieve; Morrison, Alanna C; Carson, April P; Haring, Bernhard; Mitchel, Braxton D; Psaty, Bruce M; Jaeger, Byron C; Gu, C Charles; Kooperberg, Charles; Levy, Daniel; Lloyd-Jones, Donald; Choi, Eunhee; Brody, Jennifer A; Smith, Jennifer A; Rotter, Jerome I; Moll, Matthew; Fornage, Myriam; Simon, Noah; Castaldi, Peter; Casanova, Ramon; Chung, Ren-Hua; Kaplan, Robert; Loos, Ruth J F; Kardia, Sharon L R; Rich, Stephen S; Redline, Susan; Kelly, Tanika; O'Connor, Timothy; Zhao, Wei; Kim, Wonji; Guo, Xiuqing; Der Ida Chen, Yii; Sofer, Tamar.

medRxiv ; 2023 Dec 14.

Artigo em Inglês | MEDLINE | ID: mdl-38168328

RESUMO

We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1% to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8% to 5.1% (SBP) and 4.7% to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA