Scalable mixed model methods for set-based association studies on large-scale categorical data analysis and its application to exome-sequencing data in UK Biobank.

Bi, Wenjian; Zhou, Wei; Zhang, Peipei; Sun, Yaoyao; Yue, Weihua; Lee, Seunggeun

Bi, Wenjian; Zhou, Wei; Zhang, Peipei; Sun, Yaoyao; Yue, Weihua; Lee, Seunggeun.

Afiliação

Bi W; Department of Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing, China; Center for Medical Genetics, School of Basic Medical Sciences, Peking University, Beijing, China; Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, C
Zhou W; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Zhang P; Department of Biochemistry and Biophysics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China; Key Laboratory for Neuroscience, Ministry of Education/National Health and Family Planning Commission, Peking University, Beijing, China.
Sun Y; Peking University Sixth Hospital, Peking University Institute of Mental Health, Beijing, China; NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China.
Yue W; Peking University Sixth Hospital, Peking University Institute of Mental Health, Beijing, China; NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China; Henan Key Lab of Biological Psychiatry,
Lee S; Graduate School of Data Science, Seoul National University, Seoul, Korea. Electronic address: lee7801@snu.ac.kr.

Am J Hum Genet ; 110(5): 762-773, 2023 05 04.

Article em En | MEDLINE | ID: mdl-37019109

ABSTRACT

ABSTRACT

The ongoing release of large-scale sequencing data in the UK Biobank allows for the identification of associations between rare variants and complex traits. SAIGE-GENE+ is a valid approach to conducting set-based association tests for quantitative and binary traits. However, for ordinal categorical phenotypes, applying SAIGE-GENE+ with treating the trait as quantitative or binarizing the trait can cause inflated type I error rates or power loss. In this study, we propose a scalable and accurate method for rare-variant association tests, POLMM-GENE, in which we used a proportional odds logistic mixed model to characterize ordinal categorical phenotypes while adjusting for sample relatedness. POLMM-GENE fully utilizes the categorical nature of phenotypes and thus can well control type I error rates while remaining powerful. In the analyses of UK Biobank 450k whole-exome-sequencing data for five ordinal categorical traits, POLMM-GENE identified 54 gene-phenotype associations.

Assuntos

Exoma; Estudo de Associação Genômica Ampla; Estudo de Associação Genômica Ampla/métodos; Exoma/genética; Bancos de Espécimes Biológicos; Fenótipo; Análise de Dados; Reino Unido

Palavras-chave

GWAS; POLMM; UK Biobank; categorical data analysis; family relatedness; gene-based analysis; genome-wide association studies; mixed model; proportional odds logistic mixed model; rare variants; region-based analysis; whole-exome sequencing data analysis

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudo de Associação Genômica Ampla / Exoma Tipo de estudo: Risk_factors_studies País/Região como assunto: Europa Idioma: En Revista: Am J Hum Genet Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google