Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes.

Bi, Wenjian; Zhou, Wei; Dey, Rounak; Mukherjee, Bhramar; Sampson, Joshua N; Lee, Seunggeun

Bi, Wenjian; Zhou, Wei; Dey, Rounak; Mukherjee, Bhramar; Sampson, Joshua N; Lee, Seunggeun.

Afiliação

Bi W; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA. Electronic address: wenjianb@umich.edu.
Zhou W; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142,
Dey R; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
Mukherjee B; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Sampson JN; Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, DHHS, Bethesda, MD 20892, USA.
Lee S; Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA; Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Graduate School of Data Science, Seoul National University, Seoul 08826, Republic of Korea. Electronic address: lee7801@snu.ac.kr.

Am J Hum Genet ; 108(5): 825-839, 2021 05 06.

Article em En | MEDLINE | ID: mdl-33836139

ABSTRACT

ABSTRACT

In genome-wide association studies, ordinal categorical phenotypes are widely used to measure human behaviors, satisfaction, and preferences. However, because of the lack of analysis tools, methods designed for binary or quantitative traits are commonly used inappropriately to analyze categorical phenotypes. To accurately model the dependence of an ordinal categorical phenotype on covariates, we propose an efficient mixed model association test, proportional odds logistic mixed model (POLMM). POLMM is computationally efficient to analyze large datasets with hundreds of thousands of samples, can control type I error rates at a stringent significance level regardless of the phenotypic distribution, and is more powerful than alternative methods. In contrast, the standard linear mixed model approaches cannot control type I error rates for rare variants when the phenotypic distribution is unbalanced, although they performed well when testing common variants. We applied POLMM to 258 ordinal categorical phenotypes on array genotypes and imputed samples from 408,961 individuals in UK Biobank. In total, we identified 5,885 genome-wide significant variants, of which, 424 variants (7.2%) are rare variants with MAF < 0.01.

Assuntos

Simulação por Computador; Estudo de Associação Genômica Ampla; Modelos Genéticos; Fenótipo; Bancos de Espécimes Biológicos; Criança; Feminino; Humanos; Masculino; Projetos de Pesquisa; Reino Unido

Palavras-chave

GRM; GWAS; POLMM; PheWAS; UK Biobank; food and other preferences; genetic relationship matrix; genome-wide association studies; mixed model approach; ordinal categorical data; phenome-wide association studies; proportional odds logistic mixed model; saddlepoint approximation; unbalanced phenotypic distribution

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Fenótipo / Simulação por Computador / Estudo de Associação Genômica Ampla / Modelos Genéticos Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google