Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale.
Genet Epidemiol
; 44(3): 248-260, 2020 04.
Article
en En
| MEDLINE
| ID: mdl-31879980
Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case-control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Bancos de Muestras Biológicas
/
Estudio de Asociación del Genoma Completo
Tipo de estudio:
Diagnostic_studies
/
Observational_studies
/
Risk_factors_studies
Límite:
Humans
Idioma:
En
Revista:
Genet Epidemiol
Asunto de la revista:
EPIDEMIOLOGIA
/
GENETICA MEDICA
Año:
2020
Tipo del documento:
Article