Your browser doesn't support javascript.
loading
Investigating population stratification and admixture using eigenanalysis of dense genotypes.
Shriner, D.
Afiliação
  • Shriner D; Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, MD 20892-5635, USA. shrinerda@mail.nih.gov
Heredity (Edinb) ; 107(5): 413-20, 2011 Oct.
Article em En | MEDLINE | ID: mdl-21448230
ABSTRACT
Principal components analysis of genetic data is used to avoid inflation in type I error rates in association testing due to population stratification by covariate adjustment using the top eigenvectors and to estimate cluster or group membership independent of self-reported or ethnic identities. Eigendecomposition transforms correlated variables into an equal number of uncorrelated variables. Numerous stopping rules have been developed to identify which principal components should be retained. Recent developments in random matrix theory have led to a formal hypothesis test of the top eigenvalue, providing another way to achieve dimension reduction. In this study, I compare Velicer's minimum average partial test to a test on the basis of Tracy-Widom distribution as implemented in EIGENSOFT, the most widely used implementation of principal components analysis in genome-wide association analysis. By computer simulation of vicariance on the basis of coalescent theory, EIGENSOFT systematically overestimates the number of significant principal components. Furthermore, this overestimation is larger for samples of admixed individuals than for samples of unadmixed individuals. Overestimating the number of significant principal components can potentially lead to a loss of power in association testing by adjusting for unnecessary covariates and may lead to incorrect inferences about group differentiation. Velicer's minimum average partial test is shown to have both smaller bias and smaller variance, often with a mean squared error of 0, in estimating the number of principal components to retain. Velicer's minimum average partial test is implemented in R code and is suitable for genome-wide genotype data with or without population labels.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Variação Genética / Simulação por Computador / Haplótipos / Análise de Componente Principal / Modelos Genéticos Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Heredity (Edinb) Ano de publicação: 2011 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Variação Genética / Simulação por Computador / Haplótipos / Análise de Componente Principal / Modelos Genéticos Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Heredity (Edinb) Ano de publicação: 2011 Tipo de documento: Article País de afiliação: Estados Unidos