Your browser doesn't support javascript.
loading
Iterative hard thresholding for model selection in genome-wide association studies.
Keys, Kevin L; Chen, Gary K; Lange, Kenneth.
Afiliação
  • Keys KL; Department of Medicine, University of California, San Francisco, San Francisco, California, United States of America.
  • Chen GK; Division of Biostatistics, University of Southern California, Los Angeles, California, United States of America.
  • Lange K; Departments of Biomathematics, Human Genetics, and Statistics, University of California, Los Angeles, California, United States of America.
Genet Epidemiol ; 41(8): 756-768, 2017 12.
Article em En | MEDLINE | ID: mdl-28875524
ABSTRACT
A genome-wide association study (GWAS) correlates marker and trait variation in a study sample. Each subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here, we assume that subjects are randomly collected unrelateds and that trait values are normally distributed or can be transformed to normality. Over the past decade, geneticists have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with the ℓ1 penalty (LASSO) or minimax concave penalty (MCP) penalties is capable of selecting a handful of associated SNPs from millions of potential SNPs. Unfortunately, model selection can be corrupted by false positives and false negatives, obscuring the genetic underpinning of a trait. Here, we compare LASSO and MCP penalized regression to iterative hard thresholding (IHT). On GWAS regression data, IHT is better at model selection and comparable in speed to both methods of penalized regression. This conclusion holds for both simulated and real GWAS data. IHT fosters parallelization and scales well in problems with large numbers of causal markers. Our parallel implementation of IHT accommodates SNP genotype compression and exploits multiple CPU cores and graphics processing units (GPUs). This allows statistical geneticists to leverage commodity desktop computers in GWAS analysis and to avoid supercomputing.

AVAILABILITY:

Source code is freely available at https//github.com/klkeys/IHT.jl.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudo de Associação Genômica Ampla / Modelos Genéticos Idioma: En Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudo de Associação Genômica Ampla / Modelos Genéticos Idioma: En Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Estados Unidos