Proper conditional analysis in the presence of missing data: Application to large scale meta-analysis of tobacco use phenotypes.

Jiang, Yu; Chen, Sai; McGuire, Daniel; Chen, Fang; Liu, Mengzhen; Iacono, William G; Hewitt, John K; Hokanson, John E; Krauter, Kenneth; Laakso, Markku; Li, Kevin W; Lutz, Sharon M; McGue, Matthew; Pandit, Anita; Zajac, Gregory J M; Boehnke, Michael; Abecasis, Goncalo R; Vrieze, Scott I; Zhan, Xiaowei; Jiang, Bibo; Liu, Dajiang J

Jiang, Yu; Chen, Sai; McGuire, Daniel; Chen, Fang; Liu, Mengzhen; Iacono, William G; Hewitt, John K; Hokanson, John E; Krauter, Kenneth; Laakso, Markku; Li, Kevin W; Lutz, Sharon M; McGue, Matthew; Pandit, Anita; Zajac, Gregory J M; Boehnke, Michael; Abecasis, Goncalo R; Vrieze, Scott I; Zhan, Xiaowei; Jiang, Bibo; Liu, Dajiang J.

Afiliação

Jiang Y; Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, United States of America.
Chen S; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
McGuire D; Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, United States of America.
Chen F; Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, United States of America.
Liu M; Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America.
Iacono WG; Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America.
Hewitt JK; Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America.
Hokanson JE; Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America.
Krauter K; Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, Colorado, United States of America.
Laakso M; Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland.
Li KW; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
Lutz SM; Department of Biostatistics and Informatics, University of Colorado, Anschutz Medical Campus, Aurora, Colorado, United States of America.
McGue M; Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America.
Pandit A; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
Zajac GJM; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
Boehnke M; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
Abecasis GR; Center of Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
Vrieze SI; Department of Psychology, University of Minnesota, Minneapolis, Minnesota, United States of America.
Zhan X; Department of Clinical Science, Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America.
Jiang B; Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, United States of America.
Liu DJ; Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania, United States of America.

PLoS Genet ; 14(7): e1007452, 2018 07.

Article em En | MEDLINE | ID: mdl-30016313

ABSTRACT

ABSTRACT

Meta-analysis of genetic association studies increases sample size and the power for mapping complex traits. Existing methods are mostly developed for datasets without missing values, i.e. the summary association statistics are measured for all variants in contributing studies. In practice, genotype imputation is not always effective. This may be the case when targeted genotyping/sequencing assays are used or when the un-typed genetic variant is rare. Therefore, contributed summary statistics often contain missing values. Existing methods for imputing missing summary association statistics and using imputed values in meta-analysis, approximate conditional analysis, or simple strategies such as complete case analysis all have theoretical limitations. Applying these approaches can bias genetic effect estimates and lead to seriously inflated type-I or type-II errors in conditional analysis, which is a critical tool for identifying independently associated variants. To address this challenge and complement imputation methods, we developed a method to combine summary statistics across participating studies and consistently estimate joint effects, even when the contributed summary statistics contain large amounts of missing values. Based on this estimator, we proposed a score statistic called PCBS (partial correlation based score statistic) for conditional analysis of single-variant and gene-level associations. Through extensive analysis of simulated and real data, we showed that the new method produces well-calibrated type-I errors and is substantially more powerful than existing approaches. We applied the proposed approach to one of the largest meta-analyses to date for the cigarettes-per-day phenotype. Using the new method, we identified multiple novel independently associated variants at known loci for tobacco use, which were otherwise missed by alternative methods. Together, the phenotypic variance explained by these variants was 1.1%, improving that of previously reported associations by 71%. These findings illustrate the extent of locus allelic heterogeneity and can help pinpoint causal variants.

Assuntos

Análise de Dados; Produtos do Tabaco/estatística & dados numéricos; Uso de Tabaco/genética; Alelos; Interpretação Estatística de Dados; Conjuntos de Dados como Assunto; Loci Gênicos/genética; Estudo de Associação Genômica Ampla; Genótipo; Humanos; Fenótipo; Polimorfismo de Nucleotídeo Único

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Produtos do Tabaco / Uso de Tabaco / Análise de Dados Tipo de estudo: Prognostic_studies / Systematic_reviews Limite: Humans Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google