The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples.

Minica, Camelia C; Genovese, Giulio; Hultman, Christina M; Pool, René; Vink, Jacqueline M; Neale, Michael C; Dolan, Conor V; Neale, Benjamin M

Minica, Camelia C; Genovese, Giulio; Hultman, Christina M; Pool, René; Vink, Jacqueline M; Neale, Michael C; Dolan, Conor V; Neale, Benjamin M.

Afiliação

Minica CC; Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands.
Genovese G; The Stanley Center for Psychiatric Research,Broad Institute of the Massachusetts Institute of Technology and Harvard,Cambridge,MA.
Hultman CM; The Department of Medical Epidemiology and Biostatistics,Karolinska Institute,Stockholm.
Pool R; Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands.
Vink JM; Behavioural Science Institute,Radboud University,Nijmegen,The Netherlands.
Neale MC; Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands.
Dolan CV; Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands.
Neale BM; The Stanley Center for Psychiatric Research,Broad Institute of the Massachusetts Institute of Technology and Harvard,Cambridge,MA.

Twin Res Hum Genet ; 20(2): 108-118, 2017 04.

Article em En | MEDLINE | ID: mdl-28238293

RESUMO

Sequence-based association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Because the true weights are generally unknown, and so are subject to misspecification, we examined the efficiency of a data-driven weighting scheme. We propose the use of a set of theoretically defensible weighting schemes, of which, we assume, the one that gives the largest test statistic is likely to capture best the allele frequency-functional effect relationship. We show that the use of alternative weights obviates the need to impose arbitrary frequency thresholds. As both the score test and the likelihood ratio test (LRT) may be used in this context, and may differ in power, we characterize the behavior of both tests. The two tests have equal power, if the weights in the set included weights resembling the correct ones. However, if the weights are badly specified, the LRT shows superior power (due to its robustness to misspecification). With this data-driven weighting procedure the LRT detected significant signal in genes located in regions already confirmed as associated with schizophrenia - the PRRC2A (p = 1.020e-06) and the VARS2 (p = 2.383e-06) - in the Swedish schizophrenia case-control cohort of 11,040 individuals with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances, the score test is the most powerful test. However, LRT has the advantageous properties of being generally more robust and more powerful under weight misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

Assuntos

Interpretação Estatística de Dados; Estudos de Associação Genética/métodos; Modelos Genéticos; Estudos de Casos e Controles; Simulação por Computador; Pesquisa Empírica; Feminino; Frequência do Gene; Variação Genética; Estudo de Associação Genômica Ampla; Antígenos HLA/genética; Humanos; Desequilíbrio de Ligação; Masculino; Proteínas/genética; Esquizofrenia/genética; Software; Suécia; Valina-tRNA Ligase/genética; População Branca/genética

Palavras-chave

MAF thresholding; SKAT; power; robustness; schizophrenia; variable weighting

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Interpretação Estatística de Dados / Estudos de Associação Genética / Modelos Genéticos Tipo de estudo: Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Female / Humans / Male País/Região como assunto: Europa Idioma: En Revista: Twin Res Hum Genet Assunto da revista: GENETICA MEDICA Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Holanda

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google