Fast heritability estimation based on MINQUE and batch training.

Tang, Mingsheng; Hou, Tingting; Tong, Xiaoran; Shen, Xiaoxi; Zhang, Xuefen; Wang, Tong; Lu, Qing

Tang, Mingsheng; Hou, Tingting; Tong, Xiaoran; Shen, Xiaoxi; Zhang, Xuefen; Wang, Tong; Lu, Qing.

Afiliación

Tang M; Division of Health Statistics, School of Public Health, Shanxi Medical University, No.56 Xin jian South Road, 030001 Shanxi, China.
Hou T; Department of Biostatistics, University of Florida, 2004 Mowry Road, 32611 FL, USA.
Tong X; Department of Biostatistics, University of Florida, 2004 Mowry Road, 32611 FL, USA.
Shen X; Department of Biostatistics, University of Florida, 2004 Mowry Road, 32611 FL, USA.
Zhang X; Social Medicine, School of Public Health, Shanxi Medical University,No.56 Xin jian South Road, 030001 Shanxi, China.
Wang T; Division of Health Statistics, School of Public Health, Shanxi Medical University, No.56 Xin jian South Road, 030001 Shanxi, China.
Lu Q; Department of Biostatistics, University of Florida, 2004 Mowry Road, 32611 FL, USA.

Brief Bioinform ; 23(3)2022 05 13.

Article en En | MEDLINE | ID: mdl-35383355

ABSTRACT

ABSTRACT

Heritability, the proportion of phenotypic variance explained by genome-wide single nucleotide polymorphisms (SNPs) in unrelated individuals, is an important measure of the genetic contribution to human diseases and plays a critical role in studying the genetic architecture of human diseases. Linear mixed model (LMM) has been widely used for SNP heritability estimation, where variance component parameters are commonly estimated by using a restricted maximum likelihood (REML) method. REML is an iterative optimization algorithm, which is computationally intensive when applied to large-scale datasets (e.g. UK Biobank). To facilitate the heritability analysis of large-scale genetic datasets, we develop a fast approach, minimum norm quadratic unbiased estimator (MINQUE) with batch training, to estimate variance components from LMM (LMM.MNQ.BCH). In LMM.MNQ.BCH, the parameters are estimated by MINQUE, which has a closed-form solution for fast computation and has no convergence issue. Batch training has also been adopted in LMM.MNQ.BCH to accelerate the computation for large-scale genetic datasets. Through simulations and real data analysis, we demonstrate that LMM.MNQ.BCH is much faster than two existing approaches, GCTA and BOLT-REML.

Asunto(s)

Estudio de Asociación del Genoma Completo; Modelos Genéticos; Genoma; Estudio de Asociación del Genoma Completo/métodos; Humanos; Modelos Lineales; Polimorfismo de Nucleótido Simple

Palabras clave

MINQUE; SNP heritability; batch training; human genome; kernel

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Estudio de Asociación del Genoma Completo / Modelos Genéticos Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: China

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google