Your browser doesn't support javascript.
loading
A systematic comparison of normalization methods for eQTL analysis.
Yang, Jiajun; Wang, Dongyang; Yang, Yanbo; Yang, Wenqian; Jin, Weiwei; Niu, Xiaohui; Gong, Jing.
Afiliación
  • Yang J; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Wang D; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Yang Y; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Yang W; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Jin W; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Niu X; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
  • Gong J; Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China.
Brief Bioinform ; 22(6)2021 11 05.
Article en En | MEDLINE | ID: mdl-34015824
Expression quantitative trait loci (eQTL) analysis has been widely used in interpreting disease-associated loci through correlating genetic variant loci with the expression of specific genes. RNA-sequencing (RNA-Seq), which can quantify gene expression at the genome-wide level, is often used in eQTL identification. Since different normalization methods of gene expression have substantial impacts on RNA-seq downstream analysis, it is of great necessity to systematically compare the effects of these methods on eQTL identification. Here, by using RNA-seq and genotype data of four different cancers in The Cancer Genome Atlas (TCGA) database, we comprehensively evaluated the effect of eight commonly used normalization methods on eQTL identification. Our results showed that the application of different methods could cause 20-30% differences in the final results of eQTL identification. Among these methods, COUNT, Median of Ratio (MED) and Trimmed Mean of M-values (TMM) generated similar results for identifying eQTLs, while Fragments Per Kilobase Million (FPKM) or RANK produced more differential results compared with other methods. Based on the accuracy and receiver operating characteristic (ROC) curve, the TMM method was found to be the optimal method for normalizing gene expression data in eQTLs analysis. In addition, we also evaluated the performance of different pairwise combinations of these methods. As a result, compared with single normalization methods, the combination of methods can not only identify more cis-eQTLs, but also improve the performance of the ROC curve. Overall, this study provides a comprehensive comparison of normalization methods for identifying eQTLs from RNA-seq data, and proposes some practical recommendations for diverse scenarios.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Biología Computacional / Predisposición Genética a la Enfermedad / Sitios de Carácter Cuantitativo / Estudios de Asociación Genética Tipo de estudio: Guideline / Prognostic_studies Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Biología Computacional / Predisposición Genética a la Enfermedad / Sitios de Carácter Cuantitativo / Estudios de Asociación Genética Tipo de estudio: Guideline / Prognostic_studies Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article