Your browser doesn't support javascript.
loading
A transformer-based genomic prediction method fused with knowledge-guided module.
Wu, Cuiling; Zhang, Yiyi; Ying, Zhiwen; Li, Ling; Wang, Jun; Yu, Hui; Zhang, Mengchen; Feng, Xianzhong; Wei, Xinghua; Xu, Xiaogang.
Afiliação
  • Wu C; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Zhang Y; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Ying Z; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Li L; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Wang J; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Yu H; Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130012, China.
  • Zhang M; State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou 310006, China.
  • Feng X; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
  • Wei X; Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130012, China.
  • Xu X; Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311121, China.
Brief Bioinform ; 25(1)2023 11 22.
Article em En | MEDLINE | ID: mdl-38058185
ABSTRACT
Genomic prediction (GP) uses single nucleotide polymorphisms (SNPs) to establish associations between markers and phenotypes. Selection of early individuals by genomic estimated breeding value shortens the generation interval and speeds up the breeding process. Recently, methods based on deep learning (DL) have gained great attention in the field of GP. In this study, we explore the application of Transformer-based structures to GP and develop a novel deep-learning model named GPformer. GPformer obtains a global view by gleaning beneficial information from all relevant SNPs regardless of the physical distance between SNPs. Comprehensive experimental results on five different crop datasets show that GPformer outperforms ridge regression-based linear unbiased prediction (RR-BLUP), support vector regression (SVR), light gradient boosting machine (LightGBM) and deep neural network genomic prediction (DNNGP) in terms of mean absolute error, Pearson's correlation coefficient and the proposed metric consistent index. Furthermore, we introduce a knowledge-guided module (KGM) to extract genome-wide association studies-based information, which is fused into GPformer as prior knowledge. KGM is very flexible and can be plugged into any DL network. Ablation studies of KGM on three datasets illustrate the efficiency of KGM adequately. Moreover, GPformer is robust and stable to hyperparameters and can generalize to each phenotype of every dataset, which is suitable for practical application scenarios.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudo de Associação Genômica Ampla / Modelos Genéticos Limite: Humans Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Estudo de Associação Genômica Ampla / Modelos Genéticos Limite: Humans Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China