Your browser doesn't support javascript.
loading
Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers.
Passafaro, Tiago L; Lopes, Fernando B; Dórea, João R R; Craven, Mark; Breen, Vivian; Hawken, Rachel J; Rosa, Guilherme J M.
Afiliación
  • Passafaro TL; Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI, 53706, USA.
  • Lopes FB; Cobb-Vantress Inc., Siloam Springs, AR, 72761, USA.
  • Dórea JRR; Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI, 53706, USA.
  • Craven M; Department of Biostatistics & Medical Informatics, University of Wisconsin, Madison, WI, 53706, USA.
  • Breen V; Department of Computer Sciences, University of Wisconsin, Madison, WI, 53706, USA.
  • Hawken RJ; Cobb-Vantress Inc., Siloam Springs, AR, 72761, USA.
  • Rosa GJM; Cobb-Vantress Inc., Siloam Springs, AR, 72761, USA.
BMC Genomics ; 21(1): 771, 2020 Nov 09.
Article en En | MEDLINE | ID: mdl-33167865
ABSTRACT

BACKGROUND:

Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set.

RESULTS:

Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless of the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models, across all dataset sizes, with estimates close to one with larger sample sizes.

CONCLUSIONS:

DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Furthermore, the inclusion of more data per se is not a guarantee for the DNN to outperform the Bayesian regression methods commonly used for genome-enabled prediction. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Pollos / Herencia Multifactorial Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Animals Idioma: En Revista: BMC Genomics Asunto de la revista: GENETICA Año: 2020 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Pollos / Herencia Multifactorial Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Animals Idioma: En Revista: BMC Genomics Asunto de la revista: GENETICA Año: 2020 Tipo del documento: Article País de afiliación: Estados Unidos