Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

Enhancing winter wheat prediction with genomics, phenomics and environmental data.

Montesinos-López, Osval A; Herr, Andrew W; Crossa, José; Montesinos-López, Abelardo; Carter, Arron H.

BMC Genomics ; 25(1): 544, 2024 May 31.

Artículo en Inglés | MEDLINE | ID: mdl-38822262

RESUMEN

In the realm of multi-environment prediction, when the goal is to predict a complete environment using the others as a training set, the efficiency of genomic selection (GS) falls short of expectations. Genotype by environment interaction poses a challenge in achieving high prediction accuracies. Consequently, current efforts are focused on enhancing efficiency by integrating various types of inputs, such as phenomics data, environmental information, and other omics data. In this study, we sought to evaluate the impact of incorporating environmental information into the modeling process, in addition to genomic and phenomics information. Our evaluation encompassed five data sets of soft white winter wheat, and the results revealed a significant improvement in prediction accuracy, as measured by the normalized root mean square error (NRMSE), through the integration of environmental information. Notably, there was an average gain in prediction accuracy of 49.19% in terms of NRMSE across the data sets. Moreover, the observed prediction accuracy ranged from 5.68% (data set 3) to 60.36% (data set 4), underscoring the substantial effect of integrating environmental information. By including genomic, phenomic, and environmental data in prediction models, plant breeding programs can improve selection efficiency across locations.

Asunto(s)

Genómica , Fenómica , Triticum , Triticum/genética , Genómica/métodos , Interacción Gen-Ambiente , Fenotipo , Genotipo , Fitomejoramiento , Ambiente , Genoma de Planta

2.

Bayesian discrete lognormal regression model for genomic prediction.

Montesinos-López, Abelardo; Gutiérrez-Pulido, Humberto; Ramos-Pulido, Sofía; Montesinos-López, José Cricelio; Montesinos-López, Osval A; Crossa, José.

Theor Appl Genet ; 137(1): 21, 2024 Jan 14.

Artículo en Inglés | MEDLINE | ID: mdl-38221602

RESUMEN

KEY MESSAGE: Genomic prediction models for quantitative traits assume continuous and normally distributed phenotypes. In this research, we proposed a novel Bayesian discrete lognormal regression model. Genomic selection is a powerful tool in modern breeding programs that uses genomic information to predict the performance of individuals and select those with desirable traits. It has revolutionized animal and plant breeding, as it allows breeders to identify the best candidates without labor-intensive and time-consuming phenotypic evaluations. While several statistical models have been developed, most of them have been for quantitative continuous traits and only a few for count responses. In this paper, we propose a discrete lognormal regression model in the Bayesian context, that with a Gibbs sampler to explore the corresponding posterior distribution and make the predictions. Two datasets of resistance disease is used in the wheat crop and are then evaluated against the traditional Gaussian model and a lognormal model. The results indicate the proposed model is a competitive and natural model for predicting count genomic traits.

Asunto(s)

Modelos Genéticos , Fitomejoramiento , Humanos , Animales , Teorema de Bayes , Genoma , Genómica/métodos , Fenotipo

3.

Two simple methods to improve the accuracy of the genomic selection methodology.

Montesinos-López, Osval A; Montesinos-López, Abelardo.

BMC Genomics ; 24(1): 220, 2023 Apr 26.

Artículo en Inglés | MEDLINE | ID: mdl-37101112

RESUMEN

BACKGROUND: Genomic selection (GS) is revolutionizing plant and animal breeding. However, still its practical implementation is challenging since it is affected by many factors that when they are not under control make this methodology not effective. Also, due to the fact that it is formulated as a regression problem in general has low sensitivity to select the best candidate individuals since a top percentage is selected according to a ranking of predicted breeding values. RESULTS: For this reason, in this paper we propose two methods to improve the prediction accuracy of this methodology. One of the methods consist in reformulating the GS (nowadays formulated as a regression problem) methodology as a binary classification problem. The other consists only in a postprocessing step that adjust the threshold used for classification of the lines predicted in its original scale (continues scale) to guarantee similar sensitivity and specificity. The postprocessing method is applied for the resulting predictions after obtaining the predictions using the conventional regression model. Both methods assume that we defined with anticipation a threshold, to divide the training data as top lines and not top lines, and this threshold can be decided in terms of a quantile (for example 80%, 90%, etc.) or as the average (or maximum) of the performance of the checks. In the reformulation method it is required to label as one those lines in the training set that are equal or larger than the specified threshold and as zero otherwise. Then we train a binary classification model with the conventional inputs, but using the binary response variable in place of the continuous response variable. The training of the binary classification should be done to guarantee a more similar sensitivity and specificity, to guarantee a reasonable probability of classification of the top lines. CONCLUSIONS: We evaluated the proposed models in seven data sets and we found that the two proposed methods outperformed by large margin the conventional regression model (by 402.9% in terms of sensitivity, by 110.04% in terms of F1 score and by 70.96% in terms of Kappa coefficient, with the postprocessing methods). However, between the two proposed methods the postprocessing method was better than the reformulation as binary classification model. The simple postprocessing method to improve the accuracy of the conventional genomic regression models avoid the need to reformulate the conventional regression models as binary classification models with similar or better performance, that significantly improve the selection of the top best candidate lines. In general both proposed methods are simple and can easily be adopted for use in practical breeding programs, with the guarantee that will improve significantly the selection of the top best candidates lines.

Asunto(s)

Fitomejoramiento , Selección Genética , Animales , Genoma , Genómica/métodos , Fenotipo , Modelos Genéticos

4.

Multivariate Genomic Hybrid Prediction with Kernels and Parental Information.

Montesinos-López, Osval A; Crossa, José; Saint Pierre, Carolina; Gerard, Guillermo; Valenzo-Jiménez, Marco Alberto; Vitale, Paolo; Valladares-Cellis, Patricia Edwigis; Buenrostro-Mariscal, Raymundo; Montesinos-López, Abelardo; Crespo-Herrera, Leonardo.

Int J Mol Sci ; 24(18)2023 Sep 07.

Artículo en Inglés | MEDLINE | ID: mdl-37762107

RESUMEN

Genomic selection (GS) plays a pivotal role in hybrid prediction. It can enhance the selection of parental lines, accurately predict hybrid performance, and harness hybrid vigor. Likewise, it can optimize breeding strategies by reducing field trial requirements, expediting hybrid development, facilitating targeted trait improvement, and enhancing adaptability to diverse environments. Leveraging genomic information empowers breeders to make informed decisions and significantly improve the efficiency and success rate of hybrid breeding programs. In order to improve the genomic ability performance, we explored the incorporation of parental phenotypic information as covariates under a multi-trait framework. Approach 1, referred to as Pmean, directly utilized parental phenotypic information without any preprocessing. While approach 2, denoted as BV, replaced the direct use of phenotypic values of both parents with their respective breeding values. While an improvement in prediction performance was observed in both approaches, with a minimum 4.24% reduction in the normalized root mean square error (NRMSE), the direct incorporation of parental phenotypic information in the Pmean approach slightly outperformed the BV approach. We also compared these two approaches using linear and nonlinear kernels, but no relevant gain was observed. Finally, our results increase empirical evidence confirming that the integration of parental phenotypic information helps increase the prediction performance of hybrids.

Asunto(s)

Hibridación Genética , Modelos Genéticos , Genoma de Planta , Fenotipo , Genómica/métodos , Fitomejoramiento

5.

A review of deep learning applications for genomic selection.

Montesinos-López, Osval Antonio; Montesinos-López, Abelardo; Pérez-Rodríguez, Paulino; Barrón-López, José Alberto; Martini, Johannes W R; Fajardo-Flores, Silvia Berenice; Gaytan-Lugo, Laura S; Santana-Mancilla, Pedro C; Crossa, José.

BMC Genomics ; 22(1): 19, 2021 Jan 06.

Artículo en Inglés | MEDLINE | ID: mdl-33407114

RESUMEN

BACKGROUND: Several conventional genomic Bayesian (or no Bayesian) prediction methods have been proposed including the standard additive genetic effect model for which the variance components are estimated with mixed model equations. In recent years, deep learning (DL) methods have been considered in the context of genomic prediction. The DL methods are nonparametric models providing flexibility to adapt to complicated associations between data and output with the ability to adapt to very complex patterns. MAIN BODY: We review the applications of deep learning (DL) methods in genomic selection (GS) to obtain a meta-picture of GS performance and highlight how these tools can help solve challenging plant breeding problems. We also provide general guidance for the effective use of DL methods including the fundamentals of DL and the requirements for its appropriate use. We discuss the pros and cons of this technique compared to traditional genomic prediction approaches as well as the current trends in DL applications. CONCLUSIONS: The main requirement for using DL is the quality and sufficiently large training data. Although, based on current literature GS in plant and animal breeding we did not find clear superiority of DL in terms of prediction power compared to conventional genome based prediction models. Nevertheless, there are clear evidences that DL algorithms capture nonlinear patterns more efficiently than conventional genome based. Deep learning algorithms are able to integrate data from different sources as is usually needed in GS assisted breeding and it shows the ability for improving prediction accuracy for large plant breeding data. It is important to apply DL to large training-testing data sets.

Asunto(s)

Aprendizaje Profundo , Modelos Genéticos , Animales , Teorema de Bayes , Genoma , Genómica , Fenotipo , Selección Genética

6.

A guide for kernel generalized regression methods for genomic-enabled prediction.

Montesinos-López, Abelardo; Montesinos-López, Osval Antonio; Montesinos-López, José Cricelio; Flores-Cortes, Carlos Alberto; de la Rosa, Roberto; Crossa, José.

Heredity (Edinb) ; 126(4): 577-596, 2021 04.

Artículo en Inglés | MEDLINE | ID: mdl-33649571

RESUMEN

The primary objective of this paper is to provide a guide on implementing Bayesian generalized kernel regression methods for genomic prediction in the statistical software R. Such methods are quite efficient for capturing complex non-linear patterns that conventional linear regression models cannot. Furthermore, these methods are also powerful for leveraging environmental covariates, such as genotype × environment (G×E) prediction, among others. In this study we provide the building process of seven kernel methods: linear, polynomial, sigmoid, Gaussian, Exponential, Arc-cosine 1 and Arc-cosine L. Additionally, we highlight illustrative examples for implementing exact kernel methods for genomic prediction under a single-environment, a multi-environment and multi-trait framework, as well as for the implementation of sparse kernel methods under a multi-environment framework. These examples are followed by a discussion on the strengths and limitations of kernel methods and, subsequently by conclusions about the main contributions of this paper.

Asunto(s)

Interacción Gen-Ambiente , Modelos Genéticos , Teorema de Bayes , Genómica , Triticum

7.

A robust Bayesian genome-based median regression model.

Montesinos-López, Abelardo; Montesinos-López, Osval A; Villa-Diharce, Enrique R; Gianola, Daniel; Crossa, José.

Theor Appl Genet ; 132(5): 1587-1606, 2019 May.

Artículo en Inglés | MEDLINE | ID: mdl-30747261

RESUMEN

KEY MESSAGE: Current genome-enabled prediction models assumed errors normally distributed, which are sensitive to outliers. We propose a model with errors assumed to follow a Laplace distribution to deal better with outliers. Current genome-enabled prediction models use regressions that fit the expected value (mean) of a response variable with errors assumed normally distributed, which are often sensitive to outliers, either genetic or environmental. For this reason, we propose a robust Bayesian genome median regression (BGMR) model that fits regressions to the medians of a distribution, with errors assumed to follow a Laplace distribution to deal better with outliers. The BGMR model was evaluated under a Bayesian framework with Markov Chain Monte Carlo sampling using a location-scale mixture representation of the Laplace distribution. The BGMR was implemented with two simulated and two real genomic data sets, and we compared its prediction performance with that of a conventional genomic best linear unbiased prediction (GBLUP) model and the Laplace maximum a posteriori (LMAP) method. The prediction accuracies of BGMR were higher than those of the GBLUP and LMAP methods when there were outliers. The BGMR model could be useful to breeders who need to predict and select genotypes based on data with unknown outliers.

Asunto(s)

Cruzamiento , Genoma de Planta , Modelos Teóricos , Plantas/genética , Teorema de Bayes , Simulación por Computador , Cadenas de Markov , Método de Montecarlo , Análisis de Regresión

8.

A singular value decomposition Bayesian multiple-trait and multiple-environment genomic model.

Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Ramírez-Alcaraz, Juan Manuel; Singh, Ravi; Mondal, S; Juliana, P.

Heredity (Edinb) ; 122(4): 381-401, 2019 04.

Artículo en Inglés | MEDLINE | ID: mdl-30120367

RESUMEN

Today, breeders perform genomic-assisted breeding to improve more than one trait. However, frequently there are several traits under study at one time, and the implementation of current genomic multiple-trait and multiple-environment models is challenging. Consequently, we propose a four-stage analysis for multiple-trait data in this paper. In the first stage, we perform singular value decomposition (SVD) on the resulting matrix of trait responses; in the second stage, we perform multiple trait analysis on transformed responses. In stages three and four, we collect and transform the traits back to their original state and obtain the parameter estimates and the predictions on these scale variables prior to transformation. The results of the proposed method are compared, in terms of parameter estimation and prediction accuracy, with the results of the Bayesian multiple-trait and multiple-environment model (BMTME) previously described in the literature. We found that the proposed method based on SVD produced similar results, in terms of parameter estimation and prediction accuracy, to those obtained with the BMTME model. Moreover, the proposed multiple-trait method is atractive because it can be implemented using current single-trait genomic prediction software, which yields a more efficient algorithm in terms of computation.

Asunto(s)

Interacción Gen-Ambiente , Genómica/métodos , Modelos Genéticos , Carácter Cuantitativo Heredable , Algoritmos , Teorema de Bayes , Cruzamiento , Genoma/genética , Genotipo , Fenotipo , Selección Genética

9.

An extended multiplicative error model of allometry: Incorporating systematic components, non-normal distributions, and piecewise heteroscedasticity.

Echavarría-Heras, Héctor; Villa-Diharce, Enrique; Montesinos-López, Abelardo; Leal-Ramírez, Cecilia.

Biol Methods Protoc ; 9(1): bpae024, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38765636

RESUMEN

Allometry refers to the relationship between the size of a trait and that of the whole body of an organism. Pioneering observations by Otto Snell and further elucidation by D'Arcy Thompson set the stage for its integration into Huxley's explanation of constant relative growth that epitomizes through the formula of simple allometry. The traditional method to identify such a model conforms to a regression protocol fitted in the direct scales of data. It involves Huxley's formula-systematic part and a lognormally distributed multiplicative error term. In many instances of allometric examination, the predictive strength of this paradigm is unsuitable. Established approaches to improve fit enhance the complexity of the systematic relationship while keeping the go-along normality-borne error. These extensions followed Huxley's idea that considering a biphasic allometric pattern could be necessary. However, for present data composing 10 410 pairs of measurements of individual eelgrass leaf dry weight and area, a fit relying on a biphasic systematic term and multiplicative lognormal errors barely improved correspondence measure values while maintaining a heavy tails problem. Moreover, the biphasic form and multiplicative-lognormal-mixture errors did not provide complete fit dependability either. However, updating the outline of such an error term to allow heteroscedasticity to occur in a piecewise-like mode finally produced overall fit consistency. Our results demonstrate that when attempting to achieve fit quality improvement in a Huxley's model-based multiplicative error scheme, allowing for a complex allometry form for the systematic part, a non-normal distribution-driven error term and a composite of uneven patterns to describe the heteroscedastic outline could be essential.

10.

Feature engineering of environmental covariates improves plant genomic-enabled prediction.

Montesinos-López, Osval A; Crespo-Herrera, Leonardo; Pierre, Carolina Saint; Cano-Paez, Bernabe; Huerta-Prado, Gloria Isabel; Mosqueda-González, Brandon Alejandro; Ramos-Pulido, Sofia; Gerard, Guillermo; Alnowibet, Khalid; Fritsche-Neto, Roberto; Montesinos-López, Abelardo; Crossa, José.

Front Plant Sci ; 15: 1349569, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38812738

RESUMEN

Introduction: Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. Methods: When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussion: We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.

11.

Deep learning methods improve genomic prediction of wheat breeding.

Montesinos-López, Abelardo; Crespo-Herrera, Leonardo; Dreisigacker, Susanna; Gerard, Guillermo; Vitale, Paolo; Saint Pierre, Carolina; Govindan, Velu; Tarekegn, Zerihun Tadesse; Flores, Moisés Chavira; Pérez-Rodríguez, Paulino; Ramos-Pulido, Sofía; Lillemo, Morten; Li, Huihui; Montesinos-López, Osval A; Crossa, Jose.

Front Plant Sci ; 15: 1324090, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38504889

RESUMEN

In the field of plant breeding, various machine learning models have been developed and studied to evaluate the genomic prediction (GP) accuracy of unseen phenotypes. Deep learning has shown promise. However, most studies on deep learning in plant breeding have been limited to small datasets, and only a few have explored its application in moderate-sized datasets. In this study, we aimed to address this limitation by utilizing a moderately large dataset. We examined the performance of a deep learning (DL) model and compared it with the widely used and powerful best linear unbiased prediction (GBLUP) model. The goal was to assess the GP accuracy in the context of a five-fold cross-validation strategy and when predicting complete environments using the DL model. The results revealed the DL model outperformed the GBLUP model in terms of GP accuracy for two out of the five included traits in the five-fold cross-validation strategy, with similar results in the other traits. This indicates the superiority of the DL model in predicting these specific traits. Furthermore, when predicting complete environments using the leave-one-environment-out (LOEO) approach, the DL model demonstrated competitive performance. It is worth noting that the DL model employed in this study extends a previously proposed multi-modal DL model, which had been primarily applied to image data but with small datasets. By utilizing a moderately large dataset, we were able to evaluate the performance and potential of the DL model in a context with more information and challenging scenario in plant breeding.

12.

Data Augmentation Enhances Plant-Genomic-Enabled Predictions.

Montesinos-López, Osval A; Solis-Camacho, Mario Alberto; Crespo-Herrera, Leonardo; Saint Pierre, Carolina; Huerta Prado, Gloria Isabel; Ramos-Pulido, Sofia; Al-Nowibet, Khalid; Fritsche-Neto, Roberto; Gerard, Guillermo; Montesinos-López, Abelardo; Crossa, José.

Genes (Basel) ; 15(3)2024 02 24.

Artículo en Inglés | MEDLINE | ID: mdl-38540344

RESUMEN

Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings.

Asunto(s)

Genoma de Planta , Genómica , Fenotipo , Aprendizaje Automático , Redes Neurales de la Computación

13.

A marker weighting approach for enhancing within-family accuracy in genomic prediction.

Montesinos-López, Osval A; Crespo-Herrera, Leonardo; Xavier, Alencar; Godwa, Manje; Beyene, Yoseph; Pierre, Carolina Saint; de la Rosa-Santamaria, Roberto; Salinas-Ruiz, Josafhat; Gerard, Guillermo; Vitale, Paolo; Dreisigacker, Susanne; Lillemo, Morten; Grignola, Fernando; Sarinelli, Martin; Pozzo, Ezequiel; Quiroga, Marco; Montesinos-López, Abelardo; Crossa, José.

G3 (Bethesda) ; 14(2)2024 Feb 07.

Artículo en Inglés | MEDLINE | ID: mdl-38079160

RESUMEN

Genomic selection is revolutionizing plant breeding. However, its practical implementation is still very challenging, since predicted values do not necessarily have high correspondence to the observed phenotypic values. When the goal is to predict within-family, it is not always possible to obtain reasonable accuracies, which is of paramount importance to improve the selection process. For this reason, in this research, we propose the Adversaria-Boruta (AB) method, which combines the virtues of the adversarial validation (AV) method and the Boruta feature selection method. The AB method operates primarily by minimizing the disparity between training and testing distributions. This is accomplished by reducing the weight assigned to markers that display the most significant differences between the training and testing sets. Therefore, the AB method built a weighted genomic relationship matrix that is implemented with the genomic best linear unbiased predictor (GBLUP) model. The proposed AB method is compared using 12 real data sets with the GBLUP model that uses a nonweighted genomic relationship matrix. Our results show that the proposed AB method outperforms the GBLUP by 8.6, 19.7, and 9.8% in terms of Pearson's correlation, mean square error, and normalized root mean square error, respectively. Our results support that the proposed AB method is a useful tool to improve the prediction accuracy of a complete family, however, we encourage other investigators to evaluate the AB method to increase the empirical evidence of its potential.

Asunto(s)

Modelos Genéticos , Polimorfismo de Nucleótido Simple , Genoma , Genómica/métodos , Modelos Lineales , Fenotipo , Genotipo

14.

Sparse multi-trait genomic prediction under balanced incomplete block design.

Montesinos-López, Osval A; Mosqueda-González, Brandon A; Salinas-Ruiz, Josafat; Montesinos-López, Abelardo; Crossa, José.

Plant Genome ; 16(2): e20305, 2023 06.

Artículo en Inglés | MEDLINE | ID: mdl-36815225

RESUMEN

Sparse testing is essential to increase the efficiency of the genomic selection methodology, as the same efficiency (in this case prediction power) can be obtained while using less genotypes evaluated in the fields. For this reason, it is important to evaluate the existing methods for performing the allocation of lines to environments. With this goal, four methods (M1-M4) to allocate lines to environments were evaluated under the context of a multi-trait genomic prediction problem: M1 denotes the allocation of a fraction (subset) of lines in all locations, M2 denotes the allocation of a fraction of lines with some shared lines in locations but not arranged based on the balanced incomplete block design (BIBD) principle, M3 denotes the random allocation of a subset of lines to locations, and M4 denotes the allocation of a subset of lines to locations using the BIBD principle. The evaluation was done using seven real multi-environment data sets common in plant breeding programs. We found that the best method was M4 and the worst was M1, while no important differences were found between M3 and M4. We concluded that M4 and M3 are efficient in the context of sparse testing for multi-trait prediction.

Asunto(s)

Genoma de Planta , Fitomejoramiento , Fenotipo , Genotipo , Genómica

15.

Multimodal deep learning methods enhance genomic prediction of wheat breeding.

Montesinos-López, Abelardo; Rivera, Carolina; Pinto, Francisco; Piñera, Francisco; Gonzalez, David; Reynolds, Mathew; Pérez-Rodríguez, Paulino; Li, Huihui; Montesinos-López, Osval A; Crossa, Jose.

G3 (Bethesda) ; 13(5)2023 05 02.

Artículo en Inglés | MEDLINE | ID: mdl-36869747

RESUMEN

While several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes in plant breeding research, few methods have linked genomics and phenomics (imaging). Deep learning (DL) neural networks have been developed to increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype-environment interaction (GE); however, unlike conventional GP models, DL has not been investigated for when genomics is linked with phenomics. In this study we used 2 wheat data sets (DS1 and DS2) to compare a novel DL method with conventional GP models. Models fitted for DS1 were GBLUP, gradient boosting machine (GBM), support vector regression (SVR) and the DL method. Results indicated that for 1 year, DL provided better GP accuracy than results obtained by the other models. However, GP accuracy obtained for other years indicated that the GBLUP model was slightly superior to the DL. DS2 is comprised only of genomic data from wheat lines tested for 3 years, 2 environments (drought and irrigated) and 2-4 traits. DS2 results showed that when predicting the irrigated environment with the drought environment, DL had higher accuracy than the GBLUP model in all analyzed traits and years. When predicting drought environment with information on the irrigated environment, the DL model and GBLUP model had similar accuracy. The DL method used in this study is novel and presents a strong degree of generalization as several modules can potentially be incorporated and concatenated to produce an output for a multi-input data structure.

Asunto(s)

Aprendizaje Profundo , Triticum , Triticum/genética , Fitomejoramiento/métodos , Modelos Genéticos , Fenotipo , Genómica/métodos , Genotipo

16.

Partial least squares enhance multi-trait genomic prediction of potato cultivars in new environments.

Ortiz, Rodomiro; Reslow, Fredrik; Montesinos-López, Abelardo; Huicho, José; Pérez-Rodríguez, Paulino; Montesinos-López, Osval A; Crossa, José.

Sci Rep ; 13(1): 9947, 2023 06 19.

Artículo en Inglés | MEDLINE | ID: mdl-37336933

RESUMEN

It is of paramount importance in plant breeding to have methods dealing with large numbers of predictor variables and few sample observations, as well as efficient methods for dealing with high correlation in predictors and measured traits. This paper explores in terms of prediction performance the partial least squares (PLS) method under single-trait (ST) and multi-trait (MT) prediction of potato traits. The first prediction was for tested lines in tested environments under a five-fold cross-validation (5FCV) strategy and the second prediction was for tested lines in untested environments (herein denoted as leave one environment out cross validation, LOEO). There was a good performance in terms of predictions (with accuracy mostly > 0.5 for Pearson's correlation) the accuracy of 5FCV was better than LOEO. Hence, we have empirical evidence that the ST and MT PLS framework is a very valuable tool for prediction in the context of potato breeding data.

Asunto(s)

Solanum tuberosum , Solanum tuberosum/genética , Análisis de los Mínimos Cuadrados , Modelos Genéticos , Fitomejoramiento , Fenotipo , Genómica/métodos , Genotipo

17.

Integrating Parental Phenotypic Data Enhances Prediction Accuracy of Hybrids in Wheat Traits.

Montesinos-López, Osval A; Bentley, Alison R; Saint Pierre, Carolina; Crespo-Herrera, Leonardo; Salinas Ruiz, Josafhat; Valladares-Celis, Patricia Edwigis; Montesinos-López, Abelardo; Crossa, José.

Genes (Basel) ; 14(2)2023 02 02.

Artículo en Inglés | MEDLINE | ID: mdl-36833322

RESUMEN

Genomic selection (GS) is a methodology that is revolutionizing plant breeding because it can select candidate genotypes without phenotypic evaluation in the field. However, its practical implementation in hybrid prediction remains challenging since many factors affect its accuracy. The main objective of this study was to research the genomic prediction accuracy of wheat hybrids by adding covariates with the hybrid parental phenotypic information to the model. Four types of different models (MA, MB, MC, and MD) with one covariate (same trait to be predicted) (MA_C, MB_C, MC_C, and MD_C) or several covariates (of the same trait and other correlated traits) (MA_AC, MB_AC, MC_AC, and MD_AC) were studied. We found that the four models with parental information outperformed models without parental information in terms of mean square error by at least 14.1% (MA vs. MA_C), 5.5% (MB vs. MB_C), 51.4% (MC vs. MC_C), and 6.4% (MD vs. MD_C) when parental information of the same trait was used and by at least 13.7% (MA vs. MA_AC), 5.3% (MB vs. MB_AC), 55.1% (MC vs. MC_AC), and 6.0% (MD vs. MD_AC) when parental information of the same trait and other correlated traits were used. Our results also show a large gain in prediction accuracy when covariates were considered using the parental phenotypic information, as opposed to marker information. Finally, our results empirically demonstrate that a significant improvement in prediction accuracy was gained by adding parental phenotypic information as covariates; however, this is expensive since, in many breeding programs, the parental phenotypic information is unavailable.

Asunto(s)

Modelos Genéticos , Triticum , Triticum/genética , Polimorfismo de Nucleótido Simple , Fitomejoramiento , Fenotipo

18.

A novel method for genomic-enabled prediction of cultivars in new environments.

Montesinos-López, Osval A; Ramos-Pulido, Sofia; Hernández-Suárez, Carlos Moisés; Mosqueda González, Brandon Alejandro; Valladares-Anguiano, Felícitas Alejandra; Vitale, Paolo; Montesinos-López, Abelardo; Crossa, José.

Front Plant Sci ; 14: 1218151, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37564390

RESUMEN

Introduction: Genomic selection (GS) has gained global importance due to its potential to accelerate genetic progress and improve the efficiency of breeding programs. Objectives of the research: In this research we proposed a method to improve the prediction accuracy of tested lines in new (untested) environments. Method-1: The new method trained the model with a modified response variable (a difference of response variables) that decreases the lack of a non-stationary distribution between the training and testing and improved the prediction accuracy. Comparing new and conventional method: We compared the prediction accuracy of the conventional genomic best linear unbiased prediction (GBLUP) model (M1) including (or not) genotype × environment interaction (GE) (M1_GE; M1_NO_GE) versus the proposed method (M2) on several data sets. Results and discussion: The gain in prediction accuracy of M2, versus M1_GE, M1_NO_GE in terms of Pearson´s correlation was of at least 4.3%, while in terms of percentage of top-yielding lines captured when was selected the 10% (Best10) and 20% (Best20) of lines was at least of 19.5%, while in terms of Normalized Root Mean Squared Error (NRMSE) was of at least of 42.29%.

19.

Optimizing Sparse Testing for Genomic Prediction of Plant Breeding Crops.

Montesinos-López, Osval A; Saint Pierre, Carolina; Gezan, Salvador A; Bentley, Alison R; Mosqueda-González, Brandon A; Montesinos-López, Abelardo; van Eeuwijk, Fred; Beyene, Yoseph; Gowda, Manje; Gardner, Keith; Gerard, Guillermo S; Crespo-Herrera, Leonardo; Crossa, José.

Genes (Basel) ; 14(4)2023 04 17.

Artículo en Inglés | MEDLINE | ID: mdl-37107685

RESUMEN

While sparse testing methods have been proposed by researchers to improve the efficiency of genomic selection (GS) in breeding programs, there are several factors that can hinder this. In this research, we evaluated four methods (M1-M4) for sparse testing allocation of lines to environments under multi-environmental trails for genomic prediction of unobserved lines. The sparse testing methods described in this study are applied in a two-stage analysis to build the genomic training and testing sets in a strategy that allows each location or environment to evaluate only a subset of all genotypes rather than all of them. To ensure a valid implementation, the sparse testing methods presented here require BLUEs (or BLUPs) of the lines to be computed at the first stage using an appropriate experimental design and statistical analyses in each location (or environment). The evaluation of the four cultivar allocation methods to environments of the second stage was done with four data sets (two large and two small) under a multi-trait and uni-trait framework. We found that the multi-trait model produced better genomic prediction (GP) accuracy than the uni-trait model and that methods M3 and M4 were slightly better than methods M1 and M2 for the allocation of lines to environments. Some of the most important findings, however, were that even under a scenario where we used a training-testing relation of 15-85%, the prediction accuracy of the four methods barely decreased. This indicates that genomic sparse testing methods for data sets under these scenarios can save considerable operational and financial resources with only a small loss in precision, which can be shown in our cost-benefit analysis.

Asunto(s)

Modelos Genéticos , Fitomejoramiento , Fitomejoramiento/métodos , Genoma de Planta/genética , Fenotipo , Genómica , Productos Agrícolas/genética

20.

Efficacy of plant breeding using genomic information.

Montesinos-López, Osval A; Bentley, Alison R; Saint Pierre, Carolina; Crespo-Herrera, Leonardo; Rebollar-Ruellas, Leonardo; Valladares-Celis, Patricia Edwigis; Lillemo, Morten; Montesinos-López, Abelardo; Crossa, José.

Plant Genome ; 16(2): e20346, 2023 06.

Artículo en Inglés | MEDLINE | ID: mdl-37139645

RESUMEN

Genomic selection (GS) proposed by Meuwissen et al. more than 20 years ago, is revolutionizing plant and animal breeding. Although GS has been widely accepted and applied to plant and animal breeding, there are many factors affecting its efficacy. We studied 14 real datasets to respond to the practical question of whether the accuracy of genomic prediction increases when considering genomic as compared with not using genomic. We found across traits, environments, datasets, and metrics, that the average gain in prediction accuracy when genomic information is considered was 26.31%, while only in terms of Pearson's correlation the gain was of 46.1%, while only in terms of normalized root mean squared error the gain was of 6.6%. If the quality of the makers and relatedness of the individuals increase, major gains in prediction accuracy can be obtained, but if these two factors decrease, a lower increase is possible. Finally, our findings reinforce genomic is vital for improving the prediction accuracy and, therefore, the realized genetic gain in genomic assisted plant breeding programs.

Asunto(s)

Fitomejoramiento , Selección Genética , Animales , Modelos Genéticos , Genoma , Genómica

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA