Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Theor Appl Genet ; 136(8): 176, 2023 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-37532821

RESUMO

KEY MESSAGE: Training sets produced by maximizing the number of parent lines, each involved in one cross, had the highest prediction accuracy for H0 hybrids, but lowest for H1 and H2 hybrids. Genomic prediction holds great promise for hybrid breeding but optimum composition of the training set (TS) as determined by the number of parents (nTS) and crosses per parent (c) has received little attention. Our objective was to examine prediction accuracy ([Formula: see text]) of GCA for lines used as parents of the TS (I1 lines) or not (I0 lines), and H0, H1 and H2 hybrids, comprising crosses of type I0 × I0, I1 × I0 and I1 × I1, respectively, as function of nTS and c. In the theory, we developed estimates for [Formula: see text] of GBLUPs for hybrids: (i)[Formula: see text] based on the expected prediction accuracy, and (ii) [Formula: see text] based on [Formula: see text] of GBLUPs of GCA and SCA effects. In the simulation part, hybrid populations were generated using molecular data from two experimental maize data sets. Additive and dominance effects of QTL borrowed from literature were used to simulate six scenarios of traits differing in the proportion (τSCA = 1%, 6%, 22%) of SCA variance in σG2 and heritability (h2 = 0.4, 0.8). Values of [Formula: see text] and [Formula: see text] closely agreed with [Formula: see text] for hybrids. For given size NTS = nTS × c of TS, [Formula: see text] of H0 hybrids and GCA of I0 lines was highest for c = 1. Conversely, for GCA of I1 lines and H1 and H2 hybrids, c = 1 yielded lowest [Formula: see text] with concordant results across all scenarios for both data sets. In view of these opposite trends, the optimum choice of c for maximizing selection response across all types of hybrids depends on the size and resources of the breeding program.


Assuntos
Genômica , Melhoramento Vegetal , Fenótipo , Genoma de Planta , Simulação por Computador , Modelos Genéticos
2.
Theor Appl Genet ; 134(9): 3069-3081, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34117908

RESUMO

KEY MESSAGE: Model training on data from all selection cycles yielded the highest prediction accuracy by attenuating specific effects of individual cycles. Expected reliability was a robust predictor of accuracies obtained with different calibration sets. The transition from phenotypic to genome-based selection requires a profound understanding of factors that determine genomic prediction accuracy. We analysed experimental data from a commercial maize breeding programme to investigate if genomic measures can assist in identifying optimal calibration sets for model training. The data set consisted of six contiguous selection cycles comprising testcrosses of 5968 doubled haploid lines genotyped with a minimum of 12,000 SNP markers. We evaluated genomic prediction accuracies in two independent prediction sets in combination with calibration sets differing in sample size and genomic measures (effective sample size, average maximum kinship, expected reliability, number of common polymorphic SNPs and linkage phase similarity). Our results indicate that across selection cycles prediction accuracies were as high as 0.57 for grain dry matter yield and 0.76 for grain dry matter content. Including data from all selection cycles in model training yielded the best results because interactions between calibration and prediction sets as well as the effects of different testers and specific years were attenuated. Among genomic measures, the expected reliability of genomic breeding values was the best predictor of empirical accuracies obtained with different calibration sets. For grain yield, a large difference between expected and empirical reliability was observed in one prediction set. We propose to use this difference as guidance for determining the weight phenotypic data of a given selection cycle should receive in model retraining and for selection when both genomic breeding values and phenotypes are available.


Assuntos
Cromossomos de Plantas/genética , Genoma de Planta , Fenótipo , Melhoramento Vegetal/métodos , Polimorfismo de Nucleotídeo Único , Zea mays/crescimento & desenvolvimento , Zea mays/genética , Mapeamento Cromossômico/métodos , Locos de Características Quantitativas
3.
Theor Appl Genet ; 129(11): 2043-2053, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27480157

RESUMO

KEY MESSAGE: Genomic prediction accuracy can be significantly increased by model calibration across multiple breeding cycles as long as selection cycles are connected by common ancestors. In hybrid rye breeding, application of genome-based prediction is expected to increase selection gain because of long selection cycles in population improvement and development of hybrid components. Essentially two prediction scenarios arise: (1) prediction of the genetic value of lines from the same breeding cycle in which model training is performed and (2) prediction of lines from subsequent cycles. It is the latter from which a reduction in cycle length and consequently the strongest impact on selection gain is expected. We empirically investigated genome-based prediction of grain yield, plant height and thousand kernel weight within and across four selection cycles of a hybrid rye breeding program. Prediction performance was assessed using genomic and pedigree-based best linear unbiased prediction (GBLUP and PBLUP). A total of 1040 S2 lines were genotyped with 16 k SNPs and each year testcrosses of 260 S2 lines were phenotyped in seven or eight locations. The performance gap between GBLUP and PBLUP increased significantly for all traits when model calibration was performed on aggregated data from several cycles. Prediction accuracies obtained from cross-validation were in the order of 0.70 for all traits when data from all cycles (N CS = 832) were used for model training and exceeded within-cycle accuracies in all cases. As long as selection cycles are connected by a sufficient number of common ancestors and prediction accuracy has not reached a plateau when increasing sample size, aggregating data from several preceding cycles is recommended for predicting genetic values in subsequent cycles despite decreasing relatedness over time.


Assuntos
Genoma de Planta , Modelos Genéticos , Melhoramento Vegetal , Secale/genética , Cruzamentos Genéticos , Genômica , Genótipo , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único
4.
Theor Appl Genet ; 127(6): 1375-86, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24723140

RESUMO

KEY MESSAGE: The calibration data for genomic prediction should represent the full genetic spectrum of a breeding program. Data heterogeneity is minimized by connecting data sources through highly related test units. One of the major challenges of genome-enabled prediction in plant breeding lies in the optimum design of the population employed in model training. With highly interconnected breeding cycles staggered in time the choice of data for model training is not straightforward. We used cross-validation and independent validation to assess the performance of genome-based prediction within and across genetic groups, testers, locations, and years. The study comprised data for 1,073 and 857 doubled haploid lines evaluated as testcrosses in 2 years. Testcrosses were phenotyped for grain dry matter yield and content and genotyped with 56,110 single nucleotide polymorphism markers. Predictive abilities strongly depended on the relatedness of the doubled haploid lines from the estimation set with those on which prediction accuracy was assessed. For scenarios with strong population heterogeneity it was advantageous to perform predictions within a priori defined genetic groups until higher connectivity through related test units was achieved. Differences between group means had a strong effect on predictive abilities obtained with both cross-validation and independent validation. Predictive abilities across subsequent cycles of selection and years were only slightly reduced compared to predictive abilities obtained with cross-validation within the same year. We conclude that the optimum data set for model training in genome-enabled prediction should represent the full genetic and environmental spectrum of the respective breeding program. Data heterogeneity can be reduced by experimental designs that maximize the connectivity between data sources by common or highly related test units.


Assuntos
Genoma de Planta , Hibridização Genética , Zea mays/genética , Cruzamento , Genótipo , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Zea mays/fisiologia
5.
Genetics ; 195(2): 573-87, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23934883

RESUMO

In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.


Assuntos
Cruzamento , Genoma de Planta , Modelos Genéticos , Locos de Características Quantitativas/genética , Arabidopsis/genética , Simulação por Computador , Modelos Lineares , Desequilíbrio de Ligação , Oryza/genética , Fenótipo , Seleção Genética , Triticum/genética
6.
Stat Appl Genet Mol Biol ; 12(3): 375-91, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23629460

RESUMO

Different statistical models have been proposed for maximizing prediction accuracy in genome-based prediction of breeding values in plant and animal breeding. However, little is known about the sensitivity of these models with respect to prior and hyperparameter specification, because comparisons of prediction performance are mainly based on a single set of hyperparameters. In this study, we focused on Bayesian prediction methods using a standard linear regression model with marker covariates coding additive effects at a large number of marker loci. By comparing different hyperparameter settings, we investigated the sensitivity of four methods frequently used in genome-based prediction (Bayesian Ridge, Bayesian Lasso, BayesA and BayesB) to specification of the prior distribution of marker effects. We used datasets simulated according to a typical maize breeding program differing in the number of markers and the number of simulated quantitative trait loci affecting the trait. Furthermore, we used an experimental maize dataset, comprising 698 doubled haploid lines, each genotyped with 56110 single nucleotide polymorphism markers and phenotyped as testcrosses for the two quantitative traits grain dry matter yield and grain dry matter content. The predictive ability of the different models was assessed by five-fold cross-validation. The extent of Bayesian learning was quantified by calculation of the Hellinger distance between the prior and posterior densities of marker effects. Our results indicate that similar predictive abilities can be achieved with all methods, but with BayesA and BayesB hyperparameter settings had a stronger effect on prediction performance than with the other two methods. Prediction performance of BayesA and BayesB suffered substantially from a non-optimal choice of hyperparameters.


Assuntos
Genoma de Planta , Modelos Genéticos , Algoritmos , Teorema de Bayes , Cruzamento , Simulação por Computador , Estudos de Associação Genética , Marcadores Genéticos , Modelos Lineares , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sensibilidade e Especificidade , Zea mays/genética
7.
Bioinformatics ; 28(15): 2086-7, 2012 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-22689388

RESUMO

SUMMARY: We present a novel R package named synbreed to derive genome-based predictions from high-throughput genotyping and large-scale phenotyping data. The package contains a comprehensive collection of functions required to fit and cross-validate genomic prediction models. All functions are embedded within the framework of a single, unified data object. Thereby a versatile genomic prediction analysis pipeline covering data processing, visualization and analysis is established within one software package. The implementation is flexible with respect to a wide range of data formats and models. The package fills an existing gap in the availability of user-friendly software for next-generation genetics research and education. AVAILABILITY: synbreed is open-source and available through CRAN http://cran.r-project.org/web/packages/synbreed. The latest development version is available from R-Forge. The package synbreed is released with a vignette, a manual and three large-scale example datasets (from package synbreedData). CONTACT: chris.schoen@wzw.tum.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Software , Algoritmos , Simulação por Computador , Processamento Eletrônico de Dados , Técnicas de Genotipagem , Modelos Lineares
8.
Theor Appl Genet ; 123(2): 339-50, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21505832

RESUMO

This is the first large-scale experimental study on genome-based prediction of testcross values in an advanced cycle breeding population of maize. The study comprised testcross progenies of 1,380 doubled haploid lines of maize derived from 36 crosses and phenotyped for grain yield and grain dry matter content in seven locations. The lines were genotyped with 1,152 single nucleotide polymorphism markers. Pedigree data were available for three generations. We used best linear unbiased prediction and stratified cross-validation to evaluate the performance of prediction models differing in the modeling of relatedness between inbred lines and in the calculation of genome-based coefficients of similarity. The choice of similarity coefficient did not affect prediction accuracies. Models including genomic information yielded significantly higher prediction accuracies than the model based on pedigree information alone. Average prediction accuracies based on genomic data were high even for a complex trait like grain yield (0.72-0.74) when the cross-validation scheme allowed for a high degree of relatedness between the estimation and the test set. When predictions were performed across distantly related families, prediction accuracies decreased significantly (0.47-0.48). Prediction accuracies decreased with decreasing sample size but were still high when the population size was halved (0.67-0.69). The results from this study are encouraging with respect to genome-based prediction of the genetic value of untested lines in advanced cycle breeding populations and the implementation of genomic selection in the breeding process.


Assuntos
Genoma de Planta , Modelos Genéticos , Zea mays/genética , Cruzamento , Cruzamentos Genéticos , Variação Genética , Genótipo , Fenótipo , Polimorfismo Genético
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...