Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Theor Appl Genet ; 135(8): 2891-2905, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35831462

RESUMEN

KEY MESSAGE: We propose a simulation approach to compute response to genomic selection on a multi-environment framework to provide breeders the number of entries that need to be selected from the population to have a defined probability of selecting the truly best entry from the population and the probability of obtaining the truly best entries when some top-ranked entries are selected. The goal of any plant breeding program is to maximize genetic gain for traits of interest. In classical quantitative genetics, the genetic gain can be obtained from what is known as "Breeder's equation". In the past, only phenotypic data were used to compute the genetic gain. The advent of genomic prediction (GP) has opened the door to the utilization of dense markers for estimating genomic breeding values or GBV. The salient feature of GP is the possibility to carry out genomic selection with the assistance of the kinship matrix, hence improving the prediction accuracy and accelerating the breeding cycle. However, estimates of GBV as such do not provide the full information on the number of entries to be selected as in the classical response to selection. In this paper, we use simulation, based on a fitted mixed model for GP in a multi-environmental framework, to answer two typical questions of a plant breeder: (1) How many entries need to be selected to have a defined probability of selecting the truly best entry from the population; (2) what is the probability of obtaining the truly best entries when some top-ranked entries are selected.


Asunto(s)
Modelos Genéticos , Fitomejoramiento , Genoma , Genómica , Fenotipo , Plantas/genética , Selección Genética
4.
Crop Sci ; 61(4): 2243-2253, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34413534

RESUMEN

This paper presents an extension to a heuristic method for phasing and imputation of genotypes of descendants in biparental populations so that it can phase and impute genotypes of parents that are ungenotyped or partially genotyped. The imputed genotypes of the parent are used to impute low-density (Ld) genotyped descendants to high density (Hd). The extension was implemented as part of the AlphaPlantImpute software and works in three steps. First, it identifies whether a parent has no or Ld genotypes and identifies its relatives that have Hd genotypes. Second, using the Hd genotypes of relatives, it determines whether the parent is homozygous or heterozygous for a given locus. Third, it phases heterozygous positions of the parent by matching haplotypes to its relatives. We measured the accuracy (correlation between true and imputed genotypes) of imputing parent genotypes in simulated biparental populations from different scenarios. We tested the imputation accuracy of the missing parent's descendants using the true genotype of the parent and compared this with using the imputed genotypes of the parent. Across all scenarios, the imputation accuracy of a parent was >0.98 and did not drop below ∼0.96. The imputation accuracy of a parent was always higher when it was inbred than outbred. Including ancestors of the parent at Hd, increasing the number of crosses and the number of Hd descendants increased the imputation accuracy. The high imputation accuracy achieved for the parent translated to little or no impact on the imputation accuracy of its descendants.

5.
BMC Genom Data ; 22(1): 4, 2021 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-33568071

RESUMEN

BACKGROUND: Multi-parent populations (MPPs) are important resources for studying plant genetic architecture and detecting quantitative trait loci (QTLs). In MPPs, the QTL effects can show various levels of allelic diversity, which can be an important factor influencing the detection of QTLs. In MPPs, the allelic effects can be more or less specific. They can depend on an ancestor, a parent or the combination of parents in a cross. In this paper, we evaluated the effect of QTL allelic diversity on the QTL detection power in MPPs. RESULTS: We simulated: a) cross-specific QTLs; b) parental and ancestral QTLs; and c) bi-allelic QTLs. Inspired by a real application in sugar beet, we tested different MPP designs (diallel, chessboard, factorial, and NAM) derived from five or nine parents to explore the ability to sample genetic diversity and detect QTLs. Using a fixed total population size, the QTL detection power was larger in MPPs with fewer but larger crosses derived from a reduced number of parents. The use of a larger set of parents was useful to detect rare alleles with a large phenotypic effect. The benefit of using a larger set of parents was however conditioned on an increase of the total population size. We also determined empirical confidence intervals for QTL location to compare the resolution of different designs. For QTLs representing 6% of the phenotypic variation, using 1600 F2 offspring individuals, we found average 95% confidence intervals over different designs of 49 and 25 cM for cross-specific and bi-allelic QTLs, respectively. CONCLUSIONS: MPPs derived from less parents with few but large crosses generally increased the QTL detection power. Using a larger set of parents to cover a wider genetic diversity can be useful to detect QTLs with a reduced minor allele frequency when the QTL effect is large and when the total population size is increased.


Asunto(s)
Alelos , Beta vulgaris/genética , Sitios de Carácter Cuantitativo/genética
6.
G3 (Bethesda) ; 10(8): 2725-2739, 2020 08 05.
Artículo en Inglés | MEDLINE | ID: mdl-32527748

RESUMEN

"Sparse testing" refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be predicted. Consequently, the overall costs can be reduced and the testing capacities can be increased. The accuracy of predicting the unobserved data depends on different factors including (1) how many genotypes overlap between environments, (2) in how many environments each genotype is grown, and (3) which prediction method is used. In this research, we studied the predictive ability obtained when using a fixed number of plots and different sparse testing designs. The considered designs included the extreme cases of (1) no overlap of genotypes between environments, and (2) complete overlap of the genotypes between environments. In the latter case, the prediction set fully consists of genotypes that have not been tested at all. Moreover, we gradually go from one extreme to the other considering (3) intermediates between the two previous cases with varying numbers of different or non-overlapping (NO)/overlapping (O) genotypes. The empirical study is built upon two different maize hybrid data sets consisting of different genotypes crossed to two different testers (T1 and T2) and each data set was analyzed separately. For each set, phenotypic records on yield from three different environments are available. Three different prediction models were implemented, two main effects models (M1 and M2), and a model (M3) including GE. The results showed that the genome-based model including GE (M3) captured more phenotypic variation than the models that did not include this component. Also, M3 provided higher prediction accuracy than models M1 and M2 for the different allocation scenarios. Reducing the size of the calibration sets decreased the prediction accuracy under all allocation designs with M3 being the less affected model; however, using the genome-enabled models (i.e., M2 and M3) the predictive ability is recovered when more genotypes are tested across environments. Our results indicate that a substantial part of the testing resources can be saved when using genome-based models including GE for optimizing sparse testing designs.


Asunto(s)
Interacción Gen-Ambiente , Fitomejoramiento , Genómica , Genotipo , Modelos Genéticos , Fenotipo
7.
G3 (Bethesda) ; 9(4): 1117-1129, 2019 04 09.
Artículo en Inglés | MEDLINE | ID: mdl-30760541

RESUMEN

Mixed models can be considered as a type of penalized regression and are everyday tools in statistical genetics. The standard mixed model for whole genome regression (WGR) is ridge regression best linear unbiased prediction (RRBLUP) which is based on an additive marker effect model. Many publications have extended the additive WGR approach by incorporating interactions between loci or between genes and environment. In this context of penalized regressions with interactions, it has been reported that translating the coding of single nucleotide polymorphisms -for instance from -1,0,1 to 0,1,2- has an impact on the prediction of genetic values and interaction effects. In this work, we identify the reason for the relevance of variable coding in the general context of penalized polynomial regression. We show that in many cases, predictions of the genetic values are not invariant to translations of the variable coding, with an exception when only the sizes of the coefficients of monomials of highest total degree are penalized. The invariance of RRBLUP can be considered as a special case of this setting, with a polynomial of total degree 1, penalizing additive effects (total degree 1) but not the fixed effect (total degree 0). The extended RRBLUP (eRRBLUP), which includes interactions, is not invariant to translations because it does not only penalize interactions (total degree 2), but also additive effects (total degree 1). This observation implies that translation-invariance can be maintained in a pair-wise epistatic WGR if only interaction effects are penalized, but not the additive effects. In this regard, approaches of pre-selecting loci may not only reduce computation time, but can also help to avoid the variable coding issue. To illustrate the practical relevance, we compare different regressions on a publicly available wheat data set. We show that for an eRRBLUP, the relevance of the marker coding for interaction effect estimates increases with the number of variables included in the model. A biological interpretation of estimated interaction effects may therefore become more difficult. Consequently, comparing reproducing kernel Hilbert space (RKHS) approaches to WGR approaches modeling effects explicitly, the supposed advantage of an increased interpretability of the latter may not be real. Our theoretical results are generally valid for penalized regressions, for instance also for the least absolute shrinkage and selection operator (LASSO). Moreover, they apply to any type of interaction modeled by products of predictor variables in a penalized regression approach or by Hadamard products of covariance matrices in a mixed model.


Asunto(s)
Genómica/métodos , Análisis de Regresión , Polimorfismo de Nucleótido Simple , Triticum/genética , Triticum/crecimiento & desarrollo
8.
Theor Appl Genet ; 131(11): 2345-2357, 2018 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-30078163

RESUMEN

Key message New fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations. This paper presents a new heuristic method for phasing and imputation of genomic data in diploid plant species. Our method, called AlphaPlantImpute, explicitly leverages features of plant breeding programmes to maximise the accuracy of imputation. The features are a small number of parents, which can be inbred and usually have high-density genomic data, and few recombinations separating parents and focal individuals genotyped at low density (i.e. descendants that are the imputation targets). AlphaPlantImpute works roughly in three steps. First, it identifies informative low-density genotype markers in parents. Second, it tracks the inheritance of parental alleles and haplotypes to focal individuals at informative markers. Finally, it uses this low-density information as anchor points to impute focal individuals to high density. We tested the imputation accuracy of AlphaPlantImpute in simulated bi-parental populations across different scenarios. We also compared its accuracy to existing software called PlantImpute. In general, AlphaPlantImpute had better or equal imputation accuracy as PlantImpute. The computational time and memory requirements of AlphaPlantImpute were tiny compared to PlantImpute. For example, accuracy of imputation was 0.96 for a scenario where both parents were inbred and genotyped at 25,000 markers per chromosome and a focal F2 individual was genotyped with 50 markers per chromosome. The maximum memory requirement for this scenario was 0.08 GB and took 37 s to complete.


Asunto(s)
Heurística Computacional , Plantas/genética , Polimorfismo de Nucleótido Simple , Programas Informáticos , Alelos , Simulación por Computador , Marcadores Genéticos , Genómica , Genotipo , Haplotipos , Fitomejoramiento
9.
Genet Sel Evol ; 50(1): 16, 2018 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-29653506

RESUMEN

BACKGROUND: The single-step covariance matrix H combines the pedigree-based relationship matrix [Formula: see text] with the more accurate information on realized relatedness of genotyped individuals represented by the genomic relationship matrix [Formula: see text]. In particular, to improve convergence behavior of iterative approaches and to reduce inflation, two weights [Formula: see text] and [Formula: see text] have been introduced in the definition of [Formula: see text], which blend the inverse of a part of [Formula: see text] with the inverse of [Formula: see text]. Since the definition of this blending is based on the equation describing [Formula: see text], its impact on the structure of [Formula: see text] is not obvious. In a joint discussion, we considered the question of the shape of [Formula: see text] for non-trivial [Formula: see text] and [Formula: see text]. RESULTS: Here, we present the general matrix [Formula: see text] as a function of these parameters and discuss its structure and properties. Moreover, we screen for optimal values of [Formula: see text] and [Formula: see text] with respect to predictive ability, inflation and iterations up to convergence on a well investigated, publicly available wheat data set. CONCLUSION: Our results may help the reader to develop a better understanding for the effects of changes of [Formula: see text] and [Formula: see text] on the covariance model. In particular, we give theoretical arguments that as a general tendency, inflation will be reduced by increasing [Formula: see text] or by decreasing [Formula: see text].


Asunto(s)
Genómica/métodos , Triticum/genética , Algoritmos , Genoma de Planta , Genotipo , Triticum/clasificación
10.
Theor Appl Genet ; 130(8): 1753-1764, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28547012

RESUMEN

KEY MESSAGE: In the QTL analysis of multi-parent populations, the inclusion of QTLs with various types of effects can lead to a better description of the phenotypic variation and increased power. For the type of QTL effect in QTL models for multi-parent populations (MPPs), various options exist to define them with respect to their origin. They can be modelled as referring to close parental lines or to further away ancestral founder lines. QTL models for MPPs can also be characterized by the homo- or heterogeneity of variance for polygenic effects. The most suitable model for the origin of the QTL effect and the homo- or heterogeneity of polygenic effects may be a function of the genetic distance distribution between the parents of MPPs. We investigated the statistical properties of various QTL detection models for MPPs taking into account the genetic distances between the parents of the MPP. We evaluated models with different assumptions about the QTL effect and the form of the residual term using cross validation. For the EU-NAM data, we showed that it can be useful to mix in the same model QTLs with different types of effects (parental, ancestral, or bi-allelic). The benefit of using cross-specific residual terms to handle the heterogeneity of variance was less obvious for this particular data set.


Asunto(s)
Modelos Genéticos , Sitios de Carácter Cuantitativo , Zea mays/genética , Alelos , Cruzamientos Genéticos , Genotipo , Modelos Estadísticos , Fenotipo
11.
BMC Bioinformatics ; 18(1): 3, 2017 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-28049412

RESUMEN

BACKGROUND: Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far. RESULTS: We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits. CONCLUSION: Based on our results, for EGBLUP, a symmetric coding {-1,1} or {-1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.


Asunto(s)
Epistasis Genética , Modelos Genéticos , Alelos , Animales , Ratones , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Triticum/genética
12.
Theor Appl Genet ; 129(5): 963-76, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26883048

RESUMEN

KEY MESSAGE: Models based on additive marker effects and on epistatic interactions can be translated into genomic relationship models. This equivalence allows to perform predictions based on complex gene interaction models and reduces computational effort significantly. In the theory of genome-assisted prediction, the equivalence of a linear model based on independent and identically normally distributed marker effects and a model based on multivariate Gaussian distributed breeding values with genomic relationship as covariance matrix is well known. In this work, we demonstrate equivalences of marker effect models incorporating epistatic interactions and corresponding mixed models based on relationship matrices and show how to exploit these equivalences computationally for genome-assisted prediction. In particular, we show how models with epistatic interactions of higher order (e.g., three-factor interactions) translate into linear models with certain covariance matrices and demonstrate how to construct epistatic relationship matrices for the linear mixed model, if we restrict the model to interactions defined a priori. We illustrate the practical relevance of our results with a publicly available data set on grain yield of wheat lines growing in four different environments. For this purpose, we select important interactions in one environment and use this knowledge on the network of interactions to increase predictive ability of grain yield under other environmental conditions. Our results provide a guide for building relationship matrices based on knowledge on the structure of trait-related gene networks.


Asunto(s)
Epistasis Genética , Genoma de Planta , Modelos Genéticos , Triticum/genética , Ambiente , Modelos Lineales , Fitomejoramiento , Selección Genética
13.
Theor Appl Genet ; 128(5): 875-91, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25758357

RESUMEN

KEY MESSAGE: The efficiency of marker-assisted selection for native resistance to European corn borer stalk damage can be increased when progressing from a QTL-based towards a genome-wide approach. Marker-assisted selection (MAS) has been shown to be effective in improving resistance to the European corn borer (ECB) in maize. In this study, we investigated the performance of whole-genome-based selection, relative to selection based on individual quantitative trait loci (QTL), for resistance to ECB stalk damage in European elite maize. Three connected biparental populations, comprising 590 doubled haploid (DH) lines, were genotyped with high-density single nucleotide polymorphism markers and phenotyped under artificial and natural infestation in 2011. A subset of 195 DH lines was evaluated in the following year as lines per se and as testcrosses. Resistance was evaluated based on stalk damage ratings, the number of feeding tunnels in the stalk and tunnel length. We performed individual- and joint-population QTL analyses and compared the cross-validated predictive abilities of the QTL models with genomic best linear unbiased prediction (GBLUP). For all traits, the GBLUP model consistently outperformed the QTL model despite the detection of QTL with sizeable effects. For stalk damage rating, GBLUP's predictive ability exceeded at times 0.70. Model training based on DH line per se performance was efficient in predicting stalk breakage in testcrosses. We conclude that the efficiency of MAS for ECB stalk damage resistance can be increased considerably when progressing from a QTL-based towards a genome-wide approach. With the availability of native ECB resistance in elite European maize germplasm, our results open up avenues for the implementation of an integrated genome-based selection approach for the simultaneous improvement of yield, maturity and ECB resistance.


Asunto(s)
Mapeo Cromosómico , Sitios de Carácter Cuantitativo , Zea mays/genética , Alelos , Animales , Cruzamiento , Cruzamientos Genéticos , Ligamiento Genético , Marcadores Genéticos , Genotipo , Herbivoria , Modelos Genéticos , Mariposas Nocturnas , Fenotipo , Polimorfismo de Nucleótido Simple
14.
Theor Appl Genet ; 127(6): 1375-86, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24723140

RESUMEN

KEY MESSAGE: The calibration data for genomic prediction should represent the full genetic spectrum of a breeding program. Data heterogeneity is minimized by connecting data sources through highly related test units. One of the major challenges of genome-enabled prediction in plant breeding lies in the optimum design of the population employed in model training. With highly interconnected breeding cycles staggered in time the choice of data for model training is not straightforward. We used cross-validation and independent validation to assess the performance of genome-based prediction within and across genetic groups, testers, locations, and years. The study comprised data for 1,073 and 857 doubled haploid lines evaluated as testcrosses in 2 years. Testcrosses were phenotyped for grain dry matter yield and content and genotyped with 56,110 single nucleotide polymorphism markers. Predictive abilities strongly depended on the relatedness of the doubled haploid lines from the estimation set with those on which prediction accuracy was assessed. For scenarios with strong population heterogeneity it was advantageous to perform predictions within a priori defined genetic groups until higher connectivity through related test units was achieved. Differences between group means had a strong effect on predictive abilities obtained with both cross-validation and independent validation. Predictive abilities across subsequent cycles of selection and years were only slightly reduced compared to predictive abilities obtained with cross-validation within the same year. We conclude that the optimum data set for model training in genome-enabled prediction should represent the full genetic and environmental spectrum of the respective breeding program. Data heterogeneity can be reduced by experimental designs that maximize the connectivity between data sources by common or highly related test units.


Asunto(s)
Genoma de Planta , Hibridación Genética , Zea mays/genética , Cruzamiento , Genotipo , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple , Zea mays/fisiología
15.
Genetics ; 195(2): 573-87, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23934883

RESUMEN

In genome-based prediction there is considerable uncertainty about the statistical model and method required to maximize prediction accuracy. For traits influenced by a small number of quantitative trait loci (QTL), predictions are expected to benefit from methods performing variable selection [e.g., BayesB or the least absolute shrinkage and selection operator (LASSO)] compared to methods distributing effects across the genome [ridge regression best linear unbiased prediction (RR-BLUP)]. We investigate the assumptions underlying successful variable selection by combining computer simulations with large-scale experimental data sets from rice (Oryza sativa L.), wheat (Triticum aestivum L.), and Arabidopsis thaliana (L.). We demonstrate that variable selection can be successful when the number of phenotyped individuals is much larger than the number of causal mutations contributing to the trait. We show that the sample size required for efficient variable selection increases dramatically with decreasing trait heritabilities and increasing extent of linkage disequilibrium (LD). We contrast and discuss contradictory results from simulation and experimental studies with respect to superiority of variable selection methods over RR-BLUP. Our results demonstrate that due to long-range LD, medium heritabilities, and small sample sizes, superiority of variable selection methods cannot be expected in plant breeding populations even for traits like FRIGIDA gene expression in Arabidopsis and flowering time in rice, assumed to be influenced by a few major QTL. We extend our conclusions to the analysis of whole-genome sequence data and infer upper bounds for the number of causal mutations which can be identified by LASSO. Our results have major impact on the choice of statistical method needed to make credible inferences about genetic architecture and prediction accuracy of complex traits.


Asunto(s)
Cruzamiento , Genoma de Planta , Modelos Genéticos , Sitios de Carácter Cuantitativo/genética , Arabidopsis/genética , Simulación por Computador , Modelos Lineales , Desequilibrio de Ligamiento , Oryza/genética , Fenotipo , Selección Genética , Triticum/genética
16.
Stat Appl Genet Mol Biol ; 12(3): 375-91, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23629460

RESUMEN

Different statistical models have been proposed for maximizing prediction accuracy in genome-based prediction of breeding values in plant and animal breeding. However, little is known about the sensitivity of these models with respect to prior and hyperparameter specification, because comparisons of prediction performance are mainly based on a single set of hyperparameters. In this study, we focused on Bayesian prediction methods using a standard linear regression model with marker covariates coding additive effects at a large number of marker loci. By comparing different hyperparameter settings, we investigated the sensitivity of four methods frequently used in genome-based prediction (Bayesian Ridge, Bayesian Lasso, BayesA and BayesB) to specification of the prior distribution of marker effects. We used datasets simulated according to a typical maize breeding program differing in the number of markers and the number of simulated quantitative trait loci affecting the trait. Furthermore, we used an experimental maize dataset, comprising 698 doubled haploid lines, each genotyped with 56110 single nucleotide polymorphism markers and phenotyped as testcrosses for the two quantitative traits grain dry matter yield and grain dry matter content. The predictive ability of the different models was assessed by five-fold cross-validation. The extent of Bayesian learning was quantified by calculation of the Hellinger distance between the prior and posterior densities of marker effects. Our results indicate that similar predictive abilities can be achieved with all methods, but with BayesA and BayesB hyperparameter settings had a stronger effect on prediction performance than with the other two methods. Prediction performance of BayesA and BayesB suffered substantially from a non-optimal choice of hyperparameters.


Asunto(s)
Genoma de Planta , Modelos Genéticos , Algoritmos , Teorema de Bayes , Cruzamiento , Simulación por Computador , Estudios de Asociación Genética , Marcadores Genéticos , Modelos Lineales , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Sensibilidad y Especificidad , Zea mays/genética
17.
Bioinformatics ; 28(15): 2086-7, 2012 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-22689388

RESUMEN

SUMMARY: We present a novel R package named synbreed to derive genome-based predictions from high-throughput genotyping and large-scale phenotyping data. The package contains a comprehensive collection of functions required to fit and cross-validate genomic prediction models. All functions are embedded within the framework of a single, unified data object. Thereby a versatile genomic prediction analysis pipeline covering data processing, visualization and analysis is established within one software package. The implementation is flexible with respect to a wide range of data formats and models. The package fills an existing gap in the availability of user-friendly software for next-generation genetics research and education. AVAILABILITY: synbreed is open-source and available through CRAN http://cran.r-project.org/web/packages/synbreed. The latest development version is available from R-Forge. The package synbreed is released with a vignette, a manual and three large-scale example datasets (from package synbreedData). CONTACT: chris.schoen@wzw.tum.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Programas Informáticos , Algoritmos , Simulación por Computador , Procesamiento Automatizado de Datos , Técnicas de Genotipaje , Modelos Lineales
18.
Theor Appl Genet ; 123(2): 339-50, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21505832

RESUMEN

This is the first large-scale experimental study on genome-based prediction of testcross values in an advanced cycle breeding population of maize. The study comprised testcross progenies of 1,380 doubled haploid lines of maize derived from 36 crosses and phenotyped for grain yield and grain dry matter content in seven locations. The lines were genotyped with 1,152 single nucleotide polymorphism markers. Pedigree data were available for three generations. We used best linear unbiased prediction and stratified cross-validation to evaluate the performance of prediction models differing in the modeling of relatedness between inbred lines and in the calculation of genome-based coefficients of similarity. The choice of similarity coefficient did not affect prediction accuracies. Models including genomic information yielded significantly higher prediction accuracies than the model based on pedigree information alone. Average prediction accuracies based on genomic data were high even for a complex trait like grain yield (0.72-0.74) when the cross-validation scheme allowed for a high degree of relatedness between the estimation and the test set. When predictions were performed across distantly related families, prediction accuracies decreased significantly (0.47-0.48). Prediction accuracies decreased with decreasing sample size but were still high when the population size was halved (0.67-0.69). The results from this study are encouraging with respect to genome-based prediction of the genetic value of untested lines in advanced cycle breeding populations and the implementation of genomic selection in the breeding process.


Asunto(s)
Genoma de Planta , Modelos Genéticos , Zea mays/genética , Cruzamiento , Cruzamientos Genéticos , Variación Genética , Genotipo , Fenotipo , Polimorfismo Genético
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...