Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 56
Filter
1.
Theor Appl Genet ; 137(1): 16, 2024 Jan 08.
Article in English | MEDLINE | ID: mdl-38189816

ABSTRACT

KEY MESSAGE: Simulation planned pre-breeding can increase the efficiency of starting a hybrid breeding program. Starting a hybrid breeding program commonly comprises a grouping of the initial germplasm in two pools and subsequent selection on general combining ability. Investigations on pre-breeding steps before starting the selection on general combining ability are not available. Our goals were (1) to use computer simulations on the basis of DNA markers and testcross data to plan crosses that separate genetically two initial germplasm pools of rapeseed, (2) to carry out the planned crosses, and (3) to verify experimentally the pool separation as well as the increase in testcross performance. We designed a crossing program consisting of four cycles of recombination. In each cycle, the experimentally generated material was used to plan the subsequent crossing cycle with computer simulations. After finishing the crossing program, the initially overlapping pools were clearly separated in principal coordinate plots. Doubled haploid lines derived from the material of crossing cycles 1 and 2 showed an increase in relative testcross performance for yield of about 5% per cycle. We conclude that simulation-designed pre-breeding crossing schemes, that were carried out before the general combining ability-based selection of a newly started hybrid breeding program, can save time and resources, and in addition conserve more of the initial genetic variation than a direct start of a hybrid breeding program with general combining ability-based selection.


Subject(s)
Brassica napus , Brassica rapa , Brassica napus/genetics , Plant Breeding , Brassica rapa/genetics , Computer Simulation , Haploidy
2.
Theor Appl Genet ; 136(9): 203, 2023 Aug 31.
Article in English | MEDLINE | ID: mdl-37653062

ABSTRACT

KEY MESSAGE: Genomic prediction of GCA effects based on model training with full-sib rather than half-sib families yields higher short- and long-term selection gain in reciprocal recurrent genomic selection for hybrid breeding, if SCA effects are important. Reciprocal recurrent genomic selection (RRGS) is a powerful tool for ensuring sustainable selection progress in hybrid breeding. For training the statistical model, one can use half-sib (HS) or full-sib (FS) families produced by inter-population crosses of candidates from the two parent populations. Our objective was to compare HS-RRGS and FS-RRGS for the cumulative selection gain ([Formula: see text]), the genetic, GCA and SCA variances ([Formula: see text],[Formula: see text], [Formula: see text]) of the hybrid population, and prediction accuracy ([Formula: see text]) for GCA effects across cycles. Using SNP data from maize and wheat, we simulated RRGS programs over 10 cycles, each consisting of four sub-cycles with genomic selection of [Formula: see text] out of 950 candidates in each parent population. Scenarios differed for heritability [Formula: see text] and the proportion [Formula: see text] of traits, training set (TS) size ([Formula: see text]), and maize vs. wheat. Curves of [Formula: see text] over selection cycles showed no crossing of both methods. If [Formula: see text] was high, [Formula: see text] was generally higher for FS-RRGS than HS-RRGS due to higher [Formula: see text]. In contrast, HS-RRGS was superior or on par with FS-RRGS, if [Formula: see text] or [Formula: see text] and [Formula: see text] were low. [Formula: see text] showed a steeper increase and higher selection limit for scenarios with low [Formula: see text], high [Formula: see text] and large [Formula: see text]. [Formula: see text] and even more so [Formula: see text] decreased rapidly over cycles for both methods due to the high selection intensity and the role of the Bulmer effect for reducing [Formula: see text]. Since the TS for FS-RRGS can additionally be used for hybrid prediction, we recommend this method for achieving simultaneously the two major goals in hybrid breeding: population improvement and cultivar development.


Subject(s)
Genomics , Plant Breeding , Humans , Models, Statistical , Phenotype , Triticum , Zea mays/genetics
3.
Theor Appl Genet ; 134(5): 1493-1511, 2021 May.
Article in English | MEDLINE | ID: mdl-33587151

ABSTRACT

KEY MESSAGE: Simulations highlight the potential of genomic selection to substantially increase genetic gain for complex traits in sugarcane. The success rate depends on the trait genetic architecture and the implementation strategy. Genomic selection (GS) has the potential to increase the rate of genetic gain in sugarcane beyond the levels achieved by conventional phenotypic selection (PS). To assess different implementation strategies, we simulated two different GS-based breeding strategies and compared genetic gain and genetic variance over five breeding cycles to standard PS. GS scheme 1 followed similar routines like conventional PS but included three rapid recurrent genomic selection (RRGS) steps. GS scheme 2 also included three RRGS steps but did not include a progeny assessment stage and therefore differed more fundamentally from PS. Under an additive trait model, both simulated GS schemes achieved annual genetic gains of 2.6-2.7% which were 1.9 times higher compared to standard phenotypic selection (1.4%). For a complex non-additive trait model, the expected annual rates of genetic gain were lower for all breeding schemes; however, the rates for the GS schemes (1.5-1.6%) were still greater than PS (1.1%). Investigating cost-benefit ratios with regard to numbers of genotyped clones showed that substantial benefits could be achieved when only 1500 clones were genotyped per 10-year breeding cycle for the additive genetic model. Our results show that under a complex non-additive genetic model, the success rate of GS depends on the implementation strategy, the number of genotyped clones and the stage of the breeding program, likely reflecting how changes in QTL allele frequencies change additive genetic variance and therefore the efficiency of selection. These results are encouraging and motivate further work to facilitate the adoption of GS in sugarcane breeding.


Subject(s)
Genome, Plant , Genomics/methods , Plant Breeding/methods , Quantitative Trait Loci , Saccharum/genetics , Selection, Genetic , Chromosome Mapping/methods , Chromosomes, Plant/genetics , Genetics, Population , Models, Genetic , Phenotype , Saccharum/growth & development , Saccharum/metabolism
4.
J Exp Bot ; 70(6): 1969-1986, 2019 03 27.
Article in English | MEDLINE | ID: mdl-30753580

ABSTRACT

Oilseed rape is one of the most important dicotyledonous field crops in the world, where it plays a key role in productive cereal crop rotations. However, its production requires high nitrogen fertilization and its nitrogen footprint exceeds that of most other globally important crops. Hence, increased nitrogen use efficiency (NUE) in this crop is of high priority for sustainable agriculture. We report a comprehensive study of macrophysiological characteristics associated with breeding progress, conducted under contrasting nitrogen fertilization levels in a large panel of elite oilseed rape varieties representing breeding progress over the past 20 years. The results indicate that increased plant biomass at flowering, along with increases in primary yield components, have increased NUE in modern varieties. Nitrogen uptake efficiency has improved through breeding, particularly at high nitrogen. Despite low heritability, the number of seeds per silique is associated positively with increased yield in modern varieties. Seed weight remains unaffected by breeding progress; however, recent selection for high seed oil content and for high seed yields appears to have promoted a negative correlation (r= -0.39 at high and r= -0.49 at low nitrogen) between seed weight and seed oil concentration. Overall, our results reveal valuable breeding targets to improve NUE in oilseed rape.


Subject(s)
Brassica napus/metabolism , Life History Traits , Nitrogen/metabolism , Biomass , Brassica napus/genetics , Plant Breeding , Seeds
5.
BMC Genomics ; 19(1): 371, 2018 May 21.
Article in English | MEDLINE | ID: mdl-29783940

ABSTRACT

BACKGROUND: Small RNA (sRNA) sequences are known to have a broad impact on gene regulation by various mechanisms. Their performance for the prediction of hybrid traits has not yet been analyzed. Our objective was to analyze the relation of parental sRNA expression with the performance of their hybrids, to develop a sRNA-based prediction approach, and to compare it to more common SNP and mRNA transcript based predictions using a factorial mating scheme of a maize hybrid breeding program. RESULTS: Correlation of genomic differences and messenger RNA (mRNA) or sRNA expression differences between parental lines with hybrid performance of their hybrids revealed that sRNAs showed an inverse relationship in contrast to the other two data types. We associated differences for SNPs, mRNA and sRNA expression between parental inbred lines with the performance of their hybrid combinations and developed two prediction approaches using distance measures based on associated markers. Cross-validations revealed parental differences in sRNA expression to be strong predictors for hybrid performance for grain yield in maize, comparable to genomic and mRNA data. The integration of both positively and negatively associated markers in the prediction approaches enhanced the prediction accurary. The associated sRNAs belong predominantly to the canonical size classes of 22- and 24-nt that show specific genomic mapping characteristics. CONCLUSION: Expression profiles of sRNA are a promising alternative to SNPs or mRNA expression profiles for hybrid prediction, especially for plant species without reference genome or transcriptome information. The characteristics of the sRNAs we identified suggest that association studies based on breeding populations facilitate the identification of sRNAs involved in hybrid performance.


Subject(s)
Hybridization, Genetic , RNA, Small Untranslated/genetics , Zea mays/genetics , Breeding , Gene Expression Profiling , Genomics , Polymorphism, Single Nucleotide , RNA, Messenger/genetics , Zea mays/growth & development
6.
Theor Appl Genet ; 131(2): 299-317, 2018 Feb.
Article in English | MEDLINE | ID: mdl-29080901

ABSTRACT

KEY MESSAGE: Genomic prediction using the Brassica 60 k genotyping array is efficient in oilseed rape hybrids. Prediction accuracy is more dependent on trait complexity than on the prediction model. In oilseed rape breeding programs, performance prediction of parental combinations is of fundamental importance. Due to the phenomenon of heterosis, per se performance is not a reliable indicator for F1-hybrid performance, and selection of well-paired parents requires the testing of large quantities of hybrid combinations in extensive field trials. However, the number of potential hybrids, in general, dramatically exceeds breeding capacity and budget. Integration of genomic selection (GS) could substantially increase the number of potential combinations that can be evaluated. GS models can be used to predict the performance of untested individuals based only on their genotypic profiles, using marker effects previously predicted in a training population. This allows for a preselection of promising genotypes, enabling a more efficient allocation of resources. In this study, we evaluated the usefulness of the Illumina Brassica 60 k SNP array for genomic prediction and compared three alternative approaches based on a homoscedastic ridge regression BLUP and three Bayesian prediction models that considered general and specific combining ability (GCA and SCA, respectively). A total of 448 hybrids were produced in a commercial breeding program from unbalanced crosses between 220 paternal doubled haploid lines and five male-sterile testers. Predictive ability was evaluated for seven agronomic traits. We demonstrate that the Brassica 60 k genotyping array is an adequate and highly valuable platform to implement genomic prediction of hybrid performance in oilseed rape. Furthermore, we present first insights into the application of established statistical models for prediction of important agronomical traits with contrasting patterns of polygenic control.


Subject(s)
Brassica napus/genetics , Hybrid Vigor , Models, Genetic , Plant Breeding , Crosses, Genetic , Genotype , Phenotype , Polymorphism, Single Nucleotide
7.
Plant Cell Environ ; 40(5): 717-725, 2017 May.
Article in English | MEDLINE | ID: mdl-28036107

ABSTRACT

Roots, the hidden half of crop plants, are essential for resource acquisition. However, knowledge about the genetic control of below-ground plant development in wheat, one of the most important small-grain crops in the world, is very limited. The molecular interactions connecting root and shoot development and growth, and thus modulating the plant's demand for water and nutrients along with its ability to access them, are largely unexplored. Here, we demonstrate that linkage drag in European bread wheat, driven by strong selection for a haplotype variant controlling heading date, has eliminated a specific combination of two flanking, highly conserved haplotype variants whose interaction confers increased root biomass. Reversing this inadvertent consequence of selection could recover root diversity that may prove essential for future food production in fluctuating environments. Highly conserved synteny to rice across this chromosome segment suggests that adaptive selection has shaped the diversity landscape of this locus across different, globally important cereal crops. By mining wheat gene expression data, we identified root-expressed genes within the region of interest that could help breeders to select positive variants adapted to specific target soil environments.


Subject(s)
Genetic Linkage , Plant Roots/genetics , Triticum/genetics , Biomass , Chromosomes, Plant/genetics , Ecosystem , Epistasis, Genetic , Genes, Plant , Genome-Wide Association Study , Haplotypes/genetics , Plant Development/genetics , Plant Roots/growth & development , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Quantitative Trait, Heritable , Reproducibility of Results , Seedlings/genetics
9.
BMC Genomics ; 17: 262, 2016 Mar 29.
Article in English | MEDLINE | ID: mdl-27025377

ABSTRACT

BACKGROUND: Ridge regression models can be used for predicting heterosis and hybrid performance. Their application to mRNA transcription profiles has not yet been investigated. Our objective was to compare the prediction accuracy of models employing mRNA transcription profiles with that of models employing genome-wide markers using a data set of 98 maize hybrids from a breeding program. RESULTS: We predicted hybrid performance and mid-parent heterosis for grain yield and grain dry matter content and employed cross validation to assess the prediction accuracy. Prediction with a ridge regression model using random effects for mRNA transcription profiles resulted in similar prediction accuracies than employing the model to DNA markers. For hybrids, of which none of the parental inbred lines was part of the training set, the ridge regression model did not reach the prediction accuracy that was obtained with a model using transcriptome-based distances. CONCLUSION: We conclude that mRNA transcription profiles are a promising alternative to DNA markers for hybrid prediction, but further studies with larger data sets are required to investigate the superiority of alternative prediction models.


Subject(s)
Genetic Markers , Hybrid Vigor , Transcriptome , Zea mays/genetics , Amplified Fragment Length Polymorphism Analysis , Models, Genetic , Plant Breeding , RNA, Messenger/genetics , Regression Analysis
10.
BMC Genomics ; 15: 782, 2014 Sep 11.
Article in English | MEDLINE | ID: mdl-25213628

ABSTRACT

BACKGROUND: Introgression populations are used to make the genetic variation of unadapted germplasm or wild relatives of crops available for plant breeding. They consist of introgression lines that carry small chromosome segments from an exotic donor in the genetic background of an elite line. The goal of our study was to investigate the detection of favorable donor chromosome segments in introgression lines with statistical methods developed for genome-wide prediction. RESULTS: Computer simulations showed that genome-wide prediction employing heteroscedastic marker variances had a greater power and a lower false positive rate compared with homoscedastic marker variances when the phenotypic difference between the donor and recipient lines was controlled by few genes. The simulations helped to interpret the analyses of glycosinolate and linolenic acid content in a rapeseed introgression population and plant height in a rye introgression population. These analyses support the superiority of genome-wide prediction approaches that use heteroscedastic marker variances. CONCLUSIONS: We conclude that genome-wide prediction methods in combination with permutation tests can be employed for analysis of introgression populations. They are particularly useful when introgression lines carry several donor segments or when the donor segments of different introgression lines are overlapping.


Subject(s)
Breeding , Chromosomes , Crops, Agricultural , Genome-Wide Association Study , Humans , Models, Genetic
11.
BMC Plant Biol ; 14: 88, 2014 Apr 02.
Article in English | MEDLINE | ID: mdl-24693880

ABSTRACT

BACKGROUND: The identification of QTL involved in heterosis formation is one approach to unravel the not yet fully understood genetic basis of heterosis - the improved agronomic performance of hybrid F1 plants compared to their inbred parents. The identification of candidate genes underlying a QTL is important both for developing markers and determining the molecular genetic basis of a trait, but remains difficult owing to the large number of genes often contained within individual QTL. To address this problem in heterosis analysis, we applied a meta-analysis strategy for grain yield (GY) of Zea mays L. as example, incorporating QTL-, hybrid field-, and parental gene expression data. RESULTS: For the identification of genes underlying known heterotic QTL, we made use of tight associations between gene expression pattern and the trait of interest, identified by correlation analyses. Using this approach genes strongly associated with heterosis for GY were discovered to be clustered in pericentromeric regions of the complex maize genome. This suggests that expression differences of sequences in recombination-suppressed regions are important in the establishment of heterosis for GY in F1 hybrids and also in the conservation of heterosis for GY across genotypes. Importantly functional analysis of heterosis-associated genes from these genomic regions revealed over-representation of a number of functional classes, identifying key processes contributing to heterosis for GY. Based on the finding that the majority of the analyzed heterosis-associated genes were addtitively expressed, we propose a model referring to the influence of cis-regulatory variation on heterosis for GY by the compensation of fixed detrimental expression levels in parents. CONCLUSIONS: The study highlights the utility of a meta-analysis approach that integrates phenotypic and multi-level molecular data to unravel complex traits in plants. It provides prospects for the identification of genes relevant for QTL, and also suggests a model for the potential role of additive expression in the formation and conservation of heterosis for GY via dominant, multigenic quantitative trait loci. Our findings contribute to a deeper understanding of the multifactorial phenomenon of heterosis, and thus to the breeding of new high yielding varieties.


Subject(s)
Centromere/genetics , Gene Expression Regulation, Plant , Genome, Plant/genetics , Hybrid Vigor/genetics , Zea mays/genetics , Analysis of Variance , Chromosome Mapping , Chromosomes, Plant/genetics , Computer Simulation , Genes, Plant , Hybridization, Genetic , Inbreeding , Molecular Sequence Annotation , Oligonucleotide Array Sequence Analysis , Quantitative Trait Loci/genetics , Seeds/growth & development
12.
Front Zool ; 11: 43, 2014.
Article in English | MEDLINE | ID: mdl-25018774

ABSTRACT

INTRODUCTION: In insects, the pumping of the dorsal heart causes circulation of hemolymph throughout the central body cavity, but not within the interior of long body appendages. Hemolymph exchange in these dead-end structures is accomplished by special flow-guiding structures and/or autonomous pulsatile organs ("auxiliary hearts"). In this paper accessory pulsatile organs for an insect ovipositor are described for the first time. We studied these organs in females of the cricket Acheta domesticus by analyzing their functional morphology, neuroanatomy and physiological control. RESULTS: The lumen of the four long ovipositor valves is subdivided by longitudinal septa of connective tissue into efferent and afferent hemolymph sinuses which are confluent distally. The countercurrent flow in these sinuses is effected by pulsatile organs which are located at the bases of the ovipositor valves. Each of the four organs consists of a pumping chamber which is compressed by rhythmically contracting muscles. The morphology of the paired organs is laterally mirrored, and there are differences in some details between the dorsal and ventral organs. The compression of the pumping chambers of each valve pair occurs with a left-right alternating rhythm with a frequency of 0.2 to 0.5 Hz and is synchronized between the dorsal and ventral organs. The more anteriorly located genital chamber shows rhythmical lateral movements simultaneous to those of the ovipositor pulsatile organs and probably supports the hemolymph exchange in the abdominal apex region. The left-right alternating rhythm is produced by a central pattern generator located in the terminal ganglion. It requires no sensory feedback for its output since it persists in the completely isolated ganglion. Rhythm-modulating and rhythm-resetting interneurons are identified in the terminal ganglion. CONCLUSION: The circulatory organs of the cricket ovipositor have a unique functional morphology. The pumping apparatus at the base of each ovipositor valve operates like a bellow. It forces hemolymph via sinuses delimited by thin septa of connective tissue in a countercurrent flow through the valve lumen. The pumping activity is based on neurogenic control by a central pattern generator in the terminal ganglion.

13.
Theor Appl Genet ; 126(1): 49-58, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22926309

ABSTRACT

Introgression libraries can be used to make favorable genetic variation of exotic donor genotypes available in the genetic background of elite breeding material. Our objective was to employ a combination of the Dunnett test and a linear model analysis to identify favorable donor alleles in introgression lines (ILs) that carry long or multiple donor chromosome segments (DCS). We reanalyzed a dataset of two rye introgression libraries that consisted of ILs carrying on average about four donor segments. After identifying ILs that had a significantly better per se or testcross performance than the recipient line with the Dunnett test, the linear model analysis was in most instances able to clearly identify the donor regions that were responsible for the superior performance. The precise localization of the favorable DCS allowed a detailed analysis of pleiotropic effects and the study of the consistency of effects for per se and testcross performance. We conclude that in many cases the linear model analysis allows the assignment of donor effects to individual DCS even for ILs with long or multiple donor segments. This may considerably increase the efficiency of producing sub-ILs, because only such segments need to be isolated that are known to have a significant effect on the phenotype.


Subject(s)
Chromosomes/ultrastructure , Quantitative Trait Loci , Secale/genetics , Alleles , Chromosome Mapping/methods , Chromosomes/genetics , Chromosomes, Plant , Crosses, Genetic , DNA, Plant/genetics , Gene Library , Genes, Plant , Genetic Markers , Genetic Variation , Genome, Plant , Genotype , Linear Models , Models, Genetic , Models, Statistical
14.
Front Plant Sci ; 14: 1080087, 2023.
Article in English | MEDLINE | ID: mdl-36950349

ABSTRACT

Unreplicated field trials and genomic prediction are both used to enhance the efficiency in early selection stages of a hybrid maize breeding program. No results are available on the optimal experimental design when combining both approaches. Our objectives were to investigate the effect of the training set design on the accuracy of genomic prediction in unreplicated maize test crosses. We carried out a cross validation study on basis of an experimental data set consisting of 1436 hybrids evaluated for yield and moisture for which genotyping information of 461 SNP markers were available. Training set designs of different size, implementing within environment prediction, within year prediction, across year prediction, and combinations of data sources across years and environments were compared with respect to their prediction accuracy. Across year prediction did not reach prediction accuracies that are useful for genomic selection. Within year prediction across environments provided useful correlations between observed and predicted breeding values. The prediction accuracies did not improve when adding to the training set data from previous years. We conclude that using all data available from unreplicated tests of the current breeding cycle provides a good accuracy of predicting test crosses, whereas adding data from previous breeding cycles, in which the genotypes are less related to the tested material, has only limited value for increasing the prediction accuracy.

15.
Front Plant Sci ; 14: 1217589, 2023.
Article in English | MEDLINE | ID: mdl-37731980

ABSTRACT

In modern plant breeding, genomic selection is becoming the gold standard for selection of superior genotypes. The basis for genomic prediction models is a set of phenotyped lines along with their genotypic profile. With high marker density and linkage disequilibrium (LD) between markers, genotype data in breeding populations tends to exhibit considerable redundancy. Therefore, interest is growing in the use of haplotype blocks to overcome redundancy by summarizing co-inherited features. Moreover, haplotype blocks can help to capture local epistasis caused by interacting loci. Here, we compared genomic prediction methods that either used single SNPs or haplotype blocks with regards to their prediction accuracy for important traits in crop datasets. We used four published datasets from canola, maize, wheat and soybean. Different approaches to construct haplotype blocks were compared, including blocks based on LD, physical distance, number of adjacent markers and the algorithms implemented in the software "Haploview" and "HaploBlocker". The tested prediction methods included Genomic Best Linear Unbiased Prediction (GBLUP), Extended GBLUP to account for additive by additive epistasis (EGBLUP), Bayesian LASSO and Reproducing Kernel Hilbert Space (RKHS) regression. We found improved prediction accuracy in some traits when using haplotype blocks compared to SNP-based predictions, however the magnitude of improvement was very trait- and model-specific. Especially in settings with low marker density, haplotype blocks can improve genomic prediction accuracy. In most cases, physically large haplotype blocks yielded a strong decrease in prediction accuracy. Especially when prediction accuracy varies greatly across different prediction models, prediction based on haplotype blocks can improve prediction accuracy of underperforming models. However, there is no "best" method to build haplotype blocks, since prediction accuracy varied considerably across methods and traits. Hence, criteria used to define haplotype blocks should not be viewed as fixed biological parameters, but rather as hyperparameters that need to be adjusted for every dataset.

16.
Front Plant Sci ; 14: 1178902, 2023.
Article in English | MEDLINE | ID: mdl-37546247

ABSTRACT

Testcross factorials in newly established hybrid breeding programs are often highly unbalanced, incomplete, and characterized by predominance of special combining ability (SCA) over general combining ability (GCA). This results in a low efficiency of GCA-based selection. Machine learning algorithms might improve prediction of hybrid performance in such testcross factorials, as they have been successfully applied to find complex underlying patterns in sparse data. Our objective was to compare the prediction accuracy of machine learning algorithms to that of GCA-based prediction and genomic best linear unbiased prediction (GBLUP) in six unbalanced incomplete factorials from hybrid breeding programs of rapeseed, wheat, and corn. We investigated a range of machine learning algorithms with three different types of predictor variables: (a) information on parentage of hybrids, (b) in addition hybrid performance of crosses of the parental lines with other crossing partners, and (c) genotypic marker data. In two highly incomplete and unbalanced factorials from rapeseed, in which the SCA variance contributed considerably to the genetic variance, stacked ensembles of gradient boosting machines based on parentage information outperformed GCA prediction. The stacked ensembles increased prediction accuracy from 0.39 to 0.45, and from 0.48 to 0.54 compared to GCA prediction. The prediction accuracy reached by stacked ensembles without marker data reached values comparable to those of GBLUP that requires marker data. We conclude that hybrid prediction with stacked ensembles of gradient boosting machines based on parentage information is a promising approach that is worth further investigations with other data sets in which SCA variance is high.

17.
Front Plant Sci ; 14: 1221750, 2023.
Article in English | MEDLINE | ID: mdl-37936929

ABSTRACT

In modern plant breeding, genomic selection is becoming the gold standard to select superior genotypes in large breeding populations that are only partially phenotyped. Many breeding programs commonly rely on single-nucleotide polymorphism (SNP) markers to capture genome-wide data for selection candidates. For this purpose, SNP arrays with moderate to high marker density represent a robust and cost-effective tool to generate reproducible, easy-to-handle, high-throughput genotype data from large-scale breeding populations. However, SNP arrays are prone to technical errors that lead to failed allele calls. To overcome this problem, failed calls are often imputed, based on the assumption that failed SNP calls are purely technical. However, this ignores the biological causes for failed calls-for example: deletions-and there is increasing evidence that gene presence-absence and other kinds of genome structural variants can play a role in phenotypic expression. Because deletions are frequently not in linkage disequilibrium with their flanking SNPs, permutation of missing SNP calls can potentially obscure valuable marker-trait associations. In this study, we analyze published datasets for canola and maize using four parametric and two machine learning models and demonstrate that failed allele calls in genomic prediction are highly predictive for important agronomic traits. We present two statistical pipelines, based on population structure and linkage disequilibrium, that enable the filtering of failed SNP calls that are likely caused by biological reasons. For the population and trait examined, prediction accuracy based on these filtered failed allele calls was competitive to standard SNP-based prediction, underlying the potential value of missing data in genomic prediction approaches. The combination of SNPs with all failed allele calls or the filtered allele calls did not outperform predictions with only SNP-based prediction due to redundancy in genomic relationship estimates.

18.
Front Plant Sci ; 14: 1168547, 2023.
Article in English | MEDLINE | ID: mdl-37229104

ABSTRACT

Haplotype blocks might carry additional information compared to single SNPs and have therefore been suggested for use as independent variables in genomic prediction. Studies in different species resulted in more accurate predictions than with single SNPs in some traits but not in others. In addition, it remains unclear how the blocks should be built to obtain the greatest prediction accuracies. Our objective was to compare the results of genomic prediction with different types of haplotype blocks to prediction with single SNPs in 11 traits in winter wheat. We built haplotype blocks from marker data from 361 winter wheat lines based on linkage disequilibrium, fixed SNP numbers, fixed lengths in cM and with the R package HaploBlocker. We used these blocks together with data from single-year field trials in a cross-validation study for predictions with RR-BLUP, an alternative method (RMLA) that allows for heterogeneous marker variances, and GBLUP performed with the software GVCHAP. The greatest prediction accuracies for resistance scores for B. graminis, P. triticina, and F. graminearum were obtained with LD-based haplotype blocks while blocks with fixed marker numbers and fixed lengths in cM resulted in the greatest prediction accuracies for plant height. Prediction accuracies of haplotype blocks built with HaploBlocker were greater than those of the other methods for protein concentration and resistances scores for S. tritici, B. graminis, and P. striiformis. We hypothesize that the trait-dependence is caused by properties of the haplotype blocks that have overlapping and contrasting effects on the prediction accuracy. While they might be able to capture local epistatic effects and to detect ancestral relationships better than single SNPs, prediction accuracy might be reduced by unfavorable characteristics of the design matrices in the models that are due to their multi-allelic nature.

19.
Theor Appl Genet ; 125(8): 1639-45, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22814724

ABSTRACT

Genome-based prediction of genetic values is expected to overcome shortcomings that limit the application of QTL mapping and marker-assisted selection in plant breeding. Our goal was to study the genome-based prediction of test cross performance with genetic effects that were estimated using genotypes from the preceding breeding cycle. In particular, our objectives were to employ a ridge regression approach that approximates best linear unbiased prediction of genetic effects, compare cross validation with validation using genetic material of the subsequent breeding cycle, and investigate the prospects of genome-based prediction in sugar beet breeding. We focused on the traits sugar content and standard molasses loss (ML) and used a set of 310 sugar beet lines to estimate genetic effects at 384 SNP markers. In cross validation, correlations >0.8 between observed and predicted test cross performance were observed for both traits. However, in validation with 56 lines from the next breeding cycle, a correlation of 0.8 could only be observed for sugar content, for standard ML the correlation reduced to 0.4. We found that ridge regression based on preliminary estimates of the heritability provided a very good approximation of best linear unbiased prediction and was not accompanied with a loss in prediction accuracy. We conclude that prediction accuracy assessed with cross validation within one cycle of a breeding program can not be used as an indicator for the accuracy of predicting lines of the next cycle. Prediction of lines of the next cycle seems promising for traits with high heritabilities.


Subject(s)
Breeding/methods , Crosses, Genetic , Genome, Plant/genetics , Saccharum/genetics , Carbohydrate Metabolism , Genetic Markers , Inbreeding , Linear Models , Linkage Disequilibrium/genetics , Molasses , Polymorphism, Single Nucleotide/genetics
20.
Theor Appl Genet ; 124(5): 825-33, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22101908

ABSTRACT

The performance of hybrids can be predicted with gene expression data from their parental inbred lines. Implementing such prediction approaches in breeding programs promises to increase the efficiency of hybrid breeding. The objectives of our study were to compare the accuracy of prediction models employing multiple linear regression (MLR), partial least squares regression (PLS), support vector machine regression (SVM), and transcriptome-based distances (D(B)). For a factorial of 7 flint and 14 dent maize lines, the grain yield of the hybrids was assessed and the gene expression of the parental lines was profiled with a 56k microarray. The accuracy of the prediction models was measured by the correlation between predicted and observed yield employing two cross-validation schemes. The first modeled the prediction of hybrids when testcross data are available for both parental lines (type 2 hybrids), and the second modeled the prediction of hybrids when no testcross data for the parental lines were available (type 0 hybrids). MLR, SVM, and PLS resulted in a high correlation between predicted and observed yield for type 2 hybrids, whereas for type 0 hybrids D(B) had greater prediction accuracy. The regression methods were robust to the choice of the set of profiled genes and required only a few hundred genes. In contrast, for an accurate hybrid prediction with D(B), 1,000-1,500 genes were required, and the prediction accuracy depended strongly on the set of profiled genes. We conclude that for prediction within one set of genetic material MLR is a promising approach, and for transferring prediction models from one set of genetic material to a related one, the transcriptome-based distance D(B) is most promising.


Subject(s)
Breeding/methods , Gene Expression Profiling , Hybridization, Genetic/genetics , Models, Genetic , Zea mays/genetics , Least-Squares Analysis , Support Vector Machine , Zea mays/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL