Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 110
Filtrar
1.
BMC Bioinformatics ; 25(1): 202, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38816801

RESUMEN

INTODUCTION: In systems biology, an organism is viewed as a system of interconnected molecular entities. To understand the functioning of organisms it is essential to integrate information about the variations in the concentrations of those molecular entities. This information can be structured as a set of networks with interconnections and with some hierarchical relations between them. Few methods exist for the reconstruction of integrative networks. OBJECTIVE: In this work, we propose an integrative network reconstruction method in which the network organization for a particular type of omics data is guided by the network structure of a related type of omics data upstream in the omic cascade. The structure of these guiding data can be either already known or be estimated from the guiding data themselves. METHODS: The method consists of three steps. First a network structure for the guiding data should be provided. Next, responses in the target set are regressed on the full set of predictors in the guiding data with a Lasso penalty to reduce the number of predictors and an L2 penalty on the differences between coefficients for predictors that share edges in the network for the guiding data. Finally, a network is reconstructed on the fitted target responses as functions of the predictors in the guiding data. This way we condition the target network on the network of the guiding data. CONCLUSIONS: We illustrate our approach on two examples in Arabidopsis. The method detects groups of metabolites that have a similar genetic or transcriptomic basis.


Asunto(s)
Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Biología de Sistemas/métodos , Redes Reguladoras de Genes , Algoritmos , Biología Computacional/métodos , Multiómica
2.
Sci Rep ; 14(1): 12433, 2024 05 30.
Artículo en Inglés | MEDLINE | ID: mdl-38816496

RESUMEN

Comparing the abundance of microbial communities between different groups or obtained under different experimental conditions using count sequence data is a challenging task due to various issues such as inflated zero counts, overdispersion, and non-normality. Several methods and procedures based on counts, their transformation and compositionality have been proposed in the literature to detect differentially abundant species in datasets containing hundreds to thousands of microbial species. Despite efforts to address the large numbers of zeros present in microbiome datasets, even after careful data preprocessing, the performance of existing methods is impaired by the presence of inflated zero counts and group-wise structured zeros (i.e. all zero counts in a group). We propose and validate using extensive simulations an approach combining two differential abundance testing methods, namely DESeq2-ZINBWaVE and DESeq2, to address the issues of zero-inflation and group-wise structured zeros, respectively. This combined approach was subsequently successfully applied to two plant microbiome datasets that revealed a number of taxa as interesting candidates for further experimental validation.


Asunto(s)
Microbiota , Biología Computacional/métodos , Bacterias/clasificación , Bacterias/genética , Bacterias/aislamiento & purificación , Plantas/microbiología , Algoritmos
3.
G3 (Bethesda) ; 14(4)2024 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-38243613

RESUMEN

Multienvironment genomic prediction was applied to tetraploid potato using 147 potato varieties, tested for 2 years, in 3 locations representative of 3 distinct regions in Europe. Different prediction scenarios were investigated to help breeders predict genotypic performance in the regions from one year to the next, for genotypes that were tested this year (scenario 1), as well as new genotypes (scenario 3). In scenario 2, we predicted new genotypes for any one of the 6 trials, using all the information that is available. The choice of prediction model required assessment of the variance-covariance matrix in a mixed model that takes into account heterogeneity of genetic variances and correlations. This was done for each analyzed trait (tuber weight, tuber length, and dry matter) where examples of both limited and higher degrees of heterogeneity was observed. This explains why dry matter did not need complex multienvironment modeling to combine environments and increase prediction ability, while prediction in tuber weight, improved only when models were flexible enough to capture the heterogeneous variances and covariances between environments. We also found that the prediction abilities in a target trial condition decreased, if trials with a low genetic correlation to the target were included when training the model. Genomic prediction in tetraploid potato can work once there is clarity about the prediction scenario, a suitable training set is created, and a multienvironment prediction model is chosen based on the patterns of G×E indicated by the genetic variances and covariances.


Asunto(s)
Solanum tuberosum , Solanum tuberosum/genética , Tetraploidía , Fenotipo , Genotipo , Genómica
4.
Plant Genome ; 17(1): e20333, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37122200

RESUMEN

Terminal drought is one of the major constraints to crop production in chickpea (Cicer arietinum L.). In order to map drought tolerance related traits at high resolution, we sequenced multi-parent advanced generation intercross (MAGIC) population using whole genome resequencing approach and phenotyped it under drought stress environments for two consecutive years (2013-14 and 2014-15). A total of 52.02 billion clean reads containing 4.67 TB clean data were generated on the 1136 MAGIC lines and eight parental lines. Alignment of clean data on to the reference genome enabled identification of a total, 932,172 of SNPs, 35,973 insertions, and 35,726 deletions among the parental lines. A high-density genetic map was constructed using 57,180 SNPs spanning a map distance of 1606.69 cM. Using compressed mixed linear model, genome-wide association study (GWAS) enabled us to identify 737 markers significantly associated with days to 50% flowering, days to maturity, plant height, 100 seed weight, biomass, and harvest index. In addition to the GWAS approach, an identity-by-descent (IBD)-based mixed model approach was used to map quantitative trait loci (QTLs). The IBD-based mixed model approach detected major QTLs that were comparable to those from the GWAS analysis as well as some exclusive QTLs with smaller effects. The candidate genes like FRIGIDA and CaTIFY4b can be used for enhancing drought tolerance in chickpea. The genomic resources, genetic map, marker-trait associations, and QTLs identified in the study are valuable resources for the chickpea community for developing climate resilient chickpeas.


Asunto(s)
Cicer , Mapeo Cromosómico , Cicer/genética , Genoma de Planta , Estudio de Asociación del Genoma Completo , Resistencia a la Sequía
5.
J Exp Bot ; 75(7): 2084-2099, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38134290

RESUMEN

Crop growth and phenology are driven by seasonal changes in environmental variables, with temperature as one important factor. However, knowledge about genotype-specific temperature response and its influence on phenology is limited. Such information is fundamental to improve crop models and adapt selection strategies. We measured the increase in height of 352 European winter wheat varieties in 4 years to quantify phenology, and fitted an asymptotic temperature response model. The model used hourly fluctuations in temperature to parameterize the base temperature (Tmin), the temperature optimum (rmax), and the steepness (lrc) of growth responses. Our results show that higher Tmin and lrc relate to an earlier start and end of stem elongation. A higher rmax relates to an increased final height. Both final height and rmax decreased for varieties originating from the continental east of Europe towards the maritime west. A genome-wide association study (GWAS) indicated a quantitative inheritance and a large degree of independence among loci. Nevertheless, genomic prediction accuracies (GBLUPs) for Tmin and lrc were low (r≤0.32) compared with other traits (r≥0.59). As well as known, major genes related to vernalization, photoperiod, or dwarfing, the GWAS indicated additional, as yet unknown loci that dominate the temperature response.


Asunto(s)
Estudio de Asociación del Genoma Completo , Triticum , Triticum/genética , Temperatura , Sitios de Carácter Cuantitativo , Fitomejoramiento , Fenotipo
6.
Plants (Basel) ; 12(14)2023 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-37514232

RESUMEN

There is an ongoing endeavor within the potato breeding sector to rapidly adapt potato from a clonal polyploid crop to a diploid hybrid potato crop. While hybrid breeding allows for the efficient generation and selection of parental lines, it also increases breeding program complexity and results in longer breeding cycles. Over the past two decades, genomic prediction has revolutionized hybrid crop breeding through shorter breeding cycles, lower phenotyping costs, and better population improvement, resulting in increased genetic gains for genetically complex traits. In order to accelerate the genetic gains in hybrid potato, the proper implementation of genomic prediction is a crucial milestone in the rapid improvement of this crop. The authors of this paper set out to test genomic prediction in hybrid potato using current genotyped material with two alternative models: one model that predicts the general combining ability effects (GCA) and another which predicts both the general and specific combining ability effects (GCA+SCA). Using a training set comprising 769 hybrids and 456 genotyped parental lines, we found that reasonable a prediction accuracy could be achieved for most phenotypes with both zero common parents (ρ=0.36-0.61) and one (ρ=0.50-0.68) common parent between the training and test sets. There was no benefit with the inclusion of non-additive genetic effects in the GCA+SCA model despite SCA variance contributing between 9% and 19% of the total genetic variance. Genotype-by-environment interactions, while present, did not appear to affect the prediction accuracy, though prediction errors did vary across the trial's targets. These results suggest that genomically estimated breeding values on parental lines are sufficient for hybrid yield prediction.

7.
Front Plant Sci ; 14: 1172359, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37389290

RESUMEN

Introduction: Dynamic crop growth models are an important tool to predict complex traits, like crop yield, for modern and future genotypes in their current and evolving environments, as those occurring under climate change. Phenotypic traits are the result of interactions between genetic, environmental, and management factors, and dynamic models are designed to generate the interactions producing phenotypic changes over the growing season. Crop phenotype data are becoming increasingly available at various levels of granularity, both spatially (landscape) and temporally (longitudinal, time-series) from proximal and remote sensing technologies. Methods: Here we propose four phenomenological process models of limited complexity based on differential equations for a coarse description of focal crop traits and environmental conditions during the growing season. Each of these models defines interactions between environmental drivers and crop growth (logistic growth, with implicit growth restriction, or explicit restriction by irradiance, temperature, or water availability) as a minimal set of constraints without resorting to strongly mechanistic interpretations of the parameters. Differences between individual genotypes are conceptualized as differences in crop growth parameter values. Results: We demonstrate the utility of such low-complexity models with few parameters by fitting them to longitudinal datasets from the simulation platform APSIM-Wheat involving in silico biomass development of 199 genotypes and data of environmental variables over the course of the growing season at four Australian locations over 31 years. While each of the four models fits well to particular combinations of genotype and trial, none of them provides the best fit across the full set of genotypes by trials because different environmental drivers will limit crop growth in different trials and genotypes in any specific trial will not necessarily experience the same environmental limitation. Discussion: A combination of low-complexity phenomenological models covering a small set of major limiting environmental factors may be a useful forecasting tool for crop growth under genotypic and environmental variation.

8.
Genes (Basel) ; 14(6)2023 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-37372341

RESUMEN

Plants can express different phenotypic responses following polyploidization, but ploidy-dependent phenotypic variation has so far not been assigned to specific genetic factors. To map such effects, segregating populations at different ploidy levels are required. The availability of an efficient haploid inducer line in Arabidopsis thaliana allows for the rapid development of large populations of segregating haploid offspring. Because Arabidopsis haploids can be self-fertilised to give rise to homozygous doubled haploids, the same genotypes can be phenotyped at both the haploid and diploid ploidy level. Here, we compared the phenotypes of recombinant haploid and diploid offspring derived from a cross between two late flowering accessions to map genotype × ploidy (G × P) interactions. Ploidy-specific quantitative trait loci (QTLs) were detected at both ploidy levels. This implies that mapping power will increase when phenotypic measurements of monoploids are included in QTL analyses. A multi-trait analysis further revealed pleiotropic effects for a number of the ploidy-specific QTLs as well as opposite effects at different ploidy levels for general QTLs. Taken together, we provide evidence of genetic variation between different Arabidopsis accessions being causal for dissimilarities in phenotypic responses to altered ploidy levels, revealing a G × P effect. Additionally, by investigating a population derived from late flowering accessions, we revealed a major vernalisation-specific QTL for variation in flowering time, countering the historical bias of research in early flowering accessions.


Asunto(s)
Arabidopsis , Mapeo Cromosómico , Genotipo , Sitios de Carácter Cuantitativo/genética , Haploidia
9.
Genes (Basel) ; 14(4)2023 04 17.
Artículo en Inglés | MEDLINE | ID: mdl-37107685

RESUMEN

While sparse testing methods have been proposed by researchers to improve the efficiency of genomic selection (GS) in breeding programs, there are several factors that can hinder this. In this research, we evaluated four methods (M1-M4) for sparse testing allocation of lines to environments under multi-environmental trails for genomic prediction of unobserved lines. The sparse testing methods described in this study are applied in a two-stage analysis to build the genomic training and testing sets in a strategy that allows each location or environment to evaluate only a subset of all genotypes rather than all of them. To ensure a valid implementation, the sparse testing methods presented here require BLUEs (or BLUPs) of the lines to be computed at the first stage using an appropriate experimental design and statistical analyses in each location (or environment). The evaluation of the four cultivar allocation methods to environments of the second stage was done with four data sets (two large and two small) under a multi-trait and uni-trait framework. We found that the multi-trait model produced better genomic prediction (GP) accuracy than the uni-trait model and that methods M3 and M4 were slightly better than methods M1 and M2 for the allocation of lines to environments. Some of the most important findings, however, were that even under a scenario where we used a training-testing relation of 15-85%, the prediction accuracy of the four methods barely decreased. This indicates that genomic sparse testing methods for data sets under these scenarios can save considerable operational and financial resources with only a small loss in precision, which can be shown in our cost-benefit analysis.


Asunto(s)
Modelos Genéticos , Fitomejoramiento , Fitomejoramiento/métodos , Genoma de Planta/genética , Fenotipo , Genómica , Productos Agrícolas/genética
10.
NAR Genom Bioinform ; 5(1): lqad001, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36685726

RESUMEN

Differential abundance analysis of infant 16S microbial sequencing data is complicated by challenging data properties, including high sparsity, extreme dispersion and the relative nature of the information contained within the data. In this study, we propose a pairwise ratio analysis that uses the compositional data analysis principle of subcompositional coherence and merges it with a beta-binomial regression model. The resulting method provides a flexible and easily interpretable approach to infant 16S sequencing data differential abundance analysis that does not require zero imputation. We evaluate the proposed method using infant 16S data from clinical trials and demonstrate that the proposed method has the power to detect differences, and demonstrate how its results can be used to gain insights. We further evaluate the method using data-inspired simulations and compare its power against related methods. Our results indicate that power is high for pairwise differential abundance analysis of taxon pairs that have a large abundance. In contrast, results for sparse taxon pairs show a decrease in power and substantial variability in method performance. While our method shows promising performance on well-measured subcompositions, we advise strong filtering steps in order to avoid excessive numbers of underpowered comparisons in practical applications.

11.
Bioinformatics ; 38(22): 5134-5136, 2022 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-36193999

RESUMEN

MOTIVATION: Multi-parent populations (MPPs) are popular for QTL mapping because they combine wide genetic diversity in parents with easy control of population structure, but a limited number of software tools for QTL mapping are specifically developed for general MPP designs. RESULTS: We developed an R package called statgenMPP, adopting a unified identity-by-descent (IBD)-based mixed model approach for QTL analysis in MPPs. The package offers easy-to-use functionalities of IBD calculations, mixed model solutions and visualizations for QTL mapping in a wide range of MPP designs, including diallele, nested-association mapping populations, multi-parent advanced genetic inter-cross populations and other complicated MPPs with known crossing schemes. AVAILABILITY AND IMPLEMENTATION: The R package statgenMPP is open-source and freely available on CRAN at https://CRAN.R-project.org/package=statgenMPP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Mapeo Cromosómico
12.
Nat Commun ; 13(1): 3225, 2022 06 09.
Artículo en Inglés | MEDLINE | ID: mdl-35680899

RESUMEN

Combined phenomic and genomic approaches are required to evaluate the margin of progress of breeding strategies. Here, we analyze 65 years of genetic progress in maize yield, which was similar (101 kg ha-1 year-1) across most frequent environmental scenarios in the European growing area. Yield gains were linked to physiologically simple traits (plant phenology and architecture) which indirectly affected reproductive development and light interception in all studied environments, marked by significant genomic signatures of selection. Conversely, studied physiological processes involved in stress adaptation remained phenotypically unchanged (e.g. stomatal conductance and growth sensitivity to drought) and showed no signatures of selection. By selecting for yield, breeders indirectly selected traits with stable effects on yield, but not physiological traits whose effects on yield can be positive or negative depending on environmental conditions. Because yield stability under climate change is desirable, novel breeding strategies may be needed for exploiting alleles governing physiological adaptive traits.


Asunto(s)
Fitomejoramiento , Zea mays , Alelos , Sequías , Fenotipo , Zea mays/genética
13.
Theor Appl Genet ; 135(6): 2059-2082, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35524815

RESUMEN

KEY MESSAGE: We evaluate self-organizing maps (SOM) to identify adaptation zones and visualize multi-environment genotypic responses. We apply SOM to multiple traits and crop growth model output of large-scale European sunflower data. Genotype-by-environment interactions (G × E) complicate the selection of well-adapted varieties. A possible solution is to group trial locations into adaptation zones with G × E occurring mainly between zones. By selecting for good performance inside those zones, response to selection is increased. In this paper, we present a two-step procedure to identify adaptation zones that starts from a self-organizing map (SOM). In the SOM, trials across locations and years are assigned to groups, called units, that are organized on a two-dimensional grid. Units that are further apart contain more distinct trials. In an iterative process of reweighting trial contributions to units, the grid configuration is learnt simultaneously with the trial assignment to units. An aggregation of the units in the SOM by hierarchical clustering then produces environment types, i.e. trials with similar growing conditions. Adaptation zones can subsequently be identified by grouping trial locations with similar distributions of environment types across years. For the construction of SOMs, multiple data types can be combined. We compared environment types and adaptation zones obtained for European sunflower from quantitative traits like yield, oil content, phenology and disease scores with those obtained from environmental indices calculated with the crop growth model Sunflo. We also show how results are affected by input data organization and user-defined weights for genotypes and traits. Adaptation zones for European sunflower as identified by our SOM-based strategy captured substantial genotype-by-location interaction and pointed to trials in Spain, Turkey and South Bulgaria as inducing different genotypic responses.


Asunto(s)
Helianthus , Adaptación Fisiológica , Algoritmos , Análisis por Conglomerados , Genotipo , Helianthus/genética
14.
G3 (Bethesda) ; 12(6)2022 05 30.
Artículo en Inglés | MEDLINE | ID: mdl-35460241

RESUMEN

Hybrid potato breeding has become a novel alternative to conventional potato breeding allowing breeders to overcome intractable barriers (e.g. tetrasomic inheritance, masked deleterious alleles, obligate clonal propagation) with the benefit of seed-based propagule, flexible population design, and the potential of hybrid vigor. Until now, however, no formal inquiry has adequately examined the relevant genetic components for complex traits in hybrid potato populations. In this present study, we use a 2-step multivariate modeling approach to estimate the variance components to assess the magnitude of the general and specific combining abilities in diploid hybrid potato. Specific combining ability effects were identified for all yield components studied here warranting evidence of nonadditive genetic effects in hybrid potato yield. However, the estimated general combining ability effects were on average 2 times larger than their respective specific combining ability quantile across all yield phenotypes. Tuber number general combining abilities and specific combining abilities were found to be highly correlated with total yield's genetic components. Tuber volume was shown to have the largest proportion of additive and nonadditive genetic variation suggesting under-selection of this phenotype in this population. The prominence of additive effects found for all traits presents evidence that the mid-parent value alone is useful for hybrid potato evaluation. Heterotic vigor stands to be useful in bolstering simpler traits but this will be dependent on target phenotypes and market requirements. This study represents the first diallel analysis of its kind in diploid potato using material derived from a commercial hybrid breeding program.


Asunto(s)
Vigor Híbrido , Solanum tuberosum , Alelos , Diploidia , Vigor Híbrido/genética , Fitomejoramiento , Solanum tuberosum/genética
15.
Mol Breed ; 42(12): 76, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37313326

RESUMEN

Genome-wide association studies (GWAS) are a useful tool to unravel the genetic architecture of complex traits, but the results can be difficult to interpret. Population structure, genetic heterogeneity, and rare alleles easily result in false positive or false negative associations. This paper describes the analysis of a GWAS panel combined with three bi-parental mapping populations to validate GWAS results, using phenotypic data for steroidal glycoalkaloid (SGA) accumulation and the ratio (SGR) between the two major glycoalkaloids α-solanine and α-chaconine in potato tubers. SGAs are secondary metabolites in the Solanaceae family, functional as a defence against various pests and pathogens and in high quantities toxic for humans. With GWAS, we identified five quantitative trait loci (QTL) of which Sga1.1, Sgr8.1, and Sga11.1 were validated, but not Sga3.1 and Sgr7.1. In the bi-parental populations, Sga5.1 and Sga7.1 were mapped, but these were not identified with GWAS. The QTLs Sga1.1, Sga7.1, Sgr7.1, and Sgr8.1 co-localize with genes GAME9, GAME 6/GAME 11, SGT1, and SGT2, respectively. For other genes involved in SGA synthesis, no QTLs were identified. The results of this study illustrate a number of pitfalls in GWAS of which population structure seems the most important. We also show that introgression breeding for disease resistance has introduced new haplotypes to the gene pool involved in higher SGA levels in certain pedigrees. Finally, we show that high SGA levels remain unpredictable in potato but that α-solanine/α-chaconine ratio has a predictable outcome with specific SGT1 and SGT2 haplotypes. Supplementary Information: The online version contains supplementary material available at 10.1007/s11032-022-01344-2.

16.
Front Plant Sci ; 12: 771075, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34899794

RESUMEN

Training set construction is an important prerequisite to Genomic Prediction (GP), and while this has been studied in diploids, polyploids have not received the same attention. Polyploidy is a common feature in many crop plants, like for example banana and blueberry, but also potato which is the third most important crop in the world in terms of food consumption, after rice and wheat. The aim of this study was to investigate the impact of different training set construction methods using a publicly available diversity panel of tetraploid potatoes. Four methods of training set construction were compared: simple random sampling, stratified random sampling, genetic distance sampling and sampling based on the coefficient of determination (CDmean). For stratified random sampling, population structure analyses were carried out in order to define sub-populations, but since sub-populations accounted for only 16.6% of genetic variation, there were negligible differences between stratified and simple random sampling. For genetic distance sampling, four genetic distance measures were compared and though they performed similarly, Euclidean distance was the most consistent. In the majority of cases the CDmean method was the best sampling method, and compared to simple random sampling gave improvements of 4-14% in cross-validation scenarios, and 2-8% in scenarios with an independent test set, while genetic distance sampling gave improvements of 5.5-10.5% and 0.4-4.5%. No interaction was found between sampling method and the statistical model for the traits analyzed.

17.
Plant Genome ; 14(3): e20154, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34617677

RESUMEN

Grass pea (Lathyrus sativus L.) is an annual legume species, phylogenetically close to pea (Pisum sativum L.), that may be infected by Fusarium oxysporum f. sp. pisi (Fop), the causal agent of fusarium wilt in peas with vast worldwide yield losses. A range of responses varying from high resistance to susceptibility to this pathogen has been reported in grass pea germplasm. Nevertheless, the genetic basis of that diversity of responses is still unknown, hampering its breeding exploitation. To identify genomic regions controlling grass pea resistance to fusarium wilt, a genome-wide association study approach was applied on a grass pea worldwide collection of accessions inoculated with Fop race 2. Disease responses were scored in this collection that was also subjected to high-throughput based single nucleotide polymorphisms (SNP) screening through genotyping-by-sequencing. A total of 5,651 high-quality SNPs were considered for association mapping analysis, performed using mixed linear models accounting for population structure. Because of the absence of a fully assembled grass pea reference genome, SNP markers' genomic positions were retrieved from the pea's reference genome v1a. In total, 17 genomic regions were associated with three fusarium wilt response traits in grass pea, anticipating an oligogenic control. Seven of these regions were located on pea chromosomes 1, 6, and 7. The candidate genes underlying these regions were putatively involved in secondary and amino acid metabolism, RNA (regulation of transcription), transport, and development. This study revealed important fusarium wilt resistance favorable grass pea SNP alleles, allowing the development of molecular tools for precision disease resistance breeding.


Asunto(s)
Fusarium , Estudio de Asociación del Genoma Completo , Pisum sativum/genética , Fitomejoramiento , Enfermedades de las Plantas/genética
18.
Theor Appl Genet ; 134(11): 3643-3660, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34342658

RESUMEN

KEY MESSAGE: The identity-by-descent (IBD)-based mixed model approach introduced in this study can detect quantitative trait loci (QTLs) referring to the parental origin and simultaneously account for multilevel relatedness of individuals within and across families. This unified approach is proved to be a powerful approach for all kinds of multiparental population (MPP) designs. Multiparental populations (MPPs) have become popular for quantitative trait loci (QTL) detection. Tools for QTL mapping in MPPs are mostly developed for specific MPPs and do not generalize well to other MPPs. We present an IBD-based mixed model approach for QTL mapping in all kinds of MPP designs, e.g., diallel, Nested Association Mapping (NAM), and Multiparental Advanced Generation Intercross (MAGIC) designs. The first step is to compute identity-by-descent (IBD) probabilities using a general Hidden Markov model framework, called reconstructing ancestry blocks bit by bit (RABBIT). Next, functions of IBD information are used as design matrices, or genetic predictors, in a mixed model approach to estimate variance components for multiallelic genetic effects associated with parents. Family-specific residual genetic effects are added, and a polygenic effect is structured by kinship relations between individuals. Case studies of simulated diallel, NAM, and MAGIC designs proved that the advanced IBD-based multi-QTL mixed model approach incorporating both kinship relations and family-specific residual variances (IBD.MQMkin_F) is robust across a variety of MPP designs and allele segregation patterns in comparison to a widely used benchmark association mapping method, and in most cases, outperformed or behaved at least as well as other tools developed for specific MPP designs in terms of mapping power and resolution. Successful analyses of real data cases confirmed the wide applicability of our IBD-based mixed model methodology.


Asunto(s)
Mapeo Cromosómico , Modelos Genéticos , Sitios de Carácter Cuantitativo , Alelos , Simulación por Computador , Modelos Lineales , Cadenas de Markov , Plantas/genética
19.
Front Plant Sci ; 12: 672417, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34434201

RESUMEN

Use of genomic prediction (GP) in tetraploid is becoming more common. Therefore, we think it is the right time for a comparison of GP models for tetraploid potato. GP models were compared that contrasted shrinkage with variable selection, parametric vs. non-parametric models and different ways of accounting for non-additive genetic effects. As a complement to GP, association studies were carried out in an attempt to understand the differences in prediction accuracy. We compared our GP models on a data set consisting of 147 cultivars, representing worldwide diversity, with over 39 k GBS markers and measurements on four tuber traits collected in six trials at three locations during 2 years. GP accuracies ranged from 0.32 for tuber count to 0.77 for dry matter content. For all traits, differences between GP models that utilised shrinkage penalties and those that performed variable selection were negligible. This was surprising for dry matter, as only a few additive markers explained over 50% of phenotypic variation. Accuracy for tuber count increased from 0.35 to 0.41, when dominance was included in the model. This result is supported by Genome Wide Association Study (GWAS) that found additive and dominance effects accounted for 37% of phenotypic variation, while significant additive effects alone accounted for 14%. For tuber weight, the Reproducing Kernel Hilbert Space (RKHS) model gave a larger improvement in prediction accuracy than explicitly modelling epistatic effects. This is an indication that capturing the between locus epistatic effects of tuber weight can be done more effectively using the semi-parametric RKHS model. Our results show good opportunities for GP in 4x potato.

20.
Front Genet ; 12: 667358, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34108993

RESUMEN

In the past decades, genomic prediction has had a large impact on plant breeding. Given the current advances of high-throughput phenotyping and sequencing technologies, it is increasingly common to observe a large number of traits, in addition to the target trait of interest. This raises the important question whether these additional or "secondary" traits can be used to improve genomic prediction for the target trait. With only a small number of secondary traits, this is known to be the case, given sufficiently high heritabilities and genetic correlations. Here we focus on the more challenging situation with a large number of secondary traits, which is increasingly common since the arrival of high-throughput phenotyping. In this case, secondary traits are usually incorporated through additional relatedness matrices. This approach is however infeasible when secondary traits are not measured on the test set, and cannot distinguish between genetic and non-genetic correlations. An alternative direction is to extend the classical selection indices using penalized regression. So far, penalized selection indices have not been applied in a genomic prediction setting, and require plot-level data in order to reliably estimate genetic correlations. Here we aim to overcome these limitations, using two novel approaches. Our first approach relies on a dimension reduction of the secondary traits, using either penalized regression or random forests (LS-BLUP/RF-BLUP). We then compute the bivariate GBLUP with the dimension reduction as secondary trait. For simulated data (with available plot-level data), we also use bivariate GBLUP with the penalized selection index as secondary trait (SI-BLUP). In our second approach (GM-BLUP), we follow existing multi-kernel methods but replace secondary traits by their genomic predictions, with the advantage that genomic prediction is also possible when secondary traits are only measured on the training set. For most of our simulated data, SI-BLUP was most accurate, often closely followed by RF-BLUP or LS-BLUP. In real datasets, involving metabolites in Arabidopsis and transcriptomics in maize, no method could substantially improve over univariate prediction when secondary traits were only available on the training set. LS-BLUP and RF-BLUP were most accurate when secondary traits were available also for the test set.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...