Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Genetics ; 222(1)2022 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-35924977

RESUMO

The BGLR-R package implements various types of single-trait shrinkage/variable selection Bayesian regressions. The package was first released in 2014, since then it has become a software very often used in genomic studies. We recently develop functionality for multitrait models. The implementation allows users to include an arbitrary number of random-effects terms. For each set of predictors, users can choose diffuse, Gaussian, and Gaussian-spike-slab multivariate priors. Unlike other software packages for multitrait genomic regressions, BGLR offers many specifications for (co)variance parameters (unstructured, diagonal, factor analytic, and recursive). Samples from the posterior distribution of the models implemented in the multitrait function are generated using a Gibbs sampler, which is implemented by combining code written in the R and C programming languages. In this article, we provide an overview of the models and methods implemented BGLR's multitrait function, present examples that illustrate the use of the package, and benchmark the performance of the software.


Assuntos
Algoritmos , Genoma , Teorema de Bayes , Genômica/métodos , Genótipo , Modelos Genéticos
2.
G3 (Bethesda) ; 12(2)2022 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-34849802

RESUMO

When multitrait data are available, the preferred models are those that are able to account for correlations between phenotypic traits because when the degree of correlation is moderate or large, this increases the genomic prediction accuracy. For this reason, in this article, we explore Bayesian multitrait kernel methods for genomic prediction and we illustrate the power of these models with three-real datasets. The kernels under study were the linear, Gaussian, polynomial, and sigmoid kernels; they were compared with the conventional Ridge regression and GBLUP multitrait models. The results show that, in general, the Gaussian kernel method outperformed conventional Bayesian Ridge and GBLUP multitrait linear models by 2.2-17.45% (datasets 1-3) in terms of prediction performance based on the mean square error of prediction. This improvement in terms of prediction performance of the Bayesian multitrait kernel method can be attributed to the fact that the proposed model is able to capture nonlinear patterns more efficiently than linear multitrait models. However, not all kernels perform well in the datasets used for evaluation, which is why more than one kernel should be evaluated to be able to choose the best kernel.


Assuntos
Genoma , Modelos Genéticos , Teorema de Bayes , Genótipo , Fenótipo
3.
G3 (Bethesda) ; 11(10)2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34568924

RESUMO

Implementing genomic-based prediction models in genomic selection requires an understanding of the measures for evaluating prediction accuracy from different models and methods using multi-trait data. In this study, we compared prediction accuracy using six large multi-trait wheat data sets (quality and grain yield). The data were used to predict 1 year (testing) from the previous year (training) to assess prediction accuracy using four different prediction models. The results indicated that the conventional Pearson's correlation between observed and predicted values underestimated the true correlation value, whereas the corrected Pearson's correlation calculated by fitting a bivariate model was higher than the division of the Pearson's correlation by the squared root of the heritability across traits, by 2.53-11.46%. Across the datasets, the corrected Pearson's correlation was higher than the uncorrected by 5.80-14.01%. Overall, we found that for grain yield the prediction performance was highest using a multi-trait compared to a single-trait model. The higher the absolute genetic correlation between traits the greater the benefits of multi-trait models for increasing the genomic-enabled prediction accuracy of traits.


Assuntos
Melhoramento Vegetal , Triticum , Genômica , Genótipo , Modelos Genéticos , Fenótipo , Seleção Genética , Triticum/genética
4.
G3 (Bethesda) ; 10(9): 3137-3145, 2020 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-32709618

RESUMO

Genomic selection uses whole-genome marker models to predict phenotypes or genetic values for complex traits. Some of these models fit interaction terms between markers, and are therefore called epistatic. The biological interpretation of the corresponding fitted effects is not straightforward and there is the threat of overinterpreting their functional meaning. Here we show that the predictive ability of epistatic models relative to additive models can change with the density of the marker panel. In more detail, we show that for publicly available Arabidopsis and rice datasets, an initial superiority of epistatic models over additive models, which can be observed at a lower marker density, vanishes when the number of markers increases. We relate these observations to earlier results reported in the context of association studies which showed that detecting statistical epistatic effects may not only be related to interactions in the underlying genetic architecture, but also to incomplete linkage disequilibrium at low marker density ("Phantom Epistasis"). Finally, we illustrate in a simulation study that due to phantom epistasis, epistatic models may also predict the genetic value of an underlying purely additive genetic architecture better than additive models, when the marker density is low. Our observations can encourage the use of genomic epistatic models with low density panels, and discourage their biological over-interpretation.


Assuntos
Epistasia Genética , Modelos Genéticos , Genoma , Genômica , Desequilíbrio de Ligação
5.
G3 (Bethesda) ; 9(12): 3981-3994, 2019 12 03.
Artigo em Inglês | MEDLINE | ID: mdl-31570501

RESUMO

The constrained linear genomic selection index (CLGSI) is a linear combination of genomic estimated breeding values useful for predicting the net genetic merit, which in turn is a linear combination of true unobservable breeding values of the traits weighted by their respective economic values. The CLGSI is the most general genomic index and allows imposing constraints on the expected genetic gain per trait to make some traits change their mean values based on a predetermined level, while the rest of them remain without restrictions. In addition, it includes the unconstrained linear genomic index as a particular case. Using two real datasets and simulated data for seven selection cycles, we compared the theoretical results of the CLGSI with the theoretical results of the constrained linear phenotypic selection index (CLPSI). The criteria used to compare CLGSI vs. CLPSI efficiency were the estimated expected genetic gain per trait values, the selection response, and the interval between selection cycles. The results indicated that because the interval between selection cycles is shorter for the CLGSI than for the CLPSI, CLGSI is more efficient than CLPSI per unit of time, but its efficiency could be lower per selection cycle. Thus, CLGSI is a good option for performing genomic selection when there are genotyped candidates for selection.


Assuntos
Genômica , Seleção Genética , Zea mays/genética , Simulação por Computador , Cruzamentos Genéticos , Bases de Dados Genéticas , Genoma de Planta , Fenótipo , Melhoramento Vegetal , Característica Quantitativa Herdável
6.
G3 (Bethesda) ; 9(8): 2739-2748, 2019 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-31263059

RESUMO

The genetic merit of individuals can be estimated using models with dense markers and pedigree information. Early genomic models accounted only for additive effects. However, the prediction of non-additive effects is important for different forest breeding systems where the whole genotypic value can be captured through clonal propagation. In this study, we evaluated the integration of marker data with pedigree information, in models that included or ignored non-additive effects. We tested the models Reproducing Kernel Hilbert Spaces (RKHS) and BayesA, with additive and additive-dominance frameworks. Model performance was assessed for the traits tree height, diameter at breast height and rust resistance, measured in 923 pine individuals from a structured population of 71 full-sib families. We have also simulated a population with similar genetic properties and evaluated the performance of models for six simulated traits with distinct genetic architectures. Different cross validation strategies were evaluated, and highest accuracies were achieved using within family cross validation. The inclusion of pedigree information in genomic prediction models did not yield higher accuracies. The different RKHS models resulted in similar predictions accuracies, and RKHS and BayesA generated substantially better predictions than pedigree-only models. The additive-BayesA resulted in higher accuracies than RKHS for rust incidence and in simulated additive-oligogenic traits. For DBH, HT and additive-dominance polygenic traits, the RKHS- based models showed slightly higher accuracies than BayesA. Our results indicate that BayesA performs the best for traits with few genes with major effects, while RKHS based models can best predict genotypic effects for clonal selection of complex traits.


Assuntos
Marcadores Genéticos , Genoma , Genômica , Modelos Genéticos , Linhagem , Algoritmos , Cruzamento , Genética Populacional , Genômica/métodos , Genótipo , Fenótipo , Melhoramento Vegetal , Reprodutibilidade dos Testes
7.
G3 (Bethesda) ; 9(8): 2463-2475, 2019 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-31171567

RESUMO

Genomic selection is an efficient approach to get shorter breeding cycles in recurrent selection programs and greater genetic gains with selection of superior individuals. Despite advances in genotyping techniques, genetic studies for polyploid species have been limited to a rough approximation of studies in diploid species. The major challenge is to distinguish the different types of heterozygotes present in polyploid populations. In this work, we evaluated different genomic prediction models applied to a recurrent selection population of 530 genotypes of Panicum maximum, an autotetraploid forage grass. We also investigated the effect of the allele dosage in the prediction, i.e., considering tetraploid (GS-TD) or diploid (GS-DD) allele dosage. A longitudinal linear mixed model was fitted for each one of the six phenotypic traits, considering different covariance matrices for genetic and residual effects. A total of 41,424 genotyping-by-sequencing markers were obtained using 96-plex and Pst1 restriction enzyme, and quantitative genotype calling was performed. Six predictive models were generalized to tetraploid species and predictive ability was estimated by a replicated fivefold cross-validation process. GS-TD and GS-DD models were performed considering 1,223 informative markers. Overall, GS-TD data yielded higher predictive abilities than with GS-DD data. However, different predictive models had similar predictive ability performance. In this work, we provide bioinformatic and modeling guidelines to consider tetraploid dosage and observed that genomic selection may lead to additional gains in recurrent selection program of P. maximum.


Assuntos
Alelos , Dosagem de Genes , Genoma de Planta , Genômica , Panicum/genética , Algoritmos , Genômica/métodos , Fenótipo , Melhoramento Vegetal , Poliploidia , Seleção Genética
8.
G3 (Bethesda) ; 9(4): 1231-1247, 2019 04 09.
Artigo em Inglês | MEDLINE | ID: mdl-30796086

RESUMO

Hyperspectral reflectance phenotyping and genomic selection are two emerging technologies that have the potential to increase plant breeding efficiency by improving prediction accuracy for grain yield. Hyperspectral cameras quantify canopy reflectance across a wide range of wavelengths that are associated with numerous biophysical and biochemical processes in plants. Genomic selection models utilize genome-wide marker or pedigree information to predict the genetic values of breeding lines. In this study, we propose a multi-kernel GBLUP approach to genomic selection that uses genomic marker-, pedigree-, and hyperspectral reflectance-derived relationship matrices to model the genetic main effects and genotype × environment (G × E) interactions across environments within a bread wheat (Triticum aestivum L.) breeding program. We utilized an airplane equipped with a hyperspectral camera to phenotype five differentially managed treatments of the yield trials conducted by the Bread Wheat Improvement Program of the International Maize and Wheat Improvement Center (CIMMYT) at Ciudad Obregón, México over four breeding cycles. We observed that single-kernel models using hyperspectral reflectance-derived relationship matrices performed similarly or superior to marker- and pedigree-based genomic selection models when predicting within and across environments. Multi-kernel models combining marker/pedigree information with hyperspectral reflectance phentoypes had the highest prediction accuracies; however, improvements in accuracy over marker- and pedigree-based models were marginal when correcting for days to heading. Our results demonstrate the potential of using hyperspectral imaging to predict grain yield within a multi-environment context and also support further studies on the integration of hyperspectral reflectance phenotyping into breeding programs.


Assuntos
Melhoramento Vegetal/métodos , Triticum/genética , Interação Gene-Ambiente , Marcadores Genéticos , Genoma de Planta , Genótipo , México , Fenótipo , Seleção Genética , Triticum/crescimento & desenvolvimento
9.
G3 (Bethesda) ; 8(9): 3039-3047, 2018 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-30049744

RESUMO

One of the major issues in plant breeding is the occurrence of genotype × environment (GE) interaction. Several models have been created to understand this phenomenon and explore it. In the genomic era, several models were employed to improve selection by using markers and account for GE interaction simultaneously. Some of these models use special genetic covariance matrices. In addition, the scale of multi-environment trials is getting larger, and this increases the computational challenges. In this context, we propose an R package that, in general, allows building GE genomic covariance matrices and fitting linear mixed models, in particular, to a few genomic GE models. Here we propose two functions: one to prepare the genomic kernels accounting for the genomic GE and another to perform genomic prediction using a Bayesian linear mixed model. A specific treatment is given for sparse covariance matrices, in particular, to block diagonal matrices that are present in some GE models in order to decrease the computational demand. In empirical comparisons with Bayesian Genomic Linear Regression (BGLR), accuracies and the mean squared error were similar; however, the computational time was up to five times lower than when using the classic approach. Bayesian Genomic Genotype × Environment Interaction (BGGE) is a fast, efficient option for creating genomic GE kernels and making genomic predictions.


Assuntos
Interação Gene-Ambiente , Genótipo , Modelos Genéticos , Teorema de Bayes , Valor Preditivo dos Testes
10.
G3 (Bethesda) ; 8(4): 1183-1194, 2018 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-29440129

RESUMO

Piscirickettsia salmonis is one of the main infectious diseases affecting coho salmon (Oncorhynchus kisutch) farming, and current treatments have been ineffective for the control of this disease. Genetic improvement for P. salmonis resistance has been proposed as a feasible alternative for the control of this infectious disease in farmed fish. Genotyping by sequencing (GBS) strategies allow genotyping of hundreds of individuals with thousands of single nucleotide polymorphisms (SNPs), which can be used to perform genome wide association studies (GWAS) and predict genetic values using genome-wide information. We used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and to identify molecular markers associated with the trait. We also evaluated genomic selection (GS) models in order to determine the potential to accelerate the genetic improvement of this trait by means of using genome-wide molecular information. A total of 764 individuals from 33 full-sib families (17 highly resistant and 16 highly susceptible) were experimentally challenged against P. salmonis and their genotypes were assayed using ddRAD sequencing. A total of 9,389 SNPs markers were identified in the population. These markers were used to test genomic selection models and compare different GWAS methodologies for resistance measured as day of death (DD) and binary survival (BIN). Genomic selection models showed higher accuracies than the traditional pedigree-based best linear unbiased prediction (PBLUP) method, for both DD and BIN. The models showed an improvement of up to 95% and 155% respectively over PBLUP. One SNP related with B-cell development was identified as a potential functional candidate associated with resistance to P. salmonis defined as DD.


Assuntos
DNA/genética , Resistência à Doença/genética , Estudo de Associação Genômica Ampla , Genômica , Oncorhynchus kisutch/genética , Oncorhynchus kisutch/microbiologia , Piscirickettsia/fisiologia , Mapeamento por Restrição/métodos , Animais , Cruzamento , Feminino , Doenças dos Peixes/genética , Doenças dos Peixes/microbiologia , Marcadores Genéticos , Estimativa de Kaplan-Meier , Masculino , Linhagem
11.
G3 (Bethesda) ; 8(2): 719-726, 2018 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-29255117

RESUMO

Salmonid rickettsial syndrome (SRS), caused by the intracellular bacterium Piscirickettsia salmonis, is one of the main diseases affecting rainbow trout (Oncorhynchus mykiss) farming. To accelerate genetic progress, genomic selection methods can be used as an effective approach to control the disease. The aims of this study were: (i) to compare the accuracy of estimated breeding values using pedigree-based best linear unbiased prediction (PBLUP) with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), Bayes C, and Bayesian Lasso (LASSO); and (ii) to test the accuracy of genomic prediction and PBLUP using different marker densities (0.5, 3, 10, 20, and 27 K) for resistance against P. salmonis in rainbow trout. Phenotypes were recorded as number of days to death (DD) and binary survival (BS) from 2416 fish challenged with P. salmonis A total of 1934 fish were genotyped using a 57 K single-nucleotide polymorphism (SNP) array. All genomic prediction methods achieved higher accuracies than PBLUP. The relative increase in accuracy for different genomic models ranged from 28 to 41% for both DD and BS at 27 K SNP. Between different genomic models, the highest relative increase in accuracy was obtained with Bayes C (∼40%), where 3 K SNP was enough to achieve a similar accuracy to that of the 27 K SNP for both traits. For resistance against P. salmonis in rainbow trout, we showed that genomic predictions using GBLUP, ssGBLUP, Bayes C, and LASSO can increase accuracy compared with PBLUP. Moreover, it is possible to use relatively low-density SNP panels for genomic prediction without compromising accuracy predictions for resistance against P. salmonis in rainbow trout.


Assuntos
Resistência à Doença/genética , Doenças dos Peixes/genética , Genômica/métodos , Oncorhynchus mykiss/genética , Infecções por Piscirickettsiaceae/genética , Animais , Teorema de Bayes , Doenças dos Peixes/microbiologia , Estudo de Associação Genômica Ampla , Genótipo , Oncorhynchus mykiss/microbiologia , Fenótipo , Piscirickettsia/fisiologia , Infecções por Piscirickettsiaceae/microbiologia , Polimorfismo de Nucleotídeo Único
12.
G3 (Bethesda) ; 7(6): 1855-1859, 2017 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-28391242

RESUMO

Nelore is the most economically important cattle breed in Brazil, and the use of genetically improved animals has contributed to increased beef production efficiency. The Brazilian beef feedlot industry has grown considerably in the last decade, so the selection of animals with higher growth rates on feedlot has become quite important. Genomic selection (GS) could be used to reduce generation intervals and improve the rate of genetic gains. The aim of this study was to evaluate the prediction of genomic-estimated breeding values (GEBV) for average daily weight gain (ADG) in 718 feedlot-finished Nelore steers. Analyses of three Bayesian model specifications [Bayesian GBLUP (BGBLUP), BayesA, and BayesCπ] were performed with four genotype panels [Illumina BovineHD BeadChip, TagSNPs, and GeneSeek High- and Low-density indicus (HDi and LDi, respectively)]. Estimates of Pearson correlations, regression coefficients, and mean squared errors were used to assess accuracy and bias of predictions. Overall, the BayesCπ model resulted in less biased predictions. Accuracies ranged from 0.18 to 0.27, which are reasonable values given the heritability estimates (from 0.40 to 0.44) and sample size (568 animals in the training population). Furthermore, results from Bos taurus indicus panels were as informative as those from Illumina BovineHD, indicating that they could be used to implement GS at lower costs.


Assuntos
Cruzamento , Estudo de Associação Genômica Ampla , Genoma , Genômica/métodos , Aumento de Peso/genética , Animais , Brasil , Bovinos , Genótipo , Modelos Genéticos , Fenótipo , Reprodutibilidade dos Testes
13.
G3 (Bethesda) ; 7(2): 481-495, 2017 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-27903632

RESUMO

Developing genomic selection (GS) models is an important step in applying GS to accelerate the rate of genetic gain in grain yield in plant breeding. In this study, seven genomic prediction models under two cross-validation (CV) scenarios were tested on 287 advanced elite spring wheat lines phenotyped for grain yield (GY), thousand-grain weight (GW), grain number (GN), and thermal time for flowering (TTF) in 18 international environments (year-location combinations) in major wheat-producing countries in 2010 and 2011. Prediction models with genomic and pedigree information included main effects and interaction with environments. Two random CV schemes were applied to predict a subset of lines that were not observed in any of the 18 environments (CV1), and a subset of lines that were not observed in a set of the environments, but were observed in other environments (CV2). Genomic prediction models, including genotype × environment (G×E) interaction, had the highest average prediction ability under the CV1 scenario for GY (0.31), GN (0.32), GW (0.45), and TTF (0.27). For CV2, the average prediction ability of the model including the interaction terms was generally high for GY (0.38), GN (0.43), GW (0.63), and TTF (0.53). Wheat lines in site-year combinations in Mexico and India had relatively high prediction ability for GY and GW. Results indicated that prediction ability of lines not observed in certain environments could be relatively high for genomic selection when predicting G×E interaction in multi-environment trials.


Assuntos
Interação Gene-Ambiente , Genômica , Seleção Genética , Triticum/genética , África do Norte , Ásia , Cruzamento , Genoma de Planta , Genótipo , México , Linhagem , Fenótipo , Característica Quantitativa Herdável , Estações do Ano , Triticum/crescimento & desenvolvimento
14.
G3 (Bethesda) ; 6(7): 1819-34, 2016 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-27172218

RESUMO

This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH), and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G × E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, "diversity" and "prediction", including 10% and 20%, respectively, of the total collections. Accounting for population structure decreased prediction accuracy by 15-20% as compared to prediction accuracy obtained when not accounting for population structure. Accounting for population structure gave prediction accuracies for traits evaluated in one environment for TRN20-TST80 that ranged from 0.407 to 0.677 for Mexican landraces, and from 0.166 to 0.662 for Iranian landraces. Prediction accuracy of the 20% diversity core set was similar to accuracies obtained for TRN20-TST80, ranging from 0.412 to 0.654 for Mexican landraces, and from 0.182 to 0.647 for Iranian landraces. The predictive core set gave similar prediction accuracy as the diversity core set for Mexican collections, but slightly lower for Iranian collections. Prediction accuracy when incorporating G × E for DTH and DTM for Mexican landraces for TRN20-TST80 was around 0.60, which is greater than without the G × E term. For Iranian landraces, accuracies were 0.55 for the G × E model with TRN20-TST80. Results show promising prediction accuracies for potential use in germplasm enhancement and rapid introgression of exotic germplasm into elite materials.


Assuntos
Genoma de Planta , Modelos Estatísticos , Característica Quantitativa Herdável , Triticum/genética , Adaptação Fisiológica/genética , Secas , Interação Gene-Ambiente , Genótipo , Temperatura Alta , Irã (Geográfico) , México , Modelos Genéticos , Fenótipo , Seleção Genética , Estresse Fisiológico , Triticum/classificação
15.
G3 (Bethesda) ; 6(5): 1165-77, 2016 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-26921298

RESUMO

Genomic tools allow the study of the whole genome, and facilitate the study of genotype-environment combinations and their relationship with phenotype. However, most genomic prediction models developed so far are appropriate for Gaussian phenotypes. For this reason, appropriate genomic prediction models are needed for count data, since the conventional regression models used on count data with a large sample size ([Formula: see text]) and a small number of parameters (p) cannot be used for genomic-enabled prediction where the number of parameters (p) is larger than the sample size ([Formula: see text]). Here, we propose a Bayesian mixed-negative binomial (BMNB) genomic regression model for counts that takes into account genotype by environment [Formula: see text] interaction. We also provide all the full conditional distributions to implement a Gibbs sampler. We evaluated the proposed model using a simulated data set, and a real wheat data set from the International Maize and Wheat Improvement Center (CIMMYT) and collaborators. Results indicate that our BMNB model provides a viable option for analyzing count data.


Assuntos
Teorema de Bayes , Meio Ambiente , Interação Gene-Ambiente , Genômica , Genótipo , Modelos Genéticos , Algoritmos , Estudos de Associação Genética , Genômica/métodos , Modelos Estatísticos , Fenótipo , Triticum/genética
16.
Genetics ; 199(3): 675-81, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25567991

RESUMO

Quality control filtering of single-nucleotide polymorphisms (SNPs) is a key step when analyzing genomic data. Here we present a practical method to identify low-quality SNPs, meaning markers whose genotypes are wrongly assigned for a large proportion of individuals, by estimating the heritability of gene content at each marker, where gene content is the number of copies of a particular reference allele in a genotype of an animal (0, 1, or 2). If there is no mutation at the marker, gene content has an additive heritability of 1 by construction. The method uses restricted maximum likelihood (REML) to estimate heritability of gene content at each SNP and also builds a likelihood-ratio test statistic to test for zero error variance in genotyping. As a by-product, estimates of the allele frequencies of markers at the base population are obtained. Using simulated data with 10% permutation error (4% actual error) in genotyping, the method had a specificity of 0.96 (4% of correct markers are rejected) and a sensitivity of 0.99 (1% of wrong markers are accepted) if markers with heritability lower than 0.975 are discarded. Checking of Mendelian errors resulted in a lower sensitivity (0.84) for the same simulation. The proposed method is further illustrated with a real data set with genotypes from 3534 animals genotyped for 50,433 markers from the Illumina PorcineSNP60 chip and a pedigree of 6473 individuals; those markers underwent very little quality control. A total of 4099 markers with P-values lower than 0.01 were discarded based on our method, with associated estimates of heritability as low as 0.12. Contrary to other techniques, our method uses all information in the population simultaneously, can be used in any population with markers and pedigree recordings, and is simple to implement using standard software for REML estimation. Scripts for its use are provided.


Assuntos
Genômica/normas , Técnicas de Genotipagem/normas , Modelos Genéticos , Linhagem , Polimorfismo de Nucleotídeo Único , Animais , Marcadores Genéticos , Técnicas de Genotipagem/métodos , Técnicas de Genotipagem/estatística & dados numéricos , Humanos , Funções Verossimilhança , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , Controle de Qualidade , Sensibilidade e Especificidade , Sus scrofa
17.
G3 (Bethesda) ; 3(12): 2105-14, 2013 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-24082033

RESUMO

In crop breeding, the interest of predicting the performance of candidate cultivars in the field has increased due to recent advances in molecular breeding technologies. However, the complexity of the wheat genome presents some challenges for applying new technologies in molecular marker identification with next-generation sequencing. We applied genotyping-by-sequencing, a recently developed method to identify single-nucleotide polymorphisms, in the genomes of 384 wheat (Triticum aestivum) genotypes that were field tested under three different water regimes in Mediterranean climatic conditions: rain-fed only, mild water stress, and fully irrigated. We identified 102,324 single-nucleotide polymorphisms in these genotypes, and the phenotypic data were used to train and test genomic selection models intended to predict yield, thousand-kernel weight, number of kernels per spike, and heading date. Phenotypic data showed marked spatial variation. Therefore, different models were tested to correct the trends observed in the field. A mixed-model using moving-means as a covariate was found to best fit the data. When we applied the genomic selection models, the accuracy of predicted traits increased with spatial adjustment. Multiple genomic selection models were tested, and a Gaussian kernel model was determined to give the highest accuracy. The best predictions between environments were obtained when data from different years were used to train the model. Our results confirm that genotyping-by-sequencing is an effective tool to obtain genome-wide information for crops with complex genomes, that these data are efficient for predicting traits, and that correction of spatial variation is a crucial ingredient to increase prediction accuracy in genomic selection models.


Assuntos
Cruzamento/métodos , Modelos Genéticos , Triticum/genética , Fenótipo , Filogenia , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Seleção Genética , Alinhamento de Sequência
18.
G3 (Bethesda) ; 3(11): 1903-26, 2013 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-24022750

RESUMO

Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large numbers of marker genotypes with potentially less ascertainment bias than standard single nucleotide polymorphism (SNP) arrays. Therefore, GBS has become an attractive alternative technology for genomic selection. However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava. The main objective of this study was to evaluate various methods for incorporating GBS information and compare them with pedigree models for predicting genetic values of lines from two maize populations evaluated for different traits measured in different environments (experiments 1 and 2). Given that GBS data come with a large percentage of uncalled genotypes, we evaluated methods using nonimputed, imputed, and GBS-inferred haplotypes of different lengths (short or long). GBS and pedigree data were incorporated into statistical models using either the genomic best linear unbiased predictors (GBLUP) or the reproducing kernel Hilbert spaces (RKHS) regressions, and prediction accuracy was quantified using cross-validation methods. The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.


Assuntos
Genoma de Planta , Zea mays/genética , Cruzamento , Cromossomos/química , Cromossomos/metabolismo , Genótipo , Haplótipos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA