Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Genet Sel Evol ; 54(1): 78, 2022 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-36460973

RESUMO

BACKGROUND: Selection schemes distort inference when estimating differences between treatments or genetic associations between traits, and may degrade prediction of outcomes, e.g., the expected performance of the progeny of an individual with a certain genotype. If input and output measurements are not collected on random samples, inferences and predictions must be biased to some degree. Our paper revisits inference in quantitative genetics when using samples stemming from some selection process. The approach used integrates the classical notion of fitness with that of missing data. Treatment is fully Bayesian, with inference and prediction dealt with, in an unified manner. While focus is on animal and plant breeding, concepts apply to natural selection as well. Examples based on real data and stylized models illustrate how selection can be accounted for in four different situations, and sometimes without success. RESULTS: Our flexible "soft selection" setting helps to diagnose the extent to which selection can be ignored. The clear connection between probability of missingness and the concept of fitness in stylized selection scenarios is highlighted. It is not realistic to assume that a fixed selection threshold t holds in conceptual replication, as the chance of selection depends on observed and unobserved data, and on unequal amounts of information over individuals, aspects that a "soft" selection representation addresses explicitly. There does not seem to be a general prescription to accommodate potential distortions due to selection. In structures that combine cross-sectional, longitudinal and multi-trait data such as in animal breeding, balance is the exception rather than the rule. The Bayesian approach provides an integrated answer to inference, prediction and model choice under selection that goes beyond the likelihood-based approach, where breeding values are inferred indirectly. CONCLUSIONS: The approach used here for inference and prediction under selection may or may not yield the best possible answers. One may believe that selection has been accounted for diligently, but the central problem of whether statistical inferences are good or bad does not have an unambiguous solution. On the other hand, the quality of predictions can be gauged empirically via appropriate training-testing of competing methods.


Assuntos
Genômica , Animais , Teorema de Bayes , Estudos Transversais , Funções Verossimilhança , Fenótipo
2.
Genet Sel Evol ; 54(1): 72, 2022 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-36316629

RESUMO

BACKGROUND: Single-step genomic best linear unbiased prediction (GBLUP) involves a joint analysis of individuals with genotype information, and their ancestors, descendants, or contemporaries, without recorded genotypes. Livestock applications typically represent populations with fewer individuals with genotypes relative to the number not genotyped. Most breeding programmes are structured, consisting of a nucleus tier in which selection drives genetic gains that are propagated through descendants that represent parents in multiplier and commercial tiers. In some cases, the genotypes in the nucleus tier are proprietary to a breeding company, and not publicly available for a whole industry analysis. Bayesian inference involves combining a defined description of prior information with new information to generate a posterior distribution that contains all available information on parameters of interest. A natural extension of Bayesian analysis would be to use information from the posterior distribution to define the prior distribution in a subsequent analysis. METHODS: We derive the mixed model equations for inference on breeding values for non genotyped individuals in that subset of the population that is of current interest, using only data on the performance of current individuals and their immediate pedigree, along with prior information defined from the posterior distribution of an external BLUP or single-step GBLUP analysis of the ancestors of the current population. DISCUSSION: Identical estimates of breeding values and their prediction error covariances for current animals of interest in the multiplier or commercial tier can be obtained without requiring neither the genomic relationship matrix nor genotypes of any of their ancestors in the nucleus tier, as can be obtained from a single analysis using pedigree, performance, and genomic information from all tiers. The Bayesian analysis of the current population does not require explicit information on unselected genotyped animals in the external population.


Assuntos
Genoma , Genômica , Animais , Teorema de Bayes , Genótipo , Genômica/métodos , Linhagem , Modelos Genéticos , Fenótipo
3.
Genet Sel Evol ; 54(1): 12, 2022 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-35135468

RESUMO

BACKGROUND: Linkage disequilibrium (LD) is commonly measured based on the squared coefficient of correlation [Formula: see text] between the alleles at two loci that are carried by haplotypes. LD can also be estimated as the [Formula: see text] between unphased genotype dosage at two loci when the allele frequencies and inbreeding coefficients at both loci are identical for the parental lines. Here, we investigated whether [Formula: see text] for a crossbred population (F1) can be estimated using genotype data. The parental lines of the crossbred (F1) can be purebred or crossbred. METHODS: We approached this by first showing that inbreeding coefficients for an F1 crossbred population are negative, and typically differ in size between loci. Then, we proved that the expected [Formula: see text] computed from unphased genotype data is expected to be identical to the [Formula: see text] computed from haplotype data for an F1 crossbred population, regardless of the inbreeding coefficients at the two loci. Finally, we investigated the bias and precision of the [Formula: see text] estimated using unphased genotype versus haplotype data in stochastic simulation. RESULTS: Our findings show that estimates of [Formula: see text] based on haplotype and unphased genotype data are both unbiased for different combinations of allele frequencies, sample sizes (900, 1800, and 2700), and levels of LD. In general, for any allele frequency combination and [Formula: see text] value scenarios considered, and for both methods to estimate [Formula: see text], the precision of the estimates increased, and the bias of the estimates decreased as sample size increased, indicating that both estimators are consistent. For a given scenario, the [Formula: see text] estimates using haplotype data were more precise and less biased using haplotype data than using unphased genotype data. As sample size increased, the difference in precision and biasedness between the [Formula: see text] estimates using haplotype data and unphased genotype data decreased. CONCLUSIONS: Our theoretical derivations showed that estimates of LD between loci based on unphased genotypes and haplotypes in F1 crossbreds have identical expectations. Based on our simulation results, we conclude that the LD for an F1 crossbred population can be accurately estimated from unphased genotype data. The results also apply for other crosses (F2, F3, Fn, BC1, BC2, and BCn), as long as (selected) individuals from the two parental lines mate randomly.


Assuntos
Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Frequência do Gene , Genótipo , Haplótipos , Humanos , Desequilíbrio de Ligação
4.
Genet Sel Evol ; 53(1): 91, 2021 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-34875996

RESUMO

BACKGROUND: The possibility of using antibody response (S/P ratio) to PRRSV vaccination measured in crossbred commercial gilts as a genetic indicator for reproductive performance in vaccinated crossbred sows has motivated further studies of the genomic basis of this trait. In this study, we investigated the association of haplotypes and runs of homozygosity (ROH) and heterozygosity (ROHet) with S/P ratio and their impact on reproductive performance. RESULTS: There was no association (P-value ≥ 0.18) of S/P ratio with the percentage of ROH or ROHet, or with the percentage of heterozygosity across the whole genome or in the major histocompatibility complex (MHC) region. However, specific ROH and ROHet regions were significantly associated (P-value ≤ 0.01) with S/P ratio on chromosomes 1, 4, 5, 7, 10, 11, 13, and 17 but not (P-value ≥ 0.10) with reproductive performance. With the haplotype-based genome-wide association study (GWAS), additional genomic regions associated with S/P ratio were identified on chromosomes 4, 7, and 9. These regions harbor immune-related genes, such as SLA-DOB, TAP2, TAPBP, TMIGD3, and ADORA. Four haplotypes at the identified region on chromosome 7 were also associated with multiple reproductive traits. A haplotype significantly associated with S/P ratio that is located in the MHC region may be in stronger linkage disequilibrium (LD) with the quantitative trait loci (QTL) than the previously identified single nucleotide polymorphism (SNP) (H3GA0020505) given the larger estimate of genetic variance explained by the haplotype than by the SNP. CONCLUSIONS: Specific ROH and ROHet regions were significantly associated with S/P ratio. The haplotype-based GWAS identified novel QTL for S/P ratio on chromosomes 4, 7, and 9 and confirmed the presence of at least one QTL in the MHC region. The chromosome 7 region was also associated with reproductive performance. These results narrow the search for causal genes in this region and suggest SLA-DOB and TAP2 as potential candidate genes associated with S/P ratio on chromosome 7. These results provide additional opportunities for marker-assisted selection and genomic selection for S/P ratio as genetic indicator for litter size in commercial pig populations.


Assuntos
Vírus da Síndrome Respiratória e Reprodutiva Suína , Animais , Formação de Anticorpos , Feminino , Estudo de Associação Genômica Ampla , Genômica , Haplótipos , Locos de Características Quantitativas , Sus scrofa/genética , Suínos/genética , Vacinação
5.
J Anim Breed Genet ; 138(5): 519-527, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33729622

RESUMO

Empirical estimates of the accuracy of estimates of breeding values (EBV) can be obtained by cross-validation. Leave-one-out cross-validation (LOOCV) is an extreme case of k-fold cross-validation. Efficient strategies for LOOCV of predictions of phenotypes have been developed for a simple model with an overall mean and random marker or animal genetic effects. The objective here was to develop and evaluate an efficient LOOCV method for prediction of breeding values and other random effects under a general mixed linear model with multiple random effects. Conventional LOOCV of EBV requires inverting an (n-1)×(n-1) covariance matrix for each of n (= number of observations) data sets. Our efficient LOOCV obtains the required inverses from the inverse of the covariance matrix for all n observations. The efficient method can be applied to complex models with multiple fixed and random effects, but requires fixed effects to be treated as random, with large variances. An alternative is to precorrect observations using estimates of fixed effects obtained from the complete data, but this can lead to biases. The efficient LOOCV method was compared to conventional LOOCV of predictions of breeding values in terms of computational demands and accuracy. For a data set with 3,205 observations and a model with multiple random and fixed effects, the efficient LOOCV method was 962 times faster than the conventional LOOCV with precorrection for fixed effects based on each training data set but resulted in identical EBV. A computationally efficient LOOCV for prediction of breeding values for single- and multiple-trait mixed models with multiple fixed and random effects was successfully developed. The method enables cross-validation of predictions of breeding values and of any linear combination of random and/or fixed effects, along with leave-one-out precorrection of validation phenotypes.


Assuntos
Cruzamento , Modelos Genéticos , Animais , Genótipo , Modelos Lineares , Fenótipo
6.
Theor Popul Biol ; 132: 47-59, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31830483

RESUMO

Modeling covariance structure based on genetic similarity between pairs of relatives plays an important role in evolutionary, quantitative and statistical genetics. Historically, genetic similarity between individuals has been quantified from pedigrees via the probability that randomly chosen homologous alleles between individuals are identical by descent (IBD). At present, however, many genetic analyses rely on molecular markers, with realized measures of genomic similarity replacing IBD-based expected similarities. Animal and plant breeders, for example, now employ marker-based genomic relationship matrices between individuals in prediction models and in estimation of genome-based heritability coefficients. Phenotypes convey information about genetic similarity as well. For instance, if phenotypic values are at least partially the result of the action of quantitative trait loci, one would expect the former to inform about the latter, as in genome-wide association studies. Statistically, a non-trivial conditional distribution of unknown genetic similarities, given phenotypes, is to be expected. A Bayesian formalism is presented here that applies to whole-genome regression methods where some genetic similarity matrix, e.g., a genomic relationship matrix, can be defined. Our Bayesian approach, based on phenotypes and markers, converts prior (markers only) expected similarity into trait-specific posterior similarity. A simulation illustrates situations under which effective Bayesian learning from phenotypes occurs. Pinus and wheat data sets were used to demonstrate applicability of the concept in practice. The methodology applies to a wide class of Bayesian linear regression models, it extends to the multiple-trait domain, and can also be used to develop phenotype-guided similarity kernels in prediction problems.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Locos de Características Quantitativas , Teorema de Bayes , Genótipo , Fenótipo , Pinus/genética , Polimorfismo de Nucleotídeo Único , Triticum/genética
7.
J Dairy Sci ; 102(11): 10039-10055, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31477308

RESUMO

Vitamin A is essential for human health, but current intake levels in many developing countries such as India are too low due to malnutrition. According to the World Health Organization, an estimated 250 million preschool children are vitamin A deficient globally. This number excludes pregnant women and nursing mothers, who are particularly vulnerable. Efforts to improve access to vitamin A are key because supplementation can reduce mortality rates in young children in developing countries by around 23%. Three key genes, BCMO1, BCO2, and SCARB1, have been shown to be associated with the amount of ß-carotene (BC) in milk. Whole-genome sequencing reads from the coordinates of these 3 genes in 202 non-Indian cattle (141 Bos taurus, 61 Bos indicus) and 35 non-Indian buffalo (Bubalus bubalis) animals from several breeds were collected from data repositories. The number of SNP detected in the coding regions of these 3 genes ranged from 16 to 26 in the 3 species, with 5 overlapping SNP between B. taurus and B. indicus. All these SNP together with 2 SNP in the upstream part of the gene but already present in dbSNP (https://www.ncbi.nlm.nih.gov/projects/SNP/) were used to build a custom Sequenom array. Blood for DNA and milk samples for BC were obtained from 2,291 Indian cows of 5 different breeds (Gir, Holstein cross, Jersey Cross, Tharparkar, and Sahiwal) and 2,242 Indian buffaloes (Jafarabadi, Murrah, Pandharpuri, and Surti breeds). The DNA was extracted and genotyped with the Sequenom array. For each individual breed and the combined breeds, SNP with an association that had a P-value <0.3 in the first round of linear analysis were included in a second step of regression analyses to determine allele substitution effects to increase the content of BC in milk. Additionally, an F-test for all SNP within gene was performed with the objective of determining if overall the gene had a significant effect on the content of BC in milk. The analyses were repeated using a Bayesian approach to compare and validate the previous frequentist results. Multiple significant SNP were found using both methodologies with allele substitution effects ranging from 6.21 (3.13) to 9.10 (5.43) µg of BC per 100 mL of milk. Total gene effects exceeded the mean BC value for all breeds with both analysis approaches. The custom panel designed for genes related to BC production demonstrated applicability in genotyping of cattle and buffalo in India and may be used for cattle or buffalo from other developing countries. Moreover, the recommendation of selection for significant specific alleles of some gene markers provides a route to effectively increase the BC content in milk in the Indian cattle and buffalo populations.


Assuntos
Búfalos/genética , Bovinos/genética , Marcadores Genéticos , Leite/química , beta Caroteno/análise , Alelos , Animais , Feminino , Genótipo , Índia , Polimorfismo de Nucleotídeo Único , Gravidez , Especificidade da Espécie , beta Caroteno/genética
8.
J Anim Breed Genet ; 136(2): 113-117, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30614572

RESUMO

A curious result from mixed linear models applied to genome-wide association studies was expanded. In particular, a model in which one or more markers are considered as fixed but are allowed to contribute to the covariance structure by treating such markers as random as well was examined. The best linear unbiased estimator of marker effects is invariant with respect to whether those markers are employed in constructing a genomic relationship matrix or are ignored, provided marker effects are uncorrelated with those not being tested. Also, the implications of regarding some marker effects as fixed when, in fact, these possess a non-trivial covariance structure with those declared as random were examined.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Lineares , Modelos Genéticos , Modelos Estatísticos , Animais , Cruzamento , Genoma/genética , Genômica , Polimorfismo de Nucleotídeo Único
9.
Genet Sel Evol ; 50(1): 32, 2018 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-29914353

RESUMO

BACKGROUND: Population stratification and cryptic relationships have been the main sources of excessive false-positives and false-negatives in population-based association studies. Many methods have been developed to model these confounding factors and minimize their impact on the results of genome-wide association studies. In most of these methods, a two-stage approach is applied where: (1) methods are used to determine if there is a population structure in the sample dataset and (2) the effects of population structure are corrected either by modeling it or by running a separate analysis within each sub-population. The objective of this study was to evaluate the impact of population structure on the accuracy and power of genome-wide association studies using a Bayesian multiple regression method. METHODS: We conducted a genome-wide association study in a stochastically simulated admixed population. The genome was composed of six chromosomes, each with 1000 markers. Fifteen segregating quantitative trait loci contributed to the genetic variation of a quantitative trait with heritability of 0.30. The impact of genetic relationships and breed composition (BC) on three analysis methods were evaluated: single marker simple regression (SMR), single marker mixed linear model (MLM) and Bayesian multiple-regression analysis (BMR). Each method was fitted with and without BC. Accuracy, power, false-positive rate and the positive predictive value of each method were calculated and used for comparison. RESULTS: SMR and BMR, both without BC, were ranked as the worst and the best performing approaches, respectively. Our results showed that, while explicit modeling of genetic relationships and BC is essential for models SMR and MLM, BMR can disregard them and yet result in a higher power without compromising its false-positive rate. CONCLUSIONS: This study showed that the Bayesian multiple-regression analysis is robust to population structure and to relationships among study subjects and performs better than a single marker mixed linear model approach.


Assuntos
Mapeamento Cromossômico/veterinária , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Característica Quantitativa Herdável , Animais , Teorema de Bayes , Cruzamento , Genética Populacional , Tamanho do Genoma , Modelos Lineares , Modelos Genéticos , Densidade Demográfica
10.
Genet Sel Evol ; 48(1): 80, 2016 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-27788669

RESUMO

BACKGROUND: The mixed linear model employed for genomic best linear unbiased prediction (GBLUP) includes the breeding value for each animal as a random effect that has a mean of zero and a covariance matrix proportional to the genomic relationship matrix ([Formula: see text]), where the inverse of [Formula: see text] is required to set up the usual mixed model equations (MME). When only some animals have genomic information, genomic predictions can be obtained by an extension known as single-step GBLUP, where the covariance matrix of breeding values is constructed by combining the pedigree-based additive relationship matrix with [Formula: see text]. The inverse of the combined relationship matrix can be obtained efficiently, provided [Formula: see text] can be inverted. In some livestock species, however, the number [Formula: see text] of animals with genomic information exceeds the number of marker covariates used to compute [Formula: see text], and this results in a singular [Formula: see text]. For such a case, an efficient and exact method to obtain GBLUP and single-step GBLUP is presented here. RESULTS: Exact methods are already available to obtain GBLUP when [Formula: see text] is singular, but these require working with large dense matrices. Another approach is to modify [Formula: see text] to make it nonsingular by adding a small value to all its diagonals or regressing it towards the pedigree-based relationship matrix. This, however, results in the inverse of [Formula: see text] being dense and difficult to compute as [Formula: see text] grows. The approach presented here recognizes that the number r of linearly independent genomic breeding values cannot exceed the number of marker covariates, and the mixed linear model used here for genomic prediction only fits these r linearly independent breeding values as random effects. CONCLUSIONS: The exact method presented here was compared to Apy-GBLUP and to Apy single-step GBLUP, both of which are approximate methods that use a modified [Formula: see text] that has a sparse inverse which can be computed efficiently. In a small numerical example, predictions from the exact approach and Apy were almost identical, but the MME from Apy had a condition number about 1000 times larger than that from the exact approach, indicating ill-conditioning of the MME from Apy. The practical application of exact SSGBLUP is not more difficult than implementation of Apy.


Assuntos
Genômica/métodos , Modelos Lineares , Modelos Genéticos , Animais , Simulação por Computador , Genoma , Gado/genética , Linhagem , Seleção Artificial/genética
11.
Genet Sel Evol ; 48(1): 96, 2016 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-27931187

RESUMO

BACKGROUND: Two types of models have been used for single-step genomic prediction and genome-wide association studies that include phenotypes from both genotyped animals and their non-genotyped relatives. The two types are breeding value models (BVM) that fit breeding values explicitly and marker effects models (MEM) that express the breeding values in terms of the effects of observed or imputed genotypes. MEM can accommodate a wider class of analyses, including variable selection or mixture model analyses. The order of the equations that need to be solved and the inverses required in their construction vary widely, and thus the computational effort required depends upon the size of the pedigree, the number of genotyped animals and the number of loci. THEORY: We present computational strategies to avoid storing large, dense blocks of the MME that involve imputed genotypes. Furthermore, we present a hybrid model that fits a MEM for animals with observed genotypes and a BVM for those without genotypes. The hybrid model is computationally attractive for pedigree files containing millions of animals with a large proportion of those being genotyped. APPLICATION: We demonstrate the practicality on both the original MEM and the hybrid model using real data with 6,179,960 animals in the pedigree with 4,934,101 phenotypes and 31,453 animals genotyped at 40,214 informative loci. To complete a single-trait analysis on a desk-top computer with four graphics cards required about 3 h using the hybrid model to obtain both preconditioned conjugate gradient solutions and 42,000 Markov chain Monte-Carlo (MCMC) samples of breeding values, which allowed making inferences from posterior means, variances and covariances. The MCMC sampling required one quarter of the effort when the hybrid model was used compared to the published MEM. CONCLUSIONS: We present a hybrid model that fits a MEM for animals with genotypes and a BVM for those without genotypes. Its practicality and considerable reduction in computing effort was demonstrated. This model can readily be extended to accommodate multiple traits, multiple breeds, maternal effects, and additional random effects such as polygenic residual effects.


Assuntos
Teorema de Bayes , Biologia Computacional , Modelos Genéticos , Análise de Regressão , Algoritmos , Animais , Simulação por Computador
12.
Genet Sel Evol ; 48: 22, 2016 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-26992471

RESUMO

BACKGROUND: Genomic estimated breeding values (GEBV) based on single nucleotide polymorphism (SNP) genotypes are widely used in animal improvement programs. It is typically assumed that the larger the number of animals is in the training set, the higher is the prediction accuracy of GEBV. The aim of this study was to quantify genomic prediction accuracy depending on the number of ancestral generations included in the training set, and to determine the optimal number of training generations for different traits in an elite layer breeding line. METHODS: Phenotypic records for 16 traits on 17,793 birds were used. All parents and some selection candidates from nine non-overlapping generations were genotyped for 23,098 segregating SNPs. An animal model with pedigree relationships (PBLUP) and the BayesB genomic prediction model were applied to predict EBV or GEBV at each validation generation (progeny of the most recent training generation) based on varying numbers of immediately preceding ancestral generations. Prediction accuracy of EBV or GEBV was assessed as the correlation between EBV and phenotypes adjusted for fixed effects, divided by the square root of trait heritability. The optimal number of training generations that resulted in the greatest prediction accuracy of GEBV was determined for each trait. The relationship between optimal number of training generations and heritability was investigated. RESULTS: On average, accuracies were higher with the BayesB model than with PBLUP. Prediction accuracies of GEBV increased as the number of closely-related ancestral generations included in the training set increased, but reached an asymptote or slightly decreased when distant ancestral generations were used in the training set. The optimal number of training generations was 4 or more for high heritability traits but less than that for low heritability traits. For less heritable traits, limiting the training datasets to individuals closely related to the validation population resulted in the best predictions. CONCLUSIONS: The effect of adding distant ancestral generations in the training set on prediction accuracy differed between traits and the optimal number of necessary training generations is associated with the heritability of traits.


Assuntos
Galinhas/genética , Genômica/métodos , Linhagem , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Animais , Teorema de Bayes , Cruzamento , Ovos/normas , Feminino , Genoma , Genótipo , Modelos Animais , Modelos Genéticos , Fenótipo , Seleção Genética
13.
Genet Sel Evol ; 47: 99, 2015 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-26698091

RESUMO

BACKGROUND: More accurate genomic predictions are expected when the effects of QTL (quantitative trait loci) are predicted from markers in close physical proximity to the QTL. The objective of this study was to quantify to what extent whole-genome methods using 50 K or imputed 770 K SNPs (single nucleotide polymorphisms) could predict single or multiple QTL genotypes based on SNPs in close proximity to those QTL. METHODS: Phenotypes with a heritability of 1 were simulated for 2677 Hereford animals genotyped with the BovineSNP50 BeadChip. Genotypes for the high-density 770 K SNP panel were imputed using Beagle software. Various Bayesian regression methods were used to predict single QTL or a trait influenced by 42 such QTL. We quantified to what extent these predictions were based on SNPs in close proximity to the QTL by comparing whole-genome predictions to local predictions based on estimates of the effects of variable numbers of SNPs i.e. ±1, ±2, ±5, ±10, ±50 or ±100 that flanked the QTL. RESULTS: Prediction accuracies based on local SNPs using whole-genome training for single QTL with the 50 K SNP panel and BayesC0 ranged from 0.49 (±1 SNP) to 0.75 (±100 SNPs). The minimum number of local SNPs for an accurate prediction is ±10 SNPs. Prediction accuracies that were based on local SNPs only were higher than those based on whole-genome SNPs for both 50 K and 770 K SNP panels. For the 770 K SNP panel, prediction accuracies were higher than 0.70 and varied little i.e. between 0.73 (±1 SNP) and 0.77 (±5 SNPs). For the summed 42 QTL, prediction accuracies were generally higher than for single QTL regardless of the number of local SNPs. For QTL with low minor allele frequency (MAF) compared to QTL with high MAF, prediction accuracies increased as the number of SNPs around the QTL increased. CONCLUSIONS: These results suggest that with both 50 K and imputed 770 K SNP genotypes the level of linkage disequilibrium is sufficient to predict single and multiple QTL. However, prediction accuracies are eroded through spuriously estimated effects of SNPs that are distant from the QTL. Prediction accuracies were higher with the 770 K than with the 50 K SNP panel.


Assuntos
Genoma , Genômica/métodos , Genótipo , Modelos Genéticos , Herança Multifatorial , Fenótipo , Locos de Características Quantitativas , Algoritmos , Animais , Teorema de Bayes , Cruzamento , Bovinos , Frequência do Gene , Marcadores Genéticos , Cadeias de Markov , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Reprodutibilidade dos Testes , Seleção Genética
14.
Genet Sel Evol ; 47: 80, 2015 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-26467850

RESUMO

BACKGROUND: In whole-genome analyses, the number p of marker covariates is often much larger than the number n of observations. Bayesian multiple regression models are widely used in genomic selection to address this problem of [Formula: see text] The primary difference between these models is the prior assumed for the effects of the covariates. Usually in the BayesB method, a Metropolis-Hastings (MH) algorithm is used to jointly sample the marker effect and the locus-specific variance, which may make BayesB computationally intensive. In this paper, we show how the Gibbs sampler without the MH algorithm can be used for the BayesB method. RESULTS: We consider three different versions of the Gibbs sampler to sample the marker effect and locus-specific variance for each locus. Among the Gibbs samplers that were considered, the most efficient sampler is about 2.1 times as efficient as the MH algorithm proposed by Meuwissen et al. and 1.7 times as efficient as that proposed by Habier et al. CONCLUSIONS: The three Gibbs samplers presented here were twice as efficient as Metropolis-Hastings samplers and gave virtually the same results.


Assuntos
Algoritmos , Genoma , Teorema de Bayes , Simulação por Computador , Variação Genética , Modelos Genéticos
15.
Genet Sel Evol ; 47: 59, 2015 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-26149977

RESUMO

BACKGROUND: Genomic selection (GS) using estimated breeding values (GS-EBV) based on dense marker data is a promising approach for genetic improvement. A simulation study was undertaken to illustrate the opportunities offered by GS for designing breeding programs. It consisted of a selection program for a sex-limited trait in layer chickens, which was developed by deterministic predictions under different scenarios. Later, one of the possible schemes was implemented in a real population of layer chicken. METHODS: In the simulation, the aim was to double the response to selection per year by reducing the generation interval by 50 %, while maintaining the same rate of inbreeding per year. We found that GS with retraining could achieve the set objectives while requiring 75 % fewer reared birds and 82 % fewer phenotyped birds per year. A multi-trait GS scenario was subsequently implemented in a real population of brown egg laying hens. The population was split into two sub-lines, one was submitted to conventional phenotypic selection, and one was selected based on genomic prediction. At the end of the 3-year experiment, the two sub-lines were compared for multiple performance traits that are relevant for commercial egg production. RESULTS: Birds that were selected based on genomic prediction outperformed those that were submitted to conventional selection for most of the 16 traits that were included in the index used for selection. However, although the two programs were designed to achieve the same rate of inbreeding per year, the realized inbreeding per year assessed from pedigree was higher in the genomic selected line than in the conventionally selected line. CONCLUSIONS: The results demonstrate that GS is a promising alternative to conventional breeding for genetic improvement of layer chickens.


Assuntos
Galinhas/genética , Seleção Genética , Seleção Artificial/genética , Animais , Galinhas/fisiologia , Modelos Genéticos , Linhagem , Fenótipo , Locos de Características Quantitativas
16.
Genet Sel Evol ; 46: 50, 2014 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-25253441

RESUMO

BACKGROUND: To obtain predictions that are not biased by selection, the conditional mean of the breeding values must be computed given the data that were used for selection. When single nucleotide polymorphism (SNP) effects have a normal distribution, it can be argued that single-step best linear unbiased prediction (SS-BLUP) yields a conditional mean of the breeding values. Obtaining SS-BLUP, however, requires computing the inverse of the dense matrix G of genomic relationships, which will become infeasible as the number of genotyped animals increases. Also, computing G requires the frequencies of SNP alleles in the founders, which are not available in most situations. Furthermore, SS-BLUP is expected to perform poorly relative to variable selection models such as BayesB and BayesC as marker densities increase. METHODS: A strategy is presented for Bayesian regression models (SSBR) that combines all available data from genotyped and non-genotyped animals, as in SS-BLUP, but accommodates a wider class of models. Our strategy uses imputed marker covariates for animals that are not genotyped, together with an appropriate residual genetic effect to accommodate deviations between true and imputed genotypes. Under normality, one formulation of SSBR yields results identical to SS-BLUP, but does not require computing G or its inverse and provides richer inferences. At present, Bayesian regression analyses are used with a few thousand genotyped individuals. However, when SSBR is applied to all animals in a breeding program, there will be a 100 to 200-fold increase in the number of animals and an associated 100 to 200-fold increase in computing time. Parallel computing strategies can be used to reduce computing time. In one such strategy, a 58-fold speedup was achieved using 120 cores. DISCUSSION: In SSBR and SS-BLUP, phenotype, genotype and pedigree information are combined in a single-step. Unlike SS-BLUP, SSBR is not limited to normally distributed marker effects; it can be used when marker effects have a t distribution, as in BayesA, or mixture distributions, as in BayesB or BayesC π. Furthermore, it has the advantage that matrix inversion is not required. We have investigated parallel computing to speedup SSBR analyses so they can be used for routine applications.


Assuntos
Estudos de Associação Genética/veterinária , Genótipo , Alelos , Animais , Teorema de Bayes , Cruzamento , Frequência do Gene , Genômica/métodos , Técnicas de Genotipagem/veterinária , Modelos Genéticos , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único
17.
Genet Sel Evol ; 46: 37, 2014 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-24912924

RESUMO

BACKGROUND: Accuracy of genomic prediction depends on number of records in the training population, heritability, effective population size, genetic architecture, and relatedness of training and validation populations. Many traits have ordered categories including reproductive performance and susceptibility or resistance to disease. Categorical scores are often recorded because they are easier to obtain than continuous observations. Bayesian linear regression has been extended to the threshold model for genomic prediction. The objective of this study was to quantify reductions in accuracy for ordinal categorical traits relative to continuous traits. METHODS: Efficiency of genomic prediction was evaluated for heritabilities of 0.10, 0.25 or 0.50. Phenotypes were simulated for 2250 purebred animals using 50 QTL selected from actual 50k SNP (single nucleotide polymorphism) genotypes giving a proportion of causal to total loci of.0001. A Bayes C π threshold model simultaneously fitted all 50k markers except those that represented QTL. Estimated SNP effects were utilized to predict genomic breeding values in purebred (n = 239) or multibreed (n = 924) validation populations. Correlations between true and predicted genomic merit in validation populations were used to assess predictive ability. RESULTS: Accuracies of genomic estimated breeding values ranged from 0.12 to 0.66 for purebred and from 0.04 to 0.53 for multibreed validation populations based on Bayes C π linear model analysis of the simulated underlying variable. Accuracies for ordinal categorical scores analyzed by the Bayes C π threshold model were 20% to 50% lower and ranged from 0.04 to 0.55 for purebred and from 0.01 to 0.44 for multibreed validation populations. Analysis of ordinal categorical scores using a linear model resulted in further reductions in accuracy. CONCLUSIONS: Threshold traits result in markedly lower accuracy than a linear model on the underlying variable. To achieve an accuracy equal or greater than for continuous phenotypes with a training population of 1000 animals, a 2.25 fold increase in training population size was required for categorical scores fitted with the threshold model. The threshold model resulted in higher accuracies than the linear model and its advantage was greatest when training populations were smallest.


Assuntos
Bovinos/genética , Genômica/métodos , Animais , Teorema de Bayes , Cruzamento , Simulação por Computador , Marcadores Genéticos , Genótipo , Modelos Lineares , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Densidade Demográfica , Locos de Características Quantitativas , Característica Quantitativa Herdável , Reprodutibilidade dos Testes , Seleção Genética
18.
Front Genet ; 15: 1380643, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38894723

RESUMO

Background: To address the limitations of commonly used cross-validation methods, the linear regression method (LR) was proposed to estimate population accuracy of predictions based on the implicit assumption that the fitted model is correct. This method also provides two statistics to determine the adequacy of the fitted model. The validity and behavior of the LR method have been provided and studied for linear predictions but not for nonlinear predictions. The objectives of this study were to 1) provide a mathematical proof for the validity of the LR method when predictions are based on conditional means, regardless of whether the predictions are linear or non-linear 2) investigate the ability of the LR method to detect whether the fitted model is adequate or inadequate, and 3) provide guidelines on how to appropriately partition the data into training and validation such that the LR method can identify an inadequate model. Results: We present a mathematical proof for the validity of the LR method to estimate population accuracy and to determine whether the fitted model is adequate or inadequate when the predictor is the conditional mean, which may be a non-linear function of the phenotype. Using three partitioning scenarios of simulated data, we show that the one of the LR statistics can detect an inadequate model only when the data are partitioned such that the values of relevant predictor variables differ between the training and validation sets. In contrast, we observed that the other LR statistic was able to detect an inadequate model for all three scenarios. Conclusion: The LR method has been proposed to address some limitations of the traditional approach of cross-validation in genetic evaluation. In this paper, we showed that the LR method is valid when the model is adequate and the conditional mean is the predictor, even when it is a non-linear function of the phenotype. We found one of the two LR statistics is superior because it was able to detect an inadequate model for all three partitioning scenarios (i.e., between animals, by age within animals, and between animals and by age) that were studied.

19.
BMC Genet ; 14: 23, 2013 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-23530766

RESUMO

BACKGROUND: Infectious Bovine Keratoconjunctivitis (IBK) in beef cattle, commonly known as pinkeye, is a bacterial disease caused by Moraxellabovis. IBK is characterized by excessive tearing and ulceration of the cornea. Perforation of the cornea may also occur in severe cases. IBK is considered the most important ocular disease in cattle production, due to the decreased growth performance of infected individuals and its subsequent economic effects. IBK is an economically important, lowly heritable categorical disease trait. Mass selection of unaffected animals has not been successful at reducing disease incidence. Genome-wide studies can determine chromosomal regions associated with IBK susceptibility. The objective of the study was to detect single-nucleotide polymorphism (SNP) markers in linkage disequilibrium (LD) with genetic variants associated with IBK in American Angus cattle. RESULTS: The proportion of phenotypic variance explained by markers was 0.06 in the whole genome analysis of IBK incidence classified as two, three or nine categories. Whole-genome analysis using any categorisation of (two, three or nine) IBK scores showed that locations on chromosomes 2, 12, 13 and 21 were associated with IBK disease. The genomic locations on chromosomes 13 and 21 overlap with QTLs associated with Bovine spongiform encephalopathy, clinical mastitis or somatic cell count. CONCLUSIONS: Results of these genome-wide analyses indicated that if the underlying genetic factors confer not only IBK susceptibility but also IBK severity, treating IBK phenotypes as a two-categorical trait can cause information loss in the genome-wide analysis. These results help our overall understanding of the genetics of IBK and have the potential to provide information for future use in breeding schemes.


Assuntos
Doenças dos Bovinos/genética , Bovinos/genética , Estudo de Associação Genômica Ampla , Ceratoconjuntivite Infecciosa/genética , Animais , Teorema de Bayes , Mapeamento Cromossômico/veterinária , Polimorfismo de Nucleotídeo Único
20.
Genet Sel Evol ; 45: 5, 2013 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-23496971

RESUMO

BACKGROUND: Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. METHODS: A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. RESULTS: About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers. CONCLUSIONS: Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.


Assuntos
Genômica , Haplótipos , Animais , Teorema de Bayes , Bovinos , Biologia Computacional/métodos , Genealogia e Heráldica , Genótipo , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Característica Quantitativa Herdável
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA