Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Genet Sel Evol ; 55(1): 45, 2023 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-37407936

RESUMEN

BACKGROUND: The breeding value of a crossbred individual can be expressed as the sum of the contributions from each of the contributing pure breeds. In theory, the breeding value should account for segregation between breeds, which results from the difference in the mean contribution of loci between breeds, which in turn is caused by differences in allele frequencies between breeds. However, with multiple generations of crossbreeding, how to account for breed segregation in genomic models that split the breeding value of crossbreds based on breed origin of alleles (BOA) is not known. Furthermore, local breed proportions (LBP) have been modelled based on BOA and is a concept related to breed segregation. The objectives of this study were to explore the theoretical background of the effect of LBP and how it relates to breed segregation and to investigate how to incorporate breed segregation (co)variance in genomic BOA models. RESULTS: We showed that LBP effects result from the difference in the mean contribution of loci between breeds in an additive genetic model, i.e. breed segregation effects. We found that the (co)variance structure for BS effects in genomic BOA models does not lead to relationship matrices that are positive semi-definite in all cases. However, by setting one breed as a reference breed, a valid (co)variance structure can be constructed by including LBP effects for all other breeds and assuming them to be correlated. We successfully estimated variance components for a genomic BOA model with LBP effects in a simulated example. CONCLUSIONS: Breed segregation effects and LBP effects are two alternative ways to account for the contribution of differences in the mean effects of loci between breeds. When the covariance between LBP effects across breeds is included in the model, a valid (co)variance structure for LBP effects can be constructed by setting one breed as reference breed and fitting an LBP effect for each of the other breeds.


Asunto(s)
Genómica , Modelos Genéticos , Genómica/métodos , Hibridación Genética , Frecuencia de los Genes , Alelos
2.
Genet Sel Evol ; 55(1): 37, 2023 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-37291510

RESUMEN

BACKGROUND: Single-step genomic best linear unbiased prediction (ssGBLUP) models allow the combination of genomic, pedigree, and phenotypic data into a single model, which is computationally challenging for large genotyped populations. In practice, genotypes of animals without their own phenotype and progeny, so-called genotyped selection candidates, can become available after genomic breeding values have been estimated by ssGBLUP. In some breeding programmes, genomic estimated breeding values (GEBV) for these animals should be known shortly after obtaining genotype information but recomputing GEBV using the full ssGBLUP takes too much time. In this study, first we compare two equivalent formulations of ssGBLUP models, i.e. one that is based on the Woodbury matrix identity applied to the inverse of the genomic relationship matrix, and one that is based on marker equations. Second, we present computationally-fast approaches to indirectly compute GEBV for genotyped selection candidates, without the need to do the full ssGBLUP evaluation. RESULTS: The indirect approaches use information from the latest ssGBLUP evaluation and rely on the decomposition of GEBV into its components. The two equivalent ssGBLUP models and indirect approaches were tested on a six-trait calving difficulty model using Irish dairy and beef cattle data that include 2.6 million genotyped animals of which about 500,000 were considered as genotyped selection candidates. When using the same computational approaches, the solving phase of the two equivalent ssGBLUP models showed similar requirements for memory and time per iteration. The computational differences between them were due to the preprocessing phase of the genomic information. Regarding the indirect approaches, compared to GEBV obtained from single-step evaluations including all genotypes, indirect GEBV had correlations higher than 0.99 for all traits while showing little dispersion and level bias. CONCLUSIONS: In conclusion, ssGBLUP predictions for the genotyped selection candidates were accurately approximated using the presented indirect approaches, which are more memory efficient and computationally fast, compared to solving a full ssGBLUP evaluation. Thus, indirect approaches can be used even on a weekly basis to estimate GEBV for newly genotyped animals, while the full single-step evaluation is done only a few times within a year.


Asunto(s)
Genoma , Modelos Genéticos , Animales , Bovinos/genética , Genotipo , Genómica , Fenotipo , Linaje
3.
Genet Sel Evol ; 55(1): 1, 2023 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-36604633

RESUMEN

BACKGROUND: In this study, computationally efficient methods to approximate the reliabilities of genomic estimated breeding values (GEBV) in a single-step genomic prediction model including a residual polygenic (RPG) effect are described. In order to calculate the reliabilities of the genotyped animals, a single nucleotide polymorphism best linear unbiased prediction (SNPBLUP) or a genomic BLUP (GBLUP), was used, where two alternatives to account for the RPG effect were tested. In the direct approach, the genomic model included the RPG effect, while in the blended method, it did not but an index was used to weight the genomic and pedigree-based BLUP (PBLUP) reliabilities. In order to calculate the single-step GBLUP reliabilities for the breeding values for the non-genotyped animals, a simplified weighted-PBLUP model that included a general mean and additive genetic effects with weights accounting for the non-genomic and genomic information was used. We compared five schemes for the weights. Two datasets, i.e., a small (Data 1) one and a large (Data 2) one were used. RESULTS: For the genotyped animals in Data 1, correlations between approximate reliabilities using the blended method and exact reliabilities ranged from 0.993 to 0.996 across three lactations. The slopes observed by regressing the reliabilities of GEBV from the exact method on those from the blended method were 1.0 for all three lactations. For Data 2, the correlations and slopes ranged, respectively, from 0.980 to 0.986 and from 0.91 to 0.96, and for the non-genotyped animals in Data 1, they ranged, respectively, from 0.987 to 0.994 and from 0.987 to 1, which indicate that the approximations were in line with the exact results. The best approach achieved correlations of 0.992 to 0.994 across lactations. CONCLUSIONS: Our results demonstrate that the approximated reliabilities calculated using our proposed approach are in good agreement with the exact reliabilities. The blended method for the genotyped animals is computationally more feasible than the direct method when RPG effects are included, particularly for large-scale datasets. The approach can serve as an effective strategy to estimate the reliabilities of GEBV in large-scale single-step genomic predictions.


Asunto(s)
Genoma , Genómica , Animales , Femenino , Reproducibilidad de los Resultados , Genómica/métodos , Genotipo , Herencia Multifactorial , Polimorfismo de Nucleótido Simple , Linaje , Fenotipo , Modelos Genéticos
4.
J Dairy Sci ; 106(3): 1518-1532, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36567247

RESUMEN

The calculation of exact reliabilities involving the inversion of mixed model equations poses a heavy computational challenge when the system of equations is large. This has prompted the development of different approximation methods. We give an overview of the various methods and computational approaches in calculating reliability from the era before the animal model to the era of single-step genomic models. The different methods are discussed in terms of modeling, development, and applicability in large dairy cattle populations. The paper also describes the problems faced in reliability computation. Many details dispersed throughout the literature are presented in this paper. It is clear that a universal solution applicable to every model and input data may not be possible, but we point out several efficient and accurate algorithms developed recently for a variety of very large genomic evaluations.


Asunto(s)
Genoma , Genómica , Bovinos , Animales , Reproducibilidad de los Resultados , Genómica/métodos , Modelos Animales , Algoritmos , Genotipo , Modelos Genéticos , Fenotipo
5.
Front Genet ; 13: 1012205, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36479243

RESUMEN

Single-step genomic BLUP (ssGBLUP) model for routine genomic prediction of breeding values is developed intensively for many dairy cattle populations. Compatibility between the genomic (G) and the pedigree (A) relationship matrices remains an important challenge required in ssGBLUP. The compatibility relates to the amount of missing pedigree information. There are two prevailing approaches to account for the incomplete pedigree information: unknown parent groups (UPG) and metafounders (MF). unknown parent groups have been used routinely in pedigree-based evaluations to account for the differences in genetic level between groups of animals with missing parents. The MF approach is an extension of the UPG approach. The MF approach defines MF which are related pseudo-individuals. The MF approach needs a Γ matrix of the size number of MF to describe relationships between MF. The UPG and MF can be the same. However, the challenge in the MF approach is the estimation of Γ having many MF, typically needed in dairy cattle. In our study, we present an approach to fit the same amount of MF as UPG in ssGBLUP with Woodbury matrix identity (ssGTBLUP). We used 305-day milk, protein, and fat yield data from the DFS (Denmark, Finland, Sweden) Red Dairy cattle population. The pedigree had more than 6 million animals of which 207,475 were genotyped. We constructed the preliminary gamma matrix (Γ pre ) with 29 MF which was expanded to 148 MF by a covariance function (Γ 148). The quality of the extrapolation of the Γ pre matrix was studied by comparing average off-diagonal elements between breed groups. On average relationships among MF in Γ 148 were 1.8% higher than in Γ pre . The use of Γ 148 increased the correlation between the G and A matrices by 0.13 and 0.11 for the diagonal and off-diagonal elements, respectively. [G]EBV were predicted using the ssGTBLUP and Pedigree-BLUP models with the MF and UPG. The prediction reliabilities were slightly higher for the ssGTBLUP model using MF than UPG. The ssGBLUP MF model showed less overprediction compared to other models.

6.
J Dairy Sci ; 105(12): 9822-9836, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36307242

RESUMEN

For genomic prediction of crossbred animals, models that account for the breed origin of alleles (BOA) in marker genotypes can allow the effects of marker alleles to differ depending on their ancestral breed. Previous studies have shown that genomic estimated breeding values for crossbred cows can be calculated using the marker effects that are estimated in the contributing pure breeds and combined based on estimated BOA in the genotypes of the crossbred cows. In the presented study, we further exploit the BOA information for improving the prediction of genomic breeding values of crossbred dairy cows. We investigated 2 types of BOA-derived breed proportions: global breed proportions, defined as the proportion of marker alleles assigned to each breed across the whole genome; and local breed proportions (LBP), defined as the proportions of alleles on chromosome segments which were assigned to each breed. Further, we investigated 2 BOA-derived measures of heterozygosity for the prediction of total genetic value. First, global breed heterozygosity, defined as the proportion of marker loci that have alleles originating in 2 different breeds over the whole genome. Second, local breed heterozygosity (LBH), defined as proportions of marker loci on chromosome segments that had alleles originating in 2 different breeds. We estimated variance related to LBP and LBH on the remaining variation after accounting for prediction with solutions from the genomic evaluations of the pure breeds and validated alternative models for production traits in 5,214 Danish crossbred dairy cows. The estimated LBP variances were 0.9, 1.2, and 1.0% of phenotypic variance for milk, fat, and protein yield, respectively. We observed no clear LBH effect. Cross-validation showed that models with LBP effects had a numerically small but statistically significantly higher predictive ability than models only including global breed proportions. We observed similar improvement in accuracy by the model having an across crossbred residual additive genetic effect, accounting for the additive genetic variation that was not accounted for by the solutions from purebred. For genomic predictions of crossbred animals, estimated BOA can give useful information on breed proportions, both globally in the genome and locally in genome regions, and on breed heterozygosity.


Asunto(s)
Modelos Genéticos , Polimorfismo de Nucleótido Simple , Femenino , Bovinos/genética , Animales , Genómica , Alelos , Genotipo , Fenotipo
7.
Genet Sel Evol ; 54(1): 38, 2022 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-35655157

RESUMEN

BACKGROUND: Genomic estimated breeding values (GEBV) by single-step genomic BLUP (ssGBLUP) are affected by the centering of marker information used. The use of a fixed effect called J factor will lead to GEBV that are unaffected by the centering used. We extended the use of a single J factor to a group of J factors. RESULTS: J factor(s) are usually included in mixed model equations (MME) as regression effects but a transformation similar to that regularly used for genetic groups can be applied to obtain a simpler MME, which is sparser than the original MME and does not need computation of the J factors. When the J factor is based on the same structure as the genetic groups, then MME can be transformed such that coefficients for the genetic groups no longer include information from the genomic relationship matrix. We illustrate the use of J factors in the analysis of a Red dairy cattle data set for fertility. CONCLUSIONS: The GEBV from these analyses confirmed the theoretical derivations that show that the resulting GEBV are allele coding independent when a J factor is used. Transformed MME led to faster computing time than the original regression-based MME.


Asunto(s)
Genómica , Modelos Genéticos , Alelos , Animales , Bovinos/genética , Fertilidad , Genómica/métodos , Genotipo
8.
J Dairy Sci ; 105(6): 5221-5237, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35400498

RESUMEN

Approximate multistep methods to calculate reliabilities for estimated breeding values in large genetic evaluations were developed for single-trait (ST-R2A) and multitrait (MT-R2A) single-step genomic BLUP (ssGBLUP) models. First, a traditional animal model was used to estimate the amount of nongenomic information for the genotyped animals. Second, this information was used with genomic data in a genomic BLUP model (genomic BLUP/SNP-BLUP) to approximate the total amount of information and ssGBLUP reliabilities for the genotyped animals. Finally, reliabilities for the nongenotyped animals were calculated using a traditional animal model where the increased information due to genomic data for the genotyped animals is accounted for by including pseudo-record counts for the genotyped animals. The approaches were tested using a multiple-trait ssGBLUP model on 2 data sets. The first data set (data 1) was small enough such that exact ssGBLUP model reliabilities could be computed by inversion and compared with the approximation method reliabilities. Data 1 had 46,535 first-, 35,290 second-, and 23,780 third-lactation 305-d milk yield records from 47,124 Finnish Red dairy cows. The pedigree comprised 64,808 animals, of which 19,757 were genotyped. We examined the efficiency of the MT-R2A approximation on a large data set (data 2) derived from the joint Nordic (Danish, Finnish, and Swedish) Holstein dairy cattle data. Data 2 had 17.8 million 305-d milk records from 8.3 million cows and first 3 lactations. The pedigree had 11 million animals of which 274,145 were genotyped on 46,342 SNP markers. For data 1, correlations between the exact ssGBLUP model and the ST-R2A for the genotyped (nongenotyped) animals were 0.995 (0.987), 0.965 (0.984), and 0.950 (0.983) for first, second, and third lactation, respectively. Correspondingly, correlations between exact ssGBLUP reliabilities and MT-R2A for the genotyped (nongenotyped) animals were 0.995 (0.993), 0.992 (0.991), and 0.990 (0.990) for first, second, and third lactation, respectively. The regression coefficients (b1) of ssGBLUP reliability on ST-R2A for the genotyped (nongenotyped) animals ranged from 0.87 (0.94) for first lactation to 0.68 (0.93) for third lactation, whereas for MT-R2A they were between 0.91 (0.99) for first lactation to 0.89 (0.99) for third lactation. Correspondingly, the intercepts varied from 0.11 (0.05) to 0.3 (0.06) for ST-R2A and from 0.06 (0.01) to 0.07 (0.02) for MT-R2A. The computing time for the approximation method was approximately 12% of that required by the direct exact approach. In conclusion, the developed approximate approach allows calculating estimated breeding value reliabilities in the ssGBLUP model even for large data sets.


Asunto(s)
Genoma , Modelos Genéticos , Animales , Bovinos/genética , Femenino , Genómica/métodos , Genotipo , Linaje , Fenotipo , Reproducibilidad de los Resultados
9.
J Anim Breed Genet ; 139(3): 259-270, 2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-34841597

RESUMEN

Genomic data are widely used in predicting the breeding values of dairy cattle. The accuracy of genomic prediction depends on the size of the reference population and how related the candidate animals are to it. For populations with limited numbers of progeny-tested bulls, the reference populations must include cows and data from external populations. The aim of this study was to implement state-of-the-art single-step genomic evaluations for milk and fat yield in Holstein and Russian Black & White cattle in the Leningrad region (LR, Russia), using only a limited number of genotyped animals. We complemented internal information with external pseudo-phenotypic and genotypic data of bulls from the neighbouring Danish, Finnish and Swedish Holstein (DFS) population. Three data scenarios were used to perform single-step GBLUP predictions in the LR dairy cattle population. The first scenario was based on the original LR reference population, which constituted 1,080 genotyped cows and 427 genotyped bulls. In the second scenario, the genotypes of 414 bulls related to the LR from the DFS population were added to the reference population. In the third scenario, LR data were further augmented with pseudo-phenotypic data from the DFS population. The inclusion of foreign information increased the validation reliability of the milk yield by up to 30%. Suboptimal data recording practices hindered the improvement of fat yield. We confirmed that the single-step model is suitable for populations with a low number of genotyped animals, especially when external information is integrated into the evaluations. Genomic prediction in populations with a low number of progeny-tested bulls can be based on data from genotyped cows and on the inclusion of genotypes and pseudo-phenotypes from the external population. This approach increased the validation reliability of the implemented single-step model in the milk yield, but shortcomings in the LR data recording scheme prevented improvements in fat yield.


Asunto(s)
Genoma , Genómica , Animales , Bovinos/genética , Femenino , Genoma/genética , Genotipo , Masculino , Leche , Modelos Genéticos , Fenotipo , Reproducibilidad de los Resultados
10.
J Anim Breed Genet ; 136(4): 252-261, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31247679

RESUMEN

Monte Carlo (MC) methods have been found useful in estimation of variance parameters for large data and complex models with many variance components (VC), with respect to both computer memory and computing time. A disadvantage has been a fluctuation in round-to-round values of estimates that makes the estimation of convergence challenging. Furthermore, with Newton-type algorithms, the approximate Hessian matrix might have sufficient accuracy, but the inaccuracy in the gradient vector exaggerates the round-to-round fluctuation to intolerable. In this study, the reuse of the same random numbers within each MC sample was used to remove the MC fluctuation. Simulated data with six VC parameters were analysed by four different MC REML methods: expectation-maximization (EM), Newton-Raphson (NR), average information (AI) and Broyden's method (BM). In addition, field data with 96 VC parameters were analysed by MC EM REML. In all the analyses with reused samples, the MC fluctuations disappeared, but the final estimates by the MC REML methods differed from the analytically calculated values more than expected especially when the number of MC samples was small. The difference depended on the random numbers generated, and based on repeated MC AI REML analyses, the VC estimates were on average non-biased. The advantage of reusing MC samples is more apparent in the NR-type algorithms. Smooth convergence opens the possibility to use the fast converging Newton-type algorithms. However, a disadvantage from reusing MC samples is a possible "bias" in the estimates. To attain acceptable accuracy, sufficient number of MC samples need to be generated.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Variación Genética , Modelos Genéticos , Método de Montecarlo , Animales , Bovinos , Funciones de Verosimilitud , Análisis Multivariante , Fenotipo
11.
Heredity (Edinb) ; 123(3): 307-317, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-30886391

RESUMEN

Livestock production both contributes to and is affected by global climate change, and substantial modifications will be required to increase its climate resilience. In this context, reliance on dominant commercial livestock breeds, featuring small effective population sizes, makes current production strategies vulnerable if their production is restricted to environments, which may be too costly to support under future climate scenarios. The adaptability of animal populations to future environments will therefore become important. To help evaluate the role of genetics in climate adaptation, we compared selection strategies in dairy cattle using breeding simulations, where genomic selection was used on two negatively correlated traits for production (assumed to be moderately heritable) and adaptation (assumed to have low heritability). Compared with within-population breeding, genomic introgression produced a more positive genetic change for both production and adaptation traits. Genomic introgression from highly adapted but low production value populations into highly productive but low adaptation populations was most successful when the adaptation trait was given a lower selection weight than the production trait. Genomic introgression from highly productive population to highly adapted population was most successful when the adaptation trait was given a higher selection weight than the production trait. Both these genomic introgression schemes had the lowest risk of inbreeding. Our results suggest that both adaptation and production can potentially be improved simultaneously by genomic introgression.


Asunto(s)
Adaptación Fisiológica/genética , Cruzamiento/estadística & datos numéricos , Industria Lechera , Modelos Genéticos , Carácter Cuantitativo Heredable , Selección Genética , Animales , Bovinos , Cambio Climático , Simulación por Computador , Femenino , Introgresión Genética , Masculino , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
12.
J Anim Breed Genet ; 135(6): 472-484, 2018 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-30411415

RESUMEN

We developed a multiple-trait animal model for blue fox fertility evaluation and estimated genetic parameters simultaneously for seven traits: first three litter sizes (LS), pregnancy rate (PREG), whelping success (WHELP), grading size (gSI) and fur quality (gQU). Grading size and quality were included into the new multiple-trait model as correlated traits. Litter size of the first parity had the highest unfavourable genetic correlations with gSI (-0.57) and gQU (-0.56). Thus, selection for higher gSI and gQU slows down the genetic gain in fertility traits. WHELP had moderate to fairly high negative genetic correlations with gSI and gQU (-0.44 and -0.36, respectively), indicating that larger animals are more likely to lose their pups during pregnancy or immediately after birth. Our new model corrected the slight overestimation of estimated breeding values (EBVs) especially for first litter size. The accuracy of LS, PREG and WHELP estimation is likely to benefit from the new multiple-trait animal model. PREG and WHELP improved steadily from 1998 to 2014, and LS traits have shown a moderate genetic trend since 2007, whereas the positive genetic trend for gSI has levelled down, at least temporarily. The fairly high effective population size (150) allows to increase selection intensity, and the new fertility evaluation enables further improvement of fertility traits.


Asunto(s)
Cruzamiento , Fertilidad , Zorros/fisiología , Animales , Femenino , Zorros/genética , Masculino , Modelos Estadísticos , Fenotipo , Densidad de Población , Reproducción
13.
J Anim Breed Genet ; 135(5): 337-348, 2018 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-30112802

RESUMEN

Joint Nordic (Denmark, Finland, Sweden) genetic evaluation of female fertility is currently based on the multiple trait multilactation animal model (BLUP). Here, single step genomic model (ssGBLUP) was applied for the Nordic Red dairy cattle fertility evaluation. The 11 traits comprised of nonreturn rate and days from first to last insemination in heifers and first three parities, and days from calving to first insemination in the first three parities. Traits had low heritabilities (0.015-0.04), but moderately high genetic correlations between the parities (0.60-0.88). Phenotypic data included 4,226,715 animals with records and pedigree 5,445,392 animals. Unknown parents were assigned into 332 phantom parent groups (PPG). In mixed model equations animals were associated with PPG effects through the pedigree or both the pedigree and genomic information. Genotype information of 46,914 SNPs was available for 33,969 animals in the pedigree. When PPG used pedigree information only, BLUP converged after 2,420 iterations whereas the ssGBLUP evaluation needed over ten thousand iterations. When the PPG effects were solved accounting both the pedigree and the genomic information, the ssGBLUP model converged after 2,406 iterations. Also, with the latter model breeding values by ssGBLUP and BLUP became more consistent and genetic trends followed each other well. Models were validated using forward prediction of the young bulls. Reliabilities and variance inflation of predicted genomic breeding values (values for parent averages in brackets) for the 11 traits ranged 0.22-0.31 (0.10-0.27) and 0.81-0.95 (0.83-1.06), respectively. The ssGBLUP model gave always higher validation reliabilities than BLUP, but largest increases were for the cow fertility traits.


Asunto(s)
Bovinos/genética , Bovinos/fisiología , Fertilidad/genética , Genómica/métodos , Modelos Genéticos , Animales , Cruzamiento , Dinamarca , Femenino , Finlandia , Genoma , Genotipo , Masculino , Fenotipo , Suecia
14.
Genet Sel Evol ; 50(1): 6, 2018 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-29490611

RESUMEN

BACKGROUND: For marker effect models and genomic animal models, computational requirements increase with the number of loci and the number of genotyped individuals, respectively. In the latter case, the inverse genomic relationship matrix (GRM) is typically needed, which is computationally demanding to compute for large datasets. Thus, there is a great need for dimensionality-reduction methods that can analyze massive genomic data. For this purpose, we developed reduced-dimension singular value decomposition (SVD) based models for genomic prediction. METHODS: Fast SVD is performed by analyzing different chromosomes/genome segments in parallel and/or by restricting SVD to a limited core of genotyped individuals, producing chromosome- or segment-specific principal components (PC). Given a limited effective population size, nearly all the genetic variation can be effectively captured by a limited number of PC. Genomic prediction can then be performed either by PC ridge regression (PCRR) or by genomic animal models using an inverse GRM computed from the chosen PC (PCIG). In the latter case, computation of the inverse GRM will be feasible for any number of genotyped individuals and can be readily produced row- or element-wise. RESULTS: Using simulated data, we show that PCRR and PCIG models, using chromosome-wise SVD of a core sample of individuals, are appropriate for genomic prediction in a larger population, and results in virtually identical predicted breeding values as the original full-dimension genomic model (r = 1.000). Compared with other algorithms (e.g. algorithm for proven and young animals, APY), the (chromosome-wise SVD-based) PCRR and PCIG models were more robust to size of the core sample, giving nearly identical results even down to 500 core individuals. The method was also successfully tested on a large multi-breed dataset. CONCLUSIONS: SVD can be used for dimensionality reduction of large genomic datasets. After SVD, genomic prediction using dense genomic data and many genotyped individuals can be done in a computationally efficient manner. Using this method, the resulting genomic estimated breeding values were virtually identical to those computed from a full-dimension genomic model.


Asunto(s)
Biología Computacional/métodos , Genotipo , Modelos Genéticos , Algoritmos , Animales , Cruzamiento , Simulación por Computador , Genoma , Densidad de Población , Análisis de Componente Principal
15.
Genet Sel Evol ; 49(1): 36, 2017 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-28359261

RESUMEN

BACKGROUND: Single-step genomic best linear unbiased prediction (BLUP) evaluation combines relationship information from pedigree and genomic marker data. The inclusion of the genomic information into mixed model equations requires the inverse of the combined relationship matrix [Formula: see text], which has a dense matrix block for genotyped animals. METHODS: To avoid inversion of dense matrices, single-step genomic BLUP can be transformed to single-step single nucleotide polymorphism BLUP (SNP-BLUP) which have observed and imputed marker coefficients. Simple block LDL type decompositions of the single-step relationship matrix [Formula: see text] were derived to obtain different types of linearly equivalent single-step genomic mixed model equations with different sets of reparametrized random effects. For non-genotyped animals, the imputed marker coefficient terms in the single-step SNP-BLUP were calculated on-the-fly during the iterative solution using sparse matrix decompositions without storing the imputed genotypes. Residual polygenic effects were added to genotyped animals and transmitted to non-genotyped animals using relationship coefficients that are similar to imputed genotypes. The relationships were further orthogonalized to improve convergence of iterative methods. RESULTS: All presented single-step SNP-BLUP models can be solved efficiently using iterative methods that rely on iteration on data and sparse matrix approaches. The efficiency, accuracy and iteration convergence of the derived mixed model equations were tested with a small dataset that included 73,579 animals of which 2885 were genotyped with 37,526 SNPs. CONCLUSIONS: Inversion of the large and dense genomic relationship matrix was avoided in single-step evaluation by using fully orthogonalized single-step SNP-BLUP formulations. The number of iterations until convergence was smaller in single-step SNP-BLUP formulations than in the original single-step GBLUP when heritability was low, but increased above that of the original single-step when heritability was high.


Asunto(s)
Algoritmos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Herencia Multifactorial , Polimorfismo de Nucleótido Simple , Animales
16.
J Dairy Sci ; 98(2): 1296-309, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25434332

RESUMEN

Three random regression models were developed for routine genetic evaluation of Danish, Finnish, and Swedish dairy cattle. Data included over 169 million test-day records with milk, protein, and fat yield observations from over 8.7 million dairy cows of all breeds. Variance component analyses showed significant differences in estimates between Holstein, Nordic Red Cattle, and Jersey, but only small to moderate differences within a breed across countries. The obtained variance component estimates were used to build, for each breed, their own set of covariance functions. The covariance functions describe the animal effects on milk, protein, and fat yields of the first 3 lactations as 9 different traits, assuming the same heritabilities and a genetic correlation of unity across countries. Only 15, 27, and 7 eigenfunctions with the largest eigenvalues were used to describe additive genetic animal effects and nonhereditary animal effects across lactations and within later lactations, respectively. These reduced-rank covariance functions explained 99.0 to 99.9% of the original variances but reduced the number of animal equations to be solved by 44%. Moderate rank reduction for nonhereditary animal effects and use of one-third-smaller measurement error correlations than obtained from variance component estimation made the models more robust against extreme observations. Estimation of the genetic levels of the countries' subpopulations within a breed was found sensitive to the way the breed effects were modeled, especially for the genetically heterogeneous Nordic Red Cattle. Means to ensure that only additive genetic effects entered the estimated breeding values were to describe the crossbreeding effects by fixed and random cofactors and the calving age effect by an age × breed proportion interaction, and to model phantom parent groups as random effects. To ensure that genetic variances were the same across the 3 countries in breeding value estimation, as suggested by the variance component estimates, the applied multiplicative heterogeneous variance adjustment method had to be tailored using country-specific reference measurement error variances. Results showed the feasibility of across-country genetic evaluation of cows and sires based on original test-day phenotypes. Nevertheless, applying a thorough model validation procedure is essential throughout the model building process to obtain reliable breeding values.


Asunto(s)
Bovinos/genética , Lactancia/genética , Leche/química , Modelos Estadísticos , Algoritmos , Análisis de Varianza , Animales , Cruzamiento , Grasas/análisis , Femenino , Heterogeneidad Genética , Variación Genética , Vigor Híbrido , Hibridación Genética , Proteínas de la Leche/análisis , Proteínas de la Leche/genética , Fenotipo , Análisis de Regresión , Investigación , Especificidad de la Especie
17.
Genet Sel Evol ; 46: 47, 2014 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-25080199

RESUMEN

BACKGROUND: Although the X chromosome is the second largest bovine chromosome, markers on the X chromosome are not used for genomic prediction in some countries and populations. In this study, we presented a method for computing genomic relationships using X chromosome markers, investigated the accuracy of imputation from a low density (7K) to the 54K SNP (single nucleotide polymorphism) panel, and compared the accuracy of genomic prediction with and without using X chromosome markers. METHODS: The impact of considering X chromosome markers on prediction accuracy was assessed using data from Nordic Holstein bulls and different sets of SNPs: (a) the 54K SNPs for reference and test animals, (b) SNPs imputed from the 7K to the 54K SNP panel for test animals, (c) SNPs imputed from the 7K to the 54K panel for half of the reference animals, and (d) the 7K SNP panel for all animals. Beagle and Findhap were used for imputation. GBLUP (genomic best linear unbiased prediction) models with or without X chromosome markers and with or without a residual polygenic effect were used to predict genomic breeding values for 15 traits. RESULTS: Averaged over the two imputation datasets, correlation coefficients between imputed and true genotypes for autosomal markers, pseudo-autosomal markers, and X-specific markers were 0.971, 0.831 and 0.935 when using Findhap, and 0.983, 0.856 and 0.937 when using Beagle. Estimated reliabilities of genomic predictions based on the imputed datasets using Findhap or Beagle were very close to those using the real 54K data. Genomic prediction using all markers gave slightly higher reliabilities than predictions without X chromosome markers. Based on our data which included only bulls, using a G matrix that accounted for sex-linked relationships did not improve prediction, compared with a G matrix that did not account for sex-linked relationships. A model that included a polygenic effect did not recover the loss of prediction accuracy from exclusion of X chromosome markers. CONCLUSIONS: The results from this study suggest that markers on the X chromosome contribute to accuracy of genomic predictions and should be used for routine genomic evaluation.


Asunto(s)
Bovinos/genética , Marcadores Genéticos , Genómica/métodos , Cromosoma X/genética , Animales , Cruzamiento , Cromosomas/genética , Femenino , Genotipo , Modelos Lineales , Masculino , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/veterinaria , Fenotipo , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable
18.
PLoS One ; 8(12): e80821, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24339886

RESUMEN

Estimation of variance components by Monte Carlo (MC) expectation maximization (EM) restricted maximum likelihood (REML) is computationally efficient for large data sets and complex linear mixed effects models. However, efficiency may be lost due to the need for a large number of iterations of the EM algorithm. To decrease the computing time we explored the use of faster converging Newton-type algorithms within MC REML implementations. The implemented algorithms were: MC Newton-Raphson (NR), where the information matrix was generated via sampling; MC average information(AI), where the information was computed as an average of observed and expected information; and MC Broyden's method, where the zero of the gradient was searched using a quasi-Newton-type algorithm. Performance of these algorithms was evaluated using simulated data. The final estimates were in good agreement with corresponding analytical ones. MC NR REML and MC AI REML enhanced convergence compared to MC EM REML and gave standard errors for the estimates as a by-product. MC NR REML required a larger number of MC samples, while each MC AI REML iteration demanded extra solving of mixed model equations by the number of parameters to be estimated. MC Broyden's method required the largest number of MC samples with our small data and did not give standard errors for the parameters directly. We studied the performance of three different convergence criteria for the MC AI REML algorithm. Our results indicate the importance of defining a suitable convergence criterion and critical value in order to obtain an efficient Newton-type method utilizing a MC algorithm. Overall, use of a MC algorithm with Newton-type methods proved feasible and the results encourage testing of these methods with different kinds of large-scale problem settings.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Método de Montecarlo , Análisis de Varianza , Animales , Cruzamiento , Bovinos , Industria Lechera , Funciones de Verosimilitud , Modelos Lineales
19.
PLoS One ; 6(11): e26256, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22114661

RESUMEN

Genome-wide SNP data provide a powerful tool to estimate pairwise relatedness among individuals and individual inbreeding coefficient. The aim of this study was to compare methods for estimating the two parameters in a Finnsheep population based on genome-wide SNPs and genealogies, separately. This study included ninety-nine Finnsheep in Finland that differed in coat colours (white, black, brown, grey, and black/white spotted) and were from a large pedigree comprising 319 119 animals. All the individuals were genotyped with the Illumina Ovine SNP50K BeadChip by the International Sheep Genomics Consortium. We identified three genetic subpopulations that corresponded approximately with the coat colours (grey, white, and black and brown) of the sheep. We detected a significant subdivision among the colour types (F(ST) = 5.4%, P<0.05). We applied robust algorithms for the genomic estimation of individual inbreeding (F(SNP)) and pairwise relatedness (Φ(SNP)) as implemented in the programs KING and PLINK, respectively. Estimates of the two parameters from pedigrees (F(PED) and Φ(PED)) were computed using the RelaX2 program. Values of the two parameters estimated from genomic and genealogical data were mostly consistent, in particular for the highly inbred animals (e.g. inbreeding coefficient F>0.0625) and pairs of closely related animals (e.g. the full- or half-sibs). Nevertheless, we also detected differences in the two parameters between the approaches, particularly with respect to the grey Finnsheep. This could be due to the smaller sample size and relative incompleteness of the pedigree for them.We conclude that the genome-wide genomic data will provide useful information on a per sample or pairwise-samples basis in cases of complex genealogies or in the absence of genealogical data.


Asunto(s)
Biología Computacional , Genómica , Color del Cabello/genética , Endogamia , Polimorfismo de Nucleótido Simple/genética , Ovinos/genética , Animales , Linaje
20.
Genet Sel Evol ; 43: 25, 2011 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-21703021

RESUMEN

BACKGROUND: Genomic data are used in animal breeding to assist genetic evaluation. Several models to estimate genomic breeding values have been studied. In general, two approaches have been used. One approach estimates the marker effects first and then, genomic breeding values are obtained by summing marker effects. In the second approach, genomic breeding values are estimated directly using an equivalent model with a genomic relationship matrix. Allele coding is the method chosen to assign values to the regression coefficients in the statistical model. A common allele coding is zero for the homozygous genotype of the first allele, one for the heterozygote, and two for the homozygous genotype for the other allele. Another common allele coding changes these regression coefficients by subtracting a value from each marker such that the mean of regression coefficients is zero within each marker. We call this centered allele coding. This study considered effects of different allele coding methods on inference. Both marker-based and equivalent models were considered, and restricted maximum likelihood and Bayesian methods were used in inference. RESULTS: Theoretical derivations showed that parameter estimates and estimated marker effects in marker-based models are the same irrespective of the allele coding, provided that the model has a fixed general mean. For the equivalent models, the same results hold, even though different allele coding methods lead to different genomic relationship matrices. Calculated genomic breeding values are independent of allele coding when the estimate of the general mean is included into the values. Reliabilities of estimated genomic breeding values calculated using elements of the inverse of the coefficient matrix depend on the allele coding because different allele coding methods imply different models. Finally, allele coding affects the mixing of Markov chain Monte Carlo algorithms, with the centered coding being the best. CONCLUSIONS: Different allele coding methods lead to the same inference in the marker-based and equivalent models when a fixed general mean is included in the model. However, reliabilities of genomic breeding values are affected by the allele coding method used. The centered coding has some numerical advantages when Markov chain Monte Carlo methods are used.


Asunto(s)
Alelos , Genómica/métodos , Modelos Genéticos , Animales , Teorema de Bayes , Cruzamiento , Codón , Femenino , Marcadores Genéticos , Genoma , Masculino
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...