RESUMO
Although a significant cost, genotyping an entire population offers many benefits, many of which can reduce the workload and effort in decision-making on farm. As well as providing more accurate predictions of the genetic merit of individuals (and by extension their expected performance), national genotyping strategies enable complete traceability from the cradle to the grave as well as parentage discovery. The information available per animal aids more informed breeding and management decisions, including mating advice, and determining the optimal role and eventual fate of each animal.
Assuntos
Cruzamento , Animais , Testes Genéticos/veterinária , Genômica , Genótipo , Gado/genéticaRESUMO
The U.S. Holstein cattle have unprecedentedly large samples for genomic evaluation with genotypes of Single Nucleotide Polymorphism (SNP) markers and phenotypic observations of dairy quantitative traits. Such large samples provided unprecedented opportunities for the discovery of genetic variants and mechanisms affecting quantitative traits in Holstein cattle. Recent studies using the Holstein large samples on finding genetic variants affecting quantitative traits included a fat percentage study and two studies on reproductive traits. The fat percentage study confirmed that a chromosome region interacted with all chromosomes and the reproductive studies detected sharply negative homozygous recessive genotypes that were recommended for heifer culling. These novel findings provided examples showing the power of large-sample genomic mining for quantitative traits.
RESUMO
The exact accuracy of estimated breeding values can be calculated based on the prediction error variances obtained from the diagonal of the inverse of the left-hand side (LHS) of the mixed model equations (MME). However, inverting the LHS is not computationally feasible for large datasets, especially if genomic information is available. Thus, different algorithms have been proposed to approximate accuracies. This study aimed to: 1) compare the approximated accuracies from 2 algorithms implemented in the BLUPF90 suite of programs, 2) compare the approximated accuracies from the 2 algorithms against the exact accuracy based on the inversion of the LHS of MME, and 3) evaluate the impact of adding genotyped animals with and without phenotypes on the exact and approximated accuracies. Algorithm 1 approximates accuracies based on the diagonal of the genomic relationship matrix (G). In turn, algorithm 2 combines accuracies with and without genomic information through effective record contributions. The data were provided by the American Angus Association and included 3 datasets of growth, carcass, and marbling traits. The genotype file contained 1,235,930 animals, and the pedigree file contained 12,492,581 animals. For the genomic evaluation, a multi-trait model was applied to the datasets. To ensure the feasibility of inverting the LHS of the MME, a subset of data under single-trait models was used to compare approximated and exact accuracies. The correlations between exact and approximated accuracies from algorithms 1 and 2 of genotyped animals ranged from 0.87 to 0.90 and 0.98 to 0.99, respectively. The intercept and slope of the regression of exact on approximated accuracies from algorithm 2 ranged from 0.00 to 0.01 and 0.82 to 0.87, respectively. However, the intercept and the slope for algorithm 1 ranged from -0.10 to 0.05 and 0.98 to 1.10, respectively. In more than 80% of the traits, algorithm 2 exhibited a smaller mean square error than algorithm 1. The correlation between the approximated accuracies obtained from algorithms 1 and 2 ranged from 0.56 to 0.74, 0.38 to 0.71, and 0.71 to 0.97 in the groups of genotyped animals, genotyped animals without phenotype, and proven genotyped sires, respectively. The approximated accuracy from algorithm 2 showed a closer behavior to the exact accuracy when including genotyped animals in the analysis. According to the results, algorithm 2 is recommended for genetic evaluations since it proved more precise.
The genomic estimated breeding value (GEBV) represents an animal's genetic merit calculated using a combination of phenotypes, pedigree, and genomic information through a procedure known as single-step genomic best linear unbiased prediction (ssGBLUP). The accuracy of a GEBV reflects how closely it correlates with the true breeding value. However, calculating accuracies is not computationally feasible for large datasets with genomic information. In this context, methods for approximating accuracies have been proposed and implemented into genetic evaluations. This study aimed to compare 2 algorithms to approximate accuracies for ssGBLUP. In algorithm 1, genomic contributions are based on the diagonal of the genomic relationship matrix (G), combined with contributions from animal records and pedigrees. In turn, algorithm 2 combines accuracies with and without genomic information through effective record contributions. The data for this study were provided by the American Angus Association and included datasets of growth, carcass, and marbling traits. Genotypes were available for 1,235,930 animals, and the pedigree had 12,492,581 animals. We showed that algorithm 2 is better suited for approximating accuracies, as its approximations closely matched the exact accuracy values obtained from the inverse of the mixed model equations.
Assuntos
Algoritmos , Cruzamento , Genótipo , Modelos Genéticos , Animais , Genômica , Bovinos/genética , Masculino , Feminino , Fenótipo , LinhagemRESUMO
The objectives of this study were to investigate the computational performance and the predictive ability and bias of a single-step SNP BLUP model (ssSNPBLUP) in genotyped young animals with unknown-parent groups (UPG) for type traits, using national genetic evaluation data from the Japanese Holstein population. The phenotype, genotype, and pedigree data were the same as those used in a national genetic evaluation of linear type traits classified between April 1984 and December 2020. In the current study, 2 data sets were prepared: the full data set containing all entries up to December 2020 and a truncated data set ending with December 2016. Genotyped animals were classified into 3 types: sires with classified daughters (S), cows with records (C), and young animals (Y). The computing performance and prediction accuracy of ssSNPBLUP were compared for the following 3 groups of genotyped animals: sires with classified daughters and young animals (SY); cows with records and young animals (CY); and sires with classified daughters, cows with records, and young animals (SCY). In addition, we tested 3 parameters of residual polygenic variance in ssSNPBLUP (0.1, 0.2, or 0.3). Daughter yield deviations (DYD) for the validation bulls and phenotypes adjusted for all fixed effects and random effects other than animal and residual (Yadj) for the validation cows were obtained using the full data set from the pedigree-based BLUP model. The regression coefficients of DYD for bulls (or Yadj for cows) on the genomic estimated breeding value (GEBV) using the truncated data set were used to measure the inflation of the predictions of young animals. The coefficient of determination of DYD on GEBV was used to measure the predictive ability of the predictions for the validation bulls. The reliability of the predictions for the validation cows was calculated as the square of the correlation between Yadj and GEBV divided by heritability. The predictive ability was highest in the SCY group and lowest in the CY group. However, minimal difference was found in predictive abilities with or without UPG models using different parameters of residual polygenic variance. The regression coefficients approached 1.0 as the parameter of residual polygenic variance increased, but regression coefficients were mostly similar regardless of the use of UPG across the groups of genotyped animals. The ssSNPBLUP model, including UPG, was demonstrated as feasible for implementation in the national evaluation of type traits in Japanese Holsteins.
Assuntos
Bovinos , Polimorfismo de Nucleotídeo Único , Animais , Bovinos/genética , Feminino , Masculino , Genótipo , Modelos Genéticos , Linhagem , Fenótipo , Reprodutibilidade dos TestesRESUMO
Transmission ratio distortion (TRD), which is a deviation from Mendelian expectations, has been associated with basic mechanisms of life such as sperm and ova fertility and viability at developmental stages of the reproductive cycle. In this study different models including TRD regions were tested for different reproductive traits [days from first service to conception (FSTC), number of services, first service nonreturn rate (NRR), and stillbirth (SB)]. Thus, in addition to a basic model with systematic and random effects, including genetic effects modeled through a genomic relationship matrix, we developed 2 additional models, including a second genomic relationship matrix based on TRD regions, and TRD regions as a random effect assuming heterogeneous variances. The analyses were performed with 10,623 cows and 1,520 bulls genotyped for 47,910 SNPs, 590 TRD regions, and several records ranging from 9,587 (FSTC) to 19,667 (SB). The results of this study showed the ability of TRD regions to capture some additional genetic variance for some traits; however, this did not translate into higher accuracy for genomic prediction. This could be explained by the nature of TRD itself, which may arise in different stages of the reproductive cycle. Nevertheless, important effects of TRD regions were found on SB (31 regions) and NRR (18 regions) when comparing at-risk versus control matings, especially for regions with allelic TRD pattern. Particularly for NRR, the probability of observing nonpregnant cow increases by up to 27% for specific TRD regions, and the probability of observing stillbirth increased by up to 254%. These results support the relevance of several TRD regions on some reproductive traits, especially those with allelic patterns that have not received as much attention as recessive TRD patterns.
RESUMO
There is growing interest in improving feed efficiency traits in dairy cattle. The objectives of this study were to estimate the genetic parameters of residual feed intake (RFI) and its component traits [dry matter intake (DMI), metabolic body weight (MBW), and average daily gain (ADG)] in Holstein heifers, and to develop a system for genomic evaluation for RFI in Holstein dairy calves. The RFI data were collected from 6,563 growing Holstein heifers (initial body weight = 261 ± 52 kg; initial age = 266 ± 42 d) for 70 d, across 182 trials conducted between 2014 and 2022 at the STgenetics Ohio Heifer Center (South Charleston, OH) as part of the EcoFeed program, which aims to improve feed efficiency by genetic selection. The RFI was estimated as the difference between a heifer's actual feed intake and expected feed intake, which was determined by regression of DMI against midpoint MBW, age, and ADG across each trial. A total of 61,283 SNPs were used in genomic analyses. Animals with phenotypes and genotypes were used as training population, and 4 groups of prediction population, each with 2,000 animals, were selected from a pool of Holstein animals with genotypes, based on their relationship with the training population. All traits were analyzed using univariate animal model in DMU version 6 software. Pedigree information and genomic information were used to specify genetic relationships to estimate the variance components and genomic estimated breeding values (GEBV), respectively. Breeding values of the prediction population were estimated by using the 2-step approach: deriving the prediction equation of GEBV from the training population for estimation of GEBV of prediction population with only genotypes. Reliability of breeding values was obtained by approximation based on partitioning a function of the accuracy of training population GEBV and magnitudes of genomic relationships between individuals in the training and prediction population. Heifers had DMI (mean ± SD) of 8.11 ± 1.59 kg over the trial period, with growth rate of 1.08 ± 0.25 kg/d. The heritability estimates (mean ± SE) of RFI, MBW, DMI, and growth rate were 0.24 ± 0.02, 0.23 ± 0.02, 0.27 ± 0.02, and 0.19 ± 0.02, respectively. The range of genomic predicted transmitted abilities (gPTA) of the training population (-0.94 to 0.75) was higher compared with the range of gPTA (-0.82 to 0.73) of different groups of prediction population. Average reliability of breeding values from the training population was 58%, and that of prediction population was 39%. The genomic prediction of RFI provides new tools to select for feed efficiency of heifers. Future research should be directed to find the relationship between RFI of heifers and cows, to select individuals based on their lifetime production efficiencies.
Assuntos
Ingestão de Alimentos , Genoma , Humanos , Bovinos/genética , Animais , Feminino , Reprodutibilidade dos Testes , Ingestão de Alimentos/genética , Genômica , Peso Corporal/genética , Ração AnimalRESUMO
BACKGROUND: To reduce the cost of genomic selection, a low-density (LD) single nucleotide polymorphism (SNP) chip can be used in combination with imputation for genotyping selection candidates instead of using a high-density (HD) SNP chip. Next-generation sequencing (NGS) techniques have been increasingly used in livestock species but remain expensive for routine use for genomic selection. An alternative and cost-efficient solution is to use restriction site-associated DNA sequencing (RADseq) techniques to sequence only a fraction of the genome using restriction enzymes. From this perspective, use of RADseq techniques followed by an imputation step on HD chip as alternatives to LD chips for genomic selection was studied in a pure layer line. RESULTS: Genome reduction and sequencing fragments were identified on reference genome using four restriction enzymes (EcoRI, TaqI, AvaII and PstI) and a double-digest RADseq (ddRADseq) method (TaqI-PstI). The SNPs contained in these fragments were detected from the 20X sequence data of the individuals in our population. Imputation accuracy on HD chip with these genotypes was assessed as the mean correlation between true and imputed genotypes. Several production traits were evaluated using single-step GBLUP methodology. The impact of imputation errors on the ranking of the selection candidates was assessed by comparing a genomic evaluation based on ancestry using true HD or imputed HD genotyping. The relative accuracy of genomic estimated breeding values (GEBVs) was investigated by considering the GEBVs estimated on offspring as a reference. With AvaII or PstI and ddRADseq with TaqI and PstI, more than 10 K SNPs were detected in common with the HD SNP chip, resulting in an imputation accuracy greater than 0.97. The impact of imputation errors on genomic evaluation of the breeders was reduced, with a Spearman correlation greater than 0.99. Finally, the relative accuracy of GEBVs was equivalent. CONCLUSIONS: RADseq approaches can be interesting alternatives to low-density SNP chips for genomic selection. With more than 10 K SNPs in common with the SNPs of the HD SNP chip, good imputation and genomic evaluation results can be obtained. However, with real data, heterogeneity between individuals with missing data must be considered.
Assuntos
Galinhas , Polimorfismo de Nucleotídeo Único , Animais , Galinhas/genética , Genoma , Genômica/métodos , Genótipo , Análise de Sequência de DNARESUMO
Hanwoo beef cattle are well known for the flavor and tenderness of their meat. Genetic improvement programs have been extremely successful over the last 40 yr. Recently, genomic selection was initiated in Hanwoo to enhance genetic progress. Routine genomic evaluation based on the single-step breeding value model was implemented in 2020 for all economically important traits. In this study, we tested a single-step marker effect model for the genomic evaluation of four carcass traits, namely, carcass weight (CW), eye muscle area, backfat thickness, and marbling score. In total, 8,023,666 animals with carcass records were jointly evaluated, including 29,965 genotyped animals. To assess the prediction stability of the single-step model, carcass data from the last 4 yr were removed in a forward validation study. The estimated genomic breeding values (GEBV) of the validation animals and other animals were compared between the truncated and full evaluations. A parallel conventional best linear unbiased prediction (BLUP) evaluation with either the full or the truncated dataset was also conducted for comparison with the single-step model. The estimates of the marker effect from the truncated evaluation were highly correlated with those from the full evaluation, ranging from 0.88 to 0.92. The regression coefficients of the estimates of the marker effect for the full and truncated evaluations were close to their expected value of 1, indicating unbiased estimates for all carcass traits. Estimates of the marker effect revealed three chromosomal regions (chromosomes 4, 6, and 14) harboring the major genes for CW in Hanwoo. For validation of cows or steers, the single-step model had a much higher R2 value for the linear regression model than the conventional BLUP model. Based on the regression intercept and slope of the validation, the single-step evaluation was neither inflated nor deflated. For genotyped animals, the estimated GEBV from the full and truncated evaluations were more correlated than the estimated breeding values from the two conventional BLUP evaluations. The single-step model provided a more accurate and stable evaluation over time.
Hanwoo beef cattle are well known for the flavor and tenderness of their meat. Genetic improvement programs have been successful over the last 40 yr. Recently, genomic selection was initiated in Hanwoo to enhance genetic progress. A routine genomic evaluation based on the single-step breeding value model was implemented in 2020 for all economically important traits. In this study, we tested a single-step marker effect model for the genomic evaluation of four carcass traits. In total, 8,023,666 cows or steers with carcass records were jointly evaluated, including 29,965 genotyped animals. To assess the prediction accuracy of the single-step model, carcass data from the last 4 yr were removed in a forward validation study. Estimated genomic breeding values (GEBV) of validation animals were compared between truncated and full evaluations. A parallel conventional best linear unbiased prediction (BLUP) evaluation with either the full or truncated dataset was conducted for comparison with the single-step model. Plots of the estimates of the marker effect showed three chromosomal regions harboring the major genes for carcass weight in Hanwoo. The single-step model yielded a more accurate and stable evaluation over time than the conventional BLUP model.
Assuntos
Modelos Genéticos , Característica Quantitativa Herdável , Feminino , Bovinos/genética , Animais , Genoma , Genômica , Fenótipo , Genótipo , República da CoreiaRESUMO
The implementation of genomic selection for six German beef cattle populations was evaluated. Although the multiple-step implementation of genomic selection is the status quo in most national dairy cattle evaluations, the breeding structure of German beef cattle, coupled with the shortcoming and complexity of the multiple-step method, makes single step a more attractive option to implement genomic selection in German beef cattle populations. Our objective was to develop a national beef cattle single-step genomic evaluation in five economically important traits in six German beef cattle populations and investigate its impact on the accuracy and bias of genomic evaluations relative to the current pedigree-based evaluation. Across the six breeds in our study, 461,929 phenotyped and 14,321 genotyped animals were evaluated with a multi-trait single-step model. To validate the single-step model, phenotype data in the last 2 years were removed in a forward validation study. For the conventional and single-step approaches, the genomic estimated breeding values of validation animals and other animals were compared between the truncated and the full evaluations. The correlation of the GEBVs between the full and truncated evaluations in the validation animals was slightly higher in the single-step evaluation. The regression of the full GEBVs on truncated GEBVs was close to the optimal value of 1 for both the pedigree-based and the single-step evaluations. The SNP effect estimates from the truncated evaluation were highly correlated with those from the full evaluation, with values ranging from 0.79 to 0.94. The correlation of the SNP effect was influenced by the number of genotyped animals shared between the full and truncated evaluations. The regression coefficients of the SNP effect of the full evaluation on the truncated evaluation were all close to the expected value of 1, indicating unbiased estimates of the SNP markers for the production traits. The Manhattan plot of the SNP effect estimates identified chromosomal regions harbouring major genes for muscling and body weight in breeds of French origin. Based on the regression intercept and slope of the GEBVs of validation animals, the single-step evaluation was neither inflated nor deflated across the six breeds. Overall, the single-step model resulted in a more accurate and stable evaluation. However, due to the small number of genotyped individuals, the single-step method only provided slightly better results when compared to the pedigree-based method.
Assuntos
Genômica , Nonoxinol , Animais , Bovinos/genética , Genótipo , Peso Corporal , LinhagemRESUMO
In dairy cattle, identifying polymorphisms that contribute to complex economical traits such as residual feed intake (RFI) is challenging and demands accurate genotyping. In this study, we compared imputed genotypes (n = 192 cows) to those obtained using the TaqMan and high-resolution melting (HRM) methods (n = 114 cows), for mutations in the FABP4 gene that had been suggested to have a large effect on RFI. Combining the whole genome sequence (n = 19 bulls) and the cows' BovineHD BeadChip allowed imputing genotypes for these mutations that were verified by Sanger sequencing, whereas, an error rate of 11.6% and 10.7% were encountered for HRM and TaqMan, respectively. We show that this error rate seriously affected the linkage-disequilibrium analysis that supported this gene candidacy over other BTA14 gene candidates. Thus, imputation produced superior genotypes and should also be regarded as a method of choice to validate the reliability of the genotypes obtained by other methodologies that are prone to genotyping errors due to technical conditions. These results support the view that RFI is a complex trait and that searching for the causative sequence variation underlying cattle RFI should await the development of statistical methods suitable to handle additive and epistatic interactions.
Assuntos
Genoma , Feminino , Bovinos/genética , Animais , Masculino , Genótipo , Reprodutibilidade dos Testes , Desequilíbrio de LigaçãoRESUMO
Single-step genomic BLUP (ssGBLUP) model for routine genomic prediction of breeding values is developed intensively for many dairy cattle populations. Compatibility between the genomic (G) and the pedigree (A) relationship matrices remains an important challenge required in ssGBLUP. The compatibility relates to the amount of missing pedigree information. There are two prevailing approaches to account for the incomplete pedigree information: unknown parent groups (UPG) and metafounders (MF). unknown parent groups have been used routinely in pedigree-based evaluations to account for the differences in genetic level between groups of animals with missing parents. The MF approach is an extension of the UPG approach. The MF approach defines MF which are related pseudo-individuals. The MF approach needs a Γ matrix of the size number of MF to describe relationships between MF. The UPG and MF can be the same. However, the challenge in the MF approach is the estimation of Γ having many MF, typically needed in dairy cattle. In our study, we present an approach to fit the same amount of MF as UPG in ssGBLUP with Woodbury matrix identity (ssGTBLUP). We used 305-day milk, protein, and fat yield data from the DFS (Denmark, Finland, Sweden) Red Dairy cattle population. The pedigree had more than 6 million animals of which 207,475 were genotyped. We constructed the preliminary gamma matrix (Γ pre ) with 29 MF which was expanded to 148 MF by a covariance function (Γ 148). The quality of the extrapolation of the Γ pre matrix was studied by comparing average off-diagonal elements between breed groups. On average relationships among MF in Γ 148 were 1.8% higher than in Γ pre . The use of Γ 148 increased the correlation between the G and A matrices by 0.13 and 0.11 for the diagonal and off-diagonal elements, respectively. [G]EBV were predicted using the ssGTBLUP and Pedigree-BLUP models with the MF and UPG. The prediction reliabilities were slightly higher for the ssGTBLUP model using MF than UPG. The ssGBLUP MF model showed less overprediction compared to other models.
RESUMO
Changes in the accuracy of the genomic estimates obtained by the ssGBLUP and wssGBLUP methods were evaluated using different reference groups. The weighting procedure's reasonableness of application Pwas considered to improve the accuracy of genomic predictions for meat, fattening and reproduction traits in pigs. Six reference groups were formed to assess the genomic data quantity impact on the accuracy of predicted values (groups of genotyped animals). The datasets included 62,927 records of meat and fattening productivity (fat thickness over 6-7 ribs (BF1, mm)), muscle depth (MD, mm) and precocity up to 100 kg (age, days) and 16,070 observations of reproductive qualities (the number of all born piglets (TNB) and the number of live-born piglets (NBA), according to the results of the first farrowing). The wssGBLUP method has an advantage over ssGBLUP in terms of estimation reliability. When using a small reference group, the difference in the accuracy of ssGBLUP over BLUP AM is from -1.9 to +7.3 percent points, while for wssGBLUP, the change in accuracy varies from +18.2 to +87.3 percent points. Furthermore, the superiority of the wssGBLUP is also maintained for the largest group of genotyped animals: from +4.7 to +15.9 percent points for ssGBLUP and from +21.1 to +90.5 percent points for wssGBLUP. However, for all analyzed traits, the number of markers explaining 5% of genetic variability varied from 71 to 108, and the number of such SNPs varied depending on the size of the reference group (79-88 for BF1, 72-81 for MD, 71-108 for age). The results of the genetic variation distribution have the greatest similarity between groups of about 1000 and about 1500 individuals. Thus, the size of the reference group of more than 1000 individuals gives more stable results for the estimation based on the wssGBLUP method, while using the reference group of 500 individuals can lead to distorted results of GEBV.
RESUMO
Approximate multistep methods to calculate reliabilities for estimated breeding values in large genetic evaluations were developed for single-trait (ST-R2A) and multitrait (MT-R2A) single-step genomic BLUP (ssGBLUP) models. First, a traditional animal model was used to estimate the amount of nongenomic information for the genotyped animals. Second, this information was used with genomic data in a genomic BLUP model (genomic BLUP/SNP-BLUP) to approximate the total amount of information and ssGBLUP reliabilities for the genotyped animals. Finally, reliabilities for the nongenotyped animals were calculated using a traditional animal model where the increased information due to genomic data for the genotyped animals is accounted for by including pseudo-record counts for the genotyped animals. The approaches were tested using a multiple-trait ssGBLUP model on 2 data sets. The first data set (data 1) was small enough such that exact ssGBLUP model reliabilities could be computed by inversion and compared with the approximation method reliabilities. Data 1 had 46,535 first-, 35,290 second-, and 23,780 third-lactation 305-d milk yield records from 47,124 Finnish Red dairy cows. The pedigree comprised 64,808 animals, of which 19,757 were genotyped. We examined the efficiency of the MT-R2A approximation on a large data set (data 2) derived from the joint Nordic (Danish, Finnish, and Swedish) Holstein dairy cattle data. Data 2 had 17.8 million 305-d milk records from 8.3 million cows and first 3 lactations. The pedigree had 11 million animals of which 274,145 were genotyped on 46,342 SNP markers. For data 1, correlations between the exact ssGBLUP model and the ST-R2A for the genotyped (nongenotyped) animals were 0.995 (0.987), 0.965 (0.984), and 0.950 (0.983) for first, second, and third lactation, respectively. Correspondingly, correlations between exact ssGBLUP reliabilities and MT-R2A for the genotyped (nongenotyped) animals were 0.995 (0.993), 0.992 (0.991), and 0.990 (0.990) for first, second, and third lactation, respectively. The regression coefficients (b1) of ssGBLUP reliability on ST-R2A for the genotyped (nongenotyped) animals ranged from 0.87 (0.94) for first lactation to 0.68 (0.93) for third lactation, whereas for MT-R2A they were between 0.91 (0.99) for first lactation to 0.89 (0.99) for third lactation. Correspondingly, the intercepts varied from 0.11 (0.05) to 0.3 (0.06) for ST-R2A and from 0.06 (0.01) to 0.07 (0.02) for MT-R2A. The computing time for the approximation method was approximately 12% of that required by the direct exact approach. In conclusion, the developed approximate approach allows calculating estimated breeding value reliabilities in the ssGBLUP model even for large data sets.
Assuntos
Genoma , Modelos Genéticos , Animais , Bovinos/genética , Feminino , Genômica/métodos , Genótipo , Linhagem , Fenótipo , Reprodutibilidade dos TestesRESUMO
Microarray-based genomic selection is a central tool to increase the genetic gain of economically significant traits in dairy cattle. Yet, the effectivity of this tool is slightly limited, as estimates based on genotype data only partially explain the observed heritability. In the analysis of the genomes of 17 Israeli Holstein bulls, we compared genotyping accuracy between whole-genome sequencing (WGS) and microarray-based techniques. Using the standard GATK pipeline, the short-variant discovery within sequence reads mapped to the reference genome (ARS-UCD1.2) was compared to the genotypes from Illumina BovineSNP50 BeadChip and to an alternative method, which computationally mimics the hybridization procedure by mapping reads to 50 bp spanning the BeadChip source sequences. The number of mismatches between the BeadChip and WGS genotypes was low (0.2%). However, 17,197 (40% of the informative SNPs) had extra variation within 50 bp of the targeted SNP site, which might interfere with hybridization-based genotyping. Consequently, with respect to genotyping errors, BeadChip varied significantly and systematically from WGS genotyping, introducing null allele-like effects and Mendelian errors (<0.5%), whereas the GATK algorithm of local de novo assembly of haplotypes successfully resolved the genotypes in the extra-variable regions. These findings suggest that the microarray design should avoid polymorphic genomic regions that are prone to extra variation and that WGS data may be used to resolve erroneous genotyping, which may partially explain missing heritability.
Assuntos
Genoma , Polimorfismo de Nucleotídeo Único , Animais , Bovinos/genética , Genômica , Genótipo , Haplótipos/genética , Masculino , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
OBJECTIVE: Thoroughbred horses have been bred exclusively for racing in England for a long time. Additionally, because horse racing is a global sport, a healthy leisure activity for ordinary citizens, and a high-value business, systematic racehorse breeding at the population level is a requirement for continuous industrial development. Therefore, we established genomic evaluation system (using prize money as horse racing traits) to produce spirited, agile, and strong racing horse population. METHODS: We used phenotypic data from 25,061 Thoroughbred horses (all registered individuals in Korea) that competed in races between 1994 and 2019 at the Korea Racing Authority and constructed pedigree structures. We quantified the improvement in racehorse breeding output by year in Korea, and this aided in the establishment of a high-level horse-fill industry. RESULTS: We found that pedigree-based best linear unbiased prediction method improved the racing performance of the Thoroughbred population with high accuracy, making it possible to construct an excellent Thoroughbred racehorse population in Korea. CONCLUSION: This study could be used to develop an efficient breeding program at the population level for Korean Thoroughbred racehorse populations as well as others.
RESUMO
The objectives of this study were to develop an efficient algorithm for calculating prediction error variances (PEVs) for genomic best linear unbiased prediction (GBLUP) models using the Algorithm for Proven and Young (APY), extend it to single-step GBLUP (ssGBLUP), and apply this algorithm for approximating the theoretical reliabilities for single- and multiple-trait models in ssGBLUP. The PEV with APY was calculated by block sparse inversion, efficiently exploiting the sparse structure of the inverse of the genomic relationship matrix with APY. Single-step GBLUP reliabilities were approximated by combining reliabilities with and without genomic information in terms of effective record contributions. Multi-trait reliabilities relied on single-trait results adjusted using the genetic and residual covariance matrices among traits. Tests involved two datasets provided by the American Angus Association. A small dataset (Data1) was used for comparing the approximated reliabilities with the reliabilities obtained by the inversion of the left-hand side of the mixed model equations. A large dataset (Data2) was used for evaluating the computational performance of the algorithm. Analyses with both datasets used single-trait and three-trait models. The number of animals in the pedigree ranged from 167,951 in Data1 to 10,213,401 in Data2, with 50,000 and 20,000 genotyped animals for single-trait and multiple-trait analysis, respectively, in Data1 and 335,325 in Data2. Correlations between estimated and exact reliabilities obtained by inversion ranged from 0.97 to 0.99, whereas the intercept and slope of the regression of the exact on the approximated reliabilities ranged from 0.00 to 0.04 and from 0.93 to 1.05, respectively. For the three-trait model with the largest dataset (Data2), the elapsed time for the reliability estimation was 11 min. The computational complexity of the proposed algorithm increased linearly with the number of genotyped animals and with the number of traits in the model. This algorithm can efficiently approximate the theoretical reliability of genomic estimated breeding values in ssGBLUP with APY for large numbers of genotyped animals at a low cost.
The estimated breeding value (EBV) of an animal measures its genetic merit. For calculating EBVs, pedigree and genomic information are jointly used in a procedure called single-step genomic best linear unbiased prediction (ssGBLUP). Genetic evaluations report each EBV with its reliability, which measures how accurate the breeding value estimation was. Calculating EBV with ssGBLUP for large datasets is computationally expensive; Therefore, the Algorithm for Proven and Young (APY) was developed to reduce its computational cost. However, the procedure for obtaining the reliabilities of EBV is still computationally unfeasible to apply. Thus, this study aimed to develop a new method for approximating reliabilities for ssGBLUP with APY for large datasets. We required this new method to be accurate and with fewer computational requirements than the estimation of breeding values by itself. The method that we develop consists of accumulating pedigree and genomic information in successive steps, allowing for computational efficiency. Using a dataset with more than 300,000 genotypes in a pedigree of 10,000,000 animals provided by the American Angus Association, we showed that our proposed method is accurate and computationally efficient, with a correlation of 0.98 between the approximated and target values running in less than 12 min.
Assuntos
Genoma , Modelos Genéticos , Algoritmos , Animais , Genômica , Genótipo , Linhagem , Fenótipo , Reprodutibilidade dos TestesRESUMO
Genomic data are widely used in predicting the breeding values of dairy cattle. The accuracy of genomic prediction depends on the size of the reference population and how related the candidate animals are to it. For populations with limited numbers of progeny-tested bulls, the reference populations must include cows and data from external populations. The aim of this study was to implement state-of-the-art single-step genomic evaluations for milk and fat yield in Holstein and Russian Black & White cattle in the Leningrad region (LR, Russia), using only a limited number of genotyped animals. We complemented internal information with external pseudo-phenotypic and genotypic data of bulls from the neighbouring Danish, Finnish and Swedish Holstein (DFS) population. Three data scenarios were used to perform single-step GBLUP predictions in the LR dairy cattle population. The first scenario was based on the original LR reference population, which constituted 1,080 genotyped cows and 427 genotyped bulls. In the second scenario, the genotypes of 414 bulls related to the LR from the DFS population were added to the reference population. In the third scenario, LR data were further augmented with pseudo-phenotypic data from the DFS population. The inclusion of foreign information increased the validation reliability of the milk yield by up to 30%. Suboptimal data recording practices hindered the improvement of fat yield. We confirmed that the single-step model is suitable for populations with a low number of genotyped animals, especially when external information is integrated into the evaluations. Genomic prediction in populations with a low number of progeny-tested bulls can be based on data from genotyped cows and on the inclusion of genotypes and pseudo-phenotypes from the external population. This approach increased the validation reliability of the implemented single-step model in the milk yield, but shortcomings in the LR data recording scheme prevented improvements in fat yield.
Assuntos
Genoma , Genômica , Animais , Bovinos/genética , Feminino , Genoma/genética , Genótipo , Masculino , Leite , Modelos Genéticos , Fenótipo , Reprodutibilidade dos TestesRESUMO
Milk fatty acids (FA) have been suggested as biomarkers for early-lactation metabolic diseases and for female fertility status. The aim of the present study was to infer associations between FA, the metabolic disorder ketosis (KET), and the interval from calving to first insemination (ICF) genetically and genomically. In this regard, we focused on a single-step genomic BLUP approach, allowing consideration of genotyped and ungenotyped cows simultaneously. The phenotypic data set considered 38,375 first-lactation Holstein cows, kept in 45 large-scale co-operator herds from 2 federal states in Germany. The calving years for these cows were from 2014 to 2017. Concentrations in milk from the first official milk recording test-day for saturated, unsaturated (UFA), monounsaturated (MUFA), polyunsaturated, palmitic, and stearic (C18:0) FA were determined via Fourier-transform infrared spectroscopy. Ketosis was defined as a binary trait according to a veterinarian diagnosis key, considering diagnoses within a 6-wk interval after calving. A subset of 9,786 cows was genotyped for 40,989 SNP markers. Variance components and heritabilities for all Gaussian distributed FA and for ICF, and for binary KET were estimated by applying single-step genomic BLUP single-trait linear and threshold models, respectively. Genetic correlations were estimated in series of bivariate runs. Genomic breeding values for the single-step genomic BLUP estimations were dependent traits in single-step GWAS. Heritabilities for FA were moderate in the range from 0.09 to 0.20 (standard error = 0.02-0.03), but quite small for ICF (0.08, standard error = 0.01) and for KET (0.05 on the underlying liability scale, posterior standard deviation = 0.02). Genetic correlations between KET and UFA, MUFA, and C18:0 were large (0.74 to 0.85, posterior standard deviation = 0.14-0.19), and low positive between KET and ICF (0.17, posterior standard deviation = 0.22). Genetic correlations between UFA, MUFA, and C18:0 with ICF ranged from 0.34 to 0.46 (standard error = 0.12). In single-step GWAS, we identified a large proportion of overlapping genomic regions for the different FA, especially for UFA and MUFA, and for saturated and palmitic FA. One identical significantly associated SNP was identified for C18:0 and KET on BTA 15. However, there was no genomic segment simultaneously significantly affecting all trait categories ICF, FA, and KET. Nevertheless, some of the annotated potential candidate genes DGKA, IGFBP4, and CXCL8 play a role in lipid metabolism and fertility mechanisms, and influence production diseases in early lactation. Genetic and genomic associations indicate that Fourier-transform infrared spectroscopy FA concentrations in milk from the first official test-day are valuable predictors for KET and for ICF.
Assuntos
Cetose , Leite , Animais , Bovinos/genética , Ácidos Graxos , Feminino , Estudo de Associação Genômica Ampla/veterinária , Genômica , Inseminação , Cetose/genética , Cetose/veterinária , Lactação/genética , FenótipoRESUMO
The multiple sire system (MSS) is a common mating scheme in extensive beef production systems. However, MSS does not allow paternity identification and lead to inaccurate genetic predictions. The objective of this study was to investigate the implementation of single-step genomic BLUP (ssGBLUP) in different scenarios of uncertain paternity in the evaluation for 450-day adjusted liveweight (W450) and age at first calving (AFC) in a Nellore cattle population. To estimate the variance components using BLUP and ssGBLUP, the relationship matrix (A) with different proportions of animals with missing sires (MS) (scenarios 0, 25, 50, 75, and 100% of MS) was created. The genotyped animals with MS were randomly chosen, and ten replicates were performed for each scenario and trait. Five groups of animals were evaluated in each scenario: PHE, all animals with phenotypic records in the population; SIR, proven sires; GEN, genotyped animals; YNG, young animals without phenotypes and progeny; and YNGEN, young genotyped animals. The additive genetic variance decreased for both traits as the proportion of MS increased in the population when using the regular REML. When using the ssGBLUP, accuracies ranged from 0.13 to 0.47 for W450 and from 0.10 to 0.25 for AFC. For both traits, the prediction ability of the direct genomic value (DGV) decreased as the percentage of MS increased. These results emphasize that indirect prediction via DGV of young animals is more accurate when the SNP effects are derived from ssGBLUP with a reference population with known sires. The ssGBLUP could be applied in situations of uncertain paternity, especially when selecting young animals. This methodology is shown to be accurate, mainly in scenarios with a high percentage of MS.
Assuntos
Genoma , Modelos Genéticos , Animais , Bovinos/genética , Genômica , Genótipo , Linhagem , FenótipoRESUMO
The growing amount of genomic information in dairy cattle has increased computational and modeling challenges in the single-step evaluations. The computational challenges are due to the dense inverses of genomic (G) and pedigree (A22) relationship matrices of genotyped animals in the single-step mixed model equations. An equivalent mixed model equation is given by single-step genomic BLUP that are based on the T matrix (ssGTBLUP), where these inverses are avoided by expressing G-1 through a product of 2 rectangular matrices, and (A22)-1 through sparse matrix blocks of the inverse of full relationship matrix A-1. A proper way to account genetic groups through unknown parent groups (UPG) after the Quaas-Pollak transformation (QP) is one key factor in a single-step model. When the UPG effects are incompletely accounted, the iterative solving method may have convergence problems. In this study, we investigated computational and predictive performance of ssGTBLUP with residual polygenic (RPG) effect and UPG. The QP transformation used A-1 and, in the complete form, T and (A22)-1 matrices as well. The models were tested with official Nordic Holstein milk production test-day data and model. The results show that UPG can be easily implemented in ssGTBLUP having RPG. The complete QP transformation was computationally feasible when preconditioned conjugate gradient iteration and iteration on data without explicitly setting up G or A22 matrices were used. Furthermore, for good convergence of the preconditioned conjugate gradient method, a complete QP transformation was necessary.