RESUMO
BACKGROUND: A cost-effective strategy to explore the complete DNA sequence in animals for genetic evaluation purposes is to sequence key ancestors of a population, followed by imputation mechanisms to infer marker genotypes that were not originally reported in a target population of animals genotyped with single nucleotide polymorphism (SNP) panels. The feasibility of this process relies on the accuracy of the genotype imputation in that population, particularly for potential causal mutations which may be at low frequency and either within genes or regulatory regions. The objective of the present study was to investigate the imputation accuracy to the sequence level in a Nellore beef cattle population, including that for variants in annotation classes which are more likely to be functional. METHODS: Information of 151 key sequenced Nellore sires were used to assess the imputation accuracy from bovine HD BeadChip SNP (~ 777 k) to whole-genome sequence. The choice of the sires aimed at optimizing the imputation accuracy of a genotypic database, comprised of about 10,000 genotyped Nellore animals. Genotype imputation was performed using two computational approaches: FImpute3 and Minimac4 (after using Eagle for phasing). The accuracy of the imputation was evaluated using a fivefold cross-validation scheme and measured by the squared correlation between observed and imputed genotypes, calculated by individual and by SNP. SNPs were classified into a range of annotations, and the accuracy of imputation within each annotation classification was also evaluated. RESULTS: High average imputation accuracies per animal were achieved using both FImpute3 (0.94) and Minimac4 (0.95). On average, common variants (minor allele frequency (MAF) > 0.03) were more accurately imputed by Minimac4 and low-frequency variants (MAF ≤ 0.03) were more accurately imputed by FImpute3. The inherent Minimac4 Rsq imputation quality statistic appears to be a good indicator of the empirical Minimac4 imputation accuracy. Both software provided high average SNP-wise imputation accuracy for all classes of biological annotations. CONCLUSIONS: Our results indicate that imputation to whole-genome sequence is feasible in Nellore beef cattle since high imputation accuracies per individual are expected. SNP-wise imputation accuracy is software-dependent, especially for rare variants. The accuracy of imputation appears to be relatively independent of annotation classification.
Assuntos
Bovinos/genética , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Estudo de Associação Genômica Ampla/veterinária , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Software/normas , Sequenciamento Completo do Genoma/veterináriaRESUMO
The objective of this study was to investigate the impact of accounting for parent average (PA) and genotyped daughters' average (GDA) on the estimation of deregressed estimated breeding values (dEBVs) used as pseudo-phenotypes in multiple-step genomic evaluations. Genomic estimated breeding values (GEBVs) were predicted, in eight different simulated scenarios, using dEBVs calculated based on four methods. These methods included PA and GDA in the dEBV (VR) or only GDA (VRpa) and excluded both PA and GDA from the dEBV with either all information or only information from PA and GDA (JA and NEW, respectively). In general, VR and NEW showed the lowest and highest GEBV reliabilities across scenarios, respectively. Among all deregression methods, VRpa and NEW provided the most consistent bias estimates across the majority of scenarios, and they significantly yielded the least biased GEBVs. Our results indicate that removing PA and GDA information from dEBVs used in multiple-step genomic evaluations can increase the reliability of GEBVs, when both bulls and their daughters are included in the training population.
Assuntos
Bovinos/genética , Indústria de Laticínios , Genômica/métodos , Modelos Genéticos , Animais , Feminino , Genótipo , Masculino , Fenótipo , Análise de RegressãoRESUMO
BACKGROUND: Genomic selection (GS) has played an important role in cattle breeding programs. However, genotyping prices are still a challenge for implementation of GS in beef cattle and there is still a lack of information about the use of low-density Single Nucleotide Polymorphisms (SNP) chip panels for genomic predictions in breeds such as Brazilian Braford and Hereford. Therefore, this study investigated the effect of using imputed genotypes in the accuracy of genomic predictions for twenty economically important traits in Brazilian Braford and Hereford beef cattle. Various scenarios composed by different percentages of animals with imputed genotypes and different sizes of the training population were compared. De-regressed EBVs (estimated breeding values) were used as pseudo-phenotypes in a Genomic Best Linear Unbiased Prediction (GBLUP) model using two different mimicked panels derived from the 50 K (8 K and 15 K SNP panels), which were subsequently imputed to the 50 K panel. In addition, genomic prediction accuracies generated from a 777 K SNP (imputed from the 50 K SNP) were presented as another alternate scenario. RESULTS: The accuracy of genomic breeding values averaged over the twenty traits ranged from 0.38 to 0.40 across the different scenarios. The average losses in expected genomic estimated breeding values (GEBV) accuracy (accuracy obtained from the inverse of the mixed model equations) relative to the true 50 K genotypes ranged from -0.0007 to -0.0012 and from -0.0002 to -0.0005 when using the 50 K imputed from the 8 K or 15 K, respectively. When using the imputed 777 K panel the average losses in expected GEBV accuracy was -0.0021. The average gain in expected EBVs accuracy by including genomic information when compared to simple BLUP was between 0.02 and 0.03 across scenarios and traits. CONCLUSIONS: The percentage of animals with imputed genotypes in the training population did not significantly influence the validation accuracy. However, the size of the training population played a major role in the accuracies of genomic predictions in this population. The losses in the expected accuracies of GEBV due to imputation of genotypes were lower when using the 50 K SNP chip panel imputed from the 15 K compared to the one imputed from the 8 K SNP chip panel.
Assuntos
Bovinos/genética , Genômica/métodos , Genótipo , Animais , Cruzamento , Aprendizado de Máquina , Fenótipo , Especificidade da EspécieRESUMO
Apolipoprotein B (APOB) and Adiponectin Receptor 1 (ADIPOR1) are related to the regulation of feed intake, fat metabolism and protein deposition and are candidate genes for genomic studies in birds. In this study, associations of two single nucleotide polymorphisms (SNPs) g.102A>T (APOB) and g.729C>T (ADIPOR1) with carcass, bone integrity and performance traits in broilers were investigated. Genotyping was performed on a paternal line of 1,454 broilers. The SNP detection was carried out by PCR-RFLP technique using the restriction enzymes HhaI for the SNP g.729C>T and MslI for the SNP g.102A>T. The association analyses of the two SNPs with 85 traits were performed using the restricted maximum likelihood (REML) and Generalized Quasi-Likelihood Score (GQLS) methods. For REML the model included the random additive genetic effect of animal and fixed effects of sex, hatch and SNP genotypes. In the GQLS method, a logistic regression was used to associate the genotypes with phenotypes adjusted for fixed effects of sex and hatch. The SNP g.729C>T in the ADIPOR1 gene was associated with thickness of the femur and breast skin yield. Thus, the ADIPOR1 gene seems implicated in the metabolism and/or fat deposition and bone integrity in broilers.
Assuntos
Tecido Adiposo/anatomia & histologia , Apolipoproteínas B/genética , Distribuição da Gordura Corporal , Peso Corporal/genética , Galinhas/anatomia & histologia , Galinhas/genética , Locos de Características Quantitativas , Receptores de Adiponectina/genética , Animais , Galinhas/metabolismo , Fêmur/anatomia & histologia , Frequência do Gene/genética , Marcadores Genéticos/genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Studies are being conducted on the applicability of genomic data to improve the accuracy of the selection process in livestock, and genome-wide association studies (GWAS) provide valuable information to enhance the understanding on the genetics of complex traits. The aim of this study was to identify genomic regions and genes that play roles in birth weight (BW), weaning weight adjusted for 210 days of age (WW), and long-yearling weight adjusted for 420 days of age (LYW) in Canchim cattle. GWAS were performed by means of the Generalized Quasi-Likelihood Score (GQLS) method using genotypes from the BovineHD BeadChip and estimated breeding values for BW, WW, and LYW. Data consisted of 285 animals from the Canchim breed and 114 from the MA genetic group (derived from crossings between Charolais sires and ½ Canchim + ½ Zebu dams). After applying a false discovery rate correction at a 10% significance level, a total of 4, 12, and 10 SNPs were significantly associated with BW, WW, and LYW, respectively. These SNPs were surveyed to their corresponding genes or to surrounding genes within a distance of 250 kb. The genes DPP6 (dipeptidyl-peptidase 6) and CLEC3B (C-type lectin domain family 3 member B) were highlighted, considering its functions on the development of the brain and skeletal system, respectively. The GQLS method identified regions on chromosome associated with birth weight, weaning weight, and long-yearling weight in Canchim and MA animals. New candidate regions for body weight traits were detected and some of them have interesting biological functions, of which most have not been previously reported. The observation of QTL reports for body weight traits, covering areas surrounding the genes (SNPs) herein identified provides more evidence for these associations. Future studies targeting these areas could provide further knowledge to uncover the genetic architecture underlying growth traits in Canchim cattle.
Assuntos
Bovinos/crescimento & desenvolvimento , Bovinos/genética , Estudo de Associação Genômica Ampla , Característica Quantitativa Herdável , Animais , Peso ao Nascer/genética , Brasil , Cromossomos de Mamíferos/genética , Genótipo , Funções Verossimilhança , Polimorfismo de Nucleotídeo Único/genética , DesmameRESUMO
BACKGROUND: Genotype imputation from low-density (LD) to high-density single nucleotide polymorphism (SNP) chips is an important step before applying genomic selection, since denser chips tend to provide more reliable genomic predictions. Imputation methods rely partially on linkage disequilibrium between markers to infer unobserved genotypes. Bos indicus cattle (e.g. Nelore breed) are characterized, in general, by lower levels of linkage disequilibrium between genetic markers at short distances, compared to taurine breeds. Thus, it is important to evaluate the accuracy of imputation to better define which imputation method and chip are most appropriate for genomic applications in indicine breeds. METHODS: Accuracy of genotype imputation in Nelore cattle was evaluated using different LD chips, imputation software and sets of animals. Twelve commercial and customized LD chips with densities ranging from 7 K to 75 K were tested. Customized LD chips were virtually designed taking into account minor allele frequency, linkage disequilibrium and distance between markers. Software programs FImpute and BEAGLE were applied to impute genotypes. From 995 bulls and 1247 cows that were genotyped with the Illumina® BovineHD chip (HD), 793 sires composed the reference set, and the remaining 202 younger sires and all the cows composed two separate validation sets for which genotypes were masked except for the SNPs of the LD chip that were to be tested. RESULTS: Imputation accuracy increased with the SNP density of the LD chip. However, the gain in accuracy with LD chips with more than 15 K SNPs was relatively small because accuracy was already high at this density. Commercial and customized LD chips with equivalent densities presented similar results. FImpute outperformed BEAGLE for all LD chips and validation sets. Regardless of the imputation software used, accuracy tended to increase as the relatedness between imputed and reference animals increased, especially for the 7 K chip. CONCLUSIONS: If the Illumina® BovineHD is considered as the target chip for genomic applications in the Nelore breed, cost-effectiveness can be improved by genotyping part of the animals with a chip containing around 15 K useful SNPs and imputing their high-density missing genotypes with FImpute.
Assuntos
Bovinos/genética , Polimorfismo de Nucleotídeo Único , Animais , Feminino , Genômica/métodos , Genótipo , Técnicas de Genotipagem , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Linhagem , Controle de Qualidade , Reprodutibilidade dos Testes , SoftwareRESUMO
BACKGROUND: Knowledge of the linkage disequilibrium (LD) between markers is important to establish the number of markers necessary for association studies and genomic selection. The objective of this study was to evaluate the extent of LD in Nellore cattle using a high density SNP panel and 795 genotyped steers. RESULTS: After data editing, 446,986 SNPs were used for the estimation of LD, comprising 2508.4 Mb of the genome. The mean distance between adjacent markers was 4.90 ± 2.89 kb. The minor allele frequency (MAF) was less than 0.20 in a considerable proportion of SNPs. The overall mean LD between marker pairs measured by r(2) and |D'| was 0.17 and 0.52, respectively. The LD (r(2)) decreased with increasing physical distance between markers from 0.34 (1 kb) to 0.11 (100 kb). In contrast to this clear decrease of LD measured by r(2), the changes in |D'| indicated a less pronounced decline of LD. Chromosomes BTA1, BTA27, BTA28 and BTA29 showed lower levels of LD at any distance between markers. Except for these four chromosomes, the level of LD (r(2)) was higher than 0.20 for markers separated by less than 20 kb. At distances < 3 kb, the level of LD was higher than 0.30. The LD (r(2)) between markers was higher when the MAF threshold was high (0.15), especially when the distance between markers was short. CONCLUSIONS: The level of LD estimated for markers separated by less than 30 kb indicates that the High Density Bovine SNP BeadChip will likely be a suitable tool for prediction of genomic breeding values in Nellore cattle.
Assuntos
Bovinos/genética , Genômica , Desequilíbrio de Ligação/genética , Animais , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genéticaRESUMO
The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use.
RESUMO
The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use.