Búsqueda | Portal de Búsqueda de la BVS España

1.

Using prior information from humans to prioritize genes and gene-associated variants for complex traits in livestock.

Raymond, Biaty; Yengo, Loic; Costilla, Roy; Schrooten, Chris; Bouwman, Aniek C; Hayes, Ben J; Veerkamp, Roel F; Visscher, Peter M.

PLoS Genet ; 16(9): e1008780, 2020 09.

Artículo en Inglés | MEDLINE | ID: mdl-32925905

RESUMEN

Genome-Wide Association Studies (GWAS) in large human cohorts have identified thousands of loci associated with complex traits and diseases. For identifying the genes and gene-associated variants that underlie complex traits in livestock, especially where sample sizes are limiting, it may help to integrate the results of GWAS for equivalent traits in humans as prior information. In this study, we sought to investigate the usefulness of results from a GWAS on human height as prior information for identifying the genes and gene-associated variants that affect stature in cattle, using GWAS summary data on samples sizes of 700,000 and 58,265 for humans and cattle, respectively. Using Fisher's exact test, we observed a significant proportion of cattle stature-associated genes (30/77) that are also associated with human height (odds ratio = 5.1, p = 3.1e-10). Result of randomized sampling tests showed that cattle orthologs of human height-associated genes, hereafter referred to as candidate genes (C-genes), were more enriched for cattle stature GWAS signals than random samples of genes in the cattle genome (p = 0.01). Randomly sampled SNPs within the C-genes also tend to explain more genetic variance for cattle stature (up to 13.2%) than randomly sampled SNPs within random cattle genes (p = 0.09). The most significant SNPs from a cattle GWAS for stature within the C-genes did not explain more genetic variance for cattle stature than the most significant SNPs within random cattle genes (p = 0.87). Altogether, our findings support previous studies that suggest a similarity in the genetic regulation of height across mammalian species. However, with the availability of a powerful GWAS for stature that combined data from 8 cattle breeds, prior information from human-height GWAS does not seem to provide any additional benefit with respect to the identification of genes and gene-associated variants that affect stature in cattle.

Asunto(s)

Estatura/genética , Bovinos/genética , Estudio de Asociación del Genoma Completo/métodos , Animales , Cruzamiento/métodos , Bases de Datos Genéticas , Variación Genética/genética , Humanos , Ganado/genética , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética

2.

A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices.

Raymond, Biaty; Wientjes, Yvonne C J; Bouwman, Aniek C; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 52(1): 21, 2020 Apr 28.

Artículo en Inglés | MEDLINE | ID: mdl-32345213

RESUMEN

BACKGROUND: A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix ([Formula: see text]) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy. METHODS: Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population [Formula: see text] ([Formula: see text]), when individuals from two populations, [Formula: see text] and [Formula: see text], are combined in the training population and two [Formula: see text], made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation. RESULTS: In general, the derived prediction equation resulted in accurate predictions of [Formula: see text] for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments ([Formula: see text]) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for [Formula: see text], calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait. CONCLUSIONS: We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.

Asunto(s)

Cruzamiento , Metagenómica , Modelos Genéticos , Animales , Fenotipo , Polimorfismo de Nucleótido Simple

3.

Meta-analysis for milk fat and protein percentage using imputed sequence variant genotypes in 94,321 cattle from eight cattle breeds.

van den Berg, Irene; Xiang, Ruidong; Jenko, Janez; Pausch, Hubert; Boussaha, Mekki; Schrooten, Chris; Tribout, Thierry; Gjuvsland, Arne B; Boichard, Didier; Nordbø, Øyvind; Sanchez, Marie-Pierre; Goddard, Mike E.

Genet Sel Evol ; 52(1): 37, 2020 Jul 07.

Artículo en Inglés | MEDLINE | ID: mdl-32635893

RESUMEN

BACKGROUND: Sequence-based genome-wide association studies (GWAS) provide high statistical power to identify candidate causal mutations when a large number of individuals with both sequence variant genotypes and phenotypes is available. A meta-analysis combines summary statistics from multiple GWAS and increases the power to detect trait-associated variants without requiring access to data at the individual level of the GWAS mapping cohorts. Because linkage disequilibrium between adjacent markers is conserved only over short distances across breeds, a multi-breed meta-analysis can improve mapping precision. RESULTS: To maximise the power to identify quantitative trait loci (QTL), we combined the results of nine within-population GWAS that used imputed sequence variant genotypes of 94,321 cattle from eight breeds, to perform a large-scale meta-analysis for fat and protein percentage in cattle. The meta-analysis detected (p ≤ 10-8) 138 QTL for fat percentage and 176 QTL for protein percentage. This was more than the number of QTL detected in all within-population GWAS together (124 QTL for fat percentage and 104 QTL for protein percentage). Among all the lead variants, 100 QTL for fat percentage and 114 QTL for protein percentage had the same direction of effect in all within-population GWAS. This indicates either persistence of the linkage phase between the causal variant and the lead variant across breeds or that some of the lead variants might indeed be causal or tightly linked with causal variants. The percentage of intergenic variants was substantially lower for significant variants than for non-significant variants, and significant variants had mostly moderate to high minor allele frequencies. Significant variants were also clustered in genes that are known to be relevant for fat and protein percentages in milk. CONCLUSIONS: Our study identified a large number of QTL associated with fat and protein percentage in dairy cattle. We demonstrated that large-scale multi-breed meta-analysis reveals more QTL at the nucleotide resolution than within-population GWAS. Significant variants were more often located in genic regions than non-significant variants and a large part of them was located in potentially regulatory regions.

Asunto(s)

Bovinos/genética , Genotipo , Desequilibrio de Ligamiento , Lípidos/genética , Proteínas de la Leche/genética , Leche/normas , Animales , Frecuencia de los Genes , Leche/metabolismo , Polimorfismo Genético , Sitios de Carácter Cuantitativo

4.

Utility of whole-genome sequence data for across-breed genomic prediction.

Raymond, Biaty; Bouwman, Aniek C; Schrooten, Chris; Houwing-Duistermaat, Jeanine; Veerkamp, Roel F.

Genet Sel Evol ; 50(1): 27, 2018 05 18.

Artículo en Inglés | MEDLINE | ID: mdl-29776327

RESUMEN

BACKGROUND: Genomic prediction (GP) across breeds has so far resulted in low accuracies of the predicted genomic breeding values. Our objective was to evaluate whether using whole-genome sequence (WGS) instead of low-density markers can improve GP across breeds, especially when markers are pre-selected from a genome-wide association study (GWAS), and to test our hypothesis that many non-causal markers in WGS data have a diluting effect on accuracy of across-breed prediction. METHODS: Estimated breeding values for stature and bovine high-density (HD) genotypes were available for 595 Jersey bulls from New Zealand, 957 Holstein bulls from New Zealand and 5553 Holstein bulls from the Netherlands. BovineHD genotypes for all bulls were imputed to WGS using Beagle4 and Minimac2. Genomic prediction across the three populations was performed with ASReml4, with each population used as single reference and as single validation sets. In addition to the 50k, HD and WGS, markers that were significantly associated with stature in a large meta-GWAS analysis were selected and used for prediction, resulting in 10 prediction scenarios. Furthermore, we estimated the proportion of genetic variance captured by markers in each scenario. RESULTS: Across breeds, 50k, HD and WGS markers resulted in very low accuracies of prediction ranging from - 0.04 to 0.13. Accuracies were higher in scenarios with pre-selected markers from a meta-GWAS. For example, using only the 133 most significant markers in 133 QTL regions from the meta-GWAS yielded accuracies ranging from 0.08 to 0.23, while 23,125 markers with a - log10(p) higher than 7 resulted in accuracies of up 0.35. Using WGS data did not significantly improve the proportion of genetic variance captured across breeds compared to scenarios with few but pre-selected markers. CONCLUSIONS: Our results demonstrated that the accuracy of across-breed GP can be improved by using markers that are pre-selected from WGS based on their potential causal effect. We also showed that simply increasing the number of markers up to the WGS level does not increase the accuracy of across-breed prediction, even when markers that are expected to have a causal effect are included.

Asunto(s)

Cruzamiento , Bovinos/anatomía & histología , Bovinos/clasificación , Estudio de Asociación del Genoma Completo/veterinaria , Sitios de Carácter Cuantitativo , Animales , Biometría , Bovinos/genética , Biología Computacional , Variación Genética , Masculino , Modelos Genéticos , Linaje , Polimorfismo de Nucleótido Simple

5.

Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers.

Raymond, Biaty; Bouwman, Aniek C; Wientjes, Yvonne C J; Schrooten, Chris; Houwing-Duistermaat, Jeanine; Veerkamp, Roel F.

Genet Sel Evol ; 50(1): 49, 2018 Oct 10.

Artículo en Inglés | MEDLINE | ID: mdl-30314431

RESUMEN

BACKGROUND: Genomic prediction (GP) accuracy in numerically small breeds is limited by the small size of the reference population. Our objective was to test a multi-breed multiple genomic relationship matrices (GRM) GP model (MBMG) that weighs pre-selected markers separately, uses the remaining markers to explain the remaining genetic variance that can be explained by markers, and weighs information of breeds in the reference population by their genetic correlation with the validation breed. METHODS: Genotype and phenotype data were used on 595 Jersey bulls from New Zealand and 5503 Holstein bulls from the Netherlands, all with deregressed proofs for stature. Different sets of markers were used, containing either pre-selected markers from a meta-genome-wide association analysis on stature, remaining markers or both. We implemented a multi-breed bivariate GREML model in which we fitted either a single multi-breed GRM (MBSG), or two distinct multi-breed GRM (MBMG), one made with pre-selected markers and the other with remaining markers. Accuracies of predicting stature for Jersey individuals using the multi-breed models (Holstein and Jersey combined reference population) was compared to those obtained using either the Jersey (within-breed) or Holstein (across-breed) reference population. All the models were subsequently fitted in the analysis of simulated phenotypes, with a simulated genetic correlation between breeds of 1, 0.5, and 0.25. RESULTS: The MBMG model always gave better prediction accuracies for stature compared to MBSG, within-, and across-breed GP models. For example, with MBSG, accuracies obtained by fitting 48,912 unselected markers (0.43), 357 pre-selected markers (0.38) or a combination of both (0.43), were lower than accuracies obtained by fitting pre-selected and unselected markers in separate GRM in MBMG (0.49). This improvement was further confirmed by results from a simulation study, with MBMG performing on average 23% better than MBSG with all markers fitted. CONCLUSIONS: With the MBMG model, it is possible to use information from numerically large breeds to improve prediction accuracy of numerically small breeds. The superiority of MBMG is mainly due to its ability to use information on pre-selected markers, explain the remaining genetic variance and weigh information from a different breed by the genetic correlation between breeds.

Asunto(s)

Cruzamiento/métodos , Modelos Genéticos , Polimorfismo Genético , Animales , Cruzamiento/normas , Bovinos/genética , Marcadores Genéticos , Tamaño de la Muestra , Selección Genética

6.

Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.

Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L.

Genet Sel Evol ; 48(1): 95, 2016 12 01.

Artículo en Inglés | MEDLINE | ID: mdl-27905878

RESUMEN

BACKGROUND: Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. METHODS: Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. RESULTS: The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. CONCLUSIONS: Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.

Asunto(s)

Variación Genética , Estudio de Asociación del Genoma Completo , Genoma , Genómica , Animales , Cruzamiento , Bovinos , Genómica/métodos , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Selección Genética

7.

Efficient genomic prediction based on whole-genome sequence data using split-and-merge Bayesian variable selection.

Calus, Mario P L; Bouwman, Aniek C; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 48(1): 49, 2016 06 29.

Artículo en Inglés | MEDLINE | ID: mdl-27357580

RESUMEN

BACKGROUND: Use of whole-genome sequence data is expected to increase persistency of genomic prediction across generations and breeds but affects model performance and requires increased computing time. In this study, we investigated whether the split-and-merge Bayesian stochastic search variable selection (BSSVS) model could overcome these issues. BSSVS is performed first on subsets of sequence-based variants and then on a merged dataset containing variants selected in the first step. RESULTS: We used a dataset that included 4,154,064 variants after editing and de-regressed proofs for 3415 reference and 2138 validation bulls for somatic cell score, protein yield and interval first to last insemination. In the first step, BSSVS was performed on 106 subsets each containing ~39,189 variants. In the second step, 1060 up to 472,492 variants, selected from the first step, were included to estimate the accuracy of genomic prediction. Accuracies were at best equal to those achieved with the commonly used Bovine 50k-SNP chip, although the number of variants within a few well-known quantitative trait loci regions was considerably enriched. When variant selection and the final genomic prediction were performed on the same data, predictions were biased. Predictions computed as the average of the predictions computed for each subset achieved the highest accuracies, i.e. 0.5 to 1.1 % higher than the accuracies obtained with the 50k-SNP chip, and yielded the least biased predictions. Finally, the accuracy of genomic predictions obtained when all sequence-based variants were included was similar or up to 1.4 % lower compared to that based on the average predictions across the subsets. By applying parallelization, the split-and-merge procedure was completed in 5 days, while the standard analysis including all sequence-based variants took more than three months. CONCLUSIONS: The split-and-merge approach splits one large computational task into many much smaller ones, which allows the use of parallel processing and thus efficient genomic prediction based on whole-genome sequence data. The split-and-merge approach did not improve prediction accuracy, probably because we used data on a single breed for which relationships between individuals were high. Nevertheless, the split-and-merge approach may have potential for applications on data from multiple breeds.

Asunto(s)

Bovinos/genética , Biología Computacional , Genómica/métodos , Modelos Genéticos , Animales , Teorema de Bayes , Genotipo , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple

8.

Empirical and deterministic accuracies of across-population genomic prediction.

Wientjes, Yvonne C J; Veerkamp, Roel F; Bijma, Piter; Bovenhuis, Henk; Schrooten, Chris; Calus, Mario P L.

Genet Sel Evol ; 47: 5, 2015 Feb 06.

Artículo en Inglés | MEDLINE | ID: mdl-25885467

RESUMEN

BACKGROUND: Differences in linkage disequilibrium and in allele substitution effects of QTL (quantitative trait loci) may hinder genomic prediction across populations. Our objective was to develop a deterministic formula to estimate the accuracy of across-population genomic prediction, for which reference individuals and selection candidates are from different populations, and to investigate the impact of differences in allele substitution effects across populations and of the number of QTL underlying a trait on the accuracy. METHODS: A deterministic formula to estimate the accuracy of across-population genomic prediction was derived based on selection index theory. Moreover, accuracies were deterministically predicted using a formula based on population parameters and empirically calculated using simulated phenotypes and a GBLUP (genomic best linear unbiased prediction) model. Phenotypes of 1033 Holstein-Friesian, 105 Groninger White Headed and 147 Meuse-Rhine-Yssel cows were simulated by sampling 3000, 300, 30 or 3 QTL from the available high-density SNP (single nucleotide polymorphism) information of three chromosomes, assuming a correlation of 1.0, 0.8, 0.6, 0.4, or 0.2 between allele substitution effects across breeds. The simulated heritability was set to 0.95 to resemble the heritability of deregressed proofs of bulls. RESULTS: Accuracies estimated with the deterministic formula based on selection index theory were similar to empirical accuracies for all scenarios, while accuracies predicted with the formula based on population parameters overestimated empirical accuracies by ~25 to 30%. When the between-breed genetic correlation differed from 1, i.e. allele substitution effects differed across breeds, empirical and deterministic accuracies decreased in proportion to the genetic correlation. Using a multi-trait model, it was possible to accurately estimate the genetic correlation between the breeds based on phenotypes and high-density genotypes. The number of QTL underlying the simulated trait did not affect the accuracy. CONCLUSIONS: The deterministic formula based on selection index theory estimated the accuracy of across-population genomic predictions well. The deterministic formula using population parameters overestimated the across-population genomic accuracy, but may still be useful because of its simplicity. Both formulas could accommodate for genetic correlations between populations lower than 1. The number of QTL underlying a trait did not affect the accuracy of across-population genomic prediction using a GBLUP method.

Asunto(s)

Genética de Población/métodos , Genoma , Desequilibrio de Ligamiento/genética , Metagenómica/estadística & datos numéricos , Modelos Genéticos , Sitios de Carácter Cuantitativo , Algoritmos , Alelos , Animales , Cruzamiento/métodos , Bovinos/genética , Femenino , Genómica , Genotipo , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable

9.

Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle.

van Binsbergen, Rianne; Calus, Mario P L; Bink, Marco C A M; van Eeuwijk, Fred A; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 47: 71, 2015 Sep 17.

Artículo en Inglés | MEDLINE | ID: mdl-26381777

RESUMEN

BACKGROUND: In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data. METHODS: Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training. RESULTS: Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed. CONCLUSIONS: Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.

Asunto(s)

Genoma , Genómica/métodos , Modelos Genéticos , Análisis de Secuencia de ADN , Algoritmos , Animales , Bovinos , Cromosomas de los Mamíferos , Frecuencia de los Genes , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Linaje , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Reproducibilidad de los Resultados

10.

Genomic prediction of breeding values using previously estimated SNP variances.

Calus, Mario Pl; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 46: 52, 2014 Sep 25.

Artículo en Inglés | MEDLINE | ID: mdl-25928875

RESUMEN

BACKGROUND: Genomic prediction requires estimation of variances of effects of single nucleotide polymorphisms (SNPs), which is computationally demanding, and uses these variances for prediction. We have developed models with separate estimation of SNP variances, which can be applied infrequently, and genomic prediction, which can be applied routinely. METHODS: SNP variances were estimated with Bayes Stochastic Search Variable Selection (BSSVS) and BayesC. Genome-enhanced breeding values (GEBV) were estimated with RR-BLUP (ridge regression best linear unbiased prediction), using either variances obtained from BSSVS (BLUP-SSVS) or BayesC (BLUP-C), or assuming equal variances for each SNP. Datasets used to estimate SNP variances comprised (1) all animals, (2) 50% random animals (RAN50), (3) 50% best animals (TOP50), or (4) 50% worst animals (BOT50). Traits analysed were protein yield, udder depth, somatic cell score, interval between first and last insemination, direct longevity, and longevity including information from predictors. RESULTS: BLUP-SSVS and BLUP-C yielded similar GEBV as the equivalent Bayesian models that simultaneously estimated SNP variances. Reliabilities of these GEBV were consistently higher than from RR-BLUP, although only significantly for direct longevity. Across scenarios that used data subsets to estimate GEBV, observed reliabilities were generally higher for TOP50 than for RAN50, and much higher than for BOT50. Reliabilities of TOP50 were higher because the training data contained more ancestors of selection candidates. Using estimated SNP variances based on random or non-random subsets of the data, while using all data to estimate GEBV, did not affect reliabilities of the BLUP models. A convergence criterion of 10(-8) instead of 10(-10) for BLUP models yielded similar GEBV, while the required number of iterations decreased by 71 to 90%. Including a separate polygenic effect consistently improved reliabilities of the GEBV, but also substantially increased the required number of iterations to reach convergence with RR-BLUP. SNP variances converged faster for BayesC than for BSSVS. CONCLUSIONS: Combining Bayesian variable selection models to re-estimate SNP variances and BLUP models that use those SNP variances, yields GEBV that are similar to those from full Bayesian models. Moreover, these combined models yield predictions with higher reliability and less bias than the commonly used RR-BLUP model.

Asunto(s)

Cruzamiento , Bovinos/genética , Genómica/métodos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Animales , Teorema de Bayes , Genoma , Masculino , Sitios de Carácter Cuantitativo , Análisis de Regresión , Procesos Estocásticos

11.

Error rate for imputation from the Illumina BovineSNP50 chip to the Illumina BovineHD chip.

Schrooten, Chris; Dassonneville, Romain; Ducrocq, Vincent; Brøndum, Rasmus F; Lund, Mogens S; Chen, Jun; Liu, Zengting; González-Recio, Oscar; Pena, Juan; Druet, Tom.

Genet Sel Evol ; 46: 10, 2014 Feb 04.

Artículo en Inglés | MEDLINE | ID: mdl-24495554

RESUMEN

BACKGROUND: Imputation of genotypes from low-density to higher density chips is a cost-effective method to obtain high-density genotypes for many animals, based on genotypes of only a relatively small subset of animals (reference population) on the high-density chip. Several factors influence the accuracy of imputation and our objective was to investigate the effects of the size of the reference population used for imputation and of the imputation method used and its parameters. Imputation of genotypes was carried out from 50,000 (moderate-density) to 777,000 (high-density) SNPs (single nucleotide polymorphisms). METHODS: The effect of reference population size was studied in two datasets: one with 548 and one with 1289 Holstein animals, genotyped with the Illumina BovineHD chip (777 k SNPs). A third dataset included the 548 animals genotyped with the 777 k SNP chip and 2200 animals genotyped with the Illumina BovineSNP50 chip. In each dataset, 60 animals were chosen as validation animals, for which all high-density genotypes were masked, except for the Illumina BovineSNP50 markers. Imputation was studied in a subset of six chromosomes, using the imputation software programs Beagle and DAGPHASE. RESULTS: Imputation with DAGPHASE and Beagle resulted in 1.91% and 0.87% allelic imputation error rates in the dataset with 548 high-density genotypes, when scale and shift parameters were 2.0 and 0.1, and 1.0 and 0.0, respectively. When Beagle was used alone, the imputation error rate was 0.67%. If the information obtained by Beagle was subsequently used in DAGPHASE, imputation error rates were slightly higher (0.71%). When 2200 moderate-density genotypes were added and Beagle was used alone, imputation error rates were slightly lower (0.64%). The least imputation errors were obtained with Beagle in the reference set with 1289 high-density genotypes (0.41%). CONCLUSIONS: For imputation of genotypes from the 50 k to the 777 k SNP chip, Beagle gave the lowest allelic imputation error rates. Imputation error rates decreased with increasing size of the reference population. For applications for which computing time is limiting, DAGPHASE using information from Beagle can be considered as an alternative, since it reduces computation time and increases imputation error rates only slightly.

Asunto(s)

Bovinos/genética , Técnicas de Genotipaje/instrumentación , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Polimorfismo de Nucleótido Simple , Alelos , Animales , Femenino , Frecuencia de los Genes , Genotipo , Masculino

12.

A common reference population from four European Holstein populations increases reliability of genomic predictions.

Lund, Mogens S; Roos, Adrianus P W de; Vries, Alfred G de; Druet, Tom; Ducrocq, Vincent; Fritz, Sébastien; Guillaume, François; Guldbrandtsen, Bernt; Liu, Zenting; Reents, Reinhard; Schrooten, Chris; Seefried, Franz; Su, Guosheng.

Genet Sel Evol ; 43: 43, 2011 Dec 12.

Artículo en Inglés | MEDLINE | ID: mdl-22152008

RESUMEN

BACKGROUND: Size of the reference population and reliability of phenotypes are crucial factors influencing the reliability of genomic predictions. It is therefore useful to combine closely related populations. Increased accuracies of genomic predictions depend on the number of individuals added to the reference population, the reliability of their phenotypes, and the relatedness of the populations that are combined. METHODS: This paper assesses the increase in reliability achieved when combining four Holstein reference populations of 4000 bulls each, from European breeding organizations, i.e. UNCEIA (France), VikingGenetics (Denmark, Sweden, Finland), DHV-VIT (Germany) and CRV (The Netherlands, Flanders). Each partner validated its own bulls using their national reference data and the combined data, respectively. RESULTS: Combining the data significantly increased the reliability of genomic predictions for bulls in all four populations. Reliabilities increased by 10%, compared to reliabilities obtained with national reference populations alone, when they were averaged over countries and the traits evaluated. For different traits and countries, the increase in reliability ranged from 2% to 19%. CONCLUSIONS: Genomic selection programs benefit greatly from combining data from several closely related populations into a single large reference population.

Asunto(s)

Cruzamiento , Modelos Genéticos , Animales , Bovinos , Europa (Continente) , Femenino , Variación Genética , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Valores de Referencia , Selección Genética

13.

Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations.

Xiang, Ruidong; MacLeod, Iona M; Daetwyler, Hans D; de Jong, Gerben; O'Connor, Erin; Schrooten, Chris; Chamberlain, Amanda J; Goddard, Michael E.

Nat Commun ; 12(1): 860, 2021 02 08.

Artículo en Inglés | MEDLINE | ID: mdl-33558518

RESUMEN

The difficulty in finding causative mutations has hampered their use in genomic prediction. Here, we present a methodology to fine-map potentially causal variants genome-wide by integrating the functional, evolutionary and pleiotropic information of variants using GWAS, variant clustering and Bayesian mixture models. Our analysis of 17 million sequence variants in 44,000+ Australian dairy cattle for 34 traits suggests, on average, one pleiotropic QTL existing in each 50 kb chromosome-segment. We selected a set of 80k variants representing potentially causal variants within each chromosome segment to develop a bovine XT-50K genotyping array. The custom array contains many pleiotropic variants with biological functions, including splicing QTLs and variants at conserved sites across 100 vertebrate species. This biology-informed custom array outperformed the standard array in predicting genetic value of multiple traits across populations in independent datasets of 90,000+ dairy cattle from the USA, Australia and New Zealand.

Asunto(s)

Bovinos/genética , Mapeo Cromosómico , Pleiotropía Genética , Internacionalidad , Carácter Cuantitativo Heredable , Animales , Teorema de Bayes , Cromosomas de los Mamíferos/genética , Análisis por Conglomerados , Femenino , Marcadores Genéticos , Variación Genética , Genoma , Masculino , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados

14.

Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle.

Khansefid, Majid; Goddard, Michael E; Haile-Mariam, Mekonnen; Konstantinov, Kon V; Schrooten, Chris; de Jong, Gerben; Jewell, Erica G; O'Connor, Erin; Pryce, Jennie E; Daetwyler, Hans D; MacLeod, Iona M.

Front Genet ; 11: 598580, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33381150

RESUMEN

This study assessed the accuracy and bias of genomic prediction (GP) in purebred Holstein (H) and Jersey (J) as well as crossbred (H and J) validation cows using different reference sets and prediction strategies. The reference sets were made up of different combinations of 36,695 H and J purebreds and crossbreds. Additionally, the effect of using different sets of marker genotypes on GP was studied (conventional panel: 50k, custom panel enriched with, or close to, causal mutations: XT_50k, and conventional high-density with a limited custom set: pruned HDnGBS). We also compared the use of genomic best linear unbiased prediction (GBLUP) and Bayesian (emBayesR) models, and the traits tested were milk, fat, and protein yields. On average, by including crossbred cows in the reference population, the prediction accuracies increased by 0.01-0.08 and were less biased (regression coefficient closer to 1 by 0.02-0.16), and the benefit was greater for crossbreds compared to purebreds. The accuracy of prediction increased by 0.02 using XT_50k compared to 50k genotypes without affecting the bias. Although using pruned HDnGBS instead of 50k also increased the prediction accuracy by about 0.02, it increased the bias for purebred predictions in emBayesR models. Generally, emBayesR outperformed GBLUP for prediction accuracy when using 50k or pruned HDnGBS genotypes, but the benefits diminished with XT_50k genotypes. Crossbred predictions derived from a joint pure H and J reference were similar in accuracy to crossbred predictions derived from the two separate purebred reference sets and combined proportional to breed composition. However, the latter approach was less biased by 0.13. Most interestingly, using an equalized breed reference instead of an H-dominated reference, on average, reduced the bias of prediction by 0.16-0.19 and increased the accuracy by 0.04 for crossbred and J cows, with a little change in the H accuracy. In conclusion, we observed improved genomic predictions for both crossbreds and purebreds by equalizing breed contributions in a mixed breed reference that included crossbred cows. Furthermore, we demonstrate, that compared to the conventional 50k or high-density panels, our customized set of 50k sequence markers improved or matched the prediction accuracy and reduced bias with both GBLUP and Bayesian models.

15.

Mapping QTL influencing gastrointestinal nematode burden in Dutch Holstein-Friesian dairy cattle.

Coppieters, Wouter; Mes, Ted H M; Druet, Tom; Farnir, Frédéric; Tamma, Nico; Schrooten, Chris; Cornelissen, Albert W C A; Georges, Michel; Ploeger, Harm W.

BMC Genomics ; 10: 96, 2009 Mar 02.

Artículo en Inglés | MEDLINE | ID: mdl-19254385

RESUMEN

BACKGROUND: Parasitic gastroenteritis caused by nematodes is only second to mastitis in terms of health costs to dairy farmers in developed countries. Sustainable control strategies complementing anthelmintics are desired, including selective breeding for enhanced resistance. RESULTS AND CONCLUSION: To quantify and characterize the genetic contribution to variation in resistance to gastro-intestinal parasites, we measured the heritability of faecal egg and larval counts in the Dutch Holstein-Friesian dairy cattle population. The heritability of faecal egg counts ranged from 7 to 21% and was generally higher than for larval counts. We performed a whole genome scan in 12 paternal half-daughter groups for a total of 768 cows, corresponding to the approximately 10% most and least infected daughters within each family (selective genotyping). Two genome-wide significant QTL were identified in an across-family analysis, respectively on chromosomes 9 and 19, coinciding with previous findings in orthologous chromosomal regions in sheep. We identified six more suggestive QTL by within-family analysis. An additional 73 informative SNPs were genotyped on chromosome 19 and the ensuing high density map used in a variance component approach to simultaneously exploit linkage and linkage disequilibrium in an initial inconclusive attempt to refine the QTL map position.

Asunto(s)

Bovinos/genética , Bovinos/parasitología , Mapeo Cromosómico/veterinaria , Sitios de Carácter Cuantitativo , Animales , Enfermedades de los Bovinos/genética , Enfermedades de los Bovinos/parasitología , Industria Lechera , Heces/parasitología , Femenino , Genoma , Genotipo , Parasitosis Intestinales/genética , Parasitosis Intestinales/parasitología , Desequilibrio de Ligamiento , Masculino , Nematodos/fisiología , Infecciones por Nematodos/genética , Infecciones por Nematodos/parasitología , Óvulo/parasitología , Polimorfismo de Nucleótido Simple

16.

Effects of the number of markers per haplotype and clustering of haplotypes on the accuracy of QTL mapping and prediction of genomic breeding values.

Calus, Mario P L; Meuwissen, Theo H E; Windig, Jack J; Knol, Egbert F; Schrooten, Chris; Vereijken, Addie L J; Veerkamp, Roel F.

Genet Sel Evol ; 41: 11, 2009 Jan 15.

Artículo en Inglés | MEDLINE | ID: mdl-19284677

RESUMEN

The aim of this paper was to compare the effect of haplotype definition on the precision of QTL-mapping and on the accuracy of predicted genomic breeding values. In a multiple QTL model using identity-by-descent (IBD) probabilities between haplotypes, various haplotype definitions were tested i.e. including 2, 6, 12 or 20 marker alleles and clustering base haplotypes related with an IBD probability of > 0.55, 0.75 or 0.95. Simulated data contained 1100 animals with known genotypes and phenotypes and 1000 animals with known genotypes and unknown phenotypes. Genomes comprising 3 Morgan were simulated and contained 74 polymorphic QTL and 383 polymorphic SNP markers with an average r2 value of 0.14 between adjacent markers. The total number of haplotypes decreased up to 50% when the window size was increased from two to 20 markers and decreased by at least 50% when haplotypes related with an IBD probability of > 0.55 instead of > 0.95 were clustered. An intermediate window size led to more precise QTL mapping. Window size and clustering had a limited effect on the accuracy of predicted total breeding values, ranging from 0.79 to 0.81. Our conclusion is that different optimal window sizes should be used in QTL-mapping versus genome-wide breeding value prediction.

Asunto(s)

Animales Domésticos/genética , Cruzamiento , Mapeo Cromosómico , Marcadores Genéticos , Sitios de Carácter Cuantitativo , Animales , Femenino , Genoma , Haplotipos , Masculino , Modelos Genéticos , Polimorfismo Genético

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA