Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
2.
Genet Sel Evol ; 55(1): 70, 2023 Oct 12.
Article in English | MEDLINE | ID: mdl-37828440

ABSTRACT

BACKGROUND: Combining the results of within-population genome-wide association studies (GWAS) based on whole-genome sequences into a single meta-analysis (MA) is an accurate and powerful method for identifying variants associated with complex traits. As part of the H2020 BovReg project, we performed sequence-level MA for beef production traits. Five partners from France, Switzerland, Germany, and Canada contributed summary statistics from sequence-based GWAS conducted with 54,782 animals from 15 purebred or crossbred populations. We combined the summary statistics for four growth, nine morphology, and 15 carcass traits into 16 MA, using both fixed effects and z-score methods. RESULTS: The fixed-effects method was generally more informative to provide indication on potentially causal variants, although we combined substantially different traits in each MA. In comparison with within-population GWAS, this approach highlighted (i) a larger number of quantitative trait loci (QTL), (ii) QTL more frequently located in genomic regions known for their effects on growth and meat/carcass traits, (iii) a smaller number of genomic variants within the QTL, and (iv) candidate variants that were more frequently located in genes. MA pinpointed variants in genes, including MSTN, LCORL, and PLAG1 that have been previously associated with morphology and carcass traits. We also identified dozens of other variants located in genes associated with growth and carcass traits, or with a function that may be related to meat production (e.g., HS6ST1, HERC2, WDR75, COL3A1, SLIT2, MED28, and ANKAR). Some of these variants overlapped with expression or splicing QTL reported in the cattle Genotype-Tissue Expression atlas (CattleGTEx) and could therefore regulate gene expression. CONCLUSIONS: By identifying candidate genes and potential causal variants associated with beef production traits in cattle, MA demonstrates great potential for investigating the biological mechanisms underlying these traits. As a complement to within-population GWAS, this approach can provide deeper insights into the genetic architecture of complex traits in beef cattle.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Cattle/genetics , Animals , Phenotype , Meat/analysis , Genomics , Polymorphism, Single Nucleotide
3.
BMC Bioinformatics ; 23(1): 365, 2022 Sep 06.
Article in English | MEDLINE | ID: mdl-36068513

ABSTRACT

BACKGROUND: It is now widespread in livestock and plant breeding to use genotyping data to predict phenotypes with genomic prediction models. In parallel, genomic annotations related to a variety of traits are increasing in number and granularity, providing valuable insight into potentially important positions in the genome. The BayesRC model integrates this prior biological information by factorizing the genome according to disjoint annotation categories, in some cases enabling improved prediction of heritable traits. However, BayesRC is not adapted to cases where markers may have multiple annotations. RESULTS: We propose two novel Bayesian approaches to account for multi-annotated markers through a cumulative (BayesRC+) or preferential (BayesRC[Formula: see text]) model of the contribution of multiple annotation categories. We illustrate their performance on simulated data with various genetic architectures and types of annotations. We also explore their use on data from a backcross population of growing pigs in conjunction with annotations constructed using the PigQTLdb. In both simulated and real data, we observed a modest improvement in prediction quality with our models when used with informative annotations. In addition, our results show that BayesRC+ successfully prioritizes multi-annotated markers according to their posterior variance, while BayesRC[Formula: see text] provides a useful interpretation of informative annotations for multi-annotated markers. Finally, we explore several strategies for constructing annotations from a public database, highlighting the importance of careful consideration of this step. CONCLUSION: When used with annotations that are relevant to the trait under study, BayesRC[Formula: see text] and BayesRC+ allow for improved prediction and prioritization of multi-annotated markers, and can provide useful biological insight into the genetic architecture of traits.


Subject(s)
Models, Genetic , Multifactorial Inheritance , Bayes Theorem , Genomics/methods , Genotype , Phenotype , Polymorphism, Single Nucleotide
4.
G3 (Bethesda) ; 11(11)2021 10 19.
Article in English | MEDLINE | ID: mdl-34849780

ABSTRACT

Technological advances and decreasing costs have led to the rise of increasingly dense genotyping data, making feasible the identification of potential causal markers. Custom genotyping chips, which combine medium-density genotypes with a custom genotype panel, can capitalize on these candidates to potentially yield improved accuracy and interpretability in genomic prediction. A particularly promising model to this end is BayesR, which divides markers into four effect size classes. BayesR has been shown to yield accurate predictions and promise for quantitative trait loci (QTL) mapping in real data applications, but an extensive benchmarking in simulated data is currently lacking. Based on a set of real genotypes, we generated simulated data under a variety of genetic architectures and phenotype heritabilities, and we evaluated the impact of excluding or including causal markers among the genotypes. We define several statistical criteria for QTL mapping, including several based on sliding windows to account for linkage disequilibrium (LD). We compare and contrast these statistics and their ability to accurately prioritize known causal markers. Overall, we confirm the strong predictive performance for BayesR in moderately to highly heritable traits, particularly for 50k custom data. In cases of low heritability or weak LD with the causal marker in 50k genotypes, QTL mapping is a challenge, regardless of the criterion used. BayesR is a promising approach to simultaneously obtain accurate predictions and interpretable classifications of SNPs into effect size classes. We illustrated the performance of BayesR in a variety of simulation scenarios, and compared the advantages and limitations of each.


Subject(s)
Genomics , Quantitative Trait Loci , Genotype , Linkage Disequilibrium , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide
5.
Genet Sel Evol ; 52(1): 55, 2020 Oct 01.
Article in English | MEDLINE | ID: mdl-32998688

ABSTRACT

BACKGROUND: Over the last years, genome-wide association studies (GWAS) based on imputed whole-genome sequences (WGS) have been used to detect quantitative trait loci (QTL) and highlight candidate genes for important traits. However, in general this approach does not allow to validate the effects of candidate mutations or determine if they are truly causative for the trait(s) in question. To address these questions, we applied a two-step, within-breed GWAS approach on 15 traits (5 linked with milk production, 2 with udder health, and 8 with udder morphology) in Montbéliarde (MON), Normande (NOR), and Holstein (HOL) cattle. We detected the most-promising candidate variants (CV) using imputed WGS of 2515 MON, 2203 NOR, and 6321 HOL bulls, and validated their effects in three younger populations of 23,926 MON, 9400 NOR, and 51,977 HOL cows. RESULTS: Bull sequence-based GWAS detected 84 QTL: 13, 10, and 30 for milk production traits; 3, 0, and 2 for somatic cell score (SCS); and 8, 2 and 16 for udder morphology traits, in MON, NOR, and HOL respectively. Five genomic regions with effects on milk production traits were shared among the three breeds whereas six (2 for production and 4 for udder morphology and health traits) had effects in two breeds. In 80 of these QTL, 855 CV were highlighted based on the significance of their effects and functional annotation. The subsequent GWAS on MON, NOR, and HOL cows validated 8, 9, and 23 QTL for production traits; 0, 0, and 1 for SCS; and 4, 1, and 8 for udder morphology traits, respectively. In 47 of the 54 confirmed QTL, the CV identified in bulls had more significant effects than single nucleotide polymorphisms (SNPs) from the standard 50K chip. The best CV for each validated QTL was located in a gene that was functionally related to production (36 QTL) or udder (9 QTL) traits. CONCLUSIONS: Using this two-step GWAS approach, we identified and validated 54 QTL that included CV mostly located within functional candidate genes and explained up to 6.3% (udder traits) and 37% (production traits) of the genetic variance of economically important dairy traits. These CV are now included in the chip used to evaluate French dairy cattle and can be integrated into routine genomic evaluation.


Subject(s)
Cattle/genetics , Lactation/genetics , Mammary Glands, Animal/physiology , Quantitative Trait Loci , Animals , Cattle/physiology , Female , Mammary Glands, Animal/anatomy & histology , Milk/metabolism , Polymorphism, Genetic , Quantitative Trait, Heritable
6.
Animals (Basel) ; 10(10)2020 Oct 17.
Article in English | MEDLINE | ID: mdl-33080801

ABSTRACT

In the management of dairy cattle breeds, two recent trends have arisen that pose potential threats to genetic diversity: the use of reproductive technologies (RT) and a reduction in the number of bulls in breeding schemes. The expected outcome of these changes, in terms of both genetic gain and genetic diversity, is not trivial to predict. Here, we simulated 15 breeding schemes similar to those carried out in large French dairy cattle breeds; breeding schemes differed with respect to their dimensions, the intensity of RT use, and the type of RT involved. We found that intensive use of RT resulted in improved genetic gain, but deteriorated genetic diversity. Specifically, a reduction in the interval between generations through the use of ovum pick-up and in vitro fertilization (OPU-IVF) resulted in a large increase in the inbreeding rate both per year and per generation, suggesting that OPU-IVF could have severe adverse effects on genetic diversity. To achieve a given level of genetic gain, the scenarios that best maintained genetic diversity were those with a higher number of sires/bulls and a medium intensity of RT use or those with a higher number of female donors to compensate for the increased intensity of RT.

7.
Genet Sel Evol ; 51(1): 52, 2019 Sep 23.
Article in English | MEDLINE | ID: mdl-31547802

ABSTRACT

BACKGROUND: In France, implementation of genomic evaluations in dairy cattle breeds started in 2009 and this has modified the breeding schemes drastically. In this context, the goal of our study was to understand the impact of genomic selection on the genetic diversity of bulls from three French dairy cattle breeds born between 2005 and 2015 (Montbéliarde, Normande and Holstein) and the factors that are involved. METHODS: We compared annual genetic gains, inbreeding rates based on runs of homozygosity (ROH) and pedigree data, and mean ROH length within breeds, before and after the implementation of genomic selection. RESULTS: Genomic selection induced an increase in mean annual genetic gains of 50, 71 and 33% for Montbéliarde, Normande and Holstein bulls, respectively, and in parallel, the generation intervals were reduced by a factor of 1.7, 1.9 and 2, respectively. We found no significant change in inbreeding rate for the two national breeds, Montbéliarde and Normande, and a significant increase in inbreeding rate for the Holstein international breed, which is now as high as 0.55% per year based on ROH and 0.49% per year based on pedigree data (equivalent to a rate of 1.36 and 1.39% per generation, respectively). The mean ROH length was longer for bulls from the Holstein breed than for those from the other two breeds. CONCLUSIONS: With the implementation of genomic selection, the annual genetic gain increased for bulls from the three major French dairy cattle breeds. At the same time, the annual loss of genetic diversity increased for Holstein bulls, possibly because of the massive use of a few elite bulls in this breed, but not for Montbéliarde and Normande bulls. The increase in mean ROH length in Holstein may reflect the occurrence of recent inbreeding. New strategies in breeding schemes, such as female donor stations and embryo transfer, and recent implementation of genomic evaluations in small regional breeds should be studied carefully in order to ensure the sustainability of breeding schemes in the future.


Subject(s)
Breeding , Cattle/genetics , Genetic Variation , Selection, Genetic , Animals , Datasets as Topic , Female , France , Homozygote , Inbreeding , Male , Pedigree
8.
Nat Genet ; 50(3): 362-367, 2018 03.
Article in English | MEDLINE | ID: mdl-29459679

ABSTRACT

Stature is affected by many polymorphisms of small effect in humans 1 . In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 × 10-8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP-seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals.


Subject(s)
Body Size/genetics , Cattle/genetics , Conserved Sequence , Genome-Wide Association Study , Mammals/genetics , Animals , Body Height/genetics , Cattle/classification , Genetic Association Studies/veterinary , Genetic Variation , Genome-Wide Association Study/statistics & numerical data , Genome-Wide Association Study/veterinary , Humans , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics
9.
G3 (Bethesda) ; 8(1): 113-121, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29133511

ABSTRACT

Genomic selection (GS) is commonly used in livestock and increasingly in plant breeding. Relying on phenotypes and genotypes of a reference population, GS allows performance prediction for young individuals having only genotypes. This is expected to achieve fast high genetic gain but with a potential loss of genetic diversity. Existing methods to conserve genetic diversity depend mostly on the choice of the breeding individuals. In this study, we propose a modification of the reference population composition to mitigate diversity loss. Since the high cost of phenotyping is the limiting factor for GS, our findings are of major economic interest. This study aims to answer the following questions: how would decisions on the reference population affect the breeding population, and how to best select individuals to update the reference population and balance maximizing genetic gain and minimizing loss of genetic diversity? We investigated three updating strategies for the reference population: random, truncation, and optimal contribution (OC) strategies. OC maximizes genetic merit for a fixed loss of genetic diversity. A French Montbéliarde dairy cattle population with 50K SNP chip genotypes and simulations over 10 generations were used to compare these different strategies using milk production as the trait of interest. Candidates were selected to update the reference population. Prediction bias and both genetic merit and diversity were measured. Changes in the reference population composition slightly affected the breeding population. Optimal contribution strategy appeared to be an acceptable compromise to maintain both genetic gain and diversity in the reference and the breeding populations.


Subject(s)
Genome , Lactation/genetics , Models, Genetic , Quantitative Trait, Heritable , Selection, Genetic , Animals , Breeding/methods , Cattle , Dairying , Female , Genetic Variation , Genotype , Male , Phenotype
10.
Genet Sel Evol ; 49(1): 68, 2017 09 18.
Article in English | MEDLINE | ID: mdl-28923017

ABSTRACT

BACKGROUND: Genome-wide association studies (GWAS) were performed at the sequence level to identify candidate mutations that affect the expression of six major milk proteins in Montbéliarde (MON), Normande (NOR), and Holstein (HOL) dairy cattle. Whey protein (α-lactalbumin and ß-lactoglobulin) and casein (αs1, αs2, ß, and κ) contents were estimated by mid-infrared (MIR) spectrometry, with medium to high accuracy (0.59 ≤ R2 ≤ 0.92), for 848,068 test-day milk samples from 156,660 cows in the first three lactations. Milk composition was evaluated as average test-day measurements adjusted for environmental effects. Next, we genotyped a subset of 8080 cows (2967 MON, 2737 NOR, and 2306 HOL) with the BovineSNP50 Beadchip. For each breed, genotypes were first imputed to high-density (HD) using HD single nucleotide polymorphisms (SNPs) genotypes of 522 MON, 546 NOR, and 776 HOL bulls. The resulting HD SNP genotypes were subsequently imputed to the sequence level using 27 million high-quality sequence variants selected from Run4 of the 1000 Bull Genomes consortium (1147 bulls). Within-breed, multi-breed, and conditional GWAS were performed. RESULTS: Thirty-four distinct genomic regions were identified. Three regions on chromosomes 6, 11, and 20 had very significant effects on milk composition and were shared across the three breeds. Other significant effects, which partially overlapped across breeds, were found on almost all the autosomes. Multi-breed analyses provided a larger number of significant genomic regions with smaller confidence intervals than within-breed analyses. Combinations of within-breed, multi-breed, and conditional analyses led to the identification of putative causative variants in several candidate genes that presented significant protein-protein interactions enrichment, including those with previously described effects on milk composition (SLC37A1, MGST1, ABCG2, CSN1S1, CSN2, CSN1S2, CSN3, PAEP, DGAT1, AGPAT6) and those with effects reported for the first time here (ALPL, ANKH, PICALM). CONCLUSIONS: GWAS applied to fine-scale phenotypes, multiple breeds, and whole-genome sequences seems to be effective to identify candidate gene variants. However, although we identified functional links between some candidate genes and milk phenotypes, the causality between candidate variants and milk protein composition remains to be demonstrated. Nevertheless, the identification of potential causative mutations that underlie milk protein composition may have immediate applications for improvements in cheese-making.


Subject(s)
Breeding , Cattle/genetics , Genome-Wide Association Study , Lactation/genetics , Milk Proteins/genetics , Mutation/genetics , Animals , Female , Genetic Variation/genetics , Genome/genetics , Male , Milk/chemistry
11.
J Dairy Sci ; 100(4): 2905-2908, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28161173

ABSTRACT

The construction and use of haploblocks [adjacent single nucleotide polymorphisms (SNP) in strong linkage disequilibrium] for genomic evaluation is advantageous, because the number of effects to be estimated can be reduced without discarding relevant genomic information. Furthermore, haplotypes (the combination of 2 or more SNP) can increase the probability of capturing the quantitative trait loci effect compared with individual SNP markers. With regards to haplotypes, the allele frequency parameter is also of interest, because as a selection criterion, it allows the number of rare alleles to be reduced, and the effects of those alleles are usually difficult to estimate. We have proposed a simple pipeline that simultaneously incorporates linkage disequilibrium and allele frequency information in genomic evaluation, and here we present the first results obtained with this procedure. We used a population of 2,235 progeny-tested bulls from the Montbéliarde breed for the tests. Phenotype data were available in the form of daughter yield deviations on 5 production traits, and genotype data were available from the 50K SNP chip. We conducted a classical validation study by splitting the population into training (80% oldest animals) and validation (20% youngest animals) sets to emulate a real-life scenario in which the selection candidates had no available phenotype data. We measured all reported parameters for the validation set. Our results proved that the proposed method was indeed advantageous, and that the accuracy of genomic evaluation could be improved. Compared with results from a genomic BLUP analysis, correlations between daughter yield deviations (a proxy for true) and genomic estimated breeding values increased by an average of 2.7 percentage points for the 5 traits. Inflation of the genomic evaluation of the selection candidates was also significantly reduced. The proposed method outperformed the other SNP and haplotype-based tests we had evaluated in a previous study. The combination of linkage disequilibrium-based haploblocks and allele frequency-based haplotype selection methods is a promising way to improve the efficiency of genomic evaluation. Further work is needed to optimize each step in the proposed analysis pipeline.


Subject(s)
Haplotypes , Linkage Disequilibrium , Animals , Breeding , Cattle , Gene Frequency , Genome , Genomics , Genotype , Male , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide
12.
C R Biol ; 339(7-8): 274-7, 2016.
Article in English | MEDLINE | ID: mdl-27185591

ABSTRACT

The principles of genomic selection are described, with the main factors affecting its efficiency and the assumptions underlying the different models proposed. The reasons of its fast adoption in dairy cattle are explained and the conditions of its application to other species are discussed. Perspectives of development include: selection for new traits and new breeding objectives; adoption of more robust approaches based on information on causal variants; predictions of genotype×environment interactions.


Subject(s)
Animals, Domestic/genetics , Cattle/genetics , Genomics , Animals , Breeding , Dairying , Genetic Variation , Selection, Genetic
13.
J Dairy Sci ; 99(6): 4537-4546, 2016 Jun.
Article in English | MEDLINE | ID: mdl-26995132

ABSTRACT

Genomic evaluation methods today use single nucleotide polymorphism (SNP) as genomic markers to trace quantitative trait loci (QTL). Today most genomic prediction procedures use biallelic SNP markers. However, SNP can be combined into short, multiallelic haplotypes that can improve genomic prediction due to higher linkage disequilibrium between the haplotypes and the linked QTL. The aim of this study was to develop a method to identify the haplotypes, which can be expected to be superior in genomic evaluation, as compared with either SNP or other haplotypes of the same size. We first identified the SNP (termed as QTL-SNP) from the bovine 50K SNP chip that had the largest effect on the analyzed trait. It was assumed that these SNP were not the causative mutations and they merely indicated the approximate location of the QTL. Haplotypes of 3, 4, or 5 SNP were selected from short genomic windows surrounding these markers to capture the effect of the QTL. Two methods described in this paper aim at selecting the most optimal haplotype for genomic evaluation. They assumed that if an allele has a high frequency, its allele effect can be accurately predicted. These methods were tested in a classical validation study using a dairy cattle population of 2,235 bulls with genotypes from the bovine 50K SNP chip and daughter yield deviations (DYD) on 5 dairy cattle production traits. Combining the SNP into haplotypes was beneficial with all tested haplotypes, leading to an average increase of 2% in terms of correlations between DYD and genomic breeding value estimates compared with the analysis when the same SNP were used individually. Compared with haplotypes built by merging the QTL-SNP with its flanking SNP, the haplotypes selected with the proposed criteria carried less under- and over-represented alleles: the proportion of alleles with frequencies <1 or >40% decreased, on average, by 17.4 and 43.4%, respectively. The correlations between DYD and genomic breeding value estimates increased by 0.7 to 0.9 percentage points when the haplotypes were selected using any of the proposed methods compared with using the haplotypes built from the QTL-SNP and its flanking markers. We showed that the efficiency of genomic prediction could be improved at no extra costs, only by selecting the proper markers or combinations of markers for genomic prediction. One of the presented approaches was implemented in the new genomic evaluation procedure applied in dairy cattle in France in April 2015.


Subject(s)
Cattle/genetics , Genomics/methods , Haplotypes , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Alleles , Animals , Breeding , Dairying , Male , Quantitative Trait Loci
14.
Genet Sel Evol ; 47: 6, 2015 Feb 12.
Article in English | MEDLINE | ID: mdl-25885597

ABSTRACT

BACKGROUND: With dense genotyping, many choices exist for methods to detect quantitative trait loci (QTL) in livestock populations. However, no across-species study has been conducted on the performance of different methods using real data. We compared three methods that correct for relatedness either implicitly or explicitly: linkage and linkage disequilibrium haplotype-based analysis (LDLA), efficient mixed-model association (EMMA) analysis, and Bayesian whole-genome regression (BayesC). We analyzed one chromosome in each of five datasets (dairy cattle, beef cattle, sheep, horses, and pigs) using real genotypes based on dense single nucleotide polymorphisms and phenotypes. The P values corrected for multiple testing or Bayes factors greater than 150 were considered to be significant. To complete the real data study, we also simulated quantitative trait loci (QTL) for the same datasets based on the real genotypes. Several scenarios were chosen, with different QTL effects and linkage disequilibrium patterns. A pseudo-null statistical distribution was chosen to make the significance thresholds comparable across methods. RESULTS: For the real data, the three methods generally agreed within 1 or 2 cM for the locations of QTL regions and disagreed when no signals were significant (e.g. in pigs). For certain datasets, LDLA had more significant signals than EMMA or BayesC, but they were concentrated around the same peaks. Therefore, the three methods detected approximately the same number of QTL regions. For the simulated data, LDLA was slightly less powerful and accurate than either EMMA or BayesC but this depended strongly on how thresholds were set in the simulations. CONCLUSIONS: All three methods performed similarly for real and simulated data. No method was clearly superior across all datasets or for any particular dataset. For computational efficiency and ease of interpretation, EMMA is recommended, but using more than one method is suggested.


Subject(s)
Chromosome Mapping/methods , Genetic Markers , Genome , Livestock/genetics , Quantitative Trait Loci/genetics , Animals , Bayes Theorem , Cattle/genetics , Genetic Linkage , Genotype , Haplotypes/genetics , Horses/genetics , Linkage Disequilibrium , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide , Sheep/genetics , Sus scrofa/genetics
15.
Genet Sel Evol ; 45: 33, 2013 Sep 03.
Article in English | MEDLINE | ID: mdl-24004563

ABSTRACT

BACKGROUND: Genotyping with the medium-density Bovine SNP50 BeadChip® (50K) is now standard in cattle. The high-density BovineHD BeadChip®, which contains 777,609 single nucleotide polymorphisms (SNPs), was developed in 2010. Increasing marker density increases the level of linkage disequilibrium between quantitative trait loci (QTL) and SNPs and the accuracy of QTL localization and genomic selection. However, re-genotyping all animals with the high-density chip is not economically feasible. An alternative strategy is to genotype part of the animals with the high-density chip and to impute high-density genotypes for animals already genotyped with the 50K chip. Thus, it is necessary to investigate the error rate when imputing from the 50K to the high-density chip. METHODS: Five thousand one hundred and fifty three animals from 16 breeds (89 to 788 per breed) were genotyped with the high-density chip. Imputation error rates from the 50K to the high-density chip were computed for each breed with a validation set that included the 20% youngest animals. Marker genotypes were masked for animals in the validation population in order to mimic 50K genotypes. Imputation was carried out using the Beagle 3.3.0 software. RESULTS: Mean allele imputation error rates ranged from 0.31% to 2.41% depending on the breed. In total, 1980 SNPs had high imputation error rates in several breeds, which is probably due to genome assembly errors, and we recommend to discard these in future studies. Differences in imputation accuracy between breeds were related to the high-density-genotyped sample size and to the genetic relationship between reference and validation populations, whereas differences in effective population size and level of linkage disequilibrium showed limited effects. Accordingly, imputation accuracy was higher in breeds with large populations and in dairy breeds than in beef breeds. More than 99% of the alleles were correctly imputed if more than 300 animals were genotyped at high-density. No improvement was observed when multi-breed imputation was performed. CONCLUSION: In all breeds, imputation accuracy was higher than 97%, which indicates that imputation to the high-density chip was accurate. Imputation accuracy depends mainly on the size of the reference population and the relationship between reference and target populations.


Subject(s)
Alleles , Cattle/genetics , Genetic Markers , Genetic Variation , Animals , Breeding , France , Genome , Genotype , Linear Models , Linkage Disequilibrium , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable
16.
Genet Res (Camb) ; 93(6): 409-17, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22189606

ABSTRACT

For genomic selection methods, the statistical challenge is to estimate the effect of each of the available single-nucleotide polymorphism (SNP). In a context where the number of SNPs (p) is much higher than the number of bulls (n), this task may lead to a poor estimation of these SNP effects if, as for genomic BLUP (gBLUP), all SNPs have a non-null effect. An alternative is to use approaches that have been developed specifically to solve the 'p >> n' problem. This is the case of variable selection methods and among them, we focus on the Elastic-Net (EN) algorithm that is a penalized regression approach. Performances of EN, gBLUP and pedigree-based BLUP were compared with data from three French dairy cattle breeds, giving very encouraging results for EN. We tried to push further the idea of improving SNP effect estimates by considering fewer of them. This variable selection strategy was considered both in the case of gBLUP and EN by adding an SNP pre-selection step based on quantitative trait locus (QTL) detection. Similar results were observed with or without a pre-selection step, in terms of correlations between direct genomic value (DGV) and observed daughter yield deviation in a validation data set. However, when applied to the EN algorithm, this strategy led to a substantial reduction of the number of SNPs included in the prediction equation. In a context where the number of genotyped animals and the number of SNPs gets larger and larger, SNP pre-selection strongly alleviates computing requirements and ensures that national evaluations can be completed within a reasonable time frame.


Subject(s)
Algorithms , Cattle/genetics , Genome/genetics , Polymorphism, Single Nucleotide , Animals , Breeding/methods , Cattle/metabolism , Computational Biology/methods , Dairying , Female , Genomics/methods , Male , Milk/metabolism , Models, Genetic , Pedigree , Quantitative Trait Loci/genetics , Regression Analysis , Reproducibility of Results , Selection, Genetic
17.
Genet Res (Camb) ; 93(1): 77-87, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21144129

ABSTRACT

Empirical experience with genomic selection in dairy cattle suggests that the distribution of the effects of single nucleotide polymorphisms (SNPs) might be far from normality for some traits. An alternative, avoiding the use of arbitrary prior information, is the Bayesian Lasso (BL). Regular BL uses a common variance parameter for residual and SNP effects (BL1Var). We propose here a BL with different residual and SNP effect variances (BL2Var), equivalent to the original Lasso formulation. The λ parameter in Lasso is related to genetic variation in the population. We also suggest precomputing individual variances of SNP effects by BL2Var, to be later used in a linear mixed model (HetVar-GBLUP). Models were tested in a cross-validation design including 1756 Holstein and 678 Montbéliarde French bulls, with 1216 and 451 bulls used as training data; 51 325 and 49 625 polymorphic SNP were used. Milk production traits were tested. Other methods tested included linear mixed models using variances inferred from pedigree estimates or integrated out from the data. Estimates of genetic variation in the population were close to pedigree estimates in BL2Var but not in BL1Var. BL1Var shrank breeding values too little because of the common variance. BL2Var was the most accurate method for prediction and accommodated well major genes, in particular for fat percentage. BL1Var was the least accurate. HetVar-GBLUP was almost as accurate as BL2Var and allows for simple computations and extensions.


Subject(s)
Genome , Polymorphism, Single Nucleotide , Selection, Genetic/genetics , Animals , Bayes Theorem , Breeding , Cattle , Genotype , Pedigree
18.
BMC Proc ; 3 Suppl 7: S61, 2009 Dec 15.
Article in English | MEDLINE | ID: mdl-20018055

ABSTRACT

We applied a penalized regression approach to single-nucleotide polymorphisms in regions on chromosomes 1, 6, and 9 of the North American Rheumatoid Arthritis Consortium data. Results were compared with a standard single-locus association test. Overall, the penalized regression approach did not appear to offer any advantage with respect to either detection or localization of disease-associated polymorphisms, compared with the single-locus approach.

19.
Hum Hered ; 63(3-4): 229-38, 2007.
Article in English | MEDLINE | ID: mdl-17347570

ABSTRACT

To test for association between a disease and a set of linked markers, or to estimate relative risks of disease, several different methods have been developed. Many methods for family data require that individuals be genotyped at the full set of markers and that phase can be reconstructed. Individuals with missing data are excluded from the analysis. This can result in an important decrease in sample size and a loss of information. A possible solution to this problem is to use missing-data likelihood methods. We propose an alternative approach, namely the use of multiple imputation. Briefly, this method consists in estimating from the available data all possible phased genotypes and their respective posterior probabilities. These posterior probabilities are then used to generate replicate imputed data sets via a data augmentation algorithm. We performed simulations to test the efficiency of this approach for case/parent trio data and we found that the multiple imputation procedure generally gave unbiased parameter estimates with correct type 1 error and confidence interval coverage. Multiple imputation had some advantages over missing data likelihood methods with regards to ease of use and model flexibility. Multiple imputation methods represent promising tools in the search for disease susceptibility variants.


Subject(s)
Genetic Predisposition to Disease , Genotype , Case-Control Studies , Data Interpretation, Statistical , Family , Haplotypes , Humans , Models, Statistical , Odds Ratio , Regression Analysis
20.
BMC Proc ; 1 Suppl 1: S22, 2007.
Article in English | MEDLINE | ID: mdl-18466519

ABSTRACT

We recently described a new method to identify disease susceptibility loci, based on the analysis of the evolutionary relationships between haplotypes of cases and controls. However, haplotypes are often unknown and the problem of phase inference is even more crucial when there are missing data. In this work, we suggest using a multiple imputation algorithm to deal with missing phase and missing data, prior to a phylogeny-based analysis. We used the simulated data of Genetic Analysis Workshop 15 (Problem 3, answer known) to assess the power of the phylogeny-based analysis to detect disease susceptibility loci after reconstruction of haplotypes by a multiple-imputation method. We compare, for various rates of missing data, the performance of the multiple imputation method with the performance achieved when considering only the most probable haplotypic configurations or the true phase. When only the phase is unknown, all methods perform approximately the same to identify disease susceptibility sites. In the presence of missing data however, the detection of disease susceptibility sites is significantly better when reconstructing haplotypes by multiple imputation than when considering only the best haplotype configurations.

SELECTION OF CITATIONS
SEARCH DETAIL
...