RESUMEN
Chronic kidney disease (CKD) affects 10% of the human population, with only a small fraction genetically defined. CKD is also common in dogs and has been diagnosed in nearly all breeds, but its genetic basis remains unclear. Here, we performed a Bayesian mixed model genome-wide association analysis for canine CKD in a boxer population of 117 canine cases and 137 controls, and identified 21 genetic regions associated with the disease. At the top markers from each CKD region, the cases carried an average of 20.2 risk alleles, significantly higher than controls (15.6 risk alleles). An ANOVA test showed that the 21 CKD regions together explained 57% of CKD phenotypic variation in the population. Based on whole genome sequencing data of 20 boxers, we identified 5,206 variants in LD with the top 50 BayesR markers. Following comparative analysis with human regulatory data, 17 putative regulatory variants were identified and tested with electrophoretic mobility shift assays. In total four variants, three intronic variants from the MAGI2 and GALNT18 genes, and one variant in an intergenic region on chr28, showed alternative binding ability for the risk and protective alleles in kidney cell lines. Many genes from the 21 CKD regions, RELN, MAGI2, FGFR2 and others, have been implicated in human kidney development or disease. The results from this study provide new information that may enlighten the etiology of CKD in both dogs and humans.
Asunto(s)
Estudio de Asociación del Genoma Completo , Insuficiencia Renal Crónica , Perros , Humanos , Animales , Teorema de Bayes , Insuficiencia Renal Crónica/genética , Insuficiencia Renal Crónica/veterinaria , Insuficiencia Renal Crónica/epidemiología , Riñón , Alelos , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: Since the very beginning of genomic selection, researchers investigated methods that improved upon SNP-BLUP (single nucleotide polymorphism best linear unbiased prediction). SNP-BLUP gives equal weight to all SNPs, whereas it is expected that many SNPs are not near causal variants and thus do not have substantial effects. A recent approach to remedy this is to use genome-wide association study (GWAS) findings and increase the weights of GWAS-top-SNPs in genomic predictions. Here, we employ a genome-wide approach to integrate GWAS results into genomic prediction, called GWABLUP. RESULTS: GWABLUP consists of the following steps: (1) performing a GWAS in the training data which results in likelihood ratios; (2) smoothing the likelihood ratios over the SNPs; (3) combining the smoothed likelihood ratio with the prior probability of SNPs having non-zero effects, which yields the posterior probability of the SNPs; (4) calculating a weighted genomic relationship matrix using the posterior probabilities as weights; and (5) performing genomic prediction using the weighted genomic relationship matrix. Using high-density genotypes and milk, fat, protein and somatic cell count phenotypes on dairy cows, GWABLUP was compared to GBLUP, GBLUP (topSNPs) with extra weights for GWAS top-SNPs, and BayesGC, i.e. a Bayesian variable selection model. The GWAS resulted in six, five, four, and three genome-wide significant peaks for milk, fat and protein yield and somatic cell count, respectively. GWABLUP genomic predictions were 10, 6, 7 and 1% more reliable than those of GBLUP for milk, fat and protein yield and somatic cell count, respectively. It was also more reliable than GBLUP (topSNPs) for all four traits, and more reliable than BayesGC for three of the traits. Although GWABLUP showed a tendency towards inflation bias for three of the traits, this was not statistically significant. In a multitrait analysis, GWABLUP yielded the highest accuracy for two of the traits. However, for SCC, which was relatively unrelated to the yield traits, including yield trait GWAS-results reduced the reliability compared to a single trait analysis. CONCLUSIONS: GWABLUP uses GWAS results to differentially weigh all the SNPs in a weighted GBLUP genomic prediction analysis. GWABLUP yielded up to 10% and 13% more reliable genomic predictions than GBLUP for single and multitrait analyses, respectively. Extension of GWABLUP to single-step analyses is straightforward.
Asunto(s)
Estudio de Asociación del Genoma Completo , Genoma , Animales , Bovinos/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Teorema de Bayes , Reproducibilidad de los Resultados , Genotipo , Fenotipo , Polimorfismo de Nucleótido Simple , Modelos GenéticosRESUMEN
The aim of this study was to investigate the reference population size required to obtain substantial prediction accuracy within- and across-lines and the effect of using a multi-line reference population for genomic predictions of maternal traits in pigs. The data consisted of two nucleus pig populations, one pure-bred Landrace (L) and one Synthetic (S) Yorkshire/Large White line. All animals were genotyped with up to 30 K animals in each line, and all had records on maternal traits. Prediction accuracy was tested with three different marker data sets: High-density SNP (HD), whole genome sequence (WGS), and markers derived from WGS based on pig combined annotation dependent depletion-score (pCADD). Also, two different genomic prediction methods (GBLUP and Bayes GC) were compared for four maternal traits; total number piglets born (TNB), total number of stillborn piglets (STB), Shoulder Lesion Score and Body Condition Score. The main results from this study showed that a reference population of 3 K-6 K animals for within-line prediction generally was sufficient to achieve high prediction accuracy. However, when the number of animals in the reference population was increased to 30 K, the prediction accuracy significantly increased for the traits TNB and STB. For multi-line prediction accuracy, the accuracy was most dependent on the number of within-line animals in the reference data. The S-line provided a generally higher prediction accuracy compared to the L-line. Using pCADD scores to reduce the number of markers from WGS data in combination with the GBLUP method generally reduced prediction accuracies relative to GBLUP using HD genotypes. The BayesGC method benefited from a large reference population and was less dependent on the different genotype marker datasets to achieve a high prediction accuracy.
Asunto(s)
Genotipo , Secuenciación Completa del Genoma , Animales , Femenino , Secuenciación Completa del Genoma/veterinaria , Porcinos/genética , Genómica/métodos , Densidad de Población , Polimorfismo de Nucleótido Simple , Cruzamiento , Genoma/genética , FenotipoRESUMEN
Inflammatory bowel diseases are chronic gastrointestinal inflammatory disorders that affect millions of people worldwide. Genome-wide association studies have identified 200 inflammatory bowel disease-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 inflammatory bowel disease loci using high-density genotyping in 67,852 individuals. We pinpoint 18 associations to a single causal variant with greater than 95% certainty, and an additional 27 associations to a single variant with greater than 50% certainty. These 45 variants are significantly enriched for protein-coding changes (n = 13), direct disruption of transcription-factor binding sites (n = 3), and tissue-specific epigenetic marks (n = 10), with the last category showing enrichment in specific immune cells among associations stronger in Crohn's disease and in gut mucosa among associations stronger in ulcerative colitis. The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Enfermedades Inflamatorias del Intestino/genética , Sitios de Carácter Cuantitativo/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Sitios de Unión , Cromatina/genética , Colitis Ulcerosa/genética , Enfermedad de Crohn/genética , Epigénesis Genética/genética , Femenino , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Desequilibrio de Ligamiento/genética , Masculino , Persona de Mediana Edad , Proteína smad3/genética , Factores de Transcripción/metabolismo , Adulto JovenRESUMEN
Many quantitative traits measured in breeding programs are genetically correlated. The genetic correlations between the traits indicate that the measurement of one trait carries information on others. To benefit from this information, multi-trait genomic prediction (MTGP) is preferable to use. However, MTGP is more difficult to implement compared to single-trait genomic prediction (STGP), and even more challenging for the goal to exploit not only the information on other traits but also the information on ungenotyped animals. This could be accomplished using both single and multistep methods. The single-step method was achieved by implementing a single-step genomic best linear unbiased prediction (ssGBLUP) approach using a multi-trait model. Here, we examined a multistep analysis based on an approach called "Absorption" to achieve this goal. The Absorption approach absorbed all available information including the phenotypic information on ungenotyped animals as well as the information on other traits if applicable, into mixed model equations of genotyped animals. The multistep analysis included (1) to apply the Absorption approach that exploits all available information and (2) to implement genomic BLUP (GBLUP) prediction on the absorbed dataset. In this study, the ssGBLUP and multistep analysis were applied to 5 traits in Duroc pigs, which were slaughter percentage, feed consumption from 40 to 120 kg (FC40_120), days of growth from 40 to 120 kg (D40_120), age at 40 kg (A40) and lean meat percentage. The results showed that MTGP yielded higher accuracy than STGP, which on average was 0.057 higher for the multistep method and 0.045 higher for ssGBLUP. The multistep method achieved similar prediction accuracy as ssGBLUP. However, the prediction bias of the multistep method was in general lower than that of ssGBLUP.
Asunto(s)
Genómica , Carne , Animales , Porcinos , Fenotipo , GenotipoRESUMEN
The aim of this study was to compare three methods of genomic prediction: GBLUP, BayesC and BayesGC for genomic prediction of six maternal traits in Landrace sows using a panel of 660 K SNPs. The effects of different priors for the Bayesian methods were also investigated. GBLUP does not take the genetic architecture into account as all SNPs are assumed to have equally sized effects and relies heavily on the relationships between the animals for accurate predictions. Bayesian approaches rely on both fitting SNPs that describe relationships between animals in addition to fitting single SNP effects directly. Both the relationship between the animals and single SNP effects are important for accurate predictions. Maternal traits in sows are often more difficult to record and have lower heritabilities. BayesGC was generally the method with the higher accuracy, although its accuracy was for some traits matched by that of GBLUP and for others by that of BayesC. For piglet mortality within 3 weeks, BayesGC achieved up to 9.2% higher accuracy. For many of the traits, however, the methods did not show significant differences in accuracies.
Asunto(s)
Genoma , Genómica , Animales , Teorema de Bayes , Femenino , Genómica/métodos , Genotipo , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple , Porcinos/genéticaRESUMEN
The goal of this study was to assess the feasibility of across-country genomic predictions in Norwegian White Sheep (NWS) and New Zealand Composite (NZC) sheep populations with similar development history. Different training populations were evaluated (i.e., including only NWS or NZC, or combining both populations). Predictions were performed using the actual phenotypes (normalized) and the single-step GBLUP via Bayesian inference. Genotyped NWS animals born in 2016 (N = 267) were used to assess the accuracy and bias of genomic estimated breeding values (GEBVs) predicted for birth weight (BW), weaning weight (WW), carcass weight (CW), EUROP carcass classification (EUC), and EUROP fat grading (EUF). The accuracy and bias of GEBVs differed across traits and training population used. For instance, the GEBV accuracies ranged from 0.13 (BW) to 0.44 (EUC) for GEBVs predicted including only NWS, from 0.06 (BW) to 0.15 (CW) when including only NZC, and from 0.10 (BW) to 0.41 (EUC) when including both NWS and NZC animals in the training population. The regression coefficients used to assess the spread of GEBVs (bias) ranged from 0.26 (BW) to 0.64 (EUF) for only NWS, 0.10 (EUC) to 0.52 (CW) for only NZC, and from 0.42 (WW) to 2.23 (EUC) for both NWS and NZC in the training population. Our findings suggest that across-country genomic predictions based on ssGBLUP might be possible for NWS and NZC, especially for novel traits.
Asunto(s)
Genoma , Genómica , Animales , Teorema de Bayes , Genotipo , Modelos Genéticos , Nueva Zelanda , Fenotipo , Polimorfismo de Nucleótido Simple , Ovinos/genéticaRESUMEN
BACKGROUND: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. METHODS: The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis-Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. RESULTS: The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. CONCLUSIONS: Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.
Asunto(s)
Cruzamiento/métodos , Bovinos/genética , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Secuenciación Completa del Genoma/métodos , Animales , Femenino , Masculino , Productos de la Carne/normas , Carácter Cuantitativo HeredableRESUMEN
BACKGROUND: Polyploidy is widespread in animals and especially in plants. Different kinds of ploidies exist, for example, hexaploidy in wheat, octaploidy in strawberries, and diploidy, triploidy, tetraploidy, and pseudo-tetraploidy (partly tetraploid) in fish. Triploid offspring from diploid parents occur frequently in the wild in Atlantic salmon (Salmo salar) and, as with triploidy in general, the triploid individuals are sterile. Induced triploidy in Atlantic salmon is common practice to produce sterile fish. In Norwegian aquaculture, production of sterile triploid fish is an attempt by government and industry to limit genetic introgression between wild and farmed fish. However, triploid fish may have traits and properties that differ from those of diploids. Investigating the genetics behind traits in triploids has proved challenging because genotype calling of genetic markers in triploids is not supported by standard software. Our aim was to develop a method that can be used for genotype calling of genetic markers in triploid individuals. RESULTS: Allele signals were produced for 381 triploid Atlantic salmon offspring using a 56 K Thermo Fisher GeneTitan genotyping platform. Genotypes were successfully called by applying finite normal mixture models to the (transformed) allele signals. Subsets of markers were filtered by quality control statistics for use with downstream analyses. The quality of the called genotypes was sufficient to allow for assignment of diploid parents to the triploid offspring and to discriminate between maternal and paternal parents from autosomal inheritance patterns. In addition, as the maternal inheritance in triploid offspring is identical to gynogenetic inheritance, the maternal recombination pattern for each chromosome could be mapped by using a similar approach as that used in gene-centromere mapping. CONCLUSIONS: We show that calling of dense marker genotypes for triploid individuals is feasible. The resulting genotypes can be used in parentage assignment of triploid offspring to diploid parents, to discriminate between maternal and paternal parents using autosomal inheritance patterns, and to map the maternal recombination pattern using an approach similar to gene-centromere mapping. Genotyping of triploid individuals is important both for selective breeding programs and unravelling the underlying genetics of phenotypes recorded in triploids. In principle, the developed method can be used for genotype calling of other polyploid organisms.
Asunto(s)
Diploidia , Marcadores Genéticos , Genotipo , Salmo salar/genética , Triploidía , Alelos , Animales , Cruzamiento , Explotaciones PesquerasRESUMEN
BACKGROUND: The availability of both pedigree and genomic sources of information for animal breeding and genetics has created new challenges in understanding how they can be best used and interpreted. This study estimated genetic variance components based on genomic information and compared these to the variance components estimated from pedigree alone in a population generated to estimate non-additive genetic variance. Furthermore, the study examined the impact of the assumptions of Hardy-Weinberg equilibrium (HWE) on estimates of genetic variance components. For the first time, the magnitude of inbreeding depression for important commercial traits in Nile tilapia was estimated by using genomic data. RESULTS: The study estimated the non-additive genetic variance in a Nile tilapia population of full-sib families and, when present, it was almost entirely represented by additive-by-additive epistatic variance, although in pedigree studies this non-additive variance is commonly assumed to arise from dominance. For body depth (BD) and body weight at harvest (BWH), the proportion of additive-by-additive epistatic to phenotypic variance was estimated to be 0.15 and 0.17 using genomic data (P < 0.05). In addition, with genomic data, the maternal variance (P < 0.05) for BD, BWH, body length (BL) and fillet weight (FW) explained approximately 10% of the phenotypic variances, which was comparable to pedigree-based estimates. The study also showed the detrimental effects of inbreeding on commercial traits of tilapia, which was estimated to reduce trait values by 1.1, 0.9, 0.4 and 0.3% per 1% increase in the individual homozygosity for FW, BWH, BD and BL, respectively. The presence of inbreeding depression but lack of dominance variance was consistent with an infinitesimal dominance model for the traits. CONCLUSIONS: The benefit of including non-additive genetic effects for genetic evaluations in tilapia breeding schemes is not evident from these findings, but the observed inbreeding depression points to a role for reciprocal recurrent selection. Commercially, this conclusion will depend on the scheme's operational costs and resources. The creation of maternal lines in Tilapia breeding schemes may be a possibility if the variation associated with maternal effects is heritable.
Asunto(s)
Cíclidos/genética , Genoma , Carne/análisis , Animales , Peso Corporal , Cíclidos/crecimiento & desarrollo , Cíclidos/fisiología , Femenino , Endogamia , Depresión Endogámica , Masculino , Herencia Materna , Modelos Genéticos , Músculo Esquelético/química , Linaje , Fenotipo , Carácter Cuantitativo HeredableRESUMEN
This study tested and compared different implementation strategies for genomic selection for Norwegian White Sheep, aiming to increase genetic gain for maternal traits. These strategies were evaluated for their genetic gain ingrowth, carcass and maternal traits, total genetic gain, a weighted sum of the gain in each trait and rates of inbreeding through a full-scale stochastic simulation. Results showed genomic selection schemes to increase genetic gain for maternal traits but reduced genetic gain for other traits. This could also be obtained by selecting rams for artificial selection at a higher age. Implementation of genomic selection in the current breeding structure increased genetic gain for maternal traits up to 57%, outcompeted by reducing the generation interval for artificial insemination rams from current 3 to 2 years. Then, total genetic gain for maternal traits increased by 65%-77% and total genetic gain by18%-20%, but at increased rates of inbreeding.
Asunto(s)
Cruzamiento/métodos , Genómica , Selección Genética , Oveja Doméstica/genética , Animales , Simulación por Computador , Femenino , Genoma , Endogamia , Masculino , Modelos Genéticos , Fenotipo , Oveja Doméstica/crecimiento & desarrolloRESUMEN
BACKGROUND: The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions. RESULTS: Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors. CONCLUSIONS: Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.
Asunto(s)
Cruzamiento/métodos , Bovinos/genética , Genómica/métodos , Polimorfismo de Nucleótido Simple , Animales , Sesgo , Femenino , Genotipo , FenotipoRESUMEN
BACKGROUND: In pigs, crossbreeding aims at exploiting heterosis, but heterosis is difficult to quantify. Heterozygosity at genetic markers is easier to measure and could potentially be used as an indicator of heterosis. The objective of this study was to investigate the effect of heterozygosity on various maternal and production traits in purebred and crossbred pigs. The proportion of heterozygosity at genetic markers across the genome for each individual was included in the prediction model as a fixed regression across or within breeds. RESULTS: Estimates of regression coefficients of heterozygosity showed large effects for some traits. For maternal traits, regression coefficient estimates were always in a favourable direction, while for production, meat and slaughter quality traits, they were both favourable and unfavourable. Traits with the largest estimated effects of heterozygosity were total number born, litter weight at 3 weeks, weight at 150 days, and age at 40 kg. Estimates of regression coefficients on heterozygosity differed between breeds. Traits with the largest effect of heterozygosity also showed a significant (P < 0.05) increase in prediction accuracy when heterozygosity was included in the model compared to the model without heterozygosity. CONCLUSIONS: For traits with the largest estimates of regression coefficients on heterozygosity, the inclusion of heterozygosity in the model improved prediction accuracy. Using models that include heterozygosity would result in selecting different animals for breeding, which has the potential to improve genetic gain for these traits. This is most beneficial when crossbreds or several breeds are included in the estimation of breeding values and is relevant to all species, not only pigs. Thus, our results show that including heterozygosity in the model is beneficial for some traits, likely due to dominant gene action.
Asunto(s)
Heterocigoto , Hibridación Genética , Endogamia , Carácter Cuantitativo Heredable , Porcinos/genética , Animales , Femenino , Vigor Híbrido , MasculinoRESUMEN
BACKGROUND: Two distinct populations have been extensively studied in Atlantic cod (Gadus morhua L.): the Northeast Arctic cod (NEAC) population and the coastal cod (CC) population. The objectives of the current study were to identify genomic islands of divergence and to propose an approach to quantify the strength of selection pressures using whole-genome single nucleotide polymorphism (SNP) data. After applying filtering criteria, information on 93 animals (9 CC individuals, 50 NEAC animals and 34 CC × NEAC crossbred individuals) and 3,123,434 autosomal SNPs were used. RESULTS: Four genomic islands of divergence were identified on chromosomes 1, 2, 7 and 12, which were mapped accurately based on SNP data and which extended in size from 11 to 18 Mb. These regions differed considerably between the two populations although the differences in the rest of the genome were small due to considerable gene flow between the populations. The estimates of selection pressures showed that natural selection was substantially more important than genetic drift in shaping these genomic islands. Our data confirmed results from earlier publications that suggested that genomic islands are due to chromosomal rearrangements that are under strong selection and reduce recombination between rearranged and non-rearranged segments. CONCLUSIONS: Our findings further support the hypothesis that selection and reduced recombination in genomic islands may promote speciation between these two populations although their habitats overlap considerably and migrations occur between them.
Asunto(s)
Gadus morhua/genética , Islas Genómicas , Polimorfismo de Nucleótido Simple , Selección Genética , Animales , Cromosomas/genética , Flujo Génico , Flujo Genético , Recombinación GenéticaRESUMEN
BACKGROUND: Photobacteriosis is an infectious disease developed by a Gram-negative bacterium Photobacterium damselae subsp. piscicida (Phdp), which may cause high mortalities (90-100%) in sea bream. Selection and breeding for resistance against infectious diseases is a highly valuable tool to help prevent or diminish disease outbreaks, and currently available advanced selection methods with the application of genomic information could improve the response to selection. An experimental group of sea bream juveniles was derived from a Ferme Marine de Douhet (FMD, Oléron Island, France) selected line using ~ 109 parents (~ 25 females and 84 males). This group of 1187 individuals represented 177 full-sib families with 1-49 sibs per family, which were challenged with virulent Phdp for a duration of 18 days, and mortalities were recorded within this duration. Tissue samples were collected from the parents and the recorded offspring for DNA extraction, library preparation using 2b-RAD and genotyping by sequencing. Genotypic data was used to develop a linkage map, genome wide association analysis and for the estimation of breeding values. RESULTS: The analysis of genetic variation for resistance against Phdp revealed moderate genomic heritability with estimates of ~ 0.32. A genome-wide association analysis revealed a quantitative trait locus (QTL) including 11 SNPs at linkage group 17 presenting significant association to the trait with p-value crossing genome-wide Bonferroni corrected threshold P ≤ 2.22e-06. The proportion total genetic variance explained by the single top most significant SNP was ranging from 13.28-16.14% depending on the method used to compute the variance. The accuracies of predicting breeding values obtained using genomic vs. pedigree information displayed 19-24% increase when using genomic information. CONCLUSION: The current study demonstrates that SNPs-based genotyping of a sea bream population with 2b-RAD approach is effective at capturing the genetic variation for resistance against Phdp. Prediction accuracies obtained using genomic information were significantly higher than the accuracies obtained using pedigree information which highlights the importance and potential of genomic selection in commercial breeding programs.
Asunto(s)
Enfermedades de los Peces/genética , Enfermedades de los Peces/microbiología , Infecciones por Bacterias Gramnegativas/veterinaria , Photobacterium/patogenicidad , Dorada/genética , Dorada/microbiología , Animales , Mapeo Cromosómico , Resistencia a la Enfermedad/genética , Explotaciones Pesqueras , Francia , Ligamiento Genético , Estudio de Asociación del Genoma Completo , Infecciones por Bacterias Gramnegativas/genética , Linaje , Polimorfismo de Nucleótido Simple , Sitios de Carácter CuantitativoRESUMEN
BACKGROUND: Parentage assignment is usually based on a limited number of unlinked, independent genomic markers (microsatellites, low-density single nucleotide polymorphisms (SNPs), etc.). Classical methods for parentage assignment are exclusion-based (i.e. based on loci that violate Mendelian inheritance) or likelihood-based, assuming independent inheritance of loci. For true parent-offspring relations, genotyping errors cause apparent violations of Mendelian inheritance. Thus, the maximum proportion of such violations must be determined, which is complicated by variable call- and genotype error rates among loci and individuals. Recently, genotyping using high-density SNP chips has become available at lower cost and is increasingly used in genetics research and breeding programs. However, dense SNPs are not independently inherited, violating the assumptions of the likelihood-based methods. Hence, parentage assignment usually assumes a maximum proportion of exclusions, or applies likelihood-based methods on a smaller subset of independent markers. Our aim was to develop a fast and accurate trio parentage assignment method for dense SNP data without prior genotyping error- or call rate knowledge among loci and individuals. This genomic relationship likelihood (GRL) method infers parentage by using genomic relationships, which are typically used in genomic prediction models. RESULTS: Using 50 simulated datasets with 53,427 to 55,517 SNPs, genotyping error rates of 1-3% and call rates of ~ 80 to 98%, GRL was found to be fast and highly (~ 99%) accurate for parentage assignment. An iterative approach was developed for training using the evaluation data, giving similar accuracy. For comparison, we used the Colony2 software that assigns parentage and sibship simultaneously to increase the power of the likelihood-based method and found that it has considerably lower accuracy than GRL. We also compared GRL with an exclusion-based method in which one of the parameters was estimated using GRL assignments.This method was slightly more accurate than GRL. CONCLUSIONS: We show that GRL is a fast and accurate method of parentage assignment that can use dense, non-independent SNPs, with variable call rates and unknown genotyping error rates. By offering an alternative way of assigning parents, GRL is also suitable for estimating the expected proportion of inconsistent parent-offspring genotypes for exclusion-based models.
Asunto(s)
Biología Computacional/métodos , Técnicas de Genotipaje/veterinaria , Polimorfismo de Nucleótido Simple , Animales , Cruzamiento , Simulación por Computador , Bases de Datos Genéticas , Funciones de Verosimilitud , Programas InformáticosRESUMEN
BACKGROUND: For marker effect models and genomic animal models, computational requirements increase with the number of loci and the number of genotyped individuals, respectively. In the latter case, the inverse genomic relationship matrix (GRM) is typically needed, which is computationally demanding to compute for large datasets. Thus, there is a great need for dimensionality-reduction methods that can analyze massive genomic data. For this purpose, we developed reduced-dimension singular value decomposition (SVD) based models for genomic prediction. METHODS: Fast SVD is performed by analyzing different chromosomes/genome segments in parallel and/or by restricting SVD to a limited core of genotyped individuals, producing chromosome- or segment-specific principal components (PC). Given a limited effective population size, nearly all the genetic variation can be effectively captured by a limited number of PC. Genomic prediction can then be performed either by PC ridge regression (PCRR) or by genomic animal models using an inverse GRM computed from the chosen PC (PCIG). In the latter case, computation of the inverse GRM will be feasible for any number of genotyped individuals and can be readily produced row- or element-wise. RESULTS: Using simulated data, we show that PCRR and PCIG models, using chromosome-wise SVD of a core sample of individuals, are appropriate for genomic prediction in a larger population, and results in virtually identical predicted breeding values as the original full-dimension genomic model (r = 1.000). Compared with other algorithms (e.g. algorithm for proven and young animals, APY), the (chromosome-wise SVD-based) PCRR and PCIG models were more robust to size of the core sample, giving nearly identical results even down to 500 core individuals. The method was also successfully tested on a large multi-breed dataset. CONCLUSIONS: SVD can be used for dimensionality reduction of large genomic datasets. After SVD, genomic prediction using dense genomic data and many genotyped individuals can be done in a computationally efficient manner. Using this method, the resulting genomic estimated breeding values were virtually identical to those computed from a full-dimension genomic model.
Asunto(s)
Biología Computacional/métodos , Genotipo , Modelos Genéticos , Algoritmos , Animales , Cruzamiento , Simulación por Computador , Genoma , Densidad de Población , Análisis de Componente PrincipalRESUMEN
BACKGROUND: The replacement of fish oil (FO) and fishmeal with plant ingredients in the diet of farmed Atlantic salmon has resulted in reduced levels of the health-promoting long-chain polyunsaturated omega-3 fatty acids (n-3 LC-PUFA) eicosapentaenoic (EPA; 20:5n-3) and docosahexaenoic acid (DHA; 22:6n-3) in their filets. Previous studies showed the potential of selective breeding to increase n-3 LC-PUFA levels in salmon tissues, but knowledge on the genetic parameters for individual muscle fatty acids (FA) and their relationships with other traits is still lacking. Thus, we estimated genetic parameters for muscle content of individual FA, and their relationships with lipid deposition traits, muscle pigmentation, sea lice and pancreas disease in slaughter-sized Atlantic salmon. Our aim was to evaluate the selection potential for increased n-3 LC-PUFA content and provide insight into FA metabolism in Atlantic salmon muscle. RESULTS: Among the n-3 PUFA, proportional contents of alpha-linolenic acid (ALA; 18:3n-3) and DHA had the highest heritability (0.26) and EPA the lowest (0.09). Genetic correlations of EPA and DHA proportions with muscle fat differed considerably, 0.60 and 0.01, respectively. The genetic correlation of DHA proportion with visceral fat was positive and high (0.61), whereas that of EPA proportion with lice density was negative. FA that are in close proximity along the bioconversion pathway showed positive correlations with each other, whereas the start (ALA) and end-point (DHA) of the pathway were negatively correlated (- 0.28), indicating active bioconversion of ALA to DHA in the muscle of fish fed high FO-diet. CONCLUSIONS: Since contents of individual FA in salmon muscle show additive genetic variation, changing FA composition by selective breeding is possible. Taken together, our results show that the heritabilities of individual n-3 LC-PUFA and their genetic correlations with other traits vary, which indicates that they play different roles in muscle lipid metabolism, and that proportional muscle contents of EPA and DHA are linked to body fat deposition. Thus, different selection strategies can be applied in order to increase the content of healthy omega-3 FAin the salmon muscle. We recommend selection for the proportion of EPA + DHA in the muscle because they are both essential FA and because such selection has no clear detrimental effects on other traits.
Asunto(s)
Ácidos Grasos Omega-3/análisis , Músculos/química , Carácter Cuantitativo Heredable , Salmo salar/genética , Tejido Adiposo , Algoritmos , Alimentación Animal/análisis , Animales , Cruzamiento , Grasa Intraabdominal , Metabolismo de los LípidosRESUMEN
BACKGROUND: Non-linear Bayesian genomic prediction models such as BayesA/B/C/R involve iteration and mostly Markov chain Monte Carlo (MCMC) algorithms, which are computationally expensive, especially when whole-genome sequence (WGS) data are analyzed. Singular value decomposition (SVD) of the genotype matrix can facilitate genomic prediction in large datasets, and can be used to estimate marker effects and their prediction error variances (PEV) in a computationally efficient manner. Here, we developed, implemented, and evaluated a direct, non-iterative method for the estimation of marker effects for the BayesC genomic prediction model. METHODS: The BayesC model assumes a priori that markers have normally distributed effects with probability [Formula: see text] and no effect with probability (1 - [Formula: see text]). Marker effects and their PEV are estimated by using SVD and the posterior probability of the marker having a non-zero effect is calculated. These posterior probabilities are used to obtain marker-specific effect variances, which are subsequently used to approximate BayesC estimates of marker effects in a linear model. A computer simulation study was conducted to compare alternative genomic prediction methods, where a single reference generation was used to estimate marker effects, which were subsequently used for 10 generations of forward prediction, for which accuracies were evaluated. RESULTS: SVD-based posterior probabilities of markers having non-zero effects were generally lower than MCMC-based posterior probabilities, but for some regions the opposite occurred, resulting in clear signals for QTL-rich regions. The accuracies of breeding values estimated using SVD- and MCMC-based BayesC analyses were similar across the 10 generations of forward prediction. For an intermediate number of generations (2 to 5) of forward prediction, accuracies obtained with the BayesC model tended to be slightly higher than accuracies obtained using the best linear unbiased prediction of SNP effects (SNP-BLUP model). When reducing marker density from WGS data to 30 K, SNP-BLUP tended to yield the highest accuracies, at least in the short term. CONCLUSIONS: Based on SVD of the genotype matrix, we developed a direct method for the calculation of BayesC estimates of marker effects. Although SVD- and MCMC-based marker effects differed slightly, their prediction accuracies were similar. Assuming that the SVD of the marker genotype matrix is already performed for other reasons (e.g. for SNP-BLUP), computation times for the BayesC predictions were comparable to those of SNP-BLUP.
Asunto(s)
Genómica/métodos , Modelos Genéticos , Secuenciación Completa del Genoma/métodos , Animales , Teorema de Bayes , Cruzamiento , Simulación por Computador , Genoma , Polimorfismo de Nucleótido Simple/genética , Selección GenéticaRESUMEN
BACKGROUND: Multi-marker methods, which fit all markers simultaneously, were originally tailored for genomic selection purposes, but have proven to be useful also in association analyses, especially the so-called BayesC Bayesian methods. In a recent study, BayesD extended BayesC towards accounting for dominance effects and improved prediction accuracy and persistence in genomic selection. The current study investigated the power and precision of BayesC and BayesD in genome-wide association studies by means of stochastic simulations and applied these methods to a dairy cattle dataset. METHODS: The simulation protocol was designed to mimic the genetic architecture of quantitative traits as realistically as possible. Special emphasis was put on the joint distribution of the additive and dominance effects of causative mutations. Additive marker effects were estimated by BayesC and additive and dominance effects by BayesD. The dependencies between additive and dominance effects were modelled in BayesD by choosing appropriate priors. A sliding-window approach was used. For each window, the R. Fernando window posterior probability of association was calculated and this was used for inference purpose. The power to map segregating causal effects and the mapping precision were assessed for various marker densities up to full sequence information and various window sizes. RESULTS: Power to map a QTL increased with higher marker densities and larger window sizes. This held true for both methods. Method BayesD had improved power compared to BayesC. The increase in power was between -2 and 8% for causative genes that explained more than 2.5% of the genetic variance. In addition, inspection of the estimates of genomic window dominance variance allowed for inference about the magnitude of dominance at significant associations, which remains hidden in BayesC analysis. Mapping precision was not substantially improved by BayesD. CONCLUSIONS: BayesD improved power, but precision only slightly. Application of BayesD needs large datasets with genotypes and own performance records as phenotypes. Given the current efforts to establish cow reference populations in dairy cattle genomic selection schemes, such datasets are expected to be soon available, which will enable the application of BayesD for association mapping and genomic prediction purposes.