RESUMEN
Expression and splicing quantitative trait loci (e/sQTL) are large contributors to phenotypic variability. Achieving sufficient statistical power for e/sQTL mapping requires large cohorts with both genotypes and molecular phenotypes, and so, the genomic variation is often called from short-read alignments, which are unable to comprehensively resolve structural variation. Here we build a pangenome from 16 HiFi haplotype-resolved cattle assemblies to identify small and structural variation and genotype them with PanGenie in 307 short-read samples. We find high (>90%) concordance of PanGenie-genotyped and DeepVariant-called small variation and confidently genotype close to 21 million small and 43,000 structural variants in the larger population. We validate 85% of these structural variants (with MAF > 0.1) directly with a subset of 25 short-read samples that also have medium coverage HiFi reads. We then conduct e/sQTL mapping with this comprehensive variant set in a subset of 117 cattle that have testis transcriptome data, and find 92 structural variants as causal candidates for eQTL and 73 for sQTL. We find that roughly half of the top associated structural variants affecting expression or splicing are transposable elements, such as SV-eQTL for STN1 and MYH7 and SV-sQTL for CEP89 and ASAH2 Extensive linkage disequilibrium between small and structural variation results in only 28 additional eQTL and 17 sQTL discovered when including SVs, although many top associated SVs are compelling candidates.
Asunto(s)
Sitios de Carácter Cuantitativo , Empalme del ARN , Masculino , Bovinos/genética , Animales , Genotipo , Fenotipo , Desequilibrio de Ligamiento , Variación Estructural del GenomaRESUMEN
Despite passing routine laboratory tests for semen quality, bulls used in artificial insemination exhibit significant variation in fertility. Routine analysis of fertility data identified a dairy bull with extreme subfertility (10% pregnancy rate). To characterize the subfertility phenotype, a range of in vitro, in vivo, and molecular assays were carried out. Sperm from the subfertile bull exhibited reduced motility and severely reduced caffeine-induced hyperactivation compared to controls. Ability to penetrate the zona pellucida, cleavage rate, cleavage kinetics, and blastocyst yield after IVF or AI were significantly lower than in control bulls. Whole-genome sequencing from semen and RNA sequencing of testis tissue revealed a critical mutation in adenylate kinase 9 (AK9) that impaired splicing, leading to a premature termination codon and a severely truncated protein. Mice deficient in AK9 were generated to further investigate the function of the gene; knockout males were phenotypically indistinguishable from their wild-type littermates but produced immotile sperm that were incapable of normal fertilization. These sperm exhibited numerous abnormalities, including a low ATP concentration and reduced motility. RNA-seq analysis of their testis revealed differential gene expression of components of the axoneme and sperm flagellum as well as steroid metabolic processes. Sperm ultrastructural analysis showed a high percentage of sperm with abnormal flagella. Combined bovine and murine data indicate the essential metabolic role of AK9 in sperm motility and/or hyperactivation, which in turn affects sperm binding and penetration of the zona pellucida. Thus, AK9 has been found to be directly implicated in impaired male fertility in mammals.
Asunto(s)
Adenilato Quinasa , Infertilidad , Semen , Animales , Bovinos , Femenino , Masculino , Ratones , Embarazo , Adenilato Quinasa/genética , Adenilato Quinasa/metabolismo , Fertilidad , Mamíferos , Semen/metabolismo , Análisis de Semen , Motilidad Espermática , Espermatozoides/metabolismoRESUMEN
The branch point sequence is a degenerate intronic heptamer required for the assembly of the spliceosome during pre-mRNA splicing. Disruption of this motif may promote alternative splicing and eventually cause phenotype variation. Despite its functional relevance, the branch point sequence is not included in most genome annotations. Here, we predict branch point sequences in 30 plant and animal species and attempt to quantify their evolutionary constraints using public variant databases. We find an implausible variant distribution in the databases from 16 of 30 examined species. Comparative analysis of variants from whole-genome sequencing shows that variants submitted from exome sequencing or false positive variants are widespread in public databases and cause these irregularities. We then investigate evolutionary constraint with largely unbiased public variant databases in 14 species and find that the fourth and sixth position of the branch point sequence are more constrained than coding nucleotides. Our findings show that public variant databases should be scrutinized for possible biases before they qualify to analyze evolutionary constraint.
Asunto(s)
Evolución Biológica , Plantas , Empalme del ARN , Animales , Genómica , Intrones/genética , Plantas/genética , Empalmosomas , Bases de Datos GenéticasRESUMEN
BACKGROUND: Association testing between molecular phenotypes and genomic variants can help to understand how genotype affects phenotype. RNA sequencing provides access to molecular phenotypes such as gene expression and alternative splicing while DNA sequencing or microarray genotyping are the prevailing options to obtain genomic variants. RESULTS: We genotype variants for 74 male Braunvieh cattle from both DNA (~ 13-fold coverage) and deep total RNA sequencing from testis, vas deferens, and epididymis tissue (~ 250 million reads per tissue). We show that RNA sequencing can be used to identify approximately 40% of variants (7-10 million) called from DNA sequencing, with over 80% precision. Within highly expressed coding regions, over 92% of expected variants were called with nearly 98% precision. Allele-specific expression and putative post-transcriptional modifications negatively impact variant genotyping accuracy from RNA sequencing and contribute to RNA-DNA differences. Variants called from RNA sequencing detect roughly 75% of eGenes identified using variants called from DNA sequencing, demonstrating a nearly 2-fold enrichment of eQTL variants. We observe a moderate-to-strong correlation in nominal association p-values (Spearman ρ2 ~ 0.6), although only 9% of eGenes have the same top associated variant. CONCLUSIONS: We find hundreds of thousands of RNA-DNA differences in variants called from RNA and DNA sequencing on the same individuals. We identify several highly significant eQTL when using RNA sequencing variant genotypes which are not found with DNA sequencing variant genotypes, suggesting that using RNA sequencing variant genotypes for association testing results in an increased number of false positives. Our findings demonstrate that caution must be exercised beyond filtering for variant quality or imputation accuracy when analysing or imputing variants called from RNA sequencing.
Asunto(s)
Sitios de Carácter Cuantitativo , Animales , Bovinos/genética , Masculino , ADN/genética , Genotipo , Análisis de Secuencia de ARN , Testículo/metabolismo , Variación Genética , Polimorfismo de Nucleótido Simple , ARN/genética , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Mastitis is a disease that incurs significant costs in the dairy industry. A promising approach to mitigate its negative effects is to genetically improve the resistance of dairy cattle to mastitis. A meta-analysis of genome-wide association studies (GWAS) across multiple breeds for clinical mastitis (CM) and its indicator trait, somatic cell score (SCS), is a powerful method to identify functional genetic variants that impact mastitis resistance. RESULTS: We conducted meta-analyses of eight and fourteen GWAS on CM and SCS, respectively, using 30,689 and 119,438 animals from six dairy cattle breeds. Methods for the meta-analyses were selected to properly account for the multi-breed structure of the GWAS data. Our study revealed 58 lead markers that were associated with mastitis incidence, including 16 loci that did not overlap with previously identified quantitative trait loci (QTL), as curated at the Animal QTLdb. Post-GWAS analysis techniques such as gene-based analysis and genomic feature enrichment analysis enabled prioritization of 31 candidate genes and 14 credible candidate causal variants that affect mastitis. CONCLUSIONS: Our list of candidate genes can help to elucidate the genetic architecture underlying mastitis resistance and provide better tools for the prevention or treatment of mastitis, ultimately contributing to more sustainable animal production.
Asunto(s)
Resistencia a la Enfermedad , Estudio de Asociación del Genoma Completo , Mastitis Bovina , Sitios de Carácter Cuantitativo , Animales , Bovinos/genética , Mastitis Bovina/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Estudio de Asociación del Genoma Completo/veterinaria , Resistencia a la Enfermedad/genética , Polimorfismo de Nucleótido Simple , Cruzamiento/métodosRESUMEN
Many genomic analyses start by aligning sequencing reads to a linear reference genome. However, linear reference genomes are imperfect, lacking millions of bases of unknown relevance and are unable to reflect the genetic diversity of populations. This makes reference-guided methods susceptible to reference-allele bias. To overcome such limitations, we build a pangenome from six reference-quality assemblies from taurine and indicine cattle as well as yak. The pangenome contains an additional 70,329,827 bases compared to the Bos taurus reference genome. Our multiassembly approach reveals 30 and 10.1 million bases private to yak and indicine cattle, respectively, and between 3.3 and 4.4 million bases unique to each taurine assembly. Utilizing transcriptomes from 56 cattle, we show that these nonreference sequences encode transcripts that hitherto remained undetected from the B. taurus reference genome. We uncover genes, primarily encoding proteins contributing to immune response and pathogen-mediated immunomodulation, differentially expressed between Mycobacterium bovis-infected and noninfected cattle that are also undetectable in the B. taurus reference genome. Using whole-genome sequencing data of cattle from five breeds, we show that reads which were previously misaligned against the Bos taurus reference genome now align accurately to the pangenome sequences. This enables us to discover 83,250 polymorphic sites that segregate within and between breeds of cattle and capture genetic differentiation across breeds. Our work makes a so-far unused source of variation amenable to genetic investigations and provides methods and a framework for establishing and exploiting a more diverse reference genome.
Asunto(s)
Bovinos/genética , Animales , Femenino , Masculino , Secuenciación Completa del GenomaRESUMEN
Understanding the genetic mechanism of how animals adapt to extreme conditions is fundamental to determine the relationship between molecular evolution and changing environments. Goat is one of the first domesticated species and has evolved rapidly to adapt to diverse environments, including harsh high-altitude conditions with low temperature and poor oxygen supply but strong ultraviolet radiation. Here, we analyzed 331 genomes of domestic goats and wild caprid species living at varying altitudes (high > 3000â m above sea level and low < 1200â m), along with a reference-guided chromosome-scale assembly (contig-N50: 90.4â Mb) of a female Tibetan goat genome based on PacBio HiFi long reads, to dissect the genetic determinants underlying their adaptation to harsh conditions on the Qinghai-Tibetan Plateau (QTP). Population genomic analyses combined with genome-wide association studies (GWAS) revealed a genomic region harboring the 3'-phosphoadenosine 5'-phosphosulfate synthase 2 (PAPSS2) gene showing strong association with high-altitude adaptability (PGWAS = 3.62 × 10-25) in Tibetan goats. Transcriptomic data from 13 tissues revealed that PAPSS2 was implicated in hypoxia-related pathways in Tibetan goats. We further verified potential functional role of PAPSS2 in response to hypoxia in PAPSS2-deficient cells. Introgression analyses suggested that the PAPSS2 haplotype conferring the high-altitude adaptability in Tibetan goats originated from a recent hybridization between goats and a wild caprid species, the markhor (Capra falconeri). In conclusion, our results uncover a hitherto unknown contribution of PAPSS2 to high-altitude adaptability in Tibetan goats on QTP, following interspecific introgression and natural selection.
Asunto(s)
Estudio de Asociación del Genoma Completo , Cabras , Animales , Cabras/genética , Rayos Ultravioleta , GenómicaRESUMEN
BACKGROUND: Low-pass sequencing followed by sequence variant genotype imputation is an alternative to the routine microarray-based genotyping in cattle. However, the impact of haplotype reference panels and their interplay with the coverage of low-pass whole-genome sequencing data have not been sufficiently explored in typical livestock settings where only a small number of reference samples is available. METHODS: Sequence variant genotyping accuracy was compared between two variant callers, GATK and DeepVariant, in 50 Brown Swiss cattle with sequencing coverages ranging from 4- to 63-fold. Haplotype reference panels of varying sizes and composition were built with DeepVariant based on 501 individuals from nine breeds. High-coverage sequence data for 24 Brown Swiss cattle were downsampled to between 0.01- and 4-fold to mimic low-pass sequencing. GLIMPSE was used to infer sequence variant genotypes from the low-pass sequencing data using different haplotype reference panels. The accuracy of the sequence variant genotypes that were inferred from low-pass sequencing data was compared with sequence variant genotypes called from high-coverage data. RESULTS: DeepVariant was used to establish bovine haplotype reference panels because it outperformed GATK in all evaluations. Within-breed haplotype reference panels were more accurate and efficient to impute sequence variant genotypes from low-pass sequencing than equally-sized multibreed haplotype reference panels for all target sample coverages and allele frequencies. F1 scores greater than 0.9, which indicate high harmonic means of recall and precision of called genotypes, were achieved with 0.25-fold sequencing coverage when large breed-specific haplotype reference panels (n = 150) were used. In absence of such large within-breed haplotype panels, variant genotyping accuracy from low-pass sequencing could be increased either by adding non-related samples to the haplotype reference panel or by increasing the coverage of the low-pass sequencing data. Sequence variant genotyping from low-pass sequencing was substantially less accurate when the reference panel lacked individuals from the target breed. CONCLUSIONS: Variant genotyping is more accurate with DeepVariant than GATK. DeepVariant is therefore suitable to establish bovine haplotype reference panels. Medium-sized breed-specific haplotype reference panels and large multibreed haplotype reference panels enable accurate imputation of low-pass sequencing data in a typical cattle breed.
Asunto(s)
Haplotipos , Animales , Bovinos , Genotipo , Variación GenéticaRESUMEN
BACKGROUND: Combining the results of within-population genome-wide association studies (GWAS) based on whole-genome sequences into a single meta-analysis (MA) is an accurate and powerful method for identifying variants associated with complex traits. As part of the H2020 BovReg project, we performed sequence-level MA for beef production traits. Five partners from France, Switzerland, Germany, and Canada contributed summary statistics from sequence-based GWAS conducted with 54,782 animals from 15 purebred or crossbred populations. We combined the summary statistics for four growth, nine morphology, and 15 carcass traits into 16 MA, using both fixed effects and z-score methods. RESULTS: The fixed-effects method was generally more informative to provide indication on potentially causal variants, although we combined substantially different traits in each MA. In comparison with within-population GWAS, this approach highlighted (i) a larger number of quantitative trait loci (QTL), (ii) QTL more frequently located in genomic regions known for their effects on growth and meat/carcass traits, (iii) a smaller number of genomic variants within the QTL, and (iv) candidate variants that were more frequently located in genes. MA pinpointed variants in genes, including MSTN, LCORL, and PLAG1 that have been previously associated with morphology and carcass traits. We also identified dozens of other variants located in genes associated with growth and carcass traits, or with a function that may be related to meat production (e.g., HS6ST1, HERC2, WDR75, COL3A1, SLIT2, MED28, and ANKAR). Some of these variants overlapped with expression or splicing QTL reported in the cattle Genotype-Tissue Expression atlas (CattleGTEx) and could therefore regulate gene expression. CONCLUSIONS: By identifying candidate genes and potential causal variants associated with beef production traits in cattle, MA demonstrates great potential for investigating the biological mechanisms underlying these traits. As a complement to within-population GWAS, this approach can provide deeper insights into the genetic architecture of complex traits in beef cattle.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Bovinos/genética , Animales , Fenotipo , Carne/análisis , Genómica , Polimorfismo de Nucleótido SimpleRESUMEN
Cattle are ideally suited to investigate the genetics of male reproduction, because semen quality and fertility are recorded for all ejaculates of artificial insemination bulls. We analysed 26,090 ejaculates of 794 Brown Swiss bulls to assess ejaculate volume, sperm concentration, sperm motility, sperm head and tail anomalies and insemination success. The heritability of the six semen traits was between 0 and 0.26. Genome-wide association testing on 607,511 SNPs revealed a QTL on bovine chromosome 6 that was associated with sperm motility (P = 2.5 x 10-27), head (P = 2.0 x 10-44) and tail anomalies (P = 7.2 x 10-49) and insemination success (P = 9.9 x 10-13). The QTL harbors a recessive allele that compromises semen quality and male fertility. We replicated the effect of the QTL on fertility (P = 7.1 x 10-32) in an independent cohort of 2481 Brown Swiss bulls. The analysis of whole-genome sequencing data revealed that a synonymous variant (BTA6:58373887C>T, rs474302732) in WDR19 encoding WD repeat-containing protein 19 was in linkage disequilibrium with the fertility-associated haplotype. WD repeat-containing protein 19 is a constituent of the intraflagellar transport complex that is essential for the physiological function of motile cilia and flagella. Bioinformatic and transcription analyses revealed that the BTA6:58373887 T-allele activates a cryptic exonic splice site that eliminates three evolutionarily conserved amino acids from WDR19. Western blot analysis demonstrated that the BTA6:58373887 T-allele decreases protein expression. We make the remarkable observation that, in spite of negative effects on semen quality and bull fertility, the BTA6:58373887 T-allele has a frequency of 24% in the Brown Swiss population. Our findings are the first to uncover a variant that is associated with quantitative variation in semen quality and male fertility in cattle.
Asunto(s)
Empalme Alternativo , Proteínas del Citoesqueleto/genética , Infertilidad Masculina/genética , Polimorfismo de Nucleótido Simple , Semen/fisiología , Animales , Bovinos , Cromosomas de los Mamíferos/genética , Estudio de Asociación del Genoma Completo , Inseminación Artificial/veterinaria , Masculino , Carácter Cuantitativo Heredable , Análisis de Semen/veterinaria , Motilidad Espermática , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: Semen quality and insemination success are monitored in artificial insemination bulls to ensure high male fertility rates. Only ejaculates that fulfill minimum quality requirements are processed and eventually used for artificial inseminations. We examined 70,990 ejaculates from 1343 Brown Swiss bulls to identify bulls from which all ejaculates were rejected due to low semen quality. This procedure identified a bull that produced 12 ejaculates with an aberrantly small number of sperm (0.2 ± 0.2 × 109 sperm per mL) which were mostly immotile due to multiple morphological abnormalities. RESULTS: The genome of this bull was sequenced at a 12× coverage to investigate a possible genetic cause. Comparing the sequence variant genotypes of this bull with those from 397 fertile bulls revealed a 1-bp deletion in the coding sequence of the QRICH2 gene which encodes the glutamine rich 2 protein, as a compelling candidate causal variant. This 1-bp deletion causes a frameshift in translation and a premature termination codon (ENSBTAP00000018337.1:p.Cys1644AlafsTer52). The analysis of testis transcriptomes from 76 bulls showed that the transcript with the premature termination codon is subject to nonsense-mediated mRNA decay. The 1-bp deletion resides in a 675-kb haplotype that includes 181 single nucleotide polymorphisms (SNPs) from the Illumina BovineHD Bead chip. This haplotype segregates at a frequency of 5% in the Brown Swiss cattle population. Our analysis also identified another bull that carried the 1-bp deletion in the homozygous state. Semen analyses from the second bull confirmed low sperm concentration and immotile sperm with multiple morphological abnormalities that primarily affect the sperm flagellum and, to a lesser extent, the sperm head. CONCLUSIONS: A recessive loss-of-function allele of the bovine QRICH2 gene likely causes low sperm concentration and immotile sperm with multiple morphological abnormalities. Routine sperm analyses unambiguously identify homozygous bulls for this allele. A direct gene test can be implemented to monitor the frequency of the undesired allele in cattle populations.
Asunto(s)
Oligospermia , Análisis de Semen , Animales , Bovinos/genética , Fertilidad/genética , Inseminación Artificial/veterinaria , Masculino , Análisis de Semen/veterinaria , EspermatozoidesRESUMEN
Classical Swine Fever (CSF) is a contagious viral disease of pigs which is endemic in several parts of the world, including India. Prophylactic vaccination using live attenuated vaccine is the preferred method of control. However, there is significant inter-individual variation in the antibody response to vaccination. In this study, we measured the E2 antibody blocking percentage after 21 days of CSF vaccination in a mixed pig population consisting of Landrace, indigenous Ghurrah pigs, and their crossbreds. A Genome Wide Association Study (GWAS) carried out using single-SNP and haplotype based methods detected a 1.6 Mb region on SSC2 (28.92-30.52 Mb) as significantly associated with antibody response to CSF vaccination. The significant region and 1 Mb flanking sequences encompass 3 genes - EIF3M, DNAJC24 and ARL14EP, which code for proteins involved in Pestivirus replication and host immune response system. Our results combined with previous studies on immune response of pigs present this region as a suitable candidate for future functional investigations.
Asunto(s)
Virus de la Fiebre Porcina Clásica , Peste Porcina Clásica , Enfermedades de los Porcinos , Vacunas Virales , Porcinos , Animales , Peste Porcina Clásica/prevención & control , Virus de la Fiebre Porcina Clásica/genética , Formación de Anticuerpos , Estudio de Asociación del Genoma Completo , Vacunación , Vacunas AtenuadasRESUMEN
BACKGROUND: Reference-guided read alignment and variant genotyping are prone to reference allele bias, particularly for samples that are greatly divergent from the reference genome. A Hereford-based assembly is the widely accepted bovine reference genome. Haplotype-resolved genomes that exceed the current bovine reference genome in quality and continuity have been assembled for different breeds of cattle. Using whole genome sequencing data of 161 Brown Swiss cattle, we compared the accuracy of read mapping and sequence variant genotyping as well as downstream genomic analyses between the bovine reference genome (ARS-UCD1.2) and a highly continuous Angus-based assembly (UOA_Angus_1). RESULTS: Read mapping accuracy did not differ notably between the ARS-UCD1.2 and UOA_Angus_1 assemblies. We discovered 22,744,517 and 22,559,675 high-quality variants from ARS-UCD1.2 and UOA_Angus_1, respectively. The concordance between sequence- and array-called genotypes was high and the number of variants deviating from Hardy-Weinberg proportions was low at segregating sites for both assemblies. More artefactual INDELs were genotyped from UOA_Angus_1 than ARS-UCD1.2 alignments. Using the composite likelihood ratio test, we detected 40 and 33 signatures of selection from ARS-UCD1.2 and UOA_Angus_1, respectively, but the overlap between both assemblies was low. Using the 161 sequenced Brown Swiss cattle as a reference panel, we imputed sequence variant genotypes into a mapping cohort of 30,499 cattle that had microarray-derived genotypes using a two-step imputation approach. The accuracy of imputation (Beagle R2) was very high (0.87) for both assemblies. Genome-wide association studies between imputed sequence variant genotypes and six dairy traits as well as stature produced almost identical results from both assemblies. CONCLUSIONS: The ARS-UCD1.2 and UOA_Angus_1 assemblies are suitable for reference-guided genome analyses in Brown Swiss cattle. Although differences in read mapping and genotyping accuracy between both assemblies are negligible, the choice of the reference genome has a large impact on detecting signatures of selection that already reached fixation using the composite likelihood ratio test. We developed a workflow that can be adapted and reused to compare the impact of reference genomes on genome analyses in various breeds, populations and species.
Asunto(s)
Estudio de Asociación del Genoma Completo , Genoma , Animales , Bovinos/genética , Perros , Genómica , Genotipo , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: Cattle are ideally suited to investigate the genetics of male fertility. Semen from individual bulls is used for thousands of artificial inseminations for which the fertilization success is monitored. Results from the breeding soundness examination and repeated observations of semen quality complement the fertility evaluation for each bull. RESULTS: In a cohort of 3881 Brown Swiss bulls that had genotypes at 683,609 SNPs, we reveal four novel recessive QTL for male fertility on BTA1, 18, 25, and 26 using haplotype-based association testing. A QTL for bull fertility on BTA1 is also associated with sperm head shape anomalies. All other QTL are not associated with any of the semen quality traits investigated. We perform complementary fine-mapping approaches using publicly available transcriptomes as well as whole-genome sequencing data of 125 Brown Swiss bulls to reveal candidate causal variants. We show that missense or nonsense variants in SPATA16, VWA3A, ENSBTAG00000006717 and ENSBTAG00000019919 are in linkage disequilibrium with the QTL. Using whole-genome sequence data, we detect strong association (P = 4.83 × 10- 12) of a missense variant (p.Ile193Met) in SPATA16 with male fertility. However, non-coding variants exhibit stronger association at all QTL suggesting that variants in regulatory regions contribute to variation in bull fertility. CONCLUSION: Our findings in a dairy cattle population provide evidence that recessive variants may contribute substantially to quantitative variation in male fertility in mammals. Detecting causal variants that underpin variation in male fertility remains difficult because the most strongly associated variants reside in poorly annotated non-coding regions.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Animales , Bovinos/genética , Fertilidad/genética , Humanos , Inseminación Artificial , Masculino , Polimorfismo de Nucleótido Simple , Análisis de SemenRESUMEN
BACKGROUND: The key-ancestor approach has been frequently applied to prioritize individuals for whole-genome sequencing based on their marginal genetic contribution to current populations. Using this approach, we selected 70 key ancestors from two lines of the Swiss Large White breed that have been selected divergently for fertility and fattening traits and sequenced their genomes with short paired-end reads. RESULTS: Using pedigree records, we estimated the effective population size of the dam and sire line to 72 and 44, respectively. In order to assess sequence variation in both lines, we sequenced the genomes of 70 boars at an average coverage of 16.69-fold. The boars explained 87.95 and 95.35% of the genetic diversity of the breeding populations of the dam and sire line, respectively. Reference-guided variant discovery using the GATK revealed 26,862,369 polymorphic sites. Principal component, admixture and fixation index (FST) analyses indicated considerable genetic differentiation between the lines. Genomic inbreeding quantified using runs of homozygosity was higher in the sire than dam line (0.28 vs 0.26). Using two complementary approaches, we detected 51 signatures of selection. However, only six signatures of selection overlapped between both lines. We used the sequenced haplotypes of the 70 key ancestors as a reference panel to call 22,618,811 genotypes in 175 pigs that had been sequenced at very low coverage (1.11-fold) using the GLIMPSE software. The genotype concordance, non-reference sensitivity and non-reference discrepancy between thus inferred and Illumina PorcineSNP60 BeadChip-called genotypes was 97.60, 98.73 and 3.24%, respectively. The low-pass sequencing-derived genomic relationship coefficients were highly correlated (r > 0.99) with those obtained from microarray genotyping. CONCLUSIONS: We assessed genetic diversity within and between two lines of the Swiss Large White pig breed. Our analyses revealed considerable differentiation, even though the split into two populations occurred only few generations ago. The sequenced haplotypes of the key ancestor animals enabled us to implement genotyping by low-pass sequencing which offers an intriguing cost-effective approach to increase the variant density over current array-based genotyping by more than 350-fold.
Asunto(s)
Genoma , Polimorfismo de Nucleótido Simple , Animales , Genotipo , Haplotipos , Masculino , Porcinos/genética , SuizaRESUMEN
BACKGROUND: Atypical external genitalia are often a sign of reproductive organ pathologies and infertility with both environmental or genetic causes, including karyotypic abnormalities. Genome-wide association studies (GWAS) provide a means for identifying chromosomal regions harboring deleterious DNA-variants causing such phenotypes. We performed a GWAS to unravel the causes of incidental cases of atypically small vulvae in German Landrace gilts. RESULTS: A case-control GWAS involving Illumina porcine SNP60 BeadChip-called genotypes of 17 gilts with atypically small vulvae and 1818 control animals (fertile German Landrace sows) identified a significantly associated region on the X-chromosome (P = 8.81 × 10- 43). Inspection of whole-genome sequencing data in the critical area allowed us to pinpoint a likely causal variant in the form of a nonsense mutation of bone morphogenetic protein-15 (BMP15; Sscrofa11.1_X:g.44618787C>T, BMP15:p.R212X). The mutant allele occurs at a frequency of 6.2% in the German Landrace breeding population. Homozygous gilts exhibit underdeveloped, most likely not functional ovaries and are not fertile. Male carriers do not seem to manifest defects. Heterozygous sows produce 0.41±0.02 (P=4.5 × 10-83) piglets more than wildtype animals. However, the mutant allele's positive effect on litter size accompanies a negative impact on lean meat growth. CONCLUSION: Our results provide an example for the power of GWAS in identifying the genetic causes of a fuzzy phenotype and add to the list of natural deleterious BMP15 mutations that affect fertility in a dosage-dependent manner, the first time in a poly-ovulatory species. We advise eradicating the mutant allele from the German Landrace breeding population since the adverse effects on the lean meat growth outweigh the larger litter size in heterozygous sows.
Asunto(s)
Proteína Morfogenética Ósea 15 , Infertilidad , Animales , Proteína Morfogenética Ósea 15/genética , Codón sin Sentido , Femenino , Estudio de Asociación del Genoma Completo , Tamaño de la Camada/genética , Masculino , Embarazo , PorcinosRESUMEN
BACKGROUND: Autochthonous cattle breeds are an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and food conditions. Original Braunvieh (OB) is a local cattle breed of Switzerland used for beef and milk production in alpine areas. Using whole-genome sequencing (WGS) data of 49 key ancestors, we characterize genomic diversity, genomic inbreeding, and signatures of selection in Swiss OB cattle at nucleotide resolution. RESULTS: We annotated 15,722,811 SNPs and 1,580,878 Indels including 10,738 and 2763 missense deleterious and high impact variants, respectively, that were discovered in 49 OB key ancestors. Six Mendelian trait-associated variants that were previously detected in breeds other than OB, segregated in the sequenced key ancestors including variants causal for recessive xanthinuria and albinism. The average nucleotide diversity (1.6  × 10- 3) was higher in OB than many mainstream European cattle breeds. Accordingly, the average genomic inbreeding derived from runs of homozygosity (ROH) was relatively low (FROH = 0.14) in the 49 OB key ancestor animals. However, genomic inbreeding was higher in OB cattle of more recent generations (FROH = 0.16) due to a higher number of long (> 1 Mb) runs of homozygosity. Using two complementary approaches, composite likelihood ratio test and integrated haplotype score, we identified 95 and 162 genomic regions encompassing 136 and 157 protein-coding genes, respectively, that showed evidence (P < 0.005) of past and ongoing selection. These selection signals were enriched for quantitative trait loci related to beef traits including meat quality, feed efficiency and body weight and pathways related to blood coagulation, nervous and sensory stimulus. CONCLUSIONS: We provide a comprehensive overview of sequence variation in Swiss OB cattle genomes. With WGS data, we observe higher genomic diversity and less inbreeding in OB than many European mainstream cattle breeds. Footprints of selection were detected in genomic regions that are possibly relevant for meat quality and adaptation to local environmental conditions. Considering that the population size is low and genomic inbreeding increased in the past generations, the implementation of optimal mating strategies seems warranted to maintain genetic diversity in the Swiss OB cattle population.
Asunto(s)
Genómica/métodos , Secuenciación Completa del Genoma/métodos , Alelos , Animales , Bovinos , Genética de Población , Genotipo , Fenotipo , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: Imputation accuracy among other things depends on the size of the reference panel, the marker's minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy. RESULTS: The overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants. CONCLUSIONS: Accurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.
Asunto(s)
Bovinos/genética , Polimorfismo de Nucleótido Simple , Animales , Brasil , Frecuencia de los Genes , Genotipo , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND: Sequence-based genome-wide association studies (GWAS) provide high statistical power to identify candidate causal mutations when a large number of individuals with both sequence variant genotypes and phenotypes is available. A meta-analysis combines summary statistics from multiple GWAS and increases the power to detect trait-associated variants without requiring access to data at the individual level of the GWAS mapping cohorts. Because linkage disequilibrium between adjacent markers is conserved only over short distances across breeds, a multi-breed meta-analysis can improve mapping precision. RESULTS: To maximise the power to identify quantitative trait loci (QTL), we combined the results of nine within-population GWAS that used imputed sequence variant genotypes of 94,321 cattle from eight breeds, to perform a large-scale meta-analysis for fat and protein percentage in cattle. The meta-analysis detected (p ≤ 10-8) 138 QTL for fat percentage and 176 QTL for protein percentage. This was more than the number of QTL detected in all within-population GWAS together (124 QTL for fat percentage and 104 QTL for protein percentage). Among all the lead variants, 100 QTL for fat percentage and 114 QTL for protein percentage had the same direction of effect in all within-population GWAS. This indicates either persistence of the linkage phase between the causal variant and the lead variant across breeds or that some of the lead variants might indeed be causal or tightly linked with causal variants. The percentage of intergenic variants was substantially lower for significant variants than for non-significant variants, and significant variants had mostly moderate to high minor allele frequencies. Significant variants were also clustered in genes that are known to be relevant for fat and protein percentages in milk. CONCLUSIONS: Our study identified a large number of QTL associated with fat and protein percentage in dairy cattle. We demonstrated that large-scale multi-breed meta-analysis reveals more QTL at the nucleotide resolution than within-population GWAS. Significant variants were more often located in genic regions than non-significant variants and a large part of them was located in potentially regulatory regions.
Asunto(s)
Bovinos/genética , Genotipo , Desequilibrio de Ligamiento , Lípidos/genética , Proteínas de la Leche/genética , Leche/normas , Animales , Frecuencia de los Genes , Leche/metabolismo , Polimorfismo Genético , Sitios de Carácter CuantitativoRESUMEN
BACKGROUND: Little is known about the genetic architecture of economically important traits in Brown Swiss cattle because only few genome-wide association studies (GWAS) have been carried out in this breed. Moreover, most GWAS have been performed for single traits, thus not providing detailed insights into potentially existing pleiotropic effects of trait-associated loci. RESULTS: To compile a comprehensive catalogue of large-effect quantitative trait loci (QTL) segregating in Brown Swiss cattle, we carried out association tests between partially imputed genotypes at 598,016 SNPs and daughter-derived phenotypes for more than 50 economically important traits, including milk production, growth and carcass quality, body conformation, reproduction and calving traits in 4578 artificial insemination bulls from two cohorts of Brown Swiss cattle (Austrian-German and Swiss populations). Across-cohort multi-trait meta-analyses of the results from the single-trait GWAS revealed 25 quantitative trait loci (QTL; P < 8.36 × 10- 8) for economically relevant traits on 17 Bos taurus autosomes (BTA). Evidence of pleiotropy was detected at five QTL located on BTA5, 6, 17, 21 and 25. Of these, two QTL at BTA6:90,486,780 and BTA25:1,455,150 affect a diverse range of economically important traits, including traits related to body conformation, calving, longevity and milking speed. Furthermore, the QTL at BTA6:90,486,780 seems to be a target of ongoing selection as evidenced by an integrated haplotype score of 2.49 and significant changes in allele frequency over the past 25 years, whereas either no or only weak evidence of selection was detected at all other QTL. CONCLUSIONS: Our findings provide a comprehensive overview of QTL segregating in Brown Swiss cattle. Detected QTL explain between 2 and 10% of the variation in the estimated breeding values and thus may be considered as the most important QTL segregating in the Brown Swiss cattle breed. Multi-trait association testing boosts the power to detect pleiotropic QTL and assesses the full spectrum of phenotypes that are affected by trait-associated variants.