RESUMO
BACKGROUND: Accurate imputation plays a major role in genomic studies of livestock industries, where the number of genotyped or sequenced animals is limited by costs. This study explored methods to create an ideal reference population for imputation to Next Generation Sequencing data in cattle. METHODS: Methods for clustering of animals for imputation were explored, using 1000 Bull Genomes Project sequence data on 1146 animals from a variety of beef and dairy breeds. Imputation from 50 K to 777 K was first carried out to choose an ideal clustering method, using ADMIXTURE or PLINK clustering algorithms with either genotypes or reconstructed haplotypes. RESULTS: Due to efficiency, accuracy and ease of use, clustering with PLINK using haplotypes as quasi-genotypes was chosen as the most advantageous grouping method. It was found that using a clustered population slightly decreased computing time, while maintaining accuracy across the population. Although overall accuracy remained the same, a slight increase in accuracy was observed for groups of animals in some breeds (primarily purebred beef cattle from breeds with fewer sequenced animals) and for other groups, primarily crossbreed animals, a slight decrease in accuracy was observed. However, it was noted that some animals in each breed were poorly imputed across all methods. When imputed sequences were included in the reference population to aid imputation of poorly imputed animals, a small increase in overall accuracy was observed for nearly every individual in the population. Two models were created to predict imputation accuracy, a complete model using all information available including Euclidean distances from genotypes and haplotypes, pedigree information, and clustering groups and a simple model using only breed and an Euclidean distance matrix as predictors. Both models were successful in predicting imputation accuracy, with correlations between predicted and true imputation accuracy as measured by concordance rate of 0.87 and 0.83, respectively. CONCLUSIONS: A clustering methodology can be very useful to subgroup cattle for efficient genotype imputation. In addition, accuracy of genotype imputation from medium to high-density Single Nucleotide Polymorphisms (SNP) chip panels to whole-genome sequence can be predicted well using a simple linear model defined in this study.
Assuntos
Bovinos/genética , Modelos Genéticos , Sequenciamento Completo do Genoma/veterinária , Algoritmos , Animais , Modelos Lineares , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: Strategies for imputing genotypes from the Illumina-Bovine3K, Illumina-BovineLD (6K), BeefLD-GGP (8K), a non-commercial-15K and IndicusLD-GGP (20K) to either Illumina-BovineSNP50 (50K) or to Illumina-BovineHD (777K) SNP panel, as well as for imputing from 50K, GGP-IndicusHD (90iK) and GGP-BeefHD (90tK) to 777K were investigated. Imputation of low density (<50K) genotypes to 777K was carried out in either one or two steps. Imputation of ungenotyped parents (n = 37 sires) with four or more offspring to the 50K panel was also assessed. There were 2,946 Braford, 664 Hereford and 88 Nellore animals, from which 71, 59 and 88 were genotyped with the 777K panel, while all others had 50K genotypes. The reference population was comprised of 2,735 animals and 175 bulls for 50K and 777K, respectively. The low density panels were simulated by masking genotypes in the 50K or 777K panel for animals born in 2011. Analyses were performed using both Beagle and FImpute software. Genotype imputation accuracy was measured by concordance rate and allelic R(2) between true and imputed genotypes. RESULTS: The average concordance rate using FImpute was 0.943 and 0.921 averaged across all simulated low density panels to 50K or to 777K, respectively, in comparison with 0.927 and 0.895 using Beagle. The allelic R(2) was 0.912 and 0.866 for imputation to 50K or to 777K using FImpute, respectively, and 0.890 and 0.826 using Beagle. One and two steps imputation to 777K produced averaged concordance rates of 0.806 and 0.892 and allelic R(2) of 0.674 and 0.819, respectively. Imputation of low density panels to 50K, with the exception of 3K, had overall concordance rates greater than 0.940 and allelic R(2) greater than 0.919. Ungenotyped animals were imputed to 50K panel with an average concordance rate of 0.950 by FImpute. CONCLUSION: FImpute accuracy outperformed Beagle on both imputation to 50K and to 777K. Two-step outperformed one-step imputation for imputing to 777K. Ungenotyped animals that have four or more offspring can have their 50K genotypes accurately inferred using FImpute. All low density panels, except the 3K, can be used to impute to the 50K using FImpute or Beagle with high concordance rate and allelic R(2).
Assuntos
Bovinos/genética , Polimorfismo de Nucleotídeo Único , Animais , Cruzamento , Feminino , Frequência do Gene , Genoma , Genótipo , Masculino , Modelos Genéticos , Linhagem , Análise de Sequência de DNARESUMO
PIWI-interacting RNAs (piRNAs) are 24-32 nucleotide RNA sequences primarily expressed in germ cells and developing embryos that suppress transposable element expression to protect genomic integrity during epigenetic reprogramming events. We characterized the expression of piRNA sequences and their encoding clusters in sperm samples from an idiopathic fertility model of Holstein bulls with high and low Sire Conception Rates. The piRNA populations were determined to be mostly similar between fertility conditions when investigated by principal component and differential expression analysis, suggesting that a high degree of conservation in the piRNA system is likely necessary for the production of viable sperm. Both fertility conditions demonstrated evidence of 'ping-pong' activity - a secondary biogenesis pathway associated with active transposable element targeting and suppression. Most sperm-borne piRNAs were between 29-30 nucleotides in length and originated from 226 clusters across the genome, with the exception of chromosome 20. Mapping analysis revealed abundant targeting of several transposable element families, suggesting a suppressive function of sperm piRNAs consistent with their established roles. Expression of genes targeted by sperm-borne piRNAs is significantly reduced throughout early embryogenesis compared to the mRNA population. Limited transposable element expression is known to be essential for spermatogenesis, thus epigenetic regulation of this pathway is likely to influence sperm quality and fertilizing capacity.
Assuntos
Fertilidade , RNA Interferente Pequeno , Espermatozoides , Masculino , Animais , Bovinos , RNA Interferente Pequeno/genética , Espermatozoides/metabolismo , Fertilidade/genética , Elementos de DNA Transponíveis , RNA de Interação com PiwiRESUMO
Small non-coding RNAs have been linked to different phenotypes in bovine sperm, however attempts to identify sperm-borne molecular biomarkers of male fertility have thus far failed to identify a robust profile of expressed miRNAs related to fertility. We hypothesized that some differences in bull fertility may be reflected in the levels of different miRNAs in sperm. To explore such differences in fertility that are not due to differences in visible metrics of sperm quality, we employed Next Generation Sequencing to compare the miRNA populations in Bos taurus sperm from bulls with comparable motility and morphology but varying Sire Conception Rates. We identified the most abundant miRNAs in both populations (miRs -34b-3p; -100-5p; -191-5p; -30d-4p; -21-5p) and evaluated differences in the overall levels and specific patterns of isomiR expression. We also explored correlations between specific pairs of miRNAs in each population and identified 10 distinct pairs of miRNAs that were positively correlated in bulls with higher fertility and negatively correlated in comparatively less fertile individuals. Furthermore, 8 additional miRNA pairs demonstrated the opposite trend; negatively correlated in high fertility animals and positively correlated in less fertile bulls. Finally, we performed pathway analysis to identify potential roles of miRNAs present in bull sperm in the regulation of specific genes that impact spermatogenesis and embryo development. Together, these results present a comprehensive picture of the bovine sperm miRNAome that suggests multiple potential roles in fertility.
Assuntos
MicroRNAs , Animais , Bovinos , Desenvolvimento Embrionário , Fertilidade/genética , Masculino , MicroRNAs/genética , MicroRNAs/metabolismo , Espermatogênese , Espermatozoides/metabolismoRESUMO
Approximately one million in vitro produced (IVP) cattle embryos are transferred worldwide each year as a way to improve the rates of genetic gain. The most advanced programmes also apply genomic selection at the embryonic stage by SNP genotyping and the calculation of genomic estimated breeding values (GEBVs). However, a high proportion of cattle embryos fail to establish a pregnancy. Here, we demonstrate that further interrogation of the SNP data collected for GEBVs can effectively remove aneuploid embryos from the pool, improving live births per embryo transfer (ET). Using three preimplantation genetic testing for aneuploidy (PGT-A) approaches, we assessed 1713 cattle blastocysts in a blind, retrospective analysis. Our findings indicate aneuploid embryos have a 5.8% chance of establishing a pregnancy and a 5.0% chance of given rise to a live birth. This compares to 59.6% and 46.7% for euploid embryos (p < 0.0001). PGT-A improved overall pregnancy and live birth rates by 7.5% and 5.8%, respectively (p < 0.0001). More detailed analyses revealed donor, chromosome, stage, grade, and sex-specific rates of error. Notably, we discovered a significantly higher incidence of aneuploidy in XY embryos and, as in humans, detected a preponderance of maternal meiosis I errors. Our data strongly support the use of PGT-A in cattle IVP programmes.
Assuntos
Aneuploidia , Coeficiente de Natalidade/tendências , Testes Genéticos/métodos , Nascido Vivo , Diagnóstico Pré-Implantação/métodos , Animais , Blastocisto/citologia , Blastocisto/metabolismo , Bovinos , Feminino , Fertilização in vitro/métodos , Gravidez , Estudos RetrospectivosRESUMO
The identification of genomic regions and candidate genes associated with milk fatty acids contributes to better understand the underlying biology of these traits and enables breeders to modify milk fat composition through genetic selection. The main objectives of this study were: (1) to perform genome-wide association analyses for five groups of milk fatty acids in Holstein cattle using a high-density (777K) SNP panel; and (2) to compare the results of GWAS accounting (or not) for the DGAT1 gene effect as a covariate in the statistical model. The five groups of milk fatty acids analyzed were: (1) saturated (SFA); (2) unsaturated (UFA); (3) short-chain (SCFA); (4) medium-chain (MCFA); and (5) long-chain (LCFA) fatty acids. When DGAT1 was not fitted as a covariate in the model, significant SNPs and candidate genes were identified on BTA5, BTA6, BTA14, BTA16, and BTA19. When fitting the DGAT1 gene in the model, only the MGST1 and PLBD1 genes were identified. Thus, this study suggests that the DGAT1 gene accounts for most of the variability in milk fatty acid composition and the PLBD1 and MGST1 genes are important additional candidate genes in Holstein cattle.