Pesquisa | Portal Regional da BVS

1.

The rate of de novo structural variation is increased in in vitro-produced offspring and preferentially affects the paternal genome.

Lee, Young-Lim; Bouwman, Aniek C; Harland, Chad; Bosse, Mirte; Costa Monteiro Moreira, Gabriel; Veerkamp, Roel F; Mullaart, Erik; Cambisano, Nadine; Groenen, Martien A M; Karim, Latifa; Coppieters, Wouter; Georges, Michel; Charlier, Carole.

Genome Res ; 33(9): 1455-1464, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37793781

RESUMO

Assisted reproductive technologies (ARTs), including in vitro maturation and fertilization (IVF), are increasingly used in human and animal reproduction. Whether these technologies directly affect the rate of de novo mutation (DNM), and to what extent, has been a matter of debate. Here we take advantage of domestic cattle, characterized by complex pedigrees that are ideally suited to detect DNMs and by the systematic use of ART, to study the rate of de novo structural variation (dnSV) in this species and how it is impacted by IVF. By exploiting features of associated de novo point mutations (dnPMs) and dnSVs in clustered DNMs, we provide strong evidence that (1) IVF increases the rate of dnSV approximately fivefold, and (2) the corresponding mutations occur during the very early stages of embryonic development (one- and two-cell stage), yet primarily affect the paternal genome.

Assuntos

Desenvolvimento Embrionário , Família , Gravidez , Feminino , Animais , Bovinos , Humanos , Mutação , Linhagem , Genoma Humano

2.

Automated pose estimation reveals walking characteristics associated with lameness in broilers.

Fodor, István; van der Sluis, Malou; Jacobs, Marc; de Klerk, Britt; Bouwman, Aniek C; Ellen, Esther D.

Poult Sci ; 102(8): 102787, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37302328

RESUMO

Walking ability of broilers can be improved by selective breeding, but large-scale phenotypic records are required. Currently, gait of individual broilers is scored by trained experts, however, precision phenotyping tools could offer a more objective and high-throughput alternative. We studied whether specific walking characteristics determined through pose estimation are linked to gait in broilers. We filmed male broilers from behind, walking through a 3 m × 0.4 m (length × width) corridor one by one, at 3 time points during their lifetime (at 14, 21, and 33 d of age). We used a deep learning model, developed in DeepLabCut, to detect and track 8 keypoints (head, neck, left and right knees, hocks, and feet) of broilers in the recorded videos. Using the keypoints of the legs, 6 pose features were quantified during the double support phase of walking, and 1 pose feature was quantified during steps, at maximum leg lift. Gait was scored on a scale from 0 to 5 by 4 experts, using the videos recorded on d 33, and the broilers were further classified as having either good gait (mean gait score ≤2) or suboptimal gait (mean gait score >2). The relationship of pose features on d 33 with gait was analyzed using the data of 84 broilers (good gait: 57.1%, suboptimal gait: 42.9%). Birds with suboptimal gait had sharper hock joint lateral angles and lower hock-feet distance ratios during double support on d 33, on average. During steps, relative step height was lower in birds with suboptimal gait. Step height and hock-feet distance ratio showed the largest mean deviations in broilers with suboptimal gait compared to those with good gait. We demonstrate that pose estimation can be used to assess walking characteristics during a large part of the productive life of broilers, and to phenotype and monitor broiler gait. These insights can be used to understand differences in the walking patterns of lame broilers, and to build more sophisticated gait prediction models.

Assuntos

Galinhas , Coxeadura Animal , Masculino , Animais , Coxeadura Animal/diagnóstico , Caminhada , Marcha , Pé

3.

High-resolution structural variants catalogue in a large-scale whole genome sequenced bovine family cohort data.

Lee, Young-Lim; Bosse, Mirte; Takeda, Haruko; Moreira, Gabriel Costa Monteiro; Karim, Latifa; Druet, Tom; Oget-Ebrad, Claire; Coppieters, Wouter; Veerkamp, Roel F; Groenen, Martien A M; Georges, Michel; Bouwman, Aniek C; Charlier, Carole.

BMC Genomics ; 24(1): 225, 2023 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-37127590

RESUMO

BACKGROUND: Structural variants (SVs) are chromosomal segments that differ between genomes, such as deletions, duplications, insertions, inversions and translocations. The genomics revolution enabled the discovery of sub-microscopic SVs via array and whole-genome sequencing (WGS) data, paving the way to unravel the functional impact of SVs. Recent human expression QTL mapping studies demonstrated that SVs play a disproportionally large role in altering gene expression, underlining the importance of including SVs in genetic analyses. Therefore, this study aimed to generate and explore a high-quality bovine SV catalogue exploiting a unique cattle family cohort data (total 266 samples, forming 127 trios). RESULTS: We curated 13,731 SVs segregating in the population, consisting of 12,201 deletions, 1,509 duplications, and 21 multi-allelic CNVs (> 50-bp). Of these, we validated a subset of copy number variants (CNVs) utilising a direct genotyping approach in an independent cohort, indicating that at least 62% of the CNVs are true variants, segregating in the population. Among gene-disrupting SVs, we prioritised two likely high impact duplications, encompassing ORM1 and POPDC3 genes, respectively. Liver expression QTL mapping results revealed that these duplications are likely causing altered gene expression, confirming the functional importance of SVs. Although most of the accurately genotyped CNVs are tagged by single nucleotide polymorphisms (SNPs) ascertained in WGS data, most CNVs were not captured by individual SNPs obtained from a 50K genotyping array. CONCLUSION: We generated a high-quality SV catalogue exploiting unique whole genome sequenced bovine family cohort data. Two high impact duplications upregulating the ORM1 and POPDC3 are putative candidates for postpartum feed intake and hoof health traits, thus warranting further investigation. Generally, CNVs were in low LD with SNPs on the 50K array. Hence, it remains crucial to incorporate CNVs via means other than tagging SNPs, such as investigation of tagging haplotypes, direct imputation of CNVs, or direct genotyping as done in the current study. The SV catalogue and the custom genotyping array generated in the current study will serve as valuable resources accelerating utilisation of full spectrum of genetic variants in bovine genomes.

Assuntos

Genoma , Genômica , Feminino , Humanos , Bovinos , Animais , Genômica/métodos , Genótipo , Variações do Número de Cópias de DNA , Haplótipos , Polimorfismo de Nucleotídeo Único , Proteínas Musculares/genética , Moléculas de Adesão Celular/genética

4.

Screening of in vitro-produced cattle embryos to assess incidence and characteristics of unbalanced chromosomal aberrations.

Bouwman, Aniek C; Mullaart, Erik.

JDS Commun ; 4(2): 101-105, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-36974223

RESUMO

In cattle, pregnancy rates of in vitro-produced embryos are lower than those of in vivo-produced embryos. One of the reasons may be the increase in chromosomal aberrations due to in vitro maturation and fertilization of the oocyte. Currently, embryo transfer is commonly applied in nucleus cattle breeding programs, and the embryos are genotyped for genomic selection. Therefore, intensity data from SNP arrays can be exploited for preimplantation genetic testing by screening the intensity data of the embryos for unbalanced chromosomal aberrations. A total of 558 stage 8 Dutch Holstein embryos genotyped with SNP arrays were screened in an observational study in retrospect. We found a 5% incidence rate of unbalanced chromosomal aberrations (aneuploidy and ploidy issues) among 430 successfully genotyped cattle embryos. The 22 affected embryos showed either aneuploidy or ploidy issues; monosomy was most frequently observed (14/22). In most cases (16/19) the maternal chromosome or chromosomes were lost or gained. One of the monosomy cases gave rise to a live-born fully diploid individual, suggesting mosaicism. Given that embryo genotypes are readily available, monitoring incidence can easily be applied. Moreover, selection for euploid embryos may improve pregnancy rates for in vitro embryo transfer.

5.

Classifying aneuploidy in genotype intensity data using deep learning.

Bouwman, Aniek C; Hulsegge, Ina; Hawken, Rachel J; Henshall, John M; Veerkamp, Roel F; Schokker, Dirkjan; Kamphuis, Claudia.

J Anim Breed Genet ; 140(3): 304-315, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-36806175

RESUMO

Aneuploidy is the loss or gain of one or more chromosomes. Although it is a rare phenomenon in liveborn individuals, it is observed in livestock breeding populations. These breeding populations are often routinely genotyped and the genotype intensity data from single nucleotide polymorphism (SNP) arrays can be exploited to identify aneuploidy cases. This identification is a time-consuming and costly task, because it is often performed by visual inspection of the data per chromosome, usually done in plots of the intensity data by an expert. Therefore, we wanted to explore the feasibility of automated image classification to replace (part of) the visual detection procedure for any diploid species. The aim of this study was to develop a deep learning Convolutional Neural Network (CNN) classification model based on chromosome level plots of SNP array intensity data that can classify the images into disomic, monosomic and trisomic cases. A multispecies dataset enriched for aneuploidy cases was collected containing genotype intensity data of 3321 disomic, 1759 monosomic and 164 trisomic chromosomes. The final CNN model had an accuracy of 99.9%, overall precision was 1, recall was 0.98 and the F1 score was 0.99 for classifying images from intensity data. The high precision assures that cases detected are most likely true cases, however, some trisomy cases may be missed (the recall of the class trisomic was 0.94). This supervised CNN model performed much better than an unsupervised k-means clustering, which reached an accuracy of 0.73 and had especially difficult to classify trisomic cases correctly. The developed CNN classification model provides high accuracy to classify aneuploidy cases based on images of plotted X and Y genotype intensity values. The classification model can be used as a tool for routine screening in large diploid populations that are genotyped to get a better understanding of the incidence and inheritance, and in addition, avoid anomalies in breeding candidates.

Assuntos

Aprendizado Profundo , Animais , Aneuploidia , Redes Neurais de Computação , Genótipo

6.

A 12 kb multi-allelic copy number variation encompassing a GC gene enhancer is associated with mastitis resistance in dairy cattle.

Lee, Young-Lim; Takeda, Haruko; Costa Monteiro Moreira, Gabriel; Karim, Latifa; Mullaart, Erik; Coppieters, Wouter; Appeltant, Ruth; Veerkamp, Roel F; Groenen, Martien A M; Georges, Michel; Bosse, Mirte; Druet, Tom; Bouwman, Aniek C; Charlier, Carole.

PLoS Genet ; 17(7): e1009331, 2021 07.

Artigo em Inglês | MEDLINE | ID: mdl-34288907

RESUMO

Clinical mastitis (CM) is an inflammatory disease occurring in the mammary glands of lactating cows. CM is under genetic control, and a prominent CM resistance QTL located on chromosome 6 was reported in various dairy cattle breeds. Nevertheless, the biological mechanism underpinning this QTL has been lacking. Herein, we mapped, fine-mapped, and discovered the putative causal variant underlying this CM resistance QTL in the Dutch dairy cattle population. We identified a ~12 kb multi-allelic copy number variant (CNV), that is in perfect linkage disequilibrium with a lead SNP, as a promising candidate variant. By implementing a fine-mapping and through expression QTL mapping, we showed that the group-specific component gene (GC), a gene encoding a vitamin D binding protein, is an excellent candidate causal gene for the QTL. The multiplicated alleles are associated with increased GC expression and low CM resistance. Ample evidence from functional genomics data supports the presence of an enhancer within this CNV, which would exert cis-regulatory effect on GC. We observed that strong positive selection swept the region near the CNV, and haplotypes associated with the multiplicated allele were strongly selected for. Moreover, the multiplicated allele showed pleiotropic effects for increased milk yield and reduced fertility, hinting that a shared underlying biology for these effects may revolve around the vitamin D pathway. These findings together suggest a putative causal variant of a CM resistance QTL, where a cis-regulatory element located within a CNV can alter gene expression and affect multiple economically important traits.

Assuntos

Elementos Facilitadores Genéticos , Mastite Bovina/genética , Proteína de Ligação a Vitamina D/genética , Animais , Bovinos , Variações do Número de Cópias de DNA , Feminino , Predisposição Genética para Doença , Haplótipos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sequenciamento Completo do Genoma

7.

Using prior information from humans to prioritize genes and gene-associated variants for complex traits in livestock.

Raymond, Biaty; Yengo, Loic; Costilla, Roy; Schrooten, Chris; Bouwman, Aniek C; Hayes, Ben J; Veerkamp, Roel F; Visscher, Peter M.

PLoS Genet ; 16(9): e1008780, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32925905

RESUMO

Genome-Wide Association Studies (GWAS) in large human cohorts have identified thousands of loci associated with complex traits and diseases. For identifying the genes and gene-associated variants that underlie complex traits in livestock, especially where sample sizes are limiting, it may help to integrate the results of GWAS for equivalent traits in humans as prior information. In this study, we sought to investigate the usefulness of results from a GWAS on human height as prior information for identifying the genes and gene-associated variants that affect stature in cattle, using GWAS summary data on samples sizes of 700,000 and 58,265 for humans and cattle, respectively. Using Fisher's exact test, we observed a significant proportion of cattle stature-associated genes (30/77) that are also associated with human height (odds ratio = 5.1, p = 3.1e-10). Result of randomized sampling tests showed that cattle orthologs of human height-associated genes, hereafter referred to as candidate genes (C-genes), were more enriched for cattle stature GWAS signals than random samples of genes in the cattle genome (p = 0.01). Randomly sampled SNPs within the C-genes also tend to explain more genetic variance for cattle stature (up to 13.2%) than randomly sampled SNPs within random cattle genes (p = 0.09). The most significant SNPs from a cattle GWAS for stature within the C-genes did not explain more genetic variance for cattle stature than the most significant SNPs within random cattle genes (p = 0.87). Altogether, our findings support previous studies that suggest a similarity in the genetic regulation of height across mammalian species. However, with the availability of a powerful GWAS for stature that combined data from 8 cattle breeds, prior information from human-height GWAS does not seem to provide any additional benefit with respect to the identification of genes and gene-associated variants that affect stature in cattle.

Assuntos

Estatura/genética , Bovinos/genética , Estudo de Associação Genômica Ampla/métodos , Animais , Cruzamento/métodos , Bases de Dados Genéticas , Variação Genética/genética , Humanos , Gado/genética , Herança Multifatorial/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética

8.

Using short read sequencing to characterise balanced reciprocal translocations in pigs.

Bouwman, Aniek C; Derks, Martijn F L; Broekhuijse, Marleen L W J; Harlizius, Barbara; Veerkamp, Roel F.

BMC Genomics ; 21(1): 576, 2020 Aug 24.

Artigo em Inglês | MEDLINE | ID: mdl-32831014

RESUMO

BACKGROUND: A balanced constitutional reciprocal translocation (RT) is a mutual exchange of terminal segments of two non-homologous chromosomes without any loss or gain of DNA in germline cells. Carriers of balanced RTs are viable individuals with no apparent phenotypical consequences. These animals produce, however, unbalanced gametes and show therefore reduced fertility and offspring with congenital abnormalities. This cytogenetic abnormality is usually detected using chromosome staining techniques. The aim of this study was to test the possibilities of using paired end short read sequencing for detection of balanced RTs in boars and investigate their breakpoints and junctions. RESULTS: Balanced RTs were recovered in a blinded analysis, using structural variant calling software DELLY, in 6 of the 7 carriers with 30 fold short read paired end sequencing. In 15 non-carriers we did not detect any RTs. Reducing the coverage to 20 fold, 15 fold and 10 fold showed that at least 20 fold coverage is required to obtain good results. One RT was not detected using the blind screening, however, a highly likely RT was discovered after unblinding. This RT was located in a repetitive region, showing the limitations of short read sequence data. The detailed analysis of the breakpoints and junctions suggested three junctions showing microhomology, three junctions with blunt-end ligation, and three micro-insertions at the breakpoint junctions. The RTs detected also showed to disrupt genes. CONCLUSIONS: We conclude that paired end short read sequence data can be used to detect and characterize balanced reciprocal translocations, if sequencing depth is at least 20 fold coverage. However, translocations in repetitive areas may require large fragments or even long read sequence data.

Assuntos

Aberrações Cromossômicas , Translocação Genética , Animais , DNA , Heterozigoto , Masculino , Suínos/genética

9.

A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices.

Raymond, Biaty; Wientjes, Yvonne C J; Bouwman, Aniek C; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 52(1): 21, 2020 Apr 28.

Artigo em Inglês | MEDLINE | ID: mdl-32345213

RESUMO

BACKGROUND: A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix ([Formula: see text]) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy. METHODS: Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population [Formula: see text] ([Formula: see text]), when individuals from two populations, [Formula: see text] and [Formula: see text], are combined in the training population and two [Formula: see text], made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation. RESULTS: In general, the derived prediction equation resulted in accurate predictions of [Formula: see text] for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments ([Formula: see text]) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for [Formula: see text], calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait. CONCLUSIONS: We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.

Assuntos

Cruzamento , Metagenômica , Modelos Genéticos , Animais , Fenótipo , Polimorfismo de Nucleotídeo Único

10.

Functional and population genetic features of copy number variations in two dairy cattle populations.

Lee, Young-Lim; Bosse, Mirte; Mullaart, Erik; Groenen, Martien A M; Veerkamp, Roel F; Bouwman, Aniek C.

BMC Genomics ; 21(1): 89, 2020 Jan 28.

Artigo em Inglês | MEDLINE | ID: mdl-31992181

RESUMO

BACKGROUND: Copy Number Variations (CNVs) are gain or loss of DNA segments that are known to play a role in shaping a wide range of phenotypes. In this study, we used two dairy cattle populations, Holstein Friesian and Jersey, to discover CNVs using the Illumina BovineHD Genotyping BeadChip aligned to the ARS-UCD1.2 assembly. The discovered CNVs were investigated for their functional impact and their population genetics features. RESULTS: We discovered 14,272 autosomal CNVs, which were aggregated into 1755 CNV regions (CNVR) from 451 animals. These CNVRs together cover 2.8% of the bovine autosomes. The assessment of the functional impact of CNVRs showed that rare CNVRs (MAF < 0.01) are more likely to overlap with genes, than common CNVRs (MAF ≥ 0.05). The Population differentiation index (Fst) based on CNVRs revealed multiple highly diverged CNVRs between the two breeds. Some of these CNVRs overlapped with candidate genes such as MGAM and ADAMTS17 genes, which are related to starch digestion and body size, respectively. Lastly, linkage disequilibrium (LD) between CNVRs and BovineHD BeadChip SNPs was generally low, close to 0, although common deletions (MAF ≥ 0.05) showed slightly higher LD (r2 = ~ 0.1 at 10 kb distance) than the rest. Nevertheless, this LD is still lower than SNP-SNP LD (r2 = ~ 0.5 at 10 kb distance). CONCLUSIONS: Our analyses showed that CNVRs detected using BovineHD BeadChip arrays are likely to be functional. This finding indicates that CNVs can potentially disrupt the function of genes and thus might alter phenotypes. Also, the population differentiation index revealed two candidate genes, MGAM and ADAMTS17, which hint at adaptive evolution between the two populations. Lastly, low CNVR-SNP LD implies that genetic variation from CNVs might not be fully captured in routine animal genetic evaluation, which relies solely on SNP markers.

Assuntos

Variações do Número de Cópias de DNA , Genética Populacional , Animais , Cruzamento , Bovinos , Genoma , Desequilíbrio de Ligação , Locos de Características Quantitativas

11.

Imputation to whole-genome sequence using multiple pig populations and its use in genome-wide association studies.

van den Berg, Sanne; Vandenplas, Jérémie; van Eeuwijk, Fred A; Bouwman, Aniek C; Lopes, Marcos S; Veerkamp, Roel F.

Genet Sel Evol ; 51(1): 2, 2019 Jan 24.

Artigo em Inglês | MEDLINE | ID: mdl-30678638

RESUMO

BACKGROUND: Use of whole-genome sequence data (WGS) is expected to improve identification of quantitative trait loci (QTL). However, this requires imputation to WGS, often with a limited number of sequenced animals for the target population. The objective of this study was to investigate imputation to WGS in two pig lines using a multi-line reference population and, subsequently, to investigate the effect of using these imputed WGS (iWGS) for GWAS. METHODS: Phenotypes and genotypes were available on 12,184 Large White pigs (LW-line) and 4943 Dutch Landrace pigs (DL-line). Imputed 660 K and 80 K genotypes for the LW-line and DL-line, respectively, were imputed to iWGS using Beagle v.4.1. Since only 32 LW-line and 12 DL-line boars were sequenced, 142 animals from eight commercial lines were added. GWAS were performed for each line using the 80 K and 660 K SNPs, the genotype scores of iWGS SNPs that had an imputation accuracy (Beagle R2) higher than 0.6, and the dosage scores of all iWGS SNPs. RESULTS: For the DL-line (LW-line), imputation of 80 K genotypes to iWGS resulted in an average Beagle R2 of 0.39 (0.49). After quality control, 2.5 × 106 (3.5 × 106) SNPs had a Beagle R2 higher than 0.6, resulting in an average Beagle R2 of 0.83 (0.93). Compared to the 80 K and 660 K genotypes, using iWGS led to the identification of 48.9 and 64.4% more QTL regions, for the DL-line and LW-line, respectively, and the most significant SNPs in the QTL regions explained a higher proportion of phenotypic variance. Using dosage instead of genotype scores improved the identification of QTL, because the model accounted for uncertainty of imputation, and all SNPs were used in the analysis. CONCLUSIONS: Imputation to WGS using the multi-line reference population resulted in relatively poor imputation, especially when imputing from 80 K (DL-line). In spite of the poor imputation accuracies, using iWGS instead of a lower density SNP chip increased the number of detected QTL and the estimated proportion of phenotypic variance explained by these QTL, especially when dosage scores were used instead of genotype scores. Thus, iWGS, even with poor imputation accuracy, can be used to identify possible interesting regions for fine mapping.

Assuntos

Estudo de Associação Genômica Ampla/métodos , Suínos/genética , Sequenciamento Completo do Genoma/métodos , Animais , Estudo de Associação Genômica Ampla/normas , Estudo de Associação Genômica Ampla/veterinária , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/veterinária

12.

Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers.

Raymond, Biaty; Bouwman, Aniek C; Wientjes, Yvonne C J; Schrooten, Chris; Houwing-Duistermaat, Jeanine; Veerkamp, Roel F.

Genet Sel Evol ; 50(1): 49, 2018 Oct 10.

Artigo em Inglês | MEDLINE | ID: mdl-30314431

RESUMO

BACKGROUND: Genomic prediction (GP) accuracy in numerically small breeds is limited by the small size of the reference population. Our objective was to test a multi-breed multiple genomic relationship matrices (GRM) GP model (MBMG) that weighs pre-selected markers separately, uses the remaining markers to explain the remaining genetic variance that can be explained by markers, and weighs information of breeds in the reference population by their genetic correlation with the validation breed. METHODS: Genotype and phenotype data were used on 595 Jersey bulls from New Zealand and 5503 Holstein bulls from the Netherlands, all with deregressed proofs for stature. Different sets of markers were used, containing either pre-selected markers from a meta-genome-wide association analysis on stature, remaining markers or both. We implemented a multi-breed bivariate GREML model in which we fitted either a single multi-breed GRM (MBSG), or two distinct multi-breed GRM (MBMG), one made with pre-selected markers and the other with remaining markers. Accuracies of predicting stature for Jersey individuals using the multi-breed models (Holstein and Jersey combined reference population) was compared to those obtained using either the Jersey (within-breed) or Holstein (across-breed) reference population. All the models were subsequently fitted in the analysis of simulated phenotypes, with a simulated genetic correlation between breeds of 1, 0.5, and 0.25. RESULTS: The MBMG model always gave better prediction accuracies for stature compared to MBSG, within-, and across-breed GP models. For example, with MBSG, accuracies obtained by fitting 48,912 unselected markers (0.43), 357 pre-selected markers (0.38) or a combination of both (0.43), were lower than accuracies obtained by fitting pre-selected and unselected markers in separate GRM in MBMG (0.49). This improvement was further confirmed by results from a simulation study, with MBMG performing on average 23% better than MBSG with all markers fitted. CONCLUSIONS: With the MBMG model, it is possible to use information from numerically large breeds to improve prediction accuracy of numerically small breeds. The superiority of MBMG is mainly due to its ability to use information on pre-selected markers, explain the remaining genetic variance and weigh information from a different breed by the genetic correlation between breeds.

Assuntos

Cruzamento/métodos , Modelos Genéticos , Polimorfismo Genético , Animais , Cruzamento/normas , Bovinos/genética , Marcadores Genéticos , Tamanho da Amostra , Seleção Genética

13.

Utility of whole-genome sequence data for across-breed genomic prediction.

Raymond, Biaty; Bouwman, Aniek C; Schrooten, Chris; Houwing-Duistermaat, Jeanine; Veerkamp, Roel F.

Genet Sel Evol ; 50(1): 27, 2018 05 18.

Artigo em Inglês | MEDLINE | ID: mdl-29776327

RESUMO

BACKGROUND: Genomic prediction (GP) across breeds has so far resulted in low accuracies of the predicted genomic breeding values. Our objective was to evaluate whether using whole-genome sequence (WGS) instead of low-density markers can improve GP across breeds, especially when markers are pre-selected from a genome-wide association study (GWAS), and to test our hypothesis that many non-causal markers in WGS data have a diluting effect on accuracy of across-breed prediction. METHODS: Estimated breeding values for stature and bovine high-density (HD) genotypes were available for 595 Jersey bulls from New Zealand, 957 Holstein bulls from New Zealand and 5553 Holstein bulls from the Netherlands. BovineHD genotypes for all bulls were imputed to WGS using Beagle4 and Minimac2. Genomic prediction across the three populations was performed with ASReml4, with each population used as single reference and as single validation sets. In addition to the 50k, HD and WGS, markers that were significantly associated with stature in a large meta-GWAS analysis were selected and used for prediction, resulting in 10 prediction scenarios. Furthermore, we estimated the proportion of genetic variance captured by markers in each scenario. RESULTS: Across breeds, 50k, HD and WGS markers resulted in very low accuracies of prediction ranging from - 0.04 to 0.13. Accuracies were higher in scenarios with pre-selected markers from a meta-GWAS. For example, using only the 133 most significant markers in 133 QTL regions from the meta-GWAS yielded accuracies ranging from 0.08 to 0.23, while 23,125 markers with a - log10(p) higher than 7 resulted in accuracies of up 0.35. Using WGS data did not significantly improve the proportion of genetic variance captured across breeds compared to scenarios with few but pre-selected markers. CONCLUSIONS: Our results demonstrated that the accuracy of across-breed GP can be improved by using markers that are pre-selected from WGS based on their potential causal effect. We also showed that simply increasing the number of markers up to the WGS level does not increase the accuracy of across-breed prediction, even when markers that are expected to have a causal effect are included.

Assuntos

Cruzamento , Bovinos/anatomia & histologia , Bovinos/classificação , Estudo de Associação Genômica Ampla/veterinária , Locos de Características Quantitativas , Animais , Biometria , Bovinos/genética , Biologia Computacional , Variação Genética , Masculino , Modelos Genéticos , Linhagem , Polimorfismo de Nucleotídeo Único

14.

Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals.

Bouwman, Aniek C; Daetwyler, Hans D; Chamberlain, Amanda J; Ponce, Carla Hurtado; Sargolzaei, Mehdi; Schenkel, Flavio S; Sahana, Goutam; Govignon-Gion, Armelle; Boitard, Simon; Dolezal, Marlies; Pausch, Hubert; Brøndum, Rasmus F; Bowman, Phil J; Thomsen, Bo; Guldbrandtsen, Bernt; Lund, Mogens S; Servin, Bertrand; Garrick, Dorian J; Reecy, James; Vilkki, Johanna; Bagnato, Alessandro; Wang, Min; Hoff, Jesse L; Schnabel, Robert D; Taylor, Jeremy F; Vinkhuyzen, Anna A E; Panitz, Frank; Bendixen, Christian; Holm, Lars-Erik; Gredler, Birgit; Hozé, Chris; Boussaha, Mekki; Sanchez, Marie-Pierre; Rocha, Dominique; Capitan, Aurelien; Tribout, Thierry; Barbat, Anne; Croiseau, Pascal; Drögemüller, Cord; Jagannathan, Vidhya; Vander Jagt, Christy; Crowley, John J; Bieber, Anna; Purfield, Deirdre C; Berry, Donagh P; Emmerling, Reiner; Götz, Kay-Uwe; Frischknecht, Mirjam; Russ, Ingolf; Sölkner, Johann.

Nat Genet ; 50(3): 362-367, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-29459679

RESUMO

Stature is affected by many polymorphisms of small effect in humans 1 . In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 × 10-8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP-seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals.

Assuntos

Tamanho Corporal/genética , Bovinos/genética , Sequência Conservada , Estudo de Associação Genômica Ampla , Mamíferos/genética , Animais , Estatura/genética , Bovinos/classificação , Estudos de Associação Genética/veterinária , Variação Genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Estudo de Associação Genômica Ampla/veterinária , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética

15.

Estimated allele substitution effects underlying genomic evaluation models depend on the scaling of allele counts.

Bouwman, Aniek C; Hayes, Ben J; Calus, Mario P L.

Genet Sel Evol ; 49(1): 79, 2017 10 30.

Artigo em Inglês | MEDLINE | ID: mdl-29084514

RESUMO

BACKGROUND: Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of allele counts results in less shrinkage towards the mean for low minor allele frequency (MAF) variants. Scaling may become relevant for estimating ASE as more low MAF variants will be used in genomic evaluations. We show the impact of scaling on estimates of ASE using real data and a theoretical framework, and in terms of power, model fit and predictive performance. RESULTS: In a dairy cattle dataset with 630 K SNP genotypes, the correlation between DGV for stature from a random regression model using centered allele counts (RRc) and centered and scaled allele counts (RRcs) was 0.9988, whereas the overall correlation between ASE using RRc and RRcs was 0.27. The main difference in ASE between both methods was found for SNPs with a MAF lower than 0.01. Both the ratio (ASE from RRcs/ASE from RRc) and the regression coefficient (regression of ASE from RRcs on ASE from RRc) were much higher than 1 for low MAF SNPs. Derived equations showed that scenarios with a high heritability, a large number of individuals and a small number of variants have lower ratios between ASE from RRc and RRcs. We also investigated the optimal scaling parameter [from - 1 (RRcs) to 0 (RRc) in steps of 0.1] in the bovine stature dataset. We found that the log-likelihood was maximized with a scaling parameter of - 0.8, while the mean squared error of prediction was minimized with a scaling parameter of - 1, i.e., RRcs. CONCLUSIONS: Large differences in estimated ASE were observed for low MAF SNPs when allele counts were scaled or not scaled because there is less shrinkage towards the mean for scaled allele counts. We derived a theoretical framework that shows that the difference in ASE due to shrinkage is heavily influenced by the power of the data. Increasing the power results in smaller differences in ASE whether allele counts are scaled or not.

Assuntos

Algoritmos , Frequência do Gene , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Animais , Bovinos/genética , Feminino , Estudo de Associação Genômica Ampla/normas , Masculino , Modelos Genéticos

16.

Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.

Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L.

Genet Sel Evol ; 48(1): 95, 2016 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-27905878

RESUMO

BACKGROUND: Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. METHODS: Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. RESULTS: The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. CONCLUSIONS: Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.

Assuntos

Variação Genética , Estudo de Associação Genômica Ampla , Genoma , Genômica , Animais , Cruzamento , Bovinos , Genômica/métodos , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Seleção Genética

17.

Efficient genomic prediction based on whole-genome sequence data using split-and-merge Bayesian variable selection.

Calus, Mario P L; Bouwman, Aniek C; Schrooten, Chris; Veerkamp, Roel F.

Genet Sel Evol ; 48(1): 49, 2016 06 29.

Artigo em Inglês | MEDLINE | ID: mdl-27357580

RESUMO

BACKGROUND: Use of whole-genome sequence data is expected to increase persistency of genomic prediction across generations and breeds but affects model performance and requires increased computing time. In this study, we investigated whether the split-and-merge Bayesian stochastic search variable selection (BSSVS) model could overcome these issues. BSSVS is performed first on subsets of sequence-based variants and then on a merged dataset containing variants selected in the first step. RESULTS: We used a dataset that included 4,154,064 variants after editing and de-regressed proofs for 3415 reference and 2138 validation bulls for somatic cell score, protein yield and interval first to last insemination. In the first step, BSSVS was performed on 106 subsets each containing ~39,189 variants. In the second step, 1060 up to 472,492 variants, selected from the first step, were included to estimate the accuracy of genomic prediction. Accuracies were at best equal to those achieved with the commonly used Bovine 50k-SNP chip, although the number of variants within a few well-known quantitative trait loci regions was considerably enriched. When variant selection and the final genomic prediction were performed on the same data, predictions were biased. Predictions computed as the average of the predictions computed for each subset achieved the highest accuracies, i.e. 0.5 to 1.1 % higher than the accuracies obtained with the 50k-SNP chip, and yielded the least biased predictions. Finally, the accuracy of genomic predictions obtained when all sequence-based variants were included was similar or up to 1.4 % lower compared to that based on the average predictions across the subsets. By applying parallelization, the split-and-merge procedure was completed in 5 days, while the standard analysis including all sequence-based variants took more than three months. CONCLUSIONS: The split-and-merge approach splits one large computational task into many much smaller ones, which allows the use of parallel processing and thus efficient genomic prediction based on whole-genome sequence data. The split-and-merge approach did not improve prediction accuracy, probably because we used data on a single breed for which relationships between individuals were high. Nevertheless, the split-and-merge approach may have potential for applications on data from multiple breeds.

Assuntos

Bovinos/genética , Biologia Computacional , Genômica/métodos , Modelos Genéticos , Animais , Teorema de Bayes , Genótipo , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único

18.

Consequences of splitting whole-genome sequencing effort over multiple breeds on imputation accuracy.

Bouwman, Aniek C; Veerkamp, Roel F.

BMC Genet ; 15: 105, 2014 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-25277486

RESUMO

BACKGROUND: The aim of this study was to determine the consequences of splitting sequencing effort over multiple breeds for imputation accuracy from a high-density SNP chip towards whole-genome sequence. Such information would assist for instance numerical smaller cattle breeds, but also pig and chicken breeders, who have to choose wisely how to spend their sequencing efforts over all the breeds or lines they evaluate. Sequence data from cattle breeds was used, because there are currently relatively many individuals from several breeds sequenced within the 1,000 Bull Genomes project. The advantage of whole-genome sequence data is that it carries the causal mutations, but the question is whether it is possible to impute the causal variants accurately. This study therefore focussed on imputation accuracy of variants with low minor allele frequency and breed specific variants. RESULTS: Imputation accuracy was assessed for chromosome 1 and 29 as the correlation between observed and imputed genotypes. For chromosome 1, the average imputation accuracy was 0.70 with a reference population of 20 Holstein, and increased to 0.83 when the reference population was increased by including 3 other dairy breeds with 20 animals each. When the same amount of animals from the Holstein breed were added the accuracy improved to 0.88, while adding the 3 other breeds to the reference population of 80 Holstein improved the average imputation accuracy marginally to 0.89. For chromosome 29, the average imputation accuracy was lower. Some variants benefitted from the inclusion of other breeds in the reference population, initially determined by the MAF of the variant in each breed, but even Holstein specific variants did gain imputation accuracy from the multi-breed reference population. CONCLUSIONS: This study shows that splitting sequencing effort over multiple breeds and combining the reference populations is a good strategy for imputation from high-density SNP panels towards whole-genome sequence when reference populations are small and sequencing effort is limiting. When sequencing effort is limiting and interest lays in multiple breeds or lines this provides imputation of each breed.

Assuntos

Bovinos/genética , Polimorfismo de Nucleotídeo Único , Animais , Sequência de Bases , Cruzamento , Frequência do Gene , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , Análise de Sequência de DNA , Especificidade da Espécie

19.

Imputation of non-genotyped individuals based on genotyped relatives: assessing the imputation accuracy of a real case scenario in dairy cattle.

Bouwman, Aniek C; Hickey, John M; Calus, Mario P L; Veerkamp, Roel F.

Genet Sel Evol ; 46: 6, 2014 Feb 03.

Artigo em Inglês | MEDLINE | ID: mdl-24490796

RESUMO

BACKGROUND: Imputation of genotypes for ungenotyped individuals could enable the use of valuable phenotypes created before the genomic era in analyses that require genotypes. The objective of this study was to investigate the accuracy of imputation of non-genotyped individuals using genotype information from relatives. METHODS: Genotypes were simulated for all individuals in the pedigree of a real (historical) dataset of phenotyped dairy cows and with part of the pedigree genotyped. The software AlphaImpute was used for imputation in its standard settings but also without phasing, i.e. using basic inheritance rules and segregation analysis only. Different scenarios were evaluated i.e.: (1) the real data scenario, (2) addition of genotypes of sires and maternal grandsires of the ungenotyped individuals, and (3) addition of one, two, or four genotyped offspring of the ungenotyped individuals to the reference population. RESULTS: The imputation accuracy using AlphaImpute in its standard settings was lower than without phasing. Including genotypes of sires and maternal grandsires in the reference population improved imputation accuracy, i.e. the correlation of the true genotypes with the imputed genotype dosages, corrected for mean gene content, across all animals increased from 0.47 (real situation) to 0.60. Including one, two and four genotyped offspring increased the accuracy of imputation across all animals from 0.57 (no offspring) to 0.73, 0.82, and 0.92, respectively. CONCLUSIONS: At present, the use of basic inheritance rules and segregation analysis appears to be the best imputation method for ungenotyped individuals. Comparison of our empirical animal-specific imputation accuracies to predictions based on selection index theory suggested that not correcting for mean gene content considerably overestimates the true accuracy. Imputation of ungenotyped individuals can help to include valuable phenotypes for genome-wide association studies or for genomic prediction, especially when the ungenotyped individuals have genotyped offspring.

Assuntos

Bovinos/genética , Genótipo , Fenótipo , Algoritmos , Animais , Cruzamento , Genoma , Modelos Genéticos , Software

20.

Exploring causal networks of bovine milk fatty acids in a multivariate mixed model context.

Bouwman, Aniek C; Valente, Bruno D; Janss, Luc L G; Bovenhuis, Henk; Rosa, Guilherme J M.

Genet Sel Evol ; 46: 2, 2014 Jan 17.

Artigo em Inglês | MEDLINE | ID: mdl-24438068

RESUMO

BACKGROUND: Knowledge regarding causal relationships among traits is important to understand complex biological systems. Structural equation models (SEM) can be used to quantify the causal relations between traits, which allow prediction of outcomes to interventions applied to such a network. Such models are fitted conditionally on a causal structure among traits, represented by a directed acyclic graph and an Inductive Causation (IC) algorithm can be used to search for causal structures. The aim of this study was to explore the space of causal structures involving bovine milk fatty acids and to select a network supported by data as the structure of a SEM. RESULTS: The IC algorithm adapted to mixed models settings was applied to study 14 correlated bovine milk fatty acids, resulting in an undirected network. The undirected pathway from C4:0 to C12:0 resembled the de novo synthesis pathway of short and medium chain saturated fatty acids. By using prior knowledge, directions were assigned to that part of the network and the resulting structure was used to fit a SEM that led to structural coefficients ranging from 0.85 to 1.05. The deviance information criterion indicated that the SEM was more plausible than the multi-trait model. CONCLUSIONS: The IC algorithm output pointed towards causal relations between the studied traits. This changed the focus from marginal associations between traits to direct relationships, thus towards relationships that may result in changes when external interventions are applied. The causal structure can give more insight into underlying mechanisms and the SEM can predict conditional changes due to such interventions.

Assuntos

Algoritmos , Ácidos Graxos/análise , Leite/química , Animais , Bovinos , Ácidos Graxos/genética , Modelos Genéticos , Fenótipo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA