Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 85
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Theor Appl Genet ; 136(4): 74, 2023 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-36952013

RESUMO

KEY MESSAGE: For genomic selection in clonally propagated crops with diploid (-like) meiotic behavior to be effective, crossing parents should be selected based on genomic predicted cross-performance unless dominance is negligible. For genomic selection (GS) in clonal breeding programs to be effective, parents should be selected based on genomic predicted cross-performance unless dominance is negligible. Genomic prediction of cross-performance enables efficient exploitation of the additive and dominance value simultaneously. Here, we compared different GS strategies for clonally propagated crops with diploid (-like) meiotic behavior, using strawberry as an example. We used stochastic simulation to evaluate six combinations of three breeding programs and two parent selection methods. The three breeding programs included (1) a breeding program that introduced GS in the first clonal stage, and (2) two variations of a two-part breeding program with one and three crossing cycles per year, respectively. The two parent selection methods were (1) parent selection based on genomic estimated breeding values (GEBVs) and (2) parent selection based on genomic predicted cross-performance (GPCP). Selection of parents based on GPCP produced faster genetic gain than selection of parents based on GEBVs because it reduced inbreeding when the dominance degree increased. The two-part breeding programs with one and three crossing cycles per year using GPCP always produced the most genetic gain unless dominance was negligible. We conclude that (1) in clonal breeding programs with GS, parents should be selected based on GPCP, and (2) a two-part breeding program with parent selection based on GPCP to rapidly drive population improvement has great potential to improve breeding clonally propagated crops.


Assuntos
Melhoramento Vegetal , Seleção Genética , Melhoramento Vegetal/métodos , Genoma , Genômica/métodos , Endogamia , Produtos Agrícolas/genética , Modelos Genéticos
2.
Genet Sel Evol ; 55(1): 36, 2023 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-37268883

RESUMO

BACKGROUND: In breeding programmes, the observed genetic change is a sum of the contributions of different selection paths represented by groups of individuals. Quantifying these sources of genetic change is essential for identifying the key breeding actions and optimizing breeding programmes. However, it is difficult to disentangle the contribution of individual paths due to the inherent complexity of breeding programmes. Here we extend the previously developed method for partitioning genetic mean by paths of selection to work both with the mean and variance of breeding values. METHODS: First, we extended the partitioning method to quantify the contribution of different paths to genetic variance assuming that the breeding values are known. Second, we combined the partitioning method with the Markov Chain Monte Carlo approach to draw samples from the posterior distribution of breeding values and use these samples for computing the point and interval estimates of partitions for the genetic mean and variance. We implemented the method in the R package AlphaPart. We demonstrated the method with a simulated cattle breeding programme. RESULTS: We show how to quantify the contribution of different groups of individuals to genetic mean and variance and that the contributions of different selection paths to genetic variance are not necessarily independent. Finally, we observed that the partitioning method under the pedigree-based model has some limitations, which suggests the need for a genomic extension. CONCLUSIONS: We presented a partitioning method to quantify sources of change in genetic mean and variance in breeding programmes. The method can help breeders and researchers understand the dynamics in genetic mean and variance in a breeding programme. The developed method for partitioning genetic mean and variance is a powerful method for understanding how different selection paths interact within a breeding programme and how they can be optimised.


Assuntos
Genoma , Genômica , Animais , Bovinos/genética , Método de Monte Carlo , Linhagem , Cadeias de Markov , Modelos Genéticos , Seleção Genética
3.
Genet Sel Evol ; 55(1): 31, 2023 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-37161307

RESUMO

BACKGROUND: The Western honeybee is an economically important species globally, but has been experiencing colony losses that lead to economical damage and decreased genetic variability. This situation is spurring additional interest in honeybee breeding and conservation programs. Stochastic simulators are essential tools for rapid and low-cost testing of breeding programs and methods, yet no existing simulator allows for a detailed simulation of honeybee populations. Here we describe SIMplyBee, a holistic simulator of honeybee populations and breeding programs. SIMplyBee is an R package and hence freely available for installation from CRAN http://cran.r-project.org/package=SIMplyBee . IMPLEMENTATION: SIMplyBee builds upon the stochastic simulator AlphaSimR that simulates individuals with their corresponding genomes and quantitative genetic values. To enable honeybee-specific simulations, we extended AlphaSimR by developing classes for global simulation parameters, SimParamBee, for a honeybee colony, Colony, and multiple colonies, MultiColony. We also developed functions to address major honeybee specificities: honeybee genome, haplodiploid inheritance, social organisation, complementary sex determination, polyandry, colony events, and quantitative genetics at the individual- and colony-levels. RESULTS: We describe its implementation for simulating a honeybee genome, creating a honeybee colony and its members, addressing haplodiploid inheritance and complementary sex determination, simulating colony events, creating and managing multiple colonies at the same time, and obtaining genomic data and honeybee quantitative genetics. Further documentation, available at http://www.SIMplyBee.info , provides details on these operations and describes additional operations related to genomics, quantitative genetics, and other functionalities. DISCUSSION: SIMplyBee is a holistic simulator of honeybee populations and breeding programs. It simulates individual honeybees with their genomes, colonies with colony events, and individual- and colony-level genetic and breeding values. Regarding the latter, SIMplyBee takes a user-defined function to combine individual- into colony-level values and hence allows for modeling any type of interaction within a colony. SIMplyBee provides a research platform for testing breeding and conservation strategies and their effect on future genetic gain and genetic variability. Future developments of SIMplyBee will focus on improving the simulation of honeybee genomes, optimizing the simulator's performance, and including spatial awareness in mating functions and phenotype simulation. We invite the honeybee genetics and breeding community to join us in the future development of SIMplyBee.


Assuntos
Genômica , Padrões de Herança , Abelhas/genética , Animais , Simulação por Computador , Fenótipo , Reprodução
4.
Genet Sel Evol ; 55(1): 42, 2023 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-37322449

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) aim at identifying genomic regions involved in phenotype expression, but identifying causative variants is difficult. Pig Combined Annotation Dependent Depletion (pCADD) scores provide a measure of the predicted consequences of genetic variants. Incorporating pCADD into the GWAS pipeline may help their identification. Our objective was to identify genomic regions associated with loin depth and muscle pH, and identify regions of interest for fine-mapping and further experimental work. Genotypes for ~ 40,000 single nucleotide morphisms (SNPs) were used to perform GWAS for these two traits, using de-regressed breeding values (dEBV) for 329,964 pigs from four commercial lines. Imputed sequence data was used to identify SNPs in strong ([Formula: see text] 0.80) linkage disequilibrium with lead GWAS SNPs with the highest pCADD scores. RESULTS: Fifteen distinct regions were associated with loin depth and one with loin pH at genome-wide significance. Regions on chromosomes 1, 2, 5, 7, and 16, explained between 0.06 and 3.55% of the additive genetic variance and were strongly associated with loin depth. Only a small part of the additive genetic variance in muscle pH was attributed to SNPs. The results of our pCADD analysis suggests that high-scoring pCADD variants are enriched for missense mutations. Two close but distinct regions on SSC1 were associated with loin depth, and pCADD identified the previously identified missense variant within the MC4R gene for one of the lines. For loin pH, pCADD identified a synonymous variant in the RNF25 gene (SSC15) as the most likely candidate for the muscle pH association. The missense mutation in the PRKAG3 gene known to affect glycogen content was not prioritised by pCADD for loin pH. CONCLUSIONS: For loin depth, we identified several strong candidate regions for further statistical fine-mapping that are supported in the literature, and two novel regions. For loin muscle pH, we identified one previously identified associated region. We found mixed evidence for the utility of pCADD as an extension of heuristic fine-mapping. The next step is to perform more sophisticated fine-mapping and expression quantitative trait loci (eQTL) analysis, and then interrogate candidate variants in vitro by perturbation-CRISPR assays.


Assuntos
Estudo de Associação Genômica Ampla , Músculos , Suínos/genética , Animais , Estudo de Associação Genômica Ampla/métodos , Genótipo , Locos de Características Quantitativas , Fenótipo , Concentração de Íons de Hidrogênio , Polimorfismo de Nucleotídeo Único
5.
Theor Appl Genet ; 135(10): 3393-3415, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36066596

RESUMO

KEY MESSAGE: The integration of known and latent environmental covariates within a single-stage genomic selection approach provides breeders with an informative and practical framework to utilise genotype by environment interaction for prediction into current and future environments. This paper develops a single-stage genomic selection approach which integrates known and latent environmental covariates within a special factor analytic framework. The factor analytic linear mixed model of Smith et al. (2001) is an effective method for analysing multi-environment trial (MET) datasets, but has limited practicality since the underlying factors are latent so the modelled genotype by environment interaction (GEI) is observable, rather than predictable. The advantage of using random regressions on known environmental covariates, such as soil moisture and daily temperature, is that the modelled GEI becomes predictable. The integrated factor analytic linear mixed model (IFA-LMM) developed in this paper includes a model for predictable and observable GEI in terms of a joint set of known and latent environmental covariates. The IFA-LMM is demonstrated on a late-stage cotton breeding MET dataset from Bayer CropScience. The results show that the known covariates predominately capture crossover GEI and explain 34.4% of the overall genetic variance. The most notable covariates are maximum downward solar radiation (10.1%), average cloud cover (4.5%) and maximum temperature (4.0%). The latent covariates predominately capture non-crossover GEI and explain 40.5% of the overall genetic variance. The results also show that the average prediction accuracy of the IFA-LMM is [Formula: see text] higher than conventional random regression models for current environments and [Formula: see text] higher for future environments. The IFA-LMM is therefore an effective method for analysing MET datasets which also utilises crossover and non-crossover GEI for genomic prediction into current and future environments. This is becoming increasingly important with the emergence of rapidly changing environments and climate change.


Assuntos
Interação Gene-Ambiente , Modelos Genéticos , Genômica , Genótipo , Melhoramento Vegetal , Solo
6.
Heredity (Edinb) ; 128(1): 21-32, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34912044

RESUMO

Genetic variance is a central parameter in quantitative genetics and breeding. Assessing changes in genetic variance over time as well as the genome is therefore of high interest. Here, we extend a previously proposed framework for temporal analysis of genetic variance using the pedigree-based model, to a new framework for temporal and genomic analysis of genetic variance using marker-based models. To this end, we describe the theory of partitioning genetic variance into genic variance and within-chromosome and between-chromosome linkage-disequilibrium, and how to estimate these variance components from a marker-based model fitted to observed phenotype and marker data. The new framework involves three steps: (i) fitting a marker-based model to data, (ii) sampling realisations of marker effects from the fitted model and for each sample calculating realisations of genetic values and (iii) calculating the variance of sampled genetic values by time and genome partitions. Analysing time partitions indicates breeding programme sustainability, while analysing genome partitions indicates contributions from chromosomes and chromosome pairs and linkage-disequilibrium. We demonstrate the framework with a simulated breeding programme involving a complex trait. Results show good concordance between simulated and estimated variances, provided that the fitted model is capturing genetic complexity of a trait. We observe a reduction of genetic variance due to selection and drift changing allele frequencies, and due to selection inducing negative linkage-disequilibrium.


Assuntos
Cruzamento , Modelos Genéticos , Seleção Genética , Genoma , Genômica/métodos , Desequilíbrio de Ligação , Linhagem , Fenótipo
7.
Genet Sel Evol ; 54(1): 76, 2022 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-36418945

RESUMO

BACKGROUND: By entering the era of mega-scale genomics, we are facing many computational issues with standard genomic evaluation models due to their dense data structure and cubic computational complexity. Several scalable approaches have been proposed to address this challenge, such as the Algorithm for Proven and Young (APY). In APY, genotyped animals are partitioned into core and non-core subsets, which induces a sparser inverse of the genomic relationship matrix. This partitioning is often done at random. While APY is a good approximation of the full model, random partitioning can make results unstable, possibly affecting accuracy or even reranking animals. Here we present a stable optimisation of the core subset by choosing animals with the most informative genotype data. METHODS: We derived a novel algorithm for optimising the core subset based on a conditional genomic relationship matrix or a conditional single nucleotide polymorphism (SNP) genotype matrix. We compared the accuracy of genomic predictions with different core subsets for simulated and real pig data sets. The core subsets were constructed (1) at random, (2) based on the diagonal of the genomic relationship matrix, (3) at random with weights from (2), or (4) based on the novel conditional algorithm. To understand the different core subset constructions, we visualise the population structure of the genotyped animals with linear Principal Component Analysis and non-linear Uniform Manifold Approximation and Projection. RESULTS: All core subset constructions performed equally well when the number of core animals captured most of the variation in the genomic relationships, both in simulated and real data sets. When the number of core animals was not sufficiently large, there was substantial variability in the results with the random construction but no variability with the conditional construction. Visualisation of the population structure and chosen core animals showed that the conditional construction spreads core animals across the whole domain of genotyped animals in a repeatable manner. CONCLUSIONS: Our results confirm that the size of the core subset in APY is critical. Furthermore, the results show that the core subset can be optimised with the conditional algorithm that achieves an optimal and repeatable spread of core animals across the domain of genotyped animals.


Assuntos
Genoma , Modelos Genéticos , Suínos , Animais , Genômica/métodos , Genótipo , Algoritmos
8.
Genet Sel Evol ; 54(1): 39, 2022 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-35659233

RESUMO

BACKGROUND: It is expected that functional, mainly missense and loss-of-function (LOF), and regulatory variants are responsible for most phenotypic differences between breeds and genetic lines of livestock species that have undergone diverse selection histories. However, there is still limited knowledge about the existing missense and LOF variation in commercial livestock populations, in particular regarding population-specific variation and how it can affect applications such as across-breed genomic prediction. METHODS: We re-sequenced the whole genome of 7848 individuals from nine commercial pig lines (average sequencing coverage: 4.1×) and imputed whole-genome genotypes for 440,610 pedigree-related individuals. The called variants were categorized according to predicted functional annotation (from LOF to intergenic) and prevalence level (number of lines in which the variant segregated; from private to widespread). Variants in each category were examined in terms of their distribution along the genome, alternative allele frequency, per-site Wright's fixation index (FST), individual load, and association to production traits. RESULTS: Of the 46 million called variants, 28% were private (called in only one line) and 21% were widespread (called in all nine lines). Genomic regions with a low recombination rate were enriched with private variants. Low-prevalence variants (called in one or a few lines only) were enriched for lower allele frequencies, lower FST, and putatively functional and regulatory roles (including LOF and deleterious missense variants). On average, individuals carried fewer private deleterious missense alleles than expected compared to alleles with other predicted consequences. Only a small subset of the low-prevalence variants had intermediate allele frequencies and explained small fractions of phenotypic variance (up to 3.2%) of production traits. The significant low-prevalence variants had higher per-site FST than the non-significant ones. These associated low-prevalence variants were tagged by other more widespread variants in high linkage disequilibrium, including intergenic variants. CONCLUSIONS: Most low-prevalence variants have low minor allele frequencies and only a small subset of low-prevalence variants contributed detectable fractions of phenotypic variance of production traits. Accounting for low-prevalence variants is therefore unlikely to noticeably benefit across-breed analyses, such as the prediction of genomic breeding values in a population using reference populations of a different genetic background.


Assuntos
Genoma , Polimorfismo de Nucleotídeo Único , Animais , Frequência do Gene , Variação Genética , Genômica , Genótipo , Suínos/genética
9.
Genet Sel Evol ; 54(1): 65, 2022 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-36153511

RESUMO

BACKGROUND: Early simulations indicated that whole-genome sequence data (WGS) could improve the accuracy of genomic predictions within and across breeds. However, empirical results have been ambiguous so far. Large datasets that capture most of the genomic diversity in a population must be assembled so that allele substitution effects are estimated with high accuracy. The objectives of this study were to use a large pig dataset from seven intensely selected lines to assess the benefits of using WGS for genomic prediction compared to using commercial marker arrays and to identify scenarios in which WGS provides the largest advantage. METHODS: We sequenced 6931 individuals from seven commercial pig lines with different numerical sizes. Genotypes of 32.8 million variants were imputed for 396,100 individuals (17,224 to 104,661 per line). We used BayesR to perform genomic prediction for eight complex traits. Genomic predictions were performed using either data from a standard marker array or variants preselected from WGS based on association tests. RESULTS: The accuracies of genomic predictions based on preselected WGS variants were not robust across traits and lines and the improvements in prediction accuracy that we achieved so far with WGS compared to standard marker arrays were generally small. The most favourable results for WGS were obtained when the largest training sets were available and standard marker arrays were augmented with preselected variants with statistically significant associations to the trait. With this method and training sets of around 80k individuals, the accuracy of within-line genomic predictions was on average improved by 0.025. With multi-line training sets, improvements of 0.04 compared to marker arrays could be expected. CONCLUSIONS: Our results showed that WGS has limited potential to improve the accuracy of genomic predictions compared to marker arrays in intensely selected pig lines. Thus, although we expect that larger improvements in accuracy from the use of WGS are possible with a combination of larger training sets and optimised pipelines for generating and analysing such datasets, the use of WGS in the current implementations of genomic prediction should be carefully evaluated against the cost of large-scale WGS data on a case-by-case basis.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Alelos , Animais , Genômica/métodos , Genótipo , Suínos/genética
10.
Bioinformatics ; 36(15): 4369-4371, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32467963

RESUMO

SUMMARY: AlphaFamImpute is an imputation package for calling, phasing and imputing genome-wide genotypes in outbred full-sib families from single nucleotide polymorphism (SNP) array and genotype-by-sequencing (GBS) data. GBS data are increasingly being used to genotype individuals, especially when SNP arrays do not exist for a population of interest. Low-coverage GBS produces data with a large number of missing or incorrect naïve genotype calls, which can be improved by identifying shared haplotype segments between full-sib individuals. Here, we present AlphaFamImpute, an algorithm specifically designed to exploit the genetic structure of full-sib families. It performs imputation using a two-step approach. In the first step, it phases and imputes parental genotypes based on the segregation states of their offspring (i.e. which pair of parental haplotypes the offspring inherited). In the second step, it phases and imputes the offspring genotypes by detecting which haplotype segments the offspring inherited from their parents. With a series of simulations, we find that AlphaFamImpute obtains high-accuracy genotypes, even when the parents are not genotyped and individuals are sequenced at <1x coverage. AVAILABILITY AND IMPLEMENTATION: AlphaFamImpute is available as a Python package from the AlphaGenes website http://www.AlphaGenes.roslin.ed.ac.uk/AlphaFamImpute. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla , Genótipo , Haplótipos , Humanos
11.
Genet Sel Evol ; 53(1): 30, 2021 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-33736590

RESUMO

BACKGROUND: In this paper, we present the AlphaPart R package, an open-source implementation of a method for partitioning breeding values and genetic trends to identify the contribution of selection pathways to genetic gain. Breeding programmes improve populations for a set of traits, which can be measured with a genetic trend calculated from estimated breeding values averaged by year of birth. While sources of the overall genetic gain are generally known, their realised contributions are hard to quantify in complex breeding programmes. The aim of this paper is to present the AlphaPart R package and demonstrate it with a simulated stylized multi-tier breeding programme mimicking a pig or poultry breeding programme. RESULTS: The package includes the main partitioning function AlphaPart, that partitions the breeding values and genetic trends by pre-defined selection paths, and a set of functions for handling data and results. The package is freely available from the CRAN repository at http://CRAN.R-project.org/package=AlphaPart . We demonstrate the use of the package by partitioning the nucleus and multiplier genetic gain of the stylized breeding programme by tier-gender paths. For traits measured and selected in the multiplier, the multiplier selection generated additional genetic gain. By using AlphaPart, we show that the additional genetic gain depends on accuracy and intensity of selection in the multiplier and the extent of gene flow from the nucleus. We have proven that AlphaPart is a valuable tool for understanding the sources of genetic gain in the nucleus and especially the multiplier, and the relationship between the sources and parameters that affect them. CONCLUSIONS: AlphaPart implements the method for partitioning breeding values and genetic trends and provides a useful tool for quantifying the sources of genetic gain in breeding programmes. The use of AlphaPart will help breeders to improve genetic gain through a better understanding of the key selection points that are driving gains in each trait.


Assuntos
Cruzamento/métodos , Modelos Genéticos , Característica Quantitativa Herdável , Animais , Aptidão Genética , Aves Domésticas/genética , Software , Suínos/genética
12.
Genet Sel Evol ; 53(1): 54, 2021 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-34171988

RESUMO

BACKGROUND: Meiotic recombination results in the exchange of genetic material between homologous chromosomes. Recombination rate varies between different parts of the genome, between individuals, and is influenced by genetics. In this paper, we assessed the genetic variation in recombination rate along the genome and between individuals in the pig using multilocus iterative peeling on 150,000 individuals across nine genotyped pedigrees. We used these data to estimate the heritability of recombination and perform a genome-wide association study of recombination in the pig. RESULTS: Our results confirmed known features of the recombination landscape of the pig genome, including differences in genetic length of chromosomes and marked sex differences. The recombination landscape was repeatable between lines, but at the same time, there were differences in average autosome-wide recombination rate between lines. The heritability of autosome-wide recombination rate was low but not zero (on average 0.07 for females and 0.05 for males). We found six genomic regions that are associated with recombination rate, among which five harbour known candidate genes involved in recombination: RNF212, SHOC1, SYCP2, MSH4 and HFM1. CONCLUSIONS: Our results on the variation in recombination rate in the pig genome agree with those reported for other vertebrates, with a low but nonzero heritability, and the identification of a major quantitative trait locus for recombination rate that is homologous to that detected in several other species. This work also highlights the utility of using large-scale livestock data to understand biological processes.


Assuntos
Variação Genética , Recombinação Genética , Suínos/genética , Animais , Feminino , Loci Gênicos , Masculino , Linhagem
13.
Genet Sel Evol ; 53(1): 70, 2021 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-34496773

RESUMO

BACKGROUND: Body weight (BW) is an economically important trait in the broiler (meat-type chickens) industry. Under the assumption of polygenicity, a "large" number of genes with "small" effects is expected to control BW. To detect such effects, a large sample size is required in genome-wide association studies (GWAS). Our objective was to conduct a GWAS for BW measured at 35 days of age with a large sample size. METHODS: The GWAS included 137,343 broilers spanning 15 pedigree generations and 392,295 imputed single nucleotide polymorphisms (SNPs). A false discovery rate of 1% was adopted to account for multiple testing when declaring significant SNPs. A Bayesian ridge regression model was implemented, using AlphaBayes, to estimate the contribution to the total genetic variance of each region harbouring significant SNPs (1 Mb up/downstream) and the combined regions harbouring non-significant SNPs. RESULTS: GWAS revealed 25 genomic regions harbouring 96 significant SNPs on 13 Gallus gallus autosomes (GGA1 to 4, 8, 10 to 15, 19 and 27), with the strongest associations on GGA4 at 65.67-66.31 Mb (Galgal4 assembly). The association of these regions points to several strong candidate genes including: (i) growth factors (GGA1, 4, 8, 13 and 14); (ii) leptin receptor overlapping transcript (LEPROT)/leptin receptor (LEPR) locus (GGA8), and the STAT3/STAT5B locus (GGA27), in connection with the JAK/STAT signalling pathway; (iii) T-box gene (TBX3/TBX5) on GGA15 and CHST11 (GGA1), which are both related to heart/skeleton development); and (iv) PLAG1 (GGA2). Combined together, these 25 genomic regions explained ~ 30% of the total genetic variance. The region harbouring significant SNPs that explained the largest portion of the total genetic variance (4.37%) was on GGA4 (~ 65.67-66.31 Mb). CONCLUSIONS: To the best of our knowledge, this is the largest GWAS that has been conducted for BW in chicken to date. In spite of the identified regions, which showed a strong association with BW, the high proportion of genetic variance attributed to regions harbouring non-significant SNPs supports the hypothesis that the genetic architecture of BW35 is polygenic and complex. Our results also suggest that a large sample size will be required for future GWAS of BW35.


Assuntos
Peso Corporal/genética , Galinhas/anatomia & histologia , Galinhas/genética , Estudo de Associação Genômica Ampla , Animais , Teorema de Bayes , Feminino , Herança Multifatorial/genética , Fatores de Tempo
14.
Genet Sel Evol ; 53(1): 76, 2021 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-34551713

RESUMO

BACKGROUND: Backfat thickness is an important carcass composition trait for pork production and is commonly included in swine breeding programmes. In this paper, we report the results of a large genome-wide association study for backfat thickness using data from eight lines of diverse genetic backgrounds. METHODS: Data comprised 275,590 pigs from eight lines with diverse genetic backgrounds (breeds included Large White, Landrace, Pietrain, Hampshire, Duroc, and synthetic lines) genotyped and imputed for 71,324 single-nucleotide polymorphisms (SNPs). For each line, we estimated SNP associations using a univariate linear mixed model that accounted for genomic relationships. SNPs with significant associations were identified using a threshold of p < 10-6 and used to define genomic regions of interest. The proportion of genetic variance explained by a genomic region was estimated using a ridge regression model. RESULTS: We found significant associations with backfat thickness for 264 SNPs across 27 genomic regions. Six genomic regions were detected in three or more lines. The average estimate of the SNP-based heritability was 0.48, with estimates by line ranging from 0.30 to 0.58. The genomic regions jointly explained from 3.2 to 19.5% of the additive genetic variance of backfat thickness within a line. Individual genomic regions explained up to 8.0% of the additive genetic variance of backfat thickness within a line. Some of these 27 genomic regions also explained up to 1.6% of the additive genetic variance in lines for which the genomic region was not statistically significant. We identified 64 candidate genes with annotated functions that can be related to fat metabolism, including well-studied genes such as MC4R, IGF2, and LEPR, and more novel candidate genes such as DHCR7, FGF23, MEDAG, DGKI, and PTN. CONCLUSIONS: Our results confirm the polygenic architecture of backfat thickness and the role of genes involved in energy homeostasis, adipogenesis, fatty acid metabolism, and insulin signalling pathways for fat deposition in pigs. The results also suggest that several less well-understood metabolic pathways contribute to backfat development, such as those of phosphate, calcium, and vitamin D homeostasis.


Assuntos
Tecido Adiposo/anatomia & histologia , Genes , Patrimônio Genético , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Suínos/anatomia & histologia , Suínos/genética , Animais , Genoma , Genômica , Genótipo , Suínos/classificação
15.
Genet Sel Evol ; 52(1): 69, 2020 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-33198636

RESUMO

BACKGROUND: Breeders and geneticists use statistical models to separate genetic and environmental effects on phenotype. A common way to separate these effects is to model a descriptor of an environment, a contemporary group or herd, and account for genetic relationship between animals across environments. However, separating the genetic and environmental effects in smallholder systems is challenging due to small herd sizes and weak genetic connectedness across herds. We hypothesised that accounting for spatial relationships between nearby herds can improve genetic evaluation in smallholder systems. Furthermore, geographically referenced environmental covariates are increasingly available and could model underlying sources of spatial relationships. The objective of this study was therefore, to evaluate the potential of spatial modelling to improve genetic evaluation in dairy cattle smallholder systems. METHODS: We performed simulations and real dairy cattle data analysis to test our hypothesis. We modelled environmental variation by estimating herd and spatial effects. Herd effects were considered independent, whereas spatial effects had distance-based covariance between herds. We compared these models using pedigree or genomic data. RESULTS: The results show that in smallholder systems (i) standard models do not separate genetic and environmental effects accurately, (ii) spatial modelling increases the accuracy of genetic evaluation for phenotyped and non-phenotyped animals, (iii) environmental covariates do not substantially improve the accuracy of genetic evaluation beyond simple distance-based relationships between herds, (iv) the benefit of spatial modelling was largest when separating the genetic and environmental effects was challenging, and (v) spatial modelling was beneficial when using either pedigree or genomic data. CONCLUSIONS: We have demonstrated the potential of spatial modelling to improve genetic evaluation in smallholder systems. This improvement is driven by establishing environmental connectedness between herds, which enhances separation of genetic and environmental effects. We suggest routine spatial modelling in genetic evaluations, particularly for smallholder systems. Spatial modelling could also have a major impact in studies of human and wild populations.


Assuntos
Cruzamento/métodos , Bovinos/genética , Interação Gene-Ambiente , Modelos Genéticos , Animais , Ecossistema
16.
Genet Sel Evol ; 52(1): 18, 2020 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-32248818

RESUMO

BACKGROUND: For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method that is well suited for large livestock populations. METHODS: We simulated marker array and whole-genome sequence data for 15 populations with simulated or real pedigrees that had different structures. In these populations, we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population, we considered four levels of investment in sequencing that were proportional to the size of the population. RESULTS: Imputation accuracy depended greatly on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence and it was critical for achieving high imputation accuracy in both early and late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of 2× rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2× provided high imputation accuracy. The gain in imputation accuracy from additional investment decreased with larger populations and higher levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones. CONCLUSIONS: Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing ~2% of the population at a uniform coverage 2×, distributed preferably across all generations of the pedigree, except for the few earliest generations that lack genotyped ancestors. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.


Assuntos
Cruzamento , Biologia Computacional , Genótipo , Suínos/genética , Sequenciamento Completo do Genoma , Animais , Feminino , Masculino , Linhagem
17.
Genet Sel Evol ; 52(1): 38, 2020 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-32640985

RESUMO

BACKGROUND: We describe the latest improvements to the long-range phasing (LRP) and haplotype library imputation (HLI) algorithms for successful phasing of both datasets with one million individuals and datasets genotyped using different sets of single nucleotide polymorphisms (SNPs). Previous publicly available implementations of the LRP algorithm implemented in AlphaPhase could not phase large datasets due to the computational cost of defining surrogate parents by exhaustive all-against-all searches. Furthermore, the AlphaPhase implementations of LRP and HLI were not designed to deal with large amounts of missing data that are inherent when using multiple SNP arrays. METHODS: We developed methods that avoid the need for all-against-all searches by performing LRP on subsets of individuals and then concatenating the results. We also extended LRP and HLI algorithms to enable the use of different sets of markers, including missing values, when determining surrogate parents and identifying haplotypes. We implemented and tested these extensions in an updated version of AlphaPhase, and compared its performance to the software package Eagle2. RESULTS: A simulated dataset with one million individuals genotyped with the same 6711 SNPs for a single chromosome took less than a day to phase, compared to more than seven days for Eagle2. The percentage of correctly phased alleles at heterozygous loci was 90.2 and 99.9% for AlphaPhase and Eagle2, respectively. A larger dataset with one million individuals genotyped with 49,579 SNPs for a single chromosome took AlphaPhase 23 days to phase, with 89.9% of alleles at heterozygous loci phased correctly. The phasing accuracy was generally lower for datasets with different sets of markers than with one set of markers. For a simulated dataset with three sets of markers, 1.5% of alleles at heterozygous positions were phased incorrectly, compared to 0.4% with one set of markers. CONCLUSIONS: The improved LRP and HLI algorithms enable AlphaPhase to quickly and accurately phase very large and heterogeneous datasets. AlphaPhase is an order of magnitude faster than the other tested packages, although Eagle2 showed a higher level of phasing accuracy. The speed gain will make phasing achievable for very large genomic datasets in livestock, enabling more powerful breeding and genetics research and application.


Assuntos
Algoritmos , Conjuntos de Dados como Assunto/normas , Estudo de Associação Genômica Ampla/métodos , Haplótipos , Animais , Estudo de Associação Genômica Ampla/normas , Heterozigoto , Gado/genética , Polimorfismo de Nucleotídeo Único
18.
Genet Sel Evol ; 52(1): 17, 2020 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-32248811

RESUMO

BACKGROUND: The coupling of appropriate sequencing strategies and imputation methods is critical for assembling large whole-genome sequence datasets from livestock populations for research and breeding. In this paper, we describe and validate the coupling of a sequencing strategy with the imputation method hybrid peeling in real animal breeding settings. METHODS: We used data from four pig populations of different size (18,349 to 107,815 individuals) that were widely genotyped at densities between 15,000 and 75,000 markers genome-wide. Around 2% of the individuals in each population were sequenced (most of them at 1× or 2× and 37-92 individuals per population, totalling 284, at 15-30×). We imputed whole-genome sequence data with hybrid peeling. We evaluated the imputation accuracy by removing the sequence data of the 284 individuals with high coverage, using a leave-one-out design. We simulated data that mimicked the sequencing strategy used in the real populations to quantify the factors that affected the individual-wise and variant-wise imputation accuracies using regression trees. RESULTS: Imputation accuracy was high for the majority of individuals in all four populations (median individual-wise dosage correlation: 0.97). Imputation accuracy was lower for individuals in the earliest generations of each population than for the rest, due to the lack of marker array data for themselves and their ancestors. The main factors that determined the individual-wise imputation accuracy were the genotyping status, the availability of marker array data for immediate ancestors, and the degree of connectedness to the rest of the population, but sequencing coverage of the relatives had no effect. The main factors that determined variant-wise imputation accuracy were the minor allele frequency and the number of individuals with sequencing coverage at each variant site. Results were validated with the empirical observations. CONCLUSIONS: We demonstrate that the coupling of an appropriate sequencing strategy and hybrid peeling is a powerful strategy for generating whole-genome sequence data with high accuracy in large pedigreed populations where only a small fraction of individuals (2%) had been sequenced, mostly at low coverage. This is a critical step for the successful implementation of whole-genome sequence data for genomic prediction and fine-mapping of causal variants.


Assuntos
Cruzamento , Técnicas de Genotipagem , Gado/genética , Suínos/genética , Sequenciamento Completo do Genoma/veterinária , Animais , Biologia Computacional , Feminino , Frequência do Gene , Genótipo , Masculino , Linhagem
19.
Genet Sel Evol ; 52(1): 25, 2020 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-32408891

RESUMO

BACKGROUND: In the Neolithic, domestic sheep migrated into Europe and subsequently spread in westerly and northwesterly directions. Reconstruction of these migrations and subsequent genetic events requires a more detailed characterization of the current phylogeographic differentiation. RESULTS: We collected 50 K single nucleotide polymorphism (SNP) profiles of Balkan sheep that are currently found near the major Neolithic point of entry into Europe, and combined these data with published genotypes from southwest-Asian, Mediterranean, central-European and north-European sheep and from Asian and European mouflons. We detected clines, ancestral components and admixture by using variants of common analysis tools: geography-informative supervised principal component analysis (PCA), breed-specific admixture analysis, across-breed [Formula: see text] profiles and phylogenetic analysis of regional pools of breeds. The regional Balkan sheep populations exhibit considerable genetic overlap, but are clearly distinct from the breeds in surrounding regions. The Asian mouflon did not influence the differentiation of the European domestic sheep and is only distantly related to present-day sheep, including those from Iran where the mouflons were sampled. We demonstrate the occurrence, from southeast to northwest Europe, of a continuously increasing ancestral component of up to 20% contributed by the European mouflon, which is assumed to descend from the original Neolithic domesticates. The overall patterns indicate that the Balkan region and Italy served as post-domestication migration hubs, from which wool sheep reached Spain and north Italy with subsequent migrations northwards. The documented dispersal of Tarentine wool sheep during the Roman period may have been part of this process. Our results also reproduce the documented 18th century admixture of Spanish Merino sheep into several central-European breeds. CONCLUSIONS: Our results contribute to a better understanding of the events that have created the present diversity pattern, which is relevant for the management of the genetic resources represented by the European sheep population.


Assuntos
Genética Populacional/métodos , Polimorfismo de Nucleotídeo Único/genética , Ovinos/genética , Animais , Península Balcânica , Cruzamento/métodos , Domesticação , Testes Genéticos/métodos , Variação Genética/genética , Genótipo , Filogenia , Filogeografia/métodos
20.
Bioinformatics ; 34(19): 3408-3411, 2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29722792

RESUMO

Summary: AlphaMate is a flexible program that optimizes selection, maintenance of genetic diversity and mate allocation in breeding programs. It can be used in animal and cross- and self-pollinating plant populations. These populations can be subject to selective breeding or conservation management. The problem is formulated as a multi-objective optimization of a valid mating plan that is solved with an evolutionary algorithm. A valid mating plan is defined by a combination of mating constraints (the number of matings, the maximal number of parents, the minimal/equal/maximal number of contributions per parent, or allowance for selfing) that are gender specific or generic. The optimization can maximize genetic gain, minimize group coancestry, minimize inbreeding of individual matings, or maximize genetic gain for a given increase in group coancestry or inbreeding. Users provide a list of candidate individuals with associated gender and selection criteria information (if applicable) and coancestry matrix. Selection criteria and coancestry matrix can be based on pedigree or genome-wide markers. Additional individual or mating specific information can be included to enrich optimization objectives. An example of rapid recurrent genomic selection in wheat demonstrates how AlphaMate can double the efficiency of converting genetic diversity into genetic gain compared to truncation selection. Another example demonstrates the use of genome editing to expand the gain-diversity frontier. Availability and implementation: Executable versions of AlphaMate for Windows, Mac and Linux platforms are available at http://www.AlphaGenes.roslin.ed.ac.uk/AlphaMate.


Assuntos
Cruzamento , Endogamia , Software , Criação de Animais Domésticos , Animais , Genômica , Modelos Genéticos , Linhagem , Melhoramento Vegetal , Seleção Genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA