RESUMO
BACKGROUND: Gut microbial composition plays an important role in numerous traits, including immune response. Integration of host genomic information with microbiome data is a natural step in the prediction of complex traits, although methods to optimize this are still largely unexplored. In this paper, we assess the impact of different modelling strategies on the predictive capacity for six porcine immunocompetence traits when both genotype and microbiota data are available. METHODS: We used phenotypic data on six immunity traits and the relative abundance of gut bacterial communities on 400 Duroc pigs that were genotyped for 70 k SNPs. We compared the predictive accuracy, defined as the correlation between predicted and observed phenotypes, of a wide catalogue of models: reproducing kernel Hilbert space (RKHS), Bayes C, and an ensemble method, using a range of priors and microbial clustering strategies. Combined (holobiont) models that include both genotype and microbiome data were compared with partial models that use one source of variation only. RESULTS: Overall, holobiont models performed better than partial models. Host genotype was especially relevant for predicting adaptive immunity traits (i.e., concentration of immunoglobulins M and G), whereas microbial composition was important for predicting innate immunity traits (i.e., concentration of haptoglobin and C-reactive protein and lymphocyte phagocytic capacity). None of the models was uniformly best across all traits. We observed a greater variability in predictive accuracies across models when microbiability (the variance explained by the microbiome) was high. Clustering microbial abundances did not necessarily increase predictive accuracy. CONCLUSIONS: Gut microbiota information is useful for predicting immunocompetence traits, especially those related to innate immunity. Modelling microbiome abundances deserves special attention when microbiability is high. Clustering microbial data for prediction is not recommended by default.
Assuntos
Genoma , Genômica , Animais , Suínos , Teorema de Bayes , Genótipo , Fenótipo , Genômica/métodosRESUMO
KEY MESSAGE: Transposon insertion polymorphisms can improve prediction of complex agronomic traits in rice compared to using SNPs only, especially when accessions to be predicted are less related to the training set. Transposon insertion polymorphisms (TIPs) are significant sources of genetic variation. Previous work has shown that TIPs can improve detection of causative loci on agronomic traits in rice. Here, we quantify the fraction of variance explained by single nucleotide polymorphisms (SNPs) compared to TIPs, and we explore whether TIPs can improve prediction of traits when compared to using only SNPs. We used eleven traits of agronomic relevance from by five different rice population groups (Aus, Indica, Aromatic, Japonica, and Admixed), 738 accessions in total. We assess prediction by applying data split validation in two scenarios. In the within-population scenario, we predicted performance of improved Indica varieties using the rest of Indica accessions. In the across population scenario, we predicted all Aromatic and Admixed accessions using the rest of populations. In each scenario, Bayes C and a Bayesian reproducible kernel Hilbert space regression were compared. We find that TIPs can explain an important fraction of total genetic variance and that they also improve genomic prediction. In the across population prediction scenario, TIPs outperformed SNPs in nine out of the eleven traits analyzed. In some traits like leaf senescence or grain width, using TIPs increased predictive correlation by 30-50%. Our results evidence, for the first time, that TIPs genotyping can improve prediction on complex agronomic traits in rice, especially when accessions to be predicted are less related to training accessions.
Assuntos
Oryza , Teorema de Bayes , Elementos de DNA Transponíveis , Oryza/genética , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Muscle development and lipid accumulation in muscle critically affect meat quality of livestock. However, the genetic factors underlying myofiber-type specification and intramuscular fat (IMF) accumulation remain to be elucidated. Using two independent intercrosses between Western commercial breeds and Korean native pigs (KNPs) and a joint linkage-linkage disequilibrium analysis, we identified a 488.1-kb region on porcine chromosome 12 that affects both reddish meat color (a*) and IMF. In this critical region, only the MYH3 gene, encoding myosin heavy chain 3, was found to be preferentially overexpressed in the skeletal muscle of KNPs. Subsequently, MYH3-transgenic mice demonstrated that this gene controls both myofiber-type specification and adipogenesis in skeletal muscle. We discovered a structural variant in the promotor/regulatory region of MYH3 for which Q allele carriers exhibited significantly higher values of a* and IMF than q allele carriers. Furthermore, chromatin immunoprecipitation and cotransfection assays showed that the structural variant in the 5'-flanking region of MYH3 abrogated the binding of the myogenic regulatory factors (MYF5, MYOD, MYOG, and MRF4). The allele distribution of MYH3 among pig populations worldwide indicated that the MYH3 Q allele is of Asian origin and likely predates domestication. In conclusion, we identified a functional regulatory sequence variant in porcine MYH3 that provides novel insights into the genetic basis of the regulation of myofiber type ratios and associated changes in IMF in pigs. The MYH3 variant can play an important role in improving pork quality in current breeding programs.
Assuntos
Adipogenia/genética , Proteínas do Citoesqueleto/genética , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/crescimento & desenvolvimento , Miosinas/genética , Tecido Adiposo/crescimento & desenvolvimento , Tecido Adiposo/metabolismo , Animais , Cruzamento , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Carne , Camundongos , Camundongos Transgênicos , Músculo Esquelético/metabolismo , Cadeias Pesadas de Miosina/genética , Motivos de Nucleotídeos , Sus scrofa/genética , Sus scrofa/metabolismo , SuínosRESUMO
Improvements in genomic technologies have outpaced the most optimistic predictions, allowing industry-scale application of genomic selection. However, only marginal gains in genetic prediction accuracy can now be expected by increasing marker density up to sequence, unless causative mutations are identified. We argue that some of the most scientifically disrupting and industry-relevant challenges relate to 'phenomics' instead of 'genomics'. Thanks to developments in sensor technology and artificial intelligence, there is a wide range of analytical tools that are already available and many more will be developed. We can now address some of the pressing societal demands on the industry, such as animal welfare concerns or efficiency in the use of resources. From the statistical and computational point of view, phenomics raises two important issues that require further work: penalization and dimension reduction. This will be complicated by the inherent heterogeneity and 'missingness' of the data. Overall, we can expect that precision livestock technologies will make it possible to collect hundreds of traits on a continuous basis from large numbers of animals. Perhaps the main revolution will come from redesigning animal breeding schemes to explicitly allow for high-dimensional phenomics. In the meantime, phenomics data will definitely enlighten our knowledge on the biological basis of phenotypes.
Assuntos
Gado/genética , Fenômica/métodos , Seleção Artificial , Animais , Gado/fisiologiaRESUMO
BACKGROUND: Analysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest. However, numerous questions remain to be answered: how useful can the microbiome be for complex trait prediction? Are estimates of microbiability reliable? Can the underlying biological links between the host's genome, microbiome, and phenome be recovered? METHODS: Here, we address these issues by (i) developing a novel simulation strategy that uses real microbiome and genotype data as inputs, and (ii) using variance-component approaches (Bayesian Reproducing Kernel Hilbert Space (RKHS) and Bayesian variable selection methods (Bayes C)) to quantify the proportion of phenotypic variance explained by the genome and the microbiome. The proposed simulation approach can mimic genetic links between the microbiome and genotype data by a permutation procedure that retains the distributional properties of the data. RESULTS: Using real genotype and rumen microbiota abundances from dairy cattle, simulation results suggest that microbiome data can significantly improve the accuracy of phenotype predictions, regardless of whether some microbiota abundances are under direct genetic control by the host or not. This improvement depends logically on the microbiome being stable over time. Overall, random-effects linear methods appear robust for variance components estimation, in spite of the typically highly leptokurtic distribution of microbiota abundances. The predictive performance of Bayes C was higher but more sensitive to the number of causative effects than RKHS. Accuracy with Bayes C depended, in part, on the number of microorganisms' taxa that influence the phenotype. CONCLUSIONS: While we conclude that, overall, genome-microbiome-links can be characterized using variance component estimates, we are less optimistic about the possibility of identifying the causative host genetic effects that affect microbiota abundances, which would require much larger sample sizes than are typically available for genome-microbiome-phenome studies. The R code to replicate the analyses is in https://github.com/miguelperezenciso/simubiome .
Assuntos
Bovinos/genética , Microbioma Gastrointestinal , Estudo de Associação Genômica Ampla/métodos , Genoma , Herança Multifatorial , Animais , Teorema de Bayes , Bovinos/microbiologia , Simulação por Computador , FenótipoRESUMO
BACKGROUND: Short tandem repeats (STRs) are genetic markers with a greater mutation rate than single nucleotide polymorphisms (SNPs) and are widely used in genetic studies and forensics. However, most studies in pigs have focused only on SNPs or on a limited number of STRs. RESULTS: This study screened 394 deep-sequenced genomes from 22 domesticated pig breeds/populations worldwide, wild boars from both Europe and Asia, and numerous outgroup Suidaes, and identified a set of 878,967 polymorphic STRs (pSTRs), which represents the largest repository of pSTRs in pigs to date. We found multiple lines of evidence that pSTRs in coding regions were affected by purifying selection. The enrichment of trinucleotide pSTRs in coding sequences (CDS), 5'UTR and H3K4me3 regions suggests that trinucleotide STRs serve as important components in the exons and promoters of the corresponding genes. We demonstrated that, compared to SNPs, pSTRs provide comparable or even greater accuracy in determining the breed identity of individuals. We identified pSTRs that showed significant population differentiation between domestic pigs and wild boars in Asia and Europe. We also observed that some pSTRs were significantly associated with environmental variables, such as average annual temperature or altitude of the originating sites of Chinese indigenous breeds, among which we identified loss-of-function and/or expanded STRs overlapping with genes such as AHR, LAS1L and PDK1. Finally, our results revealed that several pSTRs show stronger signals in domestic pig-wild boar differentiation or association with the analysed environmental variables than the flanking SNPs within a 100-kb window. CONCLUSIONS: This study provides a genome-wide high-density map of pSTRs in diverse pig populations based on genome sequencing data, enabling a more comprehensive characterization of their roles in evolutionary and environmental adaptation.
Assuntos
Adaptação Fisiológica , Ecossistema , Evolução Molecular , Repetições de Microssatélites , Suínos/genética , Animais , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: In the early 20th century, Cuban farmers imported Charolais cattle (CHFR) directly from France. These animals are now known as Chacuba (CHCU) and have become adapted to the rough environmental tropical conditions in Cuba. These conditions include long periods of drought and food shortage with extreme temperatures that European taurine cattle have difficulty coping with. RESULTS: In this study, we used whole-genome sequence data from 12 CHCU individuals together with 60 whole-genome sequences from six additional taurine, indicus and crossed breeds to estimate the genetic diversity, structure and accurate ancestral origin of the CHCU animals. Although CHCU animals are assumed to form a closed population, the results of our admixture analysis indicate a limited introgression of Bos indicus. We used the extended haplotype homozygosity (EHH) approach to identify regions in the genome that may have had an important role in the adaptation of CHCU to tropical conditions. Putative selection events occurred in genomic regions with a high proportion of Bos indicus, but they were not sufficient to explain adaptation of CHCU to tropical conditions by Bos indicus introgression only. EHH suggested signals of potential adaptation in genomic windows that include genes of taurine origin involved in thermogenesis (ATP9A, GABBR1, PGR, PTPN1 and UCP1) and hair development (CCHCR1 and CDSN). Within these genes, we identified single nucleotide polymorphisms (SNPs) that may have a functional impact and contribute to some of the observed phenotypic differences between CHCU and CHFR animals. CONCLUSIONS: Whole-genome data confirm that CHCU cattle are closely related to Charolais from France (CHFR) and Canada, but also reveal a limited introgression of Bos indicus genes in CHCU. We observed possible signals of recent adaptation to tropical conditions between CHCU and CHFR founder populations, which were largely independent of the Bos indicus introgression. Finally, we report candidate genes and variants that may have a functional impact and explain some of the phenotypic differences observed between CHCU and CHFR cattle.
Assuntos
Bovinos/genética , Genótipo , Polimorfismo Genético , Termotolerância/genética , Pelo Animal/metabolismo , Animais , Bovinos/fisiologia , Haplótipos , Homozigoto , Termogênese/genética , Clima Tropical , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: Genomic prediction (GP) is a method whereby DNA polymorphism information is used to predict breeding values for complex traits. Although GP can significantly enhance predictive accuracy, it can be expensive and difficult to implement. To help design optimum breeding programs and experiments, including genome-wide association studies and genomic selection experiments, we have developed SeqBreed, a generic and flexible forward simulator programmed in python3. RESULTS: SeqBreed accommodates sex and mitochondrion chromosomes as well as autopolyploidy. It can simulate any number of complex phenotypes that are determined by any number of causal loci. SeqBreed implements several GP methods, including genomic best linear unbiased prediction (GBLUP), single-step GBLUP, pedigree-based BLUP, and mass selection. We illustrate its functionality with Drosophila genome reference panel (DGRP) sequence data and with tetraploid potato genotype data. CONCLUSIONS: SeqBreed is a flexible and easy to use tool that can be used to optimize GP or genome-wide association studies. It incorporates some of the most popular GP methods and includes several visualization tools. Code is open and can be freely modified. Software, documentation, and examples are available at https://github.com/miguelperezenciso/SeqBreed.
Assuntos
Drosophila/genética , Genômica/métodos , Animais , Cruzamento , Feminino , Estudo de Associação Genômica Ampla , Genótipo , Masculino , Herança Multifatorial , Linhagem , SoftwareRESUMO
An amendment to this paper has been published and can be accessed via the original article.
RESUMO
Mitigation of greenhouse gas emissions is relevant for reducing the environmental impact of ruminant production. In this study, the rumen microbiome from Holstein cows was characterized through a combination of 16S rRNA gene and shotgun metagenomic sequencing. Methane production (CH4 ) and dry matter intake (DMI) were individually measured over 4-6 weeks to calculate the CH4 yield (CH4 y = CH4 /DMI) per cow. We implemented a combination of clustering, multivariate and mixed model analyses to identify a set of operational taxonomic unit (OTU) jointly associated with CH4 y and the structure of ruminal microbial communities. Three ruminotype clusters (R1, R2 and R3) were identified, and R2 was associated with higher CH4 y. The taxonomic composition on R2 had lower abundance of Succinivibrionaceae and Methanosphaera, and higher abundance of Ruminococcaceae, Christensenellaceae and Lachnospiraceae. Metagenomic data confirmed the lower abundance of Succinivibrionaceae and Methanosphaera in R2 and identified genera (Fibrobacter and unclassified Bacteroidales) not highlighted by metataxonomic analysis. In addition, the functional metagenomic analysis revealed that samples classified in cluster R2 were overrepresented by genes coding for KEGG modules associated with methanogenesis, including a significant relative abundance of the methyl-coenzyme M reductase enzyme. Based on the cluster assignment, we applied a sparse partial least-squares discriminant analysis at the taxonomic and functional levels. In addition, we implemented a sPLS regression model using the phenotypic variation of CH4 y. By combining these two approaches, we identified 86 discriminant bacterial OTUs, notably including families linked to CH4 emission such as Succinivibrionaceae, Ruminococcaceae, Christensenellaceae, Lachnospiraceae and Rikenellaceae. These selected OTUs explained 24% of the CH4 y phenotypic variance, whereas the host genome contribution was ~14%. In summary, we identified rumen microbial biomarkers associated with the methane production of dairy cows; these biomarkers could be used for targeted methane-reduction selection programmes in the dairy cattle industry provided they are heritable.
Assuntos
Bovinos/metabolismo , Bovinos/microbiologia , Indústria de Laticínios , Trato Gastrointestinal/metabolismo , Trato Gastrointestinal/microbiologia , Metano/biossíntese , Animais , Biomarcadores/metabolismo , DNA Bacteriano/genética , Metagenômica , FenótipoRESUMO
BACKGROUND: Antiretroviral drugs are a very effective therapy against HIV infection. However, the high mutation rate of HIV permits the emergence of variants that can be resistant to the drug treatment. Predicting drug resistance to previously unobserved variants is therefore very important for an optimum medical treatment. In this paper, we propose the use of weighted categorical kernel functions to predict drug resistance from virus sequence data. These kernel functions are very simple to implement and are able to take into account HIV data particularities, such as allele mixtures, and to weigh the different importance of each protein residue, as it is known that not all positions contribute equally to the resistance. RESULTS: We analyzed 21 drugs of four classes: protease inhibitors (PI), integrase inhibitors (INI), nucleoside reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (NNRTI). We compared two categorical kernel functions, Overlap and Jaccard, against two well-known noncategorical kernel functions (Linear and RBF) and Random Forest (RF). Weighted versions of these kernels were also considered, where the weights were obtained from the RF decrease in node impurity. The Jaccard kernel was the best method, either in its weighted or unweighted form, for 20 out of the 21 drugs. CONCLUSIONS: Results show that kernels that take into account both the categorical nature of the data and the presence of mixtures consistently result in the best prediction model. The advantage of including weights depended on the protein targeted by the drug. In the case of reverse transcriptase, weights based in the relative importance of each position clearly increased the prediction performance, while the improvement in the protease was much smaller. This seems to be related to the distribution of weights, as measured by the Gini index. All methods described, together with documentation and examples, are freely available at https://bitbucket.org/elies_ramon/catkern.
Assuntos
Algoritmos , Biologia Computacional/métodos , Farmacorresistência Viral/genética , HIV-1/genética , Fármacos Anti-HIV/farmacologia , Farmacorresistência Viral/efeitos dos fármacos , Infecções por HIV/virologia , HIV-1/efeitos dos fármacos , HIV-1/isolamento & purificação , Humanos , Modelos Lineares , Análise de Componente PrincipalRESUMO
BACKGROUND: The effect of epistasis on response to selection is a highly debated topic. Here, we investigated the impact of epistasis on response to sequence-based selection via genomic best linear prediction (GBLUP) in a regime of strong non-symmetrical epistasis under divergent selection, using real Drosophila sequence data. We also explored the possible advantage of including epistasis in the evaluation model and/or of knowing the causal mutations. RESULTS: Response to selection was almost exclusively due to changes in allele frequency at a few loci with a large effect. Response was highly asymmetric (about four phenotypic standard deviations higher for upward than downward selection) due to the highly skewed site frequency spectrum. Epistasis accentuated this asymmetry and affected response to selection by modulating the additive genetic variance, which was sustained for longer under upward selection whereas it eroded rapidly under downward selection. Response to selection was quite insensitive to the evaluation model, especially under an additive scenario. Nevertheless, including epistasis in the model when there was none eventually led to lower accuracies as selection proceeded. Accounting for epistasis in the model, if it existed, was beneficial but only in the medium term. There was not much gain in response if causal mutations were known, compared to using sequence data, which is likely due to strong linkage disequilibrium, high heritability and availability of phenotypes on candidates. CONCLUSIONS: Epistatic interactions affect the response to genomic selection by modulating the additive genetic variance used for selection. Epistasis releases additive variance that may increase response to selection compared to a pure additive genetic action. Furthermore, genomic evaluation models and, in particular, GBLUP are robust, i.e. adding complexity to the model did not modify substantially the response (for a given architecture).
Assuntos
Epistasia Genética , Modelos Genéticos , Seleção Genética , Animais , Bases de Dados Genéticas , Drosophila/genética , GenomaRESUMO
BACKGROUND: Pigs were domesticated independently in Eastern and Western Eurasia early during the agricultural revolution, and have since been transported and traded across the globe. Here, we present a worldwide survey on 60K genome-wide single nucleotide polymorphism (SNP) data for 2093 pigs, including 1839 domestic pigs representing 122 local and commercial breeds, 215 wild boars, and 39 out-group suids, from Asia, Europe, America, Oceania and Africa. The aim of this study was to infer global patterns in pig domestication and diversity related to demography, migration, and selection. RESULTS: A deep phylogeographic division reflects the dichotomy between early domestication centers. In the core Eastern and Western domestication regions, Chinese pigs show differentiation between breeds due to geographic isolation, whereas this is less pronounced in European pigs. The inferred European origin of pigs in the Americas, Africa, and Australia reflects European expansion during the sixteenth to nineteenth centuries. Human-mediated introgression, which is due, in particular, to importing Chinese pigs into the UK during the eighteenth and nineteenth centuries, played an important role in the formation of modern pig breeds. Inbreeding levels vary markedly between populations, from almost no runs of homozygosity (ROH) in a number of Asian wild boar populations, to up to 20% of the genome covered by ROH in a number of Southern European breeds. Commercial populations show moderate ROH statistics. For domesticated pigs and wild boars in Asia and Europe, we identified highly differentiated loci that include candidate genes related to muscle and body development, central nervous system, reproduction, and energy balance, which are putatively under artificial selection. CONCLUSIONS: Key events related to domestication, dispersal, and mixing of pigs from different regions are reflected in the 60K SNP data, including the globalization that has recently become full circle since Chinese pig breeders in the past decades started selecting Western breeds to improve local Chinese pigs. Furthermore, signatures of ongoing and past selection, acting at different times and on different genetic backgrounds, enhance our insight in the mechanism of domestication and selection. The global diversity statistics presented here highlight concerns for maintaining agrodiversity, but also provide a necessary framework for directing genetic conservation.
Assuntos
Cruzamento , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Sus scrofa/genética , Animais , Ásia , Austrália , Europa (Continente) , Internacionalidade , Seleção Genética , Sus scrofa/classificação , SuínosRESUMO
BACKGROUND: The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach. RESULTS: We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model. CONCLUSIONS: Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high-density genotyping.
Assuntos
Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Animais , Cruzamento , Bovinos , Genômica/métodos , Mutação , Reprodutibilidade dos TestesRESUMO
BACKGROUND: The oral GPCR nutrient/taste receptor gene repertoire consists of the Tas1r family (sweet and umami tastes), the Tas2r family (bitter taste) as well as several other potential candidate sensors of amino acids, peptones and fatty acids. Taste/nutrient receptors play a fundamental role in survival through the identification of dietary nutrients or potentially toxic compounds. In humans and rodents some variations in taste sensitivity have been related to receptor polymorphisms. Some allelic variants, in turn, have been linked to the adaptation to specific geographical locations and dietary regimes. In contrast, the porcine taste/nutrient receptor repertoire has been only partially characterized and limited information on genetic variation across breeds and geographical location exists. The present study aims at filling this void which in turn will form the bases for future improvements in pig nutrition. RESULTS: Our results show that the pig oral repertoire of taste/nutrient receptors consists of at least 28 receptor genes with significant transcription measured for 27. When compared to humans and rodents, the porcine gene sequences encoding sensors for carbohydrates, amino acids and fatty acids were highly conserved whilst the bitter taste gene family (known as Tas2rs) showed high divergence. We identified 15 porcine Tas2rs of which 13 are orthologous to human sequences. The single nucleotide polymorphism (SNP) sequence analysis using 79 pig genomes, representing 14 different breeds/populations, revealed that the Tas2r subset had higher variability (average π =2.8 × 10-3) than for non-bitter taste genes (π =1.2-1.5 × 10-3). In addition, our results show that the difference in nutrient receptor genes between Asian and European breeds accounts for only a small part of the variability, which is in contrast with previous findings involving genome wide data. CONCLUSIONS: We have defined twenty-eight oral nutrient sensing related genes for the pig. The homology with the human repertoire is high for the porcine non-bitter taste gene repertoire and low for the porcine Tas2r repertoire. Our data suggests that bitter taste is a plastic trait, possibly associated with the ability of pigs to adapt to diverse environments and that may be subject to balancing selection.
Assuntos
Evolução Molecular , Receptores Acoplados a Proteínas G/genética , Percepção Gustatória/genética , Paladar/genética , Alelos , Sequência de Aminoácidos , Animais , Genoma , Humanos , Filogenia , Polimorfismo de Nucleotídeo Único , SuínosRESUMO
BACKGROUND: A major concern in conservation genetics is to maintain the genetic diversity of populations. Genetic variation in livestock species is threatened by the progressive marginalisation of local breeds in benefit of high-output pigs worldwide. We used high-density SNP and re-sequencing data to assess genetic diversity of local pig breeds from Europe. In addition, we re-sequenced pigs from commercial breeds to identify potential candidate mutations responsible for phenotypic divergence among these groups of breeds. RESULTS: Our results point out some local breeds with low genetic diversity, whose genome shows a high proportion of regions of homozygosis (>50%) and that harbour a large number of potentially damaging mutations. We also observed a high correlation between genetic diversity estimates using high-density SNP data and Next Generation Sequencing data (r = 0.96 at individual level). The study of non-synonymous SNPs that were fixed in commercial breeds and also in any local breed, but with different allele, revealed 99 non-synonymous SNPs affecting 65 genes. Candidate mutations that may underlie differences in the adaptation to the environment were exemplified by the genes AZGP1 and TAS2R40. We also observed that highly productive breeds may have lost advantageous genotypes within genes involve in immune response--e.g. IL12RB2 and STAB1-, probably as a result of strong artificial in the intensive production systems in pig. CONCLUSIONS: The high correlation between genetic diversity computed with the 60K SNP and whole genome re-sequence data indicates that the Porcine 60K SNP Beadchip provides reliable estimates of genomic diversity in European pig populations despite the expected bias. Moreover, this analysis gave insights for strategies to the genetic characterization of local breeds. The comparison between re-sequenced local pigs and re-sequenced commercial pigs made it possible to report candidate mutations to be responsible for phenotypic divergence among those groups of breeds. This study highlights the importance of low input breeds as a valuable genetic reservoir for the pig production industry. However, the high levels of ROHs, inbreeding and potentially damaging mutations emphasize the importance of the genetic characterization of local breeds to preserve their genomic variability.