RESUMO
Efforts to conserve marine mammals are often constrained by uncertainty over their population history. Here, we examine the evolutionary history of a harbour seal (Phoca vitulina) population in the Moray Firth, northeast Scotland using genetic tools and microsatellite markers to explore population change. Previous fine-scale analysis of UK harbour seal populations revealed three clusters in the UK, with a northeastern cluster that included our Moray Firth study population. Our analysis revealed that the Moray Firth cluster is an independent genetic group, with similar levels of genetic diversity across each of the localities sampled. These samples were used to assess historic abundance and demographic events in the Moray Firth population. Estimates of current genetic diversity and effective population size were low, but the results indicated that this population has remained at broadly similar levels following the population bottleneck that occurred after post-glacial recolonization of the area.
RESUMO
Understanding and predicting population abundance is a major challenge confronting scientists. Several genetic models have been developed using microsatellite markers to estimate the present and ancestral effective population sizes. However, to get an overview on the evolution of population requires that past fluctuation of population size be traceable. To address the question, we developed a new model estimating the past changes of effective population size from microsatellite by resolving coalescence theory and using approximate likelihoods in a Monte Carlo Markov Chain approach. The efficiency of the model and its sensitivity to gene flow and to assumptions on the mutational process were checked using simulated data and analysis. The model was found especially useful to provide evidence of transient changes of population size in the past. The times at which some past demographic events cannot be detected because they are too ancient and the risk that gene flow may suggest the false detection of a bottleneck are discussed considering the distribution of coalescence times. The method was applied on real data sets from several Atlantic salmon populations. The method called VarEff (Variation of Effective size) was implemented in the R package VarEff and is made available at https://qgsp.jouy.inra.fr and at http://cran.r-project.org/web/packages/VarEff.
RESUMO
The relationship between pairs of individuals is an important topic in many areas of population and quantitative genetics. It is usually measured as the proportion of the genome identical by descent shared by the pair and it can be inferred from pedigree information. But there is a variance in actual relationships as a consequence of mendelian sampling, whose general formula has not been developed. The goal of this work is to develop this general formula for the one-locus situation,. We provide simple expressions for the variances and covariances of all actual relationships in an arbitrary complex pedigree. The proposed method relies on the use of the nine identity coefficients and the generalized relationship coefficients; formulas have been checked by computer simulation. Finally two examples for a short pedigree of dogs and a long pedigree of sheep are given.
Assuntos
Mapeamento Cromossômico , Família , Variação Genética , Humanos , Modelos GenéticosRESUMO
Detecting genetic signatures of selection is of great interest for many research issues. Common approaches to separate selective from neutral processes focus on the variance of F(ST) across loci, as does the original Lewontin and Krakauer (LK) test. Modern developments aim to minimize the false positive rate and to increase the power, by accounting for complex demographic structures. Another stimulating goal is to develop straightforward parametric and computationally tractable tests to deal with massive SNP data sets. Here, we propose an extension of the original LK statistic (T(LK)), named T(F-LK), that uses a phylogenetic estimation of the population's kinship (F) matrix, thus accounting for historical branching and heterogeneity of genetic drift. Using forward simulations of single-nucleotide polymorphisms (SNPs) data under neutrality and selection, we confirm the relative robustness of the LK statistic (T(LK)) to complex demographic history but we show that T(F-LK) is more powerful in most cases. This new statistic outperforms also a multinomial-Dirichlet-based model [estimation with Markov chain Monte Carlo (MCMC)], when historical branching occurs. Overall, T(F-LK) detects 15-35% more selected SNPs than T(LK) for low type I errors (P < 0.001). Also, simulations show that T(LK) and T(F-LK) follow a chi-square distribution provided the ancestral allele frequencies are not too extreme, suggesting the possible use of the chi-square distribution for evaluating significance. The empirical distribution of T(F-LK) can be derived using simulations conditioned on the estimated F matrix. We apply this new test to pig breeds SNP data and pinpoint outliers using T(F-LK), otherwise undetected using the less powerful T(LK) statistic. This new test represents one solution for compromise between advanced SNP genetic data acquisition and outlier analyses.
Assuntos
Genética Populacional/métodos , Seleção Genética , Animais , Evolução Molecular , Deriva Genética , Marcadores Genéticos/genética , Modelos Genéticos , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Suínos/genéticaRESUMO
We investigate the effects of past changes of the effective population size on the present allelic diversity at a microsatellite marker locus. We first derive the analytical expression of the generating function of the joint probabilities of the time to the Most Recent Common Ancestor for a pair of alleles and of their distance (the difference in allele size). We give analytical solutions in the case of constant population size and the geometrical mutation model. Otherwise, numerical inversion allows the distributions to be calculated in general cases. The effects of population expansion or decrease and the possibility to detect an ancient bottleneck are discussed. The method is extended to samples of three and four alleles, which allows investigating the covariance structure of the frequencies f(k) of pairs of alleles with a size difference of k motifs, and suggesting some approaches to the estimation of past demography.
Assuntos
Alelos , Repetições de Microssatélites/genética , Modelos Genéticos , MutaçãoRESUMO
General and genetic statistical methods are commonly used to deal with microsatellite data (highly variable neutral genetic markers). In this paper, the self-organizing map (SOM) that belongs to the unsupervised artificial neural networks (ANNs) was applied to analyse the structure of 58 European and two Chinese pig populations (Sus scrofa) including commercial lines, local breeds and cosmopolitan breeds. Results were compared with other unsupervised classification or ordination methods such as factorial correspondence analysis, hierarchical clustering from an allele sharing distance and the Bayesian genetic model and with principal components analysis and neighbour joining from allelic frequencies and genetic distances between populations. Like other methods, SOMs were able to classify individuals according to their breed origin and to visualize similarities between breeds. They provided additional information on the within- and between-population diversity, allowed differences between similar populations to be highlighted and helped differentiate different groups of populations.
Assuntos
Variação Genética , Redes Neurais de Computação , Sus scrofa/genética , Animais , China , Análise por Conglomerados , Europa (Continente) , Estruturas Genéticas , Genética Populacional , Análise Multivariada , Filogenia , Especificidade da Espécie , Sus scrofa/classificaçãoRESUMO
Effective population size (Ne) is an important parameter in the conservation of genetic diversity. Comparative studies of empirical data that gauge the relative accuracy of Ne methods are limited, and a better understanding of the limitations and potential of Ne estimators is needed. This paper investigates genetic diversity and Ne in four populations of wild anadromous Atlantic salmon (Salmo salar L.) in Europe, from the Rivers Oir and Scorff (France) and Spey and Shin (Scotland). We aimed to understand present diversity and historical processes influencing current population structure. Our results showed high genetic diversity for all populations studied, despite their wide range of current effective sizes. To improve understanding of high genetic diversity observed in the populations with low effective size, we developed a model predicting present diversity as a function of past demographic history. This suggested that high genetic diversity could be explained by a bottleneck occurring within recent centuries rather than by gene flow. Previous studies have demonstrated the efficiency of coalescence models to estimate Ne. Using nine subsets from 37 microsatellite DNA markers from the four salmon populations, we compared three coalescence estimators based on single and dual samples. Comparing Ne estimates confirmed the efficiency of increasing the number and variability of microsatellite markers. This efficiency was more accentuated for the smaller populations. Analysis with low numbers of neutral markers revealed uneven distributions of allelic frequencies and overestimated short-term Ne. In addition, we found evidence of artificial stock enhancement using native and non-native origin. We propose estimates of Ne for the four populations, and their applications for salmon conservation and management are discussed.
Assuntos
Variação Genética , Salmo salar/genética , Migração Animal , Animais , Genética Populacional , Densidade DemográficaRESUMO
The advent of fully sequenced genomes opens the ground for the reconstruction of metabolic pathways on the basis of the identification of enzyme-coding genes. Here we describe PRIAM, a method for automated enzyme detection in a fully sequenced genome, based on the classification of enzymes in the ENZYME database. PRIAM relies on sets of position-specific scoring matrices ('profiles') automatically tailored for each ENZYME entry. Automatically generated logical rules define which of these profiles is required in order to infer the presence of the corresponding enzyme in an organism. As an example, PRIAM was applied to identify potential metabolic pathways from the complete genome of the nitrogen-fixing bacterium Sinorhizobium meliloti. The results of this automated method were compared with the original genome annotation and visualised on KEGG graphs in order to facilitate the interpretation of metabolic pathways and to highlight potentially missing enzymes.
Assuntos
Biologia Computacional/métodos , Enzimas/genética , Genoma , Software , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Bases de Dados Genéticas , Bases de Dados de Proteínas , Enzimas/metabolismo , Genoma Bacteriano , Sensibilidade e Especificidade , Sinorhizobium meliloti/genética , Sinorhizobium meliloti/metabolismo , Especificidade por SubstratoRESUMO
Maximum-likelihood and Bayesian (MCMC algorithm) estimates of the increase of the Wright-Malécot inbreeding coefficient, F(t), between two temporally spaced samples, were developed from the Dirichlet approximation of allelic frequency distribution (model MD) and from the admixture of the Dirichlet approximation and the probabilities of fixation and loss of alleles (model MDL). Their accuracy was tested using computer simulations in which F(t) = 10% or less. The maximum-likelihood method based on the model MDL was found to be the best estimate of F(t) provided that initial frequencies are known exactly. When founder frequencies are estimated from a limited set of founder animals, only the estimates based on the model MD can be used for the moment. In this case no method was found to be the best in all situations investigated. The likelihood and Bayesian approaches give better results than the classical F-statistics when markers exhibiting a low polymorphism (such as the SNP markers) are used. Concerning the estimations of the effective population size all the new estimates presented here were found to be better than the F-statistics classically used.
Assuntos
Frequência do Gene , Genética Populacional , Endogamia , Modelos Genéticos , Densidade Demográfica , Funções Verossimilhança , Cadeias de Markov , Repetições de Microssatélites/genética , Método de Monte CarloRESUMO
A quantitative trait locus (QTL) analysis of carcass composition data from a three-generation experimental cross between Meishan (MS) and Large White (LW) pig breeds is presented. A total of 488 F2 males issued from six F1 boars and 23 F1 sows, the progeny of six LW boars and six MS sows, were slaughtered at approximately 80 kg live weight and were submitted to a standardised cutting of the carcass. Fifteen traits, i.e. dressing percentage, loin, ham, shoulder, belly, backfat, leaf fat, feet and head weights, two backfat thickness and one muscle depth measurements, ham + loin and back + leaf fat percentages and estimated carcass lean content were analysed. Animals were typed for a total of 137 markers covering the entire porcine genome. Analyses were performed using a line-cross (LC) regression method where founder lines were assumed to be fixed for different QTL alleles and a half/full sib (HFS) maximum likelihood method where allele substitution effects were estimated within each half-/full-sib family. Additional analyses were performed to search for multiple linked QTL and imprinting effects. Significant gene effects were evidenced for both leanness and fatness traits in the telomeric regions of SSC 1q and SSC 2p, on SSC 4, SSC 7 and SSC X. Additional significant QTL were identified for ham weight on SSC 5, for head weight on SSC 1 and SSC 7, for feet weight on SSC 7 and for dressing percentage on SSC X. LW alleles were associated with a higher lean content and a lower fat content of the carcass, except for the fatness trait on SSC 7. Suggestive evidence of linked QTL on SSC 7 and of imprinting effects on SSC 6, SSC 7, SSC 9 and SSC 17 were also obtained.
Assuntos
Composição Corporal/genética , Carne , Locos de Características Quantitativas , Animais , Mapeamento Cromossômico , Cruzamentos Genéticos , Feminino , Masculino , Suínos/genética , Suínos/metabolismoRESUMO
Many works demonstrate the benefits of using highly polymorphic markers such as microsatellites in order to measure the genetic diversity between closely related breeds. But it is sometimes difficult to decide which genetic distance should be used. In this paper we review the behaviour of the main distances encountered in the literature in various divergence models. In the first part, we consider that breeds are populations in which the assumption of equilibrium between drift and mutation is verified. In this case some interesting distances can be expressed as a function of divergence time, t, and therefore can be used to construct phylogenies. Distances based on allele size distribution (such as (deltamu)(2)and derived distances), taking a mutation model of microsatellites, the Stepwise Mutation Model, specifically into account, exhibit large variance and therefore should not be used to accurately infer phylogeny of closely related breeds. In the last section, we will consider that breeds are small populations and that the divergence times between them are too small to consider that the observed diversity is due to mutations: divergence is mainly due to genetic drift. Expectation and variance of distances were calculated as a function of the Wright-Malécot inbreeding coefficient, F. Computer simulations performed under this divergence model show that the Reynolds distance [57]is the best method for very closely related breeds.