Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
Mol Biol Evol ; 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38980178

RESUMEN

The role of balancing selection is a long-standing evolutionary puzzle. Balancing selection is a crucial evolutionary process that maintains genetic variation (polymorphism) over extended periods of time; however, detecting it poses a significant challenge. Building upon the polymorphism-aware phylogenetic models (PoMos) framework rooted in the Moran model, we introduce PoMoBalance model. This novel approach is designed to disentangle the interplay of mutation, genetic drift, directional selection (GC-biased gene conversion), along with the previously unexplored balancing selection pressures on ultra-long timescales comparable with species divergence times by analysing multi-individual genomic and phylogenetic divergence data. Implemented in the open-source RevBayes Bayesian framework, PoMoBalance offers a versatile tool for inferring phylogenetic trees as well as quantifying various selective pressures. The novel aspect of our approach in studying balancing selection lies in PoMos' ability to account for ancestral polymorphisms and incorporate parameters that measure frequency-dependent selection, allowing us to determine the strength of the effect and exact frequencies under selection. We implemented validation tests and assessed the model on the data simulated with SLiM and a custom Moran model simulator. Real sequence analysis of Drosophila populations reveals insights into the evolutionary dynamics of regions subject to frequency-dependent balancing selection, particularly in the context of sex-limited colour dimorphism in Drosophila erecta.

2.
Mol Biol Evol ; 41(5)2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38667829

RESUMEN

Different frequencies amongst codons that encode the same amino acid (i.e. synonymous codons) have been observed in multiple species. Studies focused on uncovering the forces that drive such codon usage showed that a combined effect of mutational biases and translational selection works to produce different frequencies of synonymous codons. However, only few have been able to measure and distinguish between these forces that may leave similar traces on the coding regions. Here, we have developed a codon model that allows the disentangling of mutation, selection on amino acids and synonymous codons, and GC-biased gene conversion (gBGC) which we employed on an extensive dataset of 415 chordates and 191 arthropods. We found that chordates need 15 more synonymous codon categories than arthropods to explain the empirical codon frequencies, which suggests that the extent of codon usage can vary greatly between animal phyla. Moreover, methylation at CpG sites seems to partially explain these patterns of codon usage in chordates but not in arthropods. Despite the differences between the two phyla, our findings demonstrate that in both, GC-rich codons are disfavored when mutations are GC-biased, and the opposite is true when mutations are AT-biased. This indicates that selection on the genomic coding regions might act primarily to stabilize its GC/AT content on a genome-wide level. Our study shows that the degree of synonymous codon usage varies considerably among animals, but is likely governed by a common underlying dynamic.


Asunto(s)
Artrópodos , Uso de Codones , Selección Genética , Animales , Artrópodos/genética , Cordados/genética , Mutación , Evolución Molecular , Codón , Modelos Genéticos , Composición de Base , Conversión Génica
3.
BMC Bioinformatics ; 25(1): 151, 2024 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-38627634

RESUMEN

BACKGROUND: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. RESULTS: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. CONCLUSIONS: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses.


Asunto(s)
Genoma , Genómica , Animales , Humanos , Ratones , Cadenas de Markov , Composición de Base , Probabilidad , Algoritmos
4.
J Evol Biol ; 36(1): 29-44, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36544394

RESUMEN

For over a decade, experimental evolution has been combined with high-throughput sequencing techniques. In so-called Evolve-and-Resequence (E&R) experiments, populations are kept in the laboratory under controlled experimental conditions where their genomes are sampled and allele frequencies monitored. However, identifying signatures of adaptation in E&R datasets is far from trivial, and it is still necessary to develop more efficient and statistically sound methods for detecting selection in genome-wide data. Here, we present Bait-ER - a fully Bayesian approach based on the Moran model of allele evolution to estimate selection coefficients from E&R experiments. The model has overlapping generations, a feature that describes several experimental designs found in the literature. We tested our method under several different demographic and experimental conditions to assess its accuracy and precision, and it performs well in most scenarios. Nevertheless, some care must be taken when analysing trajectories where drift largely dominates and starting frequencies are low. We compare our method with other available software and report that ours has generally high accuracy even for trajectories whose complexity goes beyond a classical sweep model. Furthermore, our approach avoids the computational burden of simulating an empirical null distribution, outperforming available software in terms of computational time and facilitating its use on genome-wide data. We implemented and released our method in a new open-source software package that can be accessed at https://doi.org/10.5281/zenodo.7351736.


Asunto(s)
Selección Genética , Programas Informáticos , Teorema de Bayes , Frecuencia de los Genes , Adaptación Fisiológica
5.
Mol Biol Evol ; 36(6): 1294-1301, 2019 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-30825307

RESUMEN

Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.


Asunto(s)
Modelos Genéticos , Tasa de Mutación , Filogenia , Polimorfismo Genético , Animales , Humanos , Funciones de Verosimilitud , Programas Informáticos
6.
J Theor Biol ; 486: 110074, 2020 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-31711991

RESUMEN

Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Genéticos , Teorema de Bayes , Evolución Molecular , Filogenia , Polimorfismo Genético
7.
J Theor Biol ; 439: 166-180, 2018 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-29229523

RESUMEN

A central aim of population genetics is the inference of the evolutionary history of a population. To this end, the underlying process can be represented by a model of the evolution of allele frequencies parametrized by e.g., the population size, mutation rates and selection coefficients. A large class of models use forward-in-time models, such as the discrete Wright-Fisher and Moran models and the continuous forward diffusion, to obtain distributions of population allele frequencies, conditional on an ancestral initial allele frequency distribution. Backward-in-time diffusion processes have been rarely used in the context of parameter inference. Here, we demonstrate how forward and backward diffusion processes can be combined to efficiently calculate the exact joint probability distribution of sample and population allele frequencies at all times in the past, for both discrete and continuous population genetics models. This procedure is analogous to the forward-backward algorithm of hidden Markov models. While the efficiency of discrete models is limited by the population size, for continuous models it suffices to expand the transition density in orthogonal polynomials of the order of the sample size to infer marginal likelihoods of population genetic parameters. Additionally, conditional allele trajectories and marginal likelihoods of samples from single populations or from multiple populations that split in the past can be obtained. The described approaches allow for efficient maximum likelihood inference of population genetic parameters in a wide variety of demographic scenarios.


Asunto(s)
Genética de Población/métodos , Modelos Genéticos , Algoritmos , Evolución Biológica , Frecuencia de los Genes , Funciones de Verosimilitud , Cadenas de Markov , Métodos , Densidad de Población , Tiempo
8.
Stat Appl Genet Mol Biol ; 16(5-6): 387-405, 2017 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-29095700

RESUMEN

In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation. By moving along a simulated gradient or ascent direction, the algorithm produces a sequence of estimates that eventually converges to the maximum likelihood estimate, given a set of observed summary statistics. This strategy does not sample much from low-likelihood regions of the parameter space, and is fast, even when many summary statistics are involved. We put considerable efforts into providing tuning guidelines that improve the robustness and lead to good performance on problems with high-dimensional summary statistics and a low signal-to-noise ratio. We then investigate the performance of our resulting approach and study its properties in simulations. Finally, we re-estimate parameters describing the demographic history of Bornean and Sumatran orang-utans.


Asunto(s)
Genética de Población/métodos , Funciones de Verosimilitud , Modelos Genéticos , Algoritmos , Teorema de Bayes , Simulación por Computador , Evolución Molecular
9.
Mol Ecol ; 26(14): 3649-3662, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28370647

RESUMEN

The orchid family is the largest in the angiosperms, but little is known about the molecular basis of the significant variation they exhibit. We investigate here the transcriptomic divergence between two European terrestrial orchids, Dactylorhiza incarnata and Dactylorhiza fuchsii, and integrate these results in the context of their distinct ecologies that we also document. Clear signals of lineage-specific adaptive evolution of protein-coding sequences are identified, notably targeting elements of biotic defence, including both physical and chemical adaptations in the context of divergent pools of pathogens and herbivores. In turn, a substantial regulatory divergence between the two species appears linked to adaptation/acclimation to abiotic conditions. Several of the pathways affected by differential expression are also targeted by deviating post-transcriptional regulation via sRNAs. Finally, D. incarnata appears to suffer from insufficient sRNA control over the activity of RNA-dependent DNA polymerase, resulting in increased activity of class I transposable elements and, over time, in larger genome size than that of D. fuchsii. The extensive molecular divergence between the two species suggests significant genomic and transcriptomic shock in their hybrids and offers insights into the difficulty of coexistence at the homoploid level. Altogether, biological response to selection, accumulated during the history of these orchids, appears governed by their microenvironmental context, in which biotic and abiotic pressures act synergistically to shape transcriptome structure, expression and regulation.


Asunto(s)
Adaptación Biológica/genética , Evolución Biológica , Orchidaceae/clasificación , Transcriptoma , Elementos Transponibles de ADN , Ecología , Ambiente , Genoma de Planta , Genómica
10.
J Gen Virol ; 97(9): 2323-2332, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27267884

RESUMEN

Complete genomes of eight reference strains representing different serotypes within the species Fowl aviadenovirus D (FAdV-D) and Fowl aviadenovirus E (FAdV-E) were sequenced. The sequenced genomes of FAdV-D and FAdV-E members comprise 43 287 to 44 336 bp, and have a gene organization identical to that of an earlier sequenced FAdV-D member (strain A-2A). Highest diversity was noticed in the hexon and fiber genes and ORF19. All genomes sequenced in this study contain one fiber gene. Phylogenetic analyses and G+C content support the division of the genus Aviadenovirus into the currently recognized species. Our data also suggest that strain SR48 should be considered as FAdV-11 instead of FAdV-2 and similarly strain HG as FAdV-8b. The present results complete the list of genome sequences of reference strains representing all serotypes in species FAdV-D and FAdV-E.


Asunto(s)
Aviadenovirus/clasificación , Aviadenovirus/genética , Variación Genética , Composición de Base , Proteínas de la Cápside/genética , Análisis por Conglomerados , ADN Viral/química , ADN Viral/genética , Orden Génico , Genoma Viral , Filogenia , Análisis de Secuencia de ADN , Homología de Secuencia
11.
Bioinformatics ; 31(11): 1762-70, 2015 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-25614471

RESUMEN

MOTIVATION: Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but also monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analyzing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth. RESULTS: We present the beta-binomial Gaussian process model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine it with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present simulations exploring different experimental design choices and results on real data from Drosophila experimental evolution experiment in temperature adaptation. AVAILABILITY AND IMPLEMENTATION: R software implementing the test is available at https://github.com/handetopa/BBGP.


Asunto(s)
Evolución Molecular , Frecuencia de los Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alelos , Animales , Drosophila/genética , Genómica/métodos , Modelos Estadísticos , Distribución Normal , Polimorfismo de Nucleótido Simple , Programas Informáticos
12.
Syst Biol ; 64(6): 1018-31, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26209413

RESUMEN

Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches.


Asunto(s)
Clasificación/métodos , Simulación por Computador , Frecuencia de los Genes , Filogenia , Animales , Hominidae/clasificación , Hominidae/genética , Mutación , Polimorfismo Genético
13.
J Theor Biol ; 407: 362-370, 2016 10 21.
Artículo en Inglés | MEDLINE | ID: mdl-27480613

RESUMEN

We present a reversible Polymorphism-Aware Phylogenetic Model (revPoMo) for species tree estimation from genome-wide data. revPoMo enables the reconstruction of large scale species trees for many within-species samples. It expands the alphabet of DNA substitution models to include polymorphic states, thereby, naturally accounting for incomplete lineage sorting. We implemented revPoMo in the maximum likelihood software IQ-TREE. A simulation study and an application to great apes data show that the runtimes of our approach and standard substitution models are comparable but that revPoMo has much better accuracy in estimating trees, divergence times and mutation rates. The advantage of revPoMo is that an increase of sample size per species improves estimations but does not increase runtime. Therefore, revPoMo is a valuable tool with several applications, from speciation dating to species tree reconstruction.


Asunto(s)
Modelos Genéticos , Filogenia , Polimorfismo Genético , Animales , Simulación por Computador , Difusión , Hominidae/genética , Especificidad de la Especie
14.
Mol Biol Evol ; 30(3): 725-36, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23188590

RESUMEN

Empirical codon models (ECMs) estimated from a large number of globular protein families outperformed mechanistic codon models in their description of the general process of protein evolution. Among other factors, ECMs implicitly model the influence of amino acid properties and multiple nucleotide substitutions (MNS). However, the estimation of ECMs requires large quantities of data, and until recently, only few suitable data sets were available. Here, we take advantage of several new Drosophila species genomes to estimate codon models from genome-wide data. The availability of large numbers of genomes over varying phylogenetic depths in the Drosophila genus allows us to explore various divergence levels. In consequence, we can use these data to determine the appropriate level of divergence for the estimation of ECMs, avoiding overestimation of MNS rates caused by saturation. To account for variation in evolutionary rates along the genome, we develop new empirical codon hidden Markov models (ecHMMs). These models significantly outperform previous ones with respect to maximum likelihood values, suggesting that they provide a better fit to the evolutionary process. Using ECMs and ecHMMs derived from genome-wide data sets, we devise new likelihood ratio tests (LRTs) of positive selection. We found classical LRTs very sensitive to the presence of MNSs, showing high false-positive rates, especially with small phylogenies. The new LRTs are more conservative than the classical ones, having acceptable false-positive rates and reduced power.


Asunto(s)
Codón/genética , Drosophila/genética , Modelos Genéticos , Algoritmos , Animales , Simulación por Computador , Evolución Molecular , Especiación Genética , Genoma de los Insectos , Funciones de Verosimilitud , Cadenas de Markov , Tasa de Mutación , Sistemas de Lectura Abierta , Filogenia , Selección Genética
15.
Mol Biol Evol ; 30(10): 2249-62, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23906727

RESUMEN

The genomes of related species contain valuable information on the history of the considered taxa. Great apes in particular exhibit variation of evolutionary patterns along their genomes. However, the great ape data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms. Previous methods for genome-scale analysis are restricted to very few individuals or cannot disentangle the contribution of mutation rates and fixation biases. This represents a limitation both for the understanding of these forces as well as for the detection of regions affected by selection. Here, we present a new model designed to estimate mutation rates and fixation biases from genetic variation within and between species. We relax the assumption of instantaneous substitutions, modeling substitutions as mutational events followed by a gradual fixation. Hence, we straightforwardly account for shared ancestral polymorphisms and incomplete lineage sorting. We analyze genome-wide synonymous site alignments of human, chimpanzee, and two orangutan species. From each taxon, we include data from several individuals. We estimate mutation rates and GC-biased gene conversion intensity. We find that both mutation rates and biased gene conversion vary with GC content. We also find lineage-specific differences, with weaker fixation biases in orangutan species, suggesting a reduced historical effective population size. Finally, our results are consistent with directional selection acting on coding sequences in relation to exonic splicing enhancers.


Asunto(s)
Evolución Molecular , Genoma Humano , Genoma , Tasa de Mutación , Pan troglodytes/genética , Polimorfismo Genético , Pongo/genética , Animales , Composición de Base , Exoma , Conversión Génica , Variación Genética , Humanos , Cadenas de Markov , Modelos Genéticos , Mutación , Filogenia , Selección Genética
16.
J Gen Virol ; 95(Pt 1): 156-170, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24077297

RESUMEN

There are eight species established for aviadenoviruses: Fowl adenovirus A-E, Goose adenovirus A, Falcon adenovirus A and Turkey adenovirus B. The aim of this study was to sequence and analyse the complete genomes of turkey adenovirus 4 (TAdV-4) and TAdV-5 (strain 1277BT) in addition to almost two-thirds of the genome of another TAdV-5 strain (strain D1648). By applying next-generation sequencing, the full genomes were found to be 42 940 and 43 686 bp and the G+C content was 48.5 and 51.6 mol% for TAdV-4 and TAdV-5, respectively. One fiber gene was identified in TAdV-4, whereas two fiber genes were found in TAdV-5. The genome organization of TAdV-4 resembled that of fowl adenovirus 5 (FAdV-5), but it had ORF1C near the left end of the genome. TAdV-4 also had five 123 bp tandem repeats followed by five 33 bp tandem repeats, but they occurred before and not after ORF8, as in several fowl adenoviruses. The genome organization of TAdV-5 was almost the same as that of FAdV-1 but with a possible difference in the splicing pattern of ORF11 and ORF26. Phylogenetic analyses and G+C content showed differences that seem to merit the establishment of two new species within the genus Aviadenovirus: Turkey adenovirus C (for TAdV-4) and Turkey adenovirus D (for TAdV-5). Our analyses suggest a common evolutionary origin of TAdV-5 and FAdV-1.


Asunto(s)
Infecciones por Adenoviridae/veterinaria , Aviadenovirus/aislamiento & purificación , Genoma Viral , Enfermedades de las Aves de Corral/virología , Infecciones por Adenoviridae/virología , Secuencia de Aminoácidos , Animales , Aviadenovirus/clasificación , Aviadenovirus/genética , Secuencia de Bases , Evolución Molecular , Datos de Secuencia Molecular , Sistemas de Lectura Abierta , Filogenia , Pavos , Proteínas Virales/genética
17.
Genome Biol Evol ; 15(7)2023 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-37341535

RESUMEN

Experimental evolution studies are powerful approaches to examine the evolutionary history of lab populations. Such studies have shed light on how selection changes phenotypes and genotypes. Most of these studies have not examined the time course of adaptation under sexual selection manipulation, by resequencing the populations' genomes at multiple time points. Here, we analyze allele frequency trajectories in Drosophila pseudoobscura where we altered their sexual selection regime for 200 generations and sequenced pooled populations at 5 time points. The intensity of sexual selection was either relaxed in monogamous populations (M) or elevated in polyandrous lines (E). We present a comprehensive study of how selection alters population genetics parameters at the chromosome and gene level. We investigate differences in the effective population size-Ne-between the treatments, and perform a genome-wide scan to identify signatures of selection from the time-series data. We found genomic signatures of adaptation to both regimes in D. pseudoobscura. There are more significant variants in E lines as expected from stronger sexual selection. However, we found that the response on the X chromosome was substantial in both treatments, more pronounced in E and restricted to the more recently sex-linked chromosome arm XR in M. In the first generations of experimental evolution, we estimate Ne to be lower on the X in E lines, which might indicate a swift adaptive response at the onset of selection. Additionally, the third chromosome was affected by elevated polyandry whereby its distal end harbors a region showing a strong signal of adaptive evolution especially in E lines.


Asunto(s)
Drosophila , Selección Sexual , Animales , Drosophila/genética , Frecuencia de los Genes , Genética de Población , Adaptación Fisiológica/genética , Selección Genética , Evolución Biológica
18.
Genome Biol Evol ; 14(1)2022 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-34983052

RESUMEN

Despite the importance of natural selection in species' evolutionary history, phylogenetic methods that take into account population-level processes typically ignore selection. The assumption of neutrality is often based on the idea that selection occurs at a minority of loci in the genome and is unlikely to compromise phylogenetic inferences significantly. However, genome-wide processes like GC-bias and some variation segregating at the coding regions are known to evolve in the nearly neutral range. As we are now using genome-wide data to estimate species trees, it is natural to ask whether weak but pervasive selection is likely to blur species tree inferences. We developed a polymorphism-aware phylogenetic model tailored for measuring signatures of nucleotide usage biases to test the impact of selection in the species tree. Our analyses indicate that although the inferred relationships among species are not significantly compromised, the genetic distances are systematically underestimated in a node-height-dependent manner: that is, the deeper nodes tend to be more underestimated than the shallow ones. Such biases have implications for molecular dating. We dated the evolutionary history of 30 worldwide fruit fly populations, and we found signatures of GC-bias considerably affecting the estimated divergence times (up to 23%) in the neutral model. Our findings call for the need to account for selection when quantifying divergence or dating species evolution.


Asunto(s)
Uso de Codones , Evolución Molecular , Animales , Uso de Codones/genética , Drosophila , Nucleótidos , Filogenia , Selección Genética
19.
Curr Biol ; 18(12): 883-9, 2008 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-18571414

RESUMEN

What evolutionary forces shape genes that contribute to the risk of human disease? Do similar selective pressures act on alleles that underlie simple versus complex disorders [1-3]? Answers to these questions will shed light onto the origin of human disorders (e.g., [4]) and help to predict the population frequencies of alleles that contribute to disease risk, with important implications for the efficient design of mapping studies [5-7]. As a first step toward addressing these questions, we created a hand-curated version of the Mendelian Inheritance in Man database (OMIM). We then examined selective pressures on Mendelian-disease genes, genes that contribute to complex-disease risk, and genes known to be essential in mouse by analyzing patterns of human polymorphism and of divergence between human and rhesus macaque. We found that Mendelian-disease genes appear to be under widespread purifying selection, especially when the disease mutations are dominant (rather than recessive). In contrast, the class of genes that influence complex-disease risk shows little signs of evolutionary conservation, possibly because this category includes targets of both purifying and positive selection.


Asunto(s)
Bases de Datos Factuales , Genes/genética , Enfermedades Genéticas Congénitas/genética , Predisposición Genética a la Enfermedad/genética , Genoma Humano , Selección Genética , Animales , Biología Computacional , Humanos , Ratones , Sistemas en Línea , Polimorfismo Genético
20.
PLoS Genet ; 4(8): e1000144, 2008 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-18670650

RESUMEN

Genome-wide scans for positively selected genes (PSGs) in mammals have provided insight into the dynamics of genome evolution, the genetic basis of differences between species, and the functions of individual genes. However, previous scans have been limited in power and accuracy owing to small numbers of available genomes. Here we present the most comprehensive examination of mammalian PSGs to date, using the six high-coverage genome assemblies now available for eutherian mammals. The increased phylogenetic depth of this dataset results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied. Of approximately 16,500 human genes with high-confidence orthologs in at least two other species, 400 genes showed significant evidence of positive selection (FDR<0.05), according to a standard likelihood ratio test. An additional 144 genes showed evidence of positive selection on particular lineages or clades. As in previous studies, the identified PSGs were enriched for roles in defense/immunity, chemosensory perception, and reproduction, but enrichments were also evident for more specific functions, such as complement-mediated immunity and taste perception. Several pathways were strongly enriched for PSGs, suggesting possible co-evolution of interacting genes. A novel Bayesian analysis of the possible "selection histories" of each gene indicated that most PSGs have switched multiple times between positive selection and nonselection, suggesting that positive selection is often episodic. A detailed analysis of Affymetrix exon array data indicated that PSGs are expressed at significantly lower levels, and in a more tissue-specific manner, than non-PSGs. Genes that are specifically expressed in the spleen, testes, liver, and breast are significantly enriched for PSGs, but no evidence was found for an enrichment for PSGs among brain-specific genes. This study provides additional evidence for widespread positive selection in mammalian evolution and new genome-wide insights into the functional implications of positive selection.


Asunto(s)
Evolución Molecular , Genoma , Mamíferos/genética , Selección Genética , Animales , Teorema de Bayes , Bases de Datos Genéticas , Perros , Expresión Génica , Humanos , Funciones de Verosimilitud , Macaca mulatta , Mamíferos/clasificación , Ratones , Pan troglodytes , Filogenia , Primates , Ratas , Roedores , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA