Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 177
Filtrar
1.
Proc Natl Acad Sci U S A ; 120(22): e2220389120, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-37216509

RESUMO

Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other-so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance-covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genus Solanum with varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.


Assuntos
Algoritmos , Software , Filogenia , Simulação por Computador , Probabilidade , Modelos Genéticos
2.
Syst Biol ; 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38421146

RESUMO

Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting, introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call MAST. This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of incomplete lineage sorting in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of four Platyrrhine species for which standard concatenated maximum likelihood and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e. the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyse a concatenated alignment using maximum likelihood, while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.

3.
Mol Biol Evol ; 40(5)2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-37158385

RESUMO

Despite the increasing abundance of whole transcriptome data, few methods are available to analyze global gene expression across phylogenies. Here, we present a new software package (Computational Analysis of Gene Expression Evolution [CAGEE]) for inferring patterns of increases and decreases in gene expression across a phylogenetic tree, as well as the rate at which these changes occur. In contrast to previous methods that treat each gene independently, CAGEE can calculate genome-wide rates of gene expression, along with ancestral states for each gene. The statistical approach developed here makes it possible to infer lineage-specific shifts in rates of evolution across the genome, in addition to possible differences in rates among multiple tissues sampled from the same species. We demonstrate the accuracy and robustness of our method on simulated data and apply it to a data set of ovule gene expression collected from multiple self-compatible and self-incompatible species in the genus Solanum to test hypotheses about the evolutionary forces acting during mating system shifts. These comparisons allow us to highlight the power of CAGEE, demonstrating its utility for use in any empirical system and for the analysis of most morphological traits. Our software is available at https://github.com/hahnlab/CAGEE/.


Assuntos
Perfilação da Expressão Gênica , Filogenia , Software , Solanum , Solanum/classificação , Solanum/genética , Evolução Biológica
4.
Trends Genet ; 37(2): 174-187, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32921510

RESUMO

The availability of whole genome sequences was expected to supply essentially unlimited data for phylogenetics. However, strict reliance on single-copy genes for this purpose has drastically limited the amount of data that can be used. Here, we review several approaches for increasing the amount of data used for phylogenetic inference, focusing on methods that allow for the inclusion of duplicated genes (paralogs). Recently developed methods that are robust to high levels of incomplete lineage sorting also appear to be robust to the inclusion of paralogs, suggesting a promising way to take full advantage of genomic data. We discuss the pitfalls of these approaches, as well as further avenues for research.


Assuntos
Duplicação Gênica/genética , Genoma/genética , Genômica/métodos , Filogenia , Evolução Molecular , Sequenciamento Completo do Genoma/métodos
5.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37669126

RESUMO

MOTIVATION: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics. RESULTS: We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics. AVAILABILITY AND IMPLEMENTATION: phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.


Assuntos
Benchmarking , Evolução Biológica , Filogenia , Heterogeneidade Genética , Aprendizado de Máquina
6.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36383168

RESUMO

MOTIVATION: Site concordance factors (sCFs) have become a widely used way to summarize discordance in phylogenomic datasets. However, the original version of sCFs was calculated by sampling a quartet of tip taxa and then applying parsimony-based criteria for discordance. This approach has the potential to be strongly affected by multiple hits at a site (homoplasy), especially when substitution rates are high or taxa are not closely related. RESULTS: Here, we introduce a new method for calculating sCFs. The updated version uses likelihood to generate probability distributions of ancestral states at internal nodes of the phylogeny. By sampling from the states at internal nodes adjacent to a given branch, this approach substantially reduces-but does not abolish-the effects of homoplasy and taxon sampling. AVAILABILITY AND IMPLEMENTATION: Updated sCFs are implemented in IQ-TREE 2.2.2. The software is freely available at https://github.com/iqtree/iqtree2/releases. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.


Assuntos
Software , Filogenia , Probabilidade
7.
Mol Phylogenet Evol ; 196: 108066, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38565358

RESUMO

Machine learning has increasingly been applied to a wide range of questions in phylogenetic inference. Supervised machine learning approaches that rely on simulated training data have been used to infer tree topologies and branch lengths, to select substitution models, and to perform downstream inferences of introgression and diversification. Here, we review how researchers have used several promising machine learning approaches to make phylogenetic inferences. Despite the promise of these methods, several barriers prevent supervised machine learning from reaching its full potential in phylogenetics. We discuss these barriers and potential paths forward. In the future, we expect that the application of careful network designs and data encodings will allow supervised machine learning to accommodate the complex processes that continue to confound traditional phylogenetic methods.


Assuntos
Aprendizado de Máquina , Filogenia , Aprendizado de Máquina Supervisionado , Modelos Genéticos
8.
Nature ; 553(7686): 77-81, 2018 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-29300007

RESUMO

In contrast to infections with human immunodeficiency virus (HIV) in humans and simian immunodeficiency virus (SIV) in macaques, SIV infection of a natural host, sooty mangabeys (Cercocebus atys), is non-pathogenic despite high viraemia. Here we sequenced and assembled the genome of a captive sooty mangabey. We conducted genome-wide comparative analyses of transcript assemblies from C. atys and AIDS-susceptible species, such as humans and macaques, to identify candidates for host genetic factors that influence susceptibility. We identified several immune-related genes in the genome of C. atys that show substantial sequence divergence from macaques or humans. One of these sequence divergences, a C-terminal frameshift in the toll-like receptor-4 (TLR4) gene of C. atys, is associated with a blunted in vitro response to TLR-4 ligands. In addition, we found a major structural change in exons 3-4 of the immune-regulatory protein intercellular adhesion molecule 2 (ICAM-2); expression of this variant leads to reduced cell surface expression of ICAM-2. These data provide a resource for comparative genomic studies of HIV and/or SIV pathogenesis and may help to elucidate the mechanisms by which SIV-infected sooty mangabeys avoid AIDS.


Assuntos
Síndrome da Imunodeficiência Adquirida/genética , Cercocebus atys/genética , Cercocebus atys/virologia , Predisposição Genética para Doença , Genoma/genética , Especificidade de Hospedeiro/genética , Vírus da Imunodeficiência Símia , Síndrome da Imunodeficiência Adquirida/virologia , Sequência de Aminoácidos , Animais , Moléculas de Adesão Celular/química , Moléculas de Adesão Celular/genética , Moléculas de Adesão Celular/metabolismo , Cercocebus atys/imunologia , Éxons/genética , Feminino , Mutação da Fase de Leitura/genética , Variação Genética , Genômica , HIV/patogenicidade , Humanos , Macaca/virologia , Deleção de Sequência , Síndrome de Imunodeficiência Adquirida dos Símios/genética , Síndrome de Imunodeficiência Adquirida dos Símios/virologia , Vírus da Imunodeficiência Símia/patogenicidade , Especificidade da Espécie , Receptor 4 Toll-Like/química , Receptor 4 Toll-Like/genética , Receptor 4 Toll-Like/imunologia , Transcriptoma/genética , Sequenciamento Completo do Genoma
9.
PLoS Genet ; 17(11): e1009892, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34748547

RESUMO

It is now understood that introgression can serve as powerful evolutionary force, providing genetic variation that can shape the course of trait evolution. Introgression also induces a shared evolutionary history that is not captured by the species phylogeny, potentially complicating evolutionary analyses that use a species tree. Such analyses are often carried out on gene expression data across species, where the measurement of thousands of trait values allows for powerful inferences while controlling for shared phylogeny. Here, we present a Brownian motion model for quantitative trait evolution under the multispecies network coalescent framework, demonstrating that introgression can generate apparently convergent patterns of evolution when averaged across thousands of quantitative traits. We test our theoretical predictions using whole-transcriptome expression data from ovules in the wild tomato genus Solanum. Examining two sub-clades that both have evidence for post-speciation introgression, but that differ substantially in its magnitude, we find patterns of evolution that are consistent with histories of introgression in both the sign and magnitude of ovule gene expression. Additionally, in the sub-clade with a higher rate of introgression, we observe a correlation between local gene tree topology and expression similarity, implicating a role for introgressed cis-regulatory variation in generating these broad-scale patterns. Our results reveal a general role for introgression in shaping patterns of variation across many thousands of quantitative traits, and provide a framework for testing for these effects using simple model-informed predictions.


Assuntos
Expressão Gênica , Locos de Características Quantitativas , Solanum lycopersicum/genética , Evolução Molecular , Genes de Plantas
10.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35642314

RESUMO

Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.


Assuntos
Genoma , Animais , Análise por Conglomerados , Filogenia
11.
Mol Biol Evol ; 39(7)2022 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-35771663

RESUMO

The mutation rate is a fundamental evolutionary parameter with direct and appreciable effects on the health and function of individuals. Here, we examine this important parameter in the domestic cat, a beloved companion animal as well as a valuable biomedical model. We estimate a mutation rate of 0.86 × 10-8 per bp per generation for the domestic cat (at an average parental age of 3.8 years). We find evidence for a significant paternal age effect, with more mutations transmitted by older sires. Our analyses suggest that the cat and the human have accrued similar numbers of mutations in the germline before reaching sexual maturity. The per-generation mutation rate in the cat is 28% lower than what has been observed in humans, but is consistent with the shorter generation time in the cat. Using a model of reproductive longevity, which takes into account differences in the reproductive age and time to sexual maturity, we are able to explain much of the difference in per-generation rates between species. We further apply our reproductive longevity model in a novel analysis of mutation spectra and find that the spectrum for the cat resembles the human mutation spectrum at a younger age of reproduction. Together, these results implicate changes in life-history as a driver of mutation rate evolution between species. As the first direct observation of the paternal age effect outside of rodents and primates, our results also suggest a phenomenon that may be universal among mammals.


Assuntos
Longevidade , Taxa de Mutação , Animais , Gatos/genética , Pré-Escolar , Humanos , Longevidade/genética , Mamíferos , Mutação , Idade Paterna , Reprodução/genética
12.
Thorax ; 78(11): 1118-1125, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37280096

RESUMO

BACKGROUND: Although 1 billion people live in informal (slum) settlements, the consequences for respiratory health of living in these settlements remain largely unknown. This study investigated whether children living in an informal settlement in Nairobi, Kenya are at increased risk of asthma symptoms. METHODS: Children attending schools in Mukuru (an informal settlement in Nairobi) and a more affluent area (Buruburu) were compared. Questionnaires quantified respiratory symptoms and environmental exposures; spirometry was performed; personal exposure to particulate matter (PM2.5) was estimated. RESULTS: 2373 children participated, 1277 in Mukuru (median age, IQR 11, 9-13 years, 53% girls), and 1096 in Buruburu (10, 8-12 years, 52% girls). Mukuru schoolchildren were from less affluent homes, had greater exposure to pollution sources and PM2.5. When compared with Buruburu schoolchildren, Mukuru schoolchildren had a greater prevalence of symptoms, 'current wheeze' (9.5% vs 6.4%, p=0.007) and 'trouble breathing' (16.3% vs 12.6%, p=0.01), and these symptoms were more severe and problematic. Diagnosed asthma was more common in Buruburu (2.8% vs 1.2%, p=0.004). Spirometry did not differ between Mukuru and Buruburu. Regardless of community, significant adverse associations were observed with self-reported exposure to 'vapours, dusts, gases, fumes', mosquito coil burning, adult smoker(s) in the home, refuse burning near homes and residential proximity to roads. CONCLUSION: Children living in informal settlements are more likely to develop wheezing symptoms consistent with asthma that are more severe but less likely to be diagnosed as asthma. Self-reported but not objectively measured air pollution exposure was associated with increased risk of asthma symptoms.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Asma , Criança , Adulto , Feminino , Animais , Humanos , Masculino , Poluentes Atmosféricos/análise , Quênia/epidemiologia , Poluição do Ar/análise , Asma/diagnóstico , Asma/epidemiologia , Asma/etiologia , Material Particulado/efeitos adversos , Material Particulado/análise , Exposição Ambiental/efeitos adversos , Exposição Ambiental/análise , Sons Respiratórios , Gases , Espirometria
13.
Genome Res ; 30(6): 826-834, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32461224

RESUMO

Mutation is the ultimate source of all genetic novelty and the cause of heritable genetic disorders. Mutational burden has been linked to complex disease, including neurodevelopmental disorders such as schizophrenia and autism. The rate of mutation is a fundamental genomic parameter and direct estimates of this parameter have been enabled by accurate comparisons of whole-genome sequences between parents and offspring. Studies in humans have revealed that the paternal age at conception explains most of the variation in mutation rate: Each additional year of paternal age in humans leads to approximately 1.5 additional inherited mutations. Here, we present an estimate of the de novo mutation rate in the rhesus macaque (Macaca mulatta) using whole-genome sequence data from 32 individuals in four large pedigrees. We estimated an average mutation rate of 0.58 × 10-8 per base pair per generation (at an average parental age of 7.5 yr), much lower than found in direct estimates from great apes. As in humans, older macaque fathers transmit more mutations to their offspring, increasing the per generation mutation rate by 4.27 × 10-10 per base pair per year. We found that the rate of mutation accumulation after puberty is similar between macaques and humans, but that a smaller number of mutations accumulate before puberty in macaques. We additionally investigated the role of paternal age on offspring sociability, a proxy for normal neurodevelopment, by studying 203 male macaques in large social groups.


Assuntos
Comportamento Animal , Mutação em Linhagem Germinativa , Acúmulo de Mutações , Idade Paterna , Efeitos Tardios da Exposição Pré-Natal/genética , Habilidades Sociais , Fatores Etários , Animais , Feminino , Humanos , Macaca mulatta , Masculino , Taxa de Mutação , Gravidez , Especificidade da Espécie
14.
Syst Biol ; 71(3): 649-659, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-34951639

RESUMO

Phylogenetics has long relied on the use of orthologs, or genes related through speciation events, to infer species relationships. However, identifying orthologs is difficult because gene duplication can obscure relationships among genes. Researchers have been particularly concerned with the insidious effects of pseudoorthologs-duplicated genes that are mistaken for orthologs because they are present in a single copy in each sampled species. Because gene tree topologies of pseudoorthologs may differ from the species tree topology, they have often been invoked as the cause of counterintuitive results in phylogenetics. Despite these perceived problems, no previous work has calculated the probabilities of pseudoortholog topologies or has been able to circumscribe the regions of parameter space in which pseudoorthologs are most likely to occur. Here, we introduce a model for calculating the probabilities and branch lengths of orthologs and pseudoorthologs, including concordant and discordant pseudoortholog topologies, on a rooted three-taxon species tree. We show that the probability of orthologs is high relative to the probability of pseudoorthologs across reasonable regions of parameter space. Furthermore, the probabilities of the two discordant topologies are equal and never exceed that of the concordant topology, generally being much lower. We describe the species tree topologies most prone to generating pseudoorthologs, finding that they are likely to present problems to phylogenetic inference irrespective of the presence of pseudoorthologs. Overall, our results suggest that pseudoorthologs are unlikely to mislead inferences of species relationships under the biological scenarios considered here.[Birth-death model; orthologs; paralogs; phylogenetics.].


Assuntos
Duplicação Gênica , Modelos Genéticos , Filogenia , Probabilidade
15.
Syst Biol ; 71(2): 367-381, 2022 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-34245291

RESUMO

Many recent phylogenetic methods have focused on accurately inferring species trees when there is gene tree discordance due to incomplete lineage sorting (ILS). For almost all of these methods, and for phylogenetic methods in general, the data for each locus are assumed to consist of orthologous, single-copy sequences. Loci that are present in more than a single copy in any of the studied genomes are excluded from the data. These steps greatly reduce the number of loci available for analysis. The question we seek to answer in this study is: what happens if one runs such species tree inference methods on data where paralogy is present, in addition to or without ILS being present? Through simulation studies and analyses of two large biological data sets, we show that running such methods on data with paralogs can still provide accurate results. We use multiple different methods, some of which are based directly on the multispecies coalescent model, and some of which have been proven to be statistically consistent under it. We also treat the paralogous loci in multiple ways: from explicitly denoting them as paralogs, to randomly selecting one copy per species. In all cases, the inferred species trees are as accurate as equivalent analyses using single-copy orthologs. Our results have significant implications for the use of ILS-aware phylogenomic analyses, demonstrating that they do not have to be restricted to single-copy loci. This will greatly increase the amount of data that can be used for phylogenetic inference.[Gene duplication and loss; incomplete lineage sorting; multispecies coalescent; orthology; paralogy.].


Assuntos
Duplicação Gênica , Modelos Genéticos , Simulação por Computador , Genoma , Filogenia
16.
PLoS Biol ; 18(12): e3000954, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33270638

RESUMO

Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.


Assuntos
Introgressão Genética/genética , Primatas/genética , Animais , Evolução Biológica , Cercopithecidae/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Fósseis , Fluxo Gênico/genética , Genoma/genética , Modelos Genéticos , Filogenia , Análise de Sequência de DNA/métodos
17.
Proc Natl Acad Sci U S A ; 117(50): 31583-31590, 2020 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-33262284

RESUMO

Advances in genomics have led to an appreciation that introgression is common, but its evolutionary consequences are poorly understood. In recent species radiations the sharing of genetic variation across porous species boundaries can facilitate adaptation to new environments and generate novel phenotypes, which may contribute to further diversification. Most Anopheles mosquito species that are of major importance as human malaria vectors have evolved within recent and rapid radiations of largely nonvector species. Here, we focus on one of the most medically important yet understudied anopheline radiations, the Afrotropical Anopheles funestus complex (AFC), to investigate the role of introgression in its diversification and the possible link between introgression and vector potential. The AFC comprises at least seven morphologically similar species, yet only An. funestus sensu stricto is a highly efficient malaria vector with a pan-African distribution. Based on de novo genome assemblies and additional whole-genome resequencing, we use phylogenomic and population genomic analyses to establish species relationships. We show that extensive interspecific gene flow involving multiple species pairs has shaped the evolutionary history of the AFC since its diversification. The most recent introgression event involved a massive and asymmetrical movement of genes from a distantly related AFC lineage into An. funestus, an event that predated and plausibly facilitated its subsequent dramatic geographic range expansion across most of tropical Africa. We propose that introgression may be a common mechanism facilitating adaptation to new environments and enhancing vectorial capacity in Anopheles mosquitoes.


Assuntos
Anopheles/genética , Fluxo Gênico , Introgressão Genética , Malária/transmissão , Mosquitos Vetores/genética , Adaptação Fisiológica/genética , África , Distribuição Animal , Animais , Anopheles/parasitologia , Genoma de Inseto/genética , Geografia , Humanos , Malária/parasitologia , Mosquitos Vetores/parasitologia , Filogenia
18.
Mol Biol Evol ; 38(7): 2946-2957, 2021 06 25.
Artigo em Inglês | MEDLINE | ID: mdl-33769517

RESUMO

Dissecting the genetic mechanisms underlying dioecy (i.e., separate female and male individuals) is critical for understanding the evolution of this pervasive reproductive strategy. Nonetheless, the genetic basis of sex determination remains unclear in many cases, especially in systems where dioecy has arisen recently. Within the economically important plant genus Solanum (∼2,000 species), dioecy is thought to have evolved independently at least 4 times across roughly 20 species. Here, we generate the first genome sequence of a dioecious Solanum and use it to ascertain the genetic basis of sex determination in this species. We de novo assembled and annotated the genome of Solanum appendiculatum (assembly size: ∼750 Mb scaffold N50: 0.92 Mb; ∼35,000 genes), identified sex-specific sequences and their locations in the genome, and inferred that males in this species are the heterogametic sex. We also analyzed gene expression patterns in floral tissues of males and females, finding approximately 100 genes that are differentially expressed between the sexes. These analyses, together with observed patterns of gene-family evolution specific to S. appendiculatum, consistently implicate a suite of genes from the regulatory network controlling pectin degradation and modification in the expression of sex. Furthermore, the genome of a species with a relatively young sex-determination system provides the foundational resources for future studies on the independent evolution of dioecy in this clade.


Assuntos
Evolução Biológica , Genoma de Planta , Processos de Determinação Sexual/genética , Solanum/genética , Regulação da Expressão Gênica de Plantas , Família Multigênica , Pectinas/genética
19.
Mol Biol Evol ; 38(4): 1460-1471, 2021 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-33226085

RESUMO

Mutations play a key role in the development of disease in an individual and the evolution of traits within species. Recent work in humans and other primates has clarified the origins and patterns of single-nucleotide variants, showing that most arise in the father's germline during spermatogenesis. It remains unknown whether larger mutations, such as deletions and duplications of hundreds or thousands of nucleotides, follow similar patterns. Such mutations lead to copy-number variation (CNV) within and between species, and can have profound effects by deleting or duplicating genes. Here, we analyze patterns of CNV mutations in 32 rhesus macaque individuals from 14 parent-offspring trios. We find the rate of CNV mutations per generation is low (less than one per genome) and we observe no correlation between parental age and the number of CNVs that are passed on to offspring. We also examine segregating CNVs within the rhesus macaque sample and compare them to a similar data set from humans, finding that both species have far more segregating deletions than duplications. We contrast this with long-term patterns of gene copy-number evolution between 17 mammals, where the proportion of deletions that become fixed along the macaque lineage is much smaller than the proportion of segregating deletions. These results suggest purifying selection acting on deletions, such that the majority of them are removed from the population over time. Rhesus macaques are an important biomedical model organism, so these results will aid in our understanding of this species and the disease models it supports.


Assuntos
Variações do Número de Cópias de DNA , Macaca mulatta/genética , Mutação , Animais , Feminino , Duplicação Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Seleção Genética , Deleção de Sequência , Sequenciamento Completo do Genoma
20.
Mol Biol Evol ; 38(2): 486-501, 2021 01 23.
Artigo em Inglês | MEDLINE | ID: mdl-32946576

RESUMO

Bumblebees are a diverse group of globally important pollinators in natural ecosystems and for agricultural food production. With both eusocial and solitary life-cycle phases, and some social parasite species, they are especially interesting models to understand social evolution, behavior, and ecology. Reports of many species in decline point to pathogen transmission, habitat loss, pesticide usage, and global climate change, as interconnected causes. These threats to bumblebee diversity make our reliance on a handful of well-studied species for agricultural pollination particularly precarious. To broadly sample bumblebee genomic and phenotypic diversity, we de novo sequenced and assembled the genomes of 17 species, representing all 15 subgenera, producing the first genus-wide quantification of genetic and genomic variation potentially underlying key ecological and behavioral traits. The species phylogeny resolves subgenera relationships, whereas incomplete lineage sorting likely drives high levels of gene tree discordance. Five chromosome-level assemblies show a stable 18-chromosome karyotype, with major rearrangements creating 25 chromosomes in social parasites. Differential transposable element activity drives changes in genome sizes, with putative domestications of repetitive sequences influencing gene coding and regulatory potential. Dynamically evolving gene families and signatures of positive selection point to genus-wide variation in processes linked to foraging, diet and metabolism, immunity and detoxification, as well as adaptations for life at high altitudes. Our study reveals how bumblebee genes and genomes have evolved across the Bombus phylogeny and identifies variations potentially linked to key ecological and behavioral traits of these important pollinators.


Assuntos
Adaptação Biológica/genética , Abelhas/genética , Evolução Biológica , Genoma de Inseto , Animais , Uso do Códon , Elementos de DNA Transponíveis , Dieta , Comportamento Alimentar , Componentes do Gene , Tamanho do Genoma , Seleção Genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa