RESUMO
Operons are a hallmark of bacterial genomes, where they allow concerted expression of functionally related genes as single polycistronic transcripts. They are rare in eukaryotes, where each gene usually drives expression of its own independent messenger RNAs. Here, we report the horizontal operon transfer of a siderophore biosynthesis pathway from relatives of Escherichia coli into a group of budding yeast taxa. We further show that the co-linearly arranged secondary metabolism genes are expressed, exhibit eukaryotic transcriptional features, and enable the sequestration and uptake of iron. After transfer, several genetic changes occurred during subsequent evolution, including the gain of new transcription start sites that were sometimes within protein-coding sequences, acquisition of polyadenylation sites, structural rearrangements, and integration of eukaryotic genes into the cluster. We conclude that the genes were likely acquired as a unit, modified for eukaryotic gene expression, and maintained by selection to adapt to the highly competitive, iron-limited environment.
Assuntos
Eucariotos/genética , Transferência Genética Horizontal/genética , Óperon/genética , Bactérias/genética , Escherichia coli/genética , Células Eucarióticas , Evolução Molecular , Regulação Bacteriana da Expressão Gênica/genética , Genes Bacterianos/genética , Genoma Bacteriano/genética , Genoma Fúngico/genética , Saccharomycetales/genética , Sideróforos/genéticaRESUMO
Budding yeasts (subphylum Saccharomycotina) are found in every biome and are as genetically diverse as plants or animals. To understand budding yeast evolution, we analyzed the genomes of 332 yeast species, including 220 newly sequenced ones, which represent nearly one-third of all known budding yeast diversity. Here, we establish a robust genus-level phylogeny comprising 12 major clades, infer the timescale of diversification from the Devonian period to the present, quantify horizontal gene transfer (HGT), and reconstruct the evolution of 45 metabolic traits and the metabolic toolkit of the budding yeast common ancestor (BYCA). We infer that BYCA was metabolically complex and chronicle the tempo and mode of genomic and phenotypic evolution across the subphylum, which is characterized by very low HGT levels and widespread losses of traits and the genes that control them. More generally, our results argue that reductive evolution is a major mode of evolutionary diversification.
Assuntos
Evolução Molecular , Transferência Genética Horizontal , Genoma Fúngico , Filogenia , Saccharomycetales/classificação , Saccharomycetales/genéticaRESUMO
Many distantly related organisms have convergently evolved traits and lifestyles that enable them to live in similar ecological environments. However, the extent of phenotypic convergence evolving through the same or distinct genetic trajectories remains an open question. Here, we leverage a comprehensive dataset of genomic and phenotypic data from 1,049 yeast species in the subphylum Saccharomycotina (Kingdom Fungi, Phylum Ascomycota) to explore signatures of convergent evolution in cactophilic yeasts, ecological specialists associated with cacti. We inferred that the ecological association of yeasts with cacti arose independently approximately 17 times. Using a machine learning-based approach, we further found that cactophily can be predicted with 76% accuracy from both functional genomic and phenotypic data. The most informative feature for predicting cactophily was thermotolerance, which we found to be likely associated with altered evolutionary rates of genes impacting the cell envelope in several cactophilic lineages. We also identified horizontal gene transfer and duplication events of plant cell wall-degrading enzymes in distantly related cactophilic clades, suggesting that putatively adaptive traits evolved independently through disparate molecular mechanisms. Notably, we found that multiple cactophilic species and their close relatives have been reported as emerging human opportunistic pathogens, suggesting that the cactophilic lifestyle-and perhaps more generally lifestyles favoring thermotolerance-might preadapt yeasts to cause human disease. This work underscores the potential of a multifaceted approach involving high-throughput genomic and phenotypic data to shed light onto ecological adaptation and highlights how convergent evolution to wild environments could facilitate the transition to human pathogenicity.
Assuntos
Cactaceae , Cactaceae/microbiologia , Cactaceae/genética , Filogenia , Leveduras/genética , Genoma Fúngico/genética , Evolução Biológica , Evolução Molecular , Fenótipo , Transferência Genética Horizontal , Termotolerância/genética , Ascomicetos/genética , Ascomicetos/patogenicidade , Aprendizado de MáquinaRESUMO
How genomic differences contribute to phenotypic differences is a major question in biology. The recently characterized genomes, isolation environments, and qualitative patterns of growth on 122 sources and conditions of 1,154 strains from 1,049 fungal species (nearly all known) in the yeast subphylum Saccharomycotina provide a powerful, yet complex, dataset for addressing this question. We used a random forest algorithm trained on these genomic, metabolic, and environmental data to predict growth on several carbon sources with high accuracy. Known structural genes involved in assimilation of these sources and presence/absence patterns of growth in other sources were important features contributing to prediction accuracy. By further examining growth on galactose, we found that it can be predicted with high accuracy from either genomic (92.2%) or growth data (82.6%) but not from isolation environment data (65.6%). Prediction accuracy was even higher (93.3%) when we combined genomic and growth data. After the GALactose utilization genes, the most important feature for predicting growth on galactose was growth on galactitol, raising the hypothesis that several species in two orders, Serinales and Pichiales (containing the emerging pathogen Candida auris and the genus Ogataea, respectively), have an alternative galactose utilization pathway because they lack the GAL genes. Growth and biochemical assays confirmed that several of these species utilize galactose through an alternative oxidoreductive D-galactose pathway, rather than the canonical GAL pathway. Machine learning approaches are powerful for investigating the evolution of the yeast genotype-phenotype map, and their application will uncover novel biology, even in well-studied traits.
Assuntos
Galactose , Aprendizado de Máquina , Galactose/metabolismo , Genoma Fúngico , Redes e Vias Metabólicas/genética , Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genéticaRESUMO
The Saccharomycotina yeasts ("yeasts" hereafter) are a fungal clade of scientific, economic, and medical significance. Yeasts are highly ecologically diverse, found across a broad range of environments in every biome and continent on earth; however, little is known about what rules govern the macroecology of yeast species and their range limits in the wild. Here, we trained machine learning models on 12,816 terrestrial occurrence records and 96 environmental variables to infer global distribution maps at ~1 km2 resolution for 186 yeast species (~15% of described species from 75% of orders) and to test environmental drivers of yeast biogeography and macroecology. We found that predicted yeast diversity hotspots occur in mixed montane forests in temperate climates. Diversity in vegetation type and topography were some of the greatest predictors of yeast species richness, suggesting that microhabitats and environmental clines are key to yeast diversity. We further found that range limits in yeasts are significantly influenced by carbon niche breadth and range overlap with other yeast species, with carbon specialists and species in high-diversity environments exhibiting reduced geographic ranges. Finally, yeasts contravene many long-standing macroecological principles, including the latitudinal diversity gradient, temperature-dependent species richness, and a positive relationship between latitude and range size (Rapoport's rule). These results unveil how the environment governs the global diversity and distribution of species in the yeast subphylum. These high-resolution models of yeast species distributions will facilitate the prediction of economically relevant and emerging pathogenic species under current and future climate scenarios.
Assuntos
Biodiversidade , Ecossistema , Clima , Florestas , Carbono , LevedurasRESUMO
Siderophores are crucial for iron-scavenging in microorganisms. While many yeasts can uptake siderophores produced by other organisms, they are typically unable to synthesize siderophores themselves. In contrast, Wickerhamiella/Starmerella (W/S) clade yeasts gained the capacity to make the siderophore enterobactin following the remarkable horizontal acquisition of a bacterial operon enabling enterobactin synthesis. Yet, how these yeasts absorb the iron bound by enterobactin remains unresolved. Here, we demonstrate that Enb1 is the key enterobactin importer in the W/S-clade species Starmerella bombicola. Through phylogenomic analyses, we show that ENB1 is present in all W/S clade yeast species that retained the enterobactin biosynthetic genes. Conversely, it is absent in species that lost the ent genes, except for Starmerella stellata, making this species the only cheater in the W/S clade that can utilize enterobactin without producing it. Through phylogenetic analyses, we infer that ENB1 is a fungal gene that likely existed in the W/S clade prior to the acquisition of the ent genes and subsequently experienced multiple gene losses and duplications. Through phylogenetic topology tests, we show that ENB1 likely underwent horizontal gene transfer from an ancient W/S clade yeast to the order Saccharomycetales, which includes the model yeast Saccharomyces cerevisiae, followed by extensive secondary losses. Taken together, these results suggest that the fungal ENB1 and bacterial ent genes were cooperatively integrated into a functional unit within the W/S clade that enabled adaptation to iron-limited environments. This integrated fungal-bacterial circuit and its dynamic evolution determine the extant distribution of yeast enterobactin producers and cheaters.
Assuntos
Enterobactina , Evolução Molecular , Óperon , Filogenia , Enterobactina/metabolismo , Enterobactina/genética , Sideróforos/metabolismo , Sideróforos/genética , Genes Fúngicos , Saccharomycetales/genética , Saccharomycetales/metabolismo , Transferência Genética HorizontalRESUMO
Xylose is the second most abundant monomeric sugar in plant biomass. Consequently, xylose catabolism is an ecologically important trait for saprotrophic organisms, as well as a fundamentally important trait for industries that hope to convert plant mass to renewable fuels and other bioproducts using microbial metabolism. Although common across fungi, xylose catabolism is rare within Saccharomycotina, the subphylum that contains most industrially relevant fermentative yeast species. The genomes of several yeasts unable to consume xylose have been previously reported to contain the full set of genes in the XYL pathway, suggesting the absence of a gene-trait correlation for xylose metabolism. Here, we measured growth on xylose and systematically identified XYL pathway orthologs across the genomes of 332 budding yeast species. Although the XYL pathway coevolved with xylose metabolism, we found that pathway presence only predicted xylose catabolism about half of the time, demonstrating that a complete XYL pathway is necessary, but not sufficient, for xylose catabolism. We also found that XYL1 copy number was positively correlated, after phylogenetic correction, with xylose utilization. We then quantified codon usage bias of XYL genes and found that XYL3 codon optimization was significantly higher, after phylogenetic correction, in species able to consume xylose. Finally, we showed that codon optimization of XYL2 was positively correlated, after phylogenetic correction, with growth rates in xylose medium. We conclude that gene content alone is a weak predictor of xylose metabolism and that using codon optimization enhances the prediction of xylose metabolism from yeast genome sequence data.
Assuntos
Saccharomycetales , Saccharomycetales/genética , Saccharomycetales/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Xilose/genética , Xilose/metabolismo , Filogenia , Uso do CódonRESUMO
Yeasts in the subphylum Saccharomycotina are found across the globe in disparate ecosystems. A major aim of yeast research is to understand the diversity and evolution of ecological traits, such as carbon metabolic breadth, insect association, and cactophily. This includes studying aspects of ecological traits like genetic architecture or association with other phenotypic traits. Genomic resources in the Saccharomycotina have grown rapidly. Ecological data, however, are still limited for many species, especially those only known from species descriptions where usually only a limited number of strains are studied. Moreover, ecological information is recorded in natural language format limiting high throughput computational analysis. To address these limitations, we developed an ontological framework for the analysis of yeast ecology. A total of 1,088 yeast strains were added to the Ontology of Yeast Environments (OYE) and analyzed in a machine-learning framework to connect genotype to ecology. This framework is flexible and can be extended to additional isolates, species, or environmental sequencing data. Widespread adoption of OYE would greatly aid the study of macroecology in the Saccharomycotina subphylum.
Assuntos
Ecossistema , Ecologia , Ascomicetos/genética , Ascomicetos/classificação , Genótipo , Aprendizado de Máquina , Genoma Fúngico/genéticaRESUMO
Reverse ecology is the inference of ecological information from patterns of genomic variation. One rich, heretofore underutilized, source of ecologically relevant genomic information is codon optimality or adaptation. Bias toward codons that match the tRNA pool is robustly associated with high gene expression in diverse organisms, suggesting that codon optimization could be used in a reverse ecology framework to identify highly expressed, ecologically relevant genes. To test this hypothesis, we examined the relationship between optimal codon usage in the classic galactose metabolism (GAL) pathway and known ecological niches for 329 species of budding yeasts, a diverse subphylum of fungi. We find that optimal codon usage in the GAL pathway is positively correlated with quantitative growth on galactose, suggesting that GAL codon optimization reflects increased capacity to grow on galactose. Optimal codon usage in the GAL pathway is also positively correlated with human-associated ecological niches in yeasts of the CUG-Ser1 clade and with dairy-associated ecological niches in the family Saccharomycetaceae. For example, optimal codon usage of GAL genes is greater than 85% of all genes in the genome of the major human pathogen Candida albicans (CUG-Ser1 clade) and greater than 75% of genes in the genome of the dairy yeast Kluyveromyces lactis (family Saccharomycetaceae). We further find a correlation between optimization in the GALactose pathway genes and several genes associated with nutrient sensing and metabolism. This work suggests that codon optimization harbors information about the metabolic ecology of microbial eukaryotes. This information may be particularly useful for studying fungal dark matter-species that have yet to be cultured in the lab or have only been identified by genomic material.
Assuntos
Uso do Códon/fisiologia , Ecossistema , Redes e Vias Metabólicas/genética , Saccharomycetales , Metabolismo dos Carboidratos/genética , Códon , Galactose/metabolismo , Interação Gene-Ambiente , Genes Fúngicos/fisiologia , Estudos de Associação Genética , Organismos Geneticamente Modificados , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crescimento & desenvolvimento , Saccharomyces cerevisiae/metabolismo , Saccharomycetales/classificação , Saccharomycetales/genética , Saccharomycetales/metabolismoRESUMO
[This corrects the article DOI: 10.1371/journal.pgen.1008304.].
RESUMO
A novel budding yeast species was isolated from a soil sample collected in the United States of America. Phylogenetic analyses of multiple loci and phylogenomic analyses conclusively placed the species within the genus Pichia. Strain yHMH446 falls within a clade that includes Pichia norvegensis, Pichia pseudocactophila, Candida inconspicua, and Pichia cactophila. Whole genome sequence data were analyzed for the presence of genes known to be important for carbon and nitrogen metabolism, and the phenotypic data from the novel species were compared to all Pichia species with publicly available genomes. Across the genus, including the novel species candidate, we found that the inability to use many carbon and nitrogen sources correlated with the absence of metabolic genes. Based on these results, Pichia galeolata sp. nov. is proposed to accommodate yHMH446T (=NRRL Y-64187 = CBS 16864). This study shows how integrated taxogenomic analysis can add mechanistic insight to species descriptions.
Assuntos
Pichia , Solo , Pichia/genética , Filogenia , DNA Fúngico/genética , Técnicas de Tipagem Micológica , Leveduras/genética , Carbono , Nitrogênio , Análise de Sequência de DNARESUMO
The wild, cold-adapted parent of hybrid lager-brewing yeasts, Saccharomyces eubayanus, has a complex and understudied natural history. The exploration of this diversity can be used both to develop new brewing applications and to enlighten our understanding of the dynamics of yeast evolution in the wild. Here, we integrate whole genome sequence and phenotypic data of 200 S. eubayanus strains, the largest collection known to date. S. eubayanus has a multilayered population structure, consisting of two major populations that are further structured into six subpopulations. Four of these subpopulations are found exclusively in the Patagonian region of South America; one is found predominantly in Patagonia and sparsely in Oceania and North America; and one is specific to the Holarctic ecozone. Plant host associations differed between subpopulations and between S. eubayanus and its sister species, Saccharomyces uvarum. S. eubayanus is most abundant and genetically diverse in northern Patagonia, where some locations harbor more genetic diversity than is found outside of South America, suggesting that northern Patagonia east of the Andes was a glacial refugium for this species. All but one subpopulation shows isolation-by-distance, and gene flow between subpopulations is low. However, there are strong signals of ancient and recent outcrossing, including two admixed lineages, one that is sympatric with and one that is mostly isolated from its parental populations. Using our extensive biogeographical data, we build a robust model that predicts all known and a handful of additional regions of the globe that are climatically suitable for S. eubayanus, including Europe where host accessibility and competitive exclusion by other Saccharomyces species may explain its continued elusiveness. We conclude that this industrially relevant species has rich natural diversity with many factors contributing to its complex distribution and natural history.
Assuntos
Ecossistema , Evolução Molecular , Polimorfismo Genético , Saccharomyces/genética , Genoma Fúngico , Hibridização Genética , Filogeografia , Saccharomyces/fisiologiaRESUMO
Yeasts have broad importance as industrially and clinically relevant microbes and as powerful models for fundamental research, but we are only beginning to understand the roles yeasts play in natural ecosystems. Yeast ecology is often more difficult to study compared to other, more abundant microbes, but growing collections of natural yeast isolates are beginning to shed light on fundamental ecological questions. Here, we used environmental sampling and isolation to assemble a dataset of 1962 isolates collected from throughout the contiguous United States of America (USA) and Alaska, which were then used to uncover geographic patterns, along with substrate and temperature associations among yeast taxa. We found some taxa, including the common yeasts Torulaspora delbrueckii and Saccharomyces paradoxus, to be repeatedly isolated from multiple sampled regions of the USA, and we classify these as broadly distributed cosmopolitan yeasts. A number of yeast taxon-substrate associations were identified, some of which were novel and some of which support previously reported associations. Further, we found a strong effect of isolation temperature on the phyla of yeasts recovered, as well as for many species. We speculate that substrate and isolation temperature associations reflect the ecological diversity of and niche partitioning by yeast taxa.
Assuntos
Ecossistema , Torulaspora , Temperatura , LevedurasRESUMO
Yeasts are ubiquitous in temperate forests. While this broad habitat is well-defined, the yeasts inhabiting it and their life cycles, niches, and contributions to ecosystem functioning are less understood. Yeasts are present on nearly all sampled substrates in temperate forests worldwide. They associate with soils, macroorganisms, and other habitats and no doubt contribute to broader ecosystem-wide processes. Researchers have gathered information leading to hypotheses about yeasts' niches and their life cycles based on physiological observations in the laboratory as well as genomic analyses, but the challenge remains to test these hypotheses in the forests themselves. Here, we summarize the habitat and global patterns of yeast diversity, give some information on a handful of well-studied temperate forest yeast genera, discuss the various strategies to isolate forest yeasts, and explain temperate forest yeasts' contributions to biotechnology. We close with a summary of the many future directions and outstanding questions facing researchers in temperate forest yeast ecology. Yeasts present an exciting opportunity to better understand the hidden world of microbial ecology in this threatened and global habitat.
Assuntos
Ecossistema , Árvores , Biodiversidade , Florestas , Leveduras/genéticaRESUMO
Cell-cycle checkpoints and DNA repair processes protect organisms from potentially lethal mutational damage. Compared to other budding yeasts in the subphylum Saccharomycotina, we noticed that a lineage in the genus Hanseniaspora exhibited very high evolutionary rates, low Guanine-Cytosine (GC) content, small genome sizes, and lower gene numbers. To better understand Hanseniaspora evolution, we analyzed 25 genomes, including 11 newly sequenced, representing 18/21 known species in the genus. Our phylogenomic analyses identify two Hanseniaspora lineages, a faster-evolving lineage (FEL), which began diversifying approximately 87 million years ago (mya), and a slower-evolving lineage (SEL), which began diversifying approximately 54 mya. Remarkably, both lineages lost genes associated with the cell cycle and genome integrity, but these losses were greater in the FEL. E.g., all species lost the cell-cycle regulator WHIskey 5 (WHI5), and the FEL lost components of the spindle checkpoint pathway (e.g., Mitotic Arrest-Deficient 1 [MAD1], Mitotic Arrest-Deficient 2 [MAD2]) and DNA-damage-checkpoint pathway (e.g., Mitosis Entry Checkpoint 3 [MEC3], RADiation sensitive 9 [RAD9]). Similarly, both lineages lost genes involved in DNA repair pathways, including the DNA glycosylase gene 3-MethylAdenine DNA Glycosylase 1 (MAG1), which is part of the base-excision repair pathway, and the DNA photolyase gene PHotoreactivation Repair deficient 1 (PHR1), which is involved in pyrimidine dimer repair. Strikingly, the FEL lost 33 additional genes, including polymerases (i.e., POLymerase 4 [POL4] and POL32) and telomere-associated genes (e.g., Repressor/activator site binding protein-Interacting Factor 1 [RIF1], Replication Factor A 3 [RFA3], Cell Division Cycle 13 [CDC13], Pbp1p Binding Protein [PBP2]). Echoing these losses, molecular evolutionary analyses reveal that, compared to the SEL, the FEL stem lineage underwent a burst of accelerated evolution, which resulted in greater mutational loads, homopolymer instabilities, and higher fractions of mutations associated with the common endogenously damaged base, 8-oxoguanine. We conclude that Hanseniaspora is an ancient lineage that has diversified and thrived, despite lacking many otherwise highly conserved cell-cycle and genome integrity genes and pathways, and may represent a novel, to our knowledge, system for studying cellular life without them.
Assuntos
Ciclo Celular/genética , Reparo do DNA/genética , Genes Fúngicos , Filogenia , Saccharomycetales/citologia , Saccharomycetales/genética , Sequência de Bases , Dano ao DNA/genética , Evolução Molecular , FenótipoRESUMO
Variation in synonymous codon usage is abundant across multiple levels of organization: between codons of an amino acid, between genes in a genome, and between genomes of different species. It is now well understood that variation in synonymous codon usage is influenced by mutational bias coupled with both natural selection for translational efficiency and genetic drift, but how these processes shape patterns of codon usage bias across entire lineages remains unexplored. To address this question, we used a rich genomic data set of 327 species that covers nearly one third of the known biodiversity of the budding yeast subphylum Saccharomycotina. We found that, while genome-wide relative synonymous codon usage (RSCU) for all codons was highly correlated with the GC content of the third codon position (GC3), the usage of codons for the amino acids proline, arginine, and glycine was inconsistent with the neutral expectation where mutational bias coupled with genetic drift drive codon usage. Examination between genes' effective numbers of codons and their GC3 contents in individual genomes revealed that nearly a quarter of genes (381,174/1,683,203; 23%), as well as most genomes (308/327; 94%), significantly deviate from the neutral expectation. Finally, by evaluating the imprint of translational selection on codon usage, measured as the degree to which genes' adaptiveness to the tRNA pool were correlated with selective pressure, we show that translational selection is widespread in budding yeast genomes (264/327; 81%). These results suggest that the contribution of translational selection and drift to patterns of synonymous codon usage across budding yeasts varies across codons, genes, and genomes; whereas drift is the primary driver of global codon usage across the subphylum, the codon bias of large numbers of genes in the majority of genomes is influenced by translational selection.
Assuntos
Uso do Códon , Saccharomycetales/genética , Viés , Variação Genética , Genoma Fúngico , Seleção GenéticaRESUMO
Secondary metabolites are key in how organisms from all domains of life interact with their environment and each other. The iron-binding molecule pulcherrimin was described a century ago, but the genes responsible for its production in budding yeasts have remained uncharacterized. Here, we used phylogenomic footprinting on 90 genomes across the budding yeast subphylum Saccharomycotina to identify the gene cluster associated with pulcherrimin production. Using targeted gene replacements in Kluyveromyces lactis, we characterized the four genes that make up the cluster, which likely encode two pulcherriminic acid biosynthesis enzymes, a pulcherrimin transporter, and a transcription factor involved in both biosynthesis and transport. The requirement of a functional putative transporter to utilize extracellular pulcherrimin-complexed iron demonstrates that pulcherriminic acid is a siderophore, a chelator that binds iron outside the cell for subsequent uptake. Surprisingly, we identified homologs of the putative transporter and transcription factor genes in multiple yeast genera that lacked the biosynthesis genes and could not make pulcherrimin, including the model yeast Saccharomyces cerevisiae We deleted these previously uncharacterized genes and showed they are also required for pulcherrimin utilization in S. cerevisiae, raising the possibility that other genes of unknown function are linked to secondary metabolism. Phylogenetic analyses of this gene cluster suggest that pulcherrimin biosynthesis and utilization were ancestral to budding yeasts, but the biosynthesis genes and, subsequently, the utilization genes, were lost in many lineages, mirroring other microbial public goods systems that lead to the rise of cheater organisms.
Assuntos
Família Multigênica/genética , Saccharomycetales/genética , Metabolismo Secundário/genética , Ferro/metabolismo , Kluyveromyces/genética , Proteínas de Membrana Transportadoras/genética , Filogenia , Biossíntese de Proteínas/genética , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomycetales/metabolismo , Sideróforos/genética , Fatores de Transcrição/genéticaRESUMO
New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Assuntos
Evolução Molecular , Proteínas Fúngicas/genética , Saccharomyces/genética , Fatores Etários , Sequência de Bases , Sequência Conservada , Regiões Promotoras Genéticas , Recombinação Genética , Seleção GenéticaRESUMO
Budding yeasts are distributed across a wide range of habitats, including as human commensals. However, under some conditions, these commensals can cause superficial, invasive, and even lethal infections. Despite their importance to human health, little is known about the ecology of these opportunistic pathogens, aside from their associations with mammals and clinical environments. During a survey of approximately 1000 non-clinical samples across the United States of America, we isolated 54 strains of budding yeast species considered opportunistic pathogens, including Candida albicans and Candida (Nakaseomyces) glabrata. We found that, as a group, pathogenic yeasts were positively associated with fruits and soil environments, whereas the species Pichia kudriavzevii (syn. Candida krusei syn. Issatchenkia orientalis) had a significant association with plants. Of the four species that cause 95% of candidiasis, we found a positive association with soil. These results suggest that pathogenic yeast ecology is more complex and diverse than is currently appreciated and raises the possibility that these additional environments could be a point of contact for human infections.
Assuntos
Frutas/microbiologia , Plantas/microbiologia , Saccharomycetales/isolamento & purificação , Saccharomycetales/patogenicidade , Microbiologia do Solo , Candida/isolamento & purificação , Candida/patogenicidade , Testes de Sensibilidade Microbiana , Pichia/isolamento & purificação , Saccharomycetales/classificação , Estados UnidosRESUMO
BACKGROUND: Associations between traits are prevalent in nature, occurring across a diverse range of taxa and traits. Individual traits may co-evolve with one other, and these correlations can be driven by factors intrinsic or extrinsic to an organism. However, few studies, especially in microbes, have simultaneously investigated both across a broad taxonomic range. Here we quantify pairwise associations among 48 traits across 784 diverse yeast species of the ancient budding yeast subphylum Saccharomycotina, assessing the effects of phylogenetic history, genetics, and ecology. RESULTS: We find extensive negative (traits that tend to not occur together) and positive (traits that tend to co-occur) pairwise associations among traits, as well as between traits and environments. These associations can largely be explained by the biological properties of the traits, such as overlapping biochemical pathways. The isolation environments of the yeasts explain a minor but significant component of the variance, while phylogeny (the retention of ancestral traits in descendant species) plays an even more limited role. Positive correlations are pervasive among carbon utilization traits and track with chemical structures (e.g., glucosides and sugar alcohols) and metabolic pathways, suggesting a molecular basis for the presence of suites of traits. In several cases, characterized genes from model organisms suggest that enzyme promiscuity and overlapping biochemical pathways are likely mechanisms to explain these macroevolutionary trends. Interestingly, fermentation traits are negatively correlated with the utilization of pentose sugars, which are major components of the plant biomass degraded by fungi and present major bottlenecks to the production of cellulosic biofuels. Finally, we show that mammalian pathogenic and commensal yeasts have a suite of traits that includes growth at high temperature and, surprisingly, the utilization of a narrowed panel of carbon sources. CONCLUSIONS: These results demonstrate how both intrinsic physiological factors and extrinsic ecological factors drive the distribution of traits present in diverse organisms across macroevolutionary timescales.