Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 36
Filter
1.
Genome Biol Evol ; 16(8)2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39004885

ABSTRACT

New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.


Subject(s)
Evolution, Molecular , Saccharomyces cerevisiae/genetics , Phylogeny , Genome, Fungal , Genes, Fungal
2.
bioRxiv ; 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38659899

ABSTRACT

The current "consensus" order in which amino acids were added to the genetic code is based on potentially biased criteria such as absence of sulfur-containing amino acids from the Urey-Miller experiment which lacked sulfur. Even if inferred perfectly, abiotic abundance might not reflect abundance in the organisms in which the genetic code evolved. Here, we instead exploit the fact that proteins that emerged prior to the genetic code's completion are likely enriched in early amino acids and depleted in late amino acids. We identify the most ancient protein-coding sequences born prior to the archaeal-bacterial split. Amino acid usage in protein sequences whose ancestors date back to a single homolog in the Last Universal Common Ancestor (LUCA) largely matches the consensus order. However, our findings indicate that metal-binding (cysteine and histidine) and sulfur-containing (cysteine and methionine) amino acids were added to the genetic code much earlier than previously thought. Surprisingly, even more ancient protein sequences - those that had already diversified into multiple distinct copies in LUCA - show a different pattern to single copy LUCA sequences: significantly less depleted in the late amino acids tryptophan and tyrosine, and enriched rather than depleted in phenylalanine. This is compatible with at least some of these sequences predating the current genetic code. Their distinct enrichment patterns thus provide hints about earlier, alternative genetic codes.

3.
Int J Mol Sci ; 24(7)2023 Mar 24.
Article in English | MEDLINE | ID: mdl-37047167

ABSTRACT

Using meta-analyses, we introduce a unicellular attractor (UCA) model integrating essential features of the 'atavistic reversal', 'cancer attractor', 'somatic mutation', 'genome chaos', and 'tissue organization field' theories. The 'atavistic reversal' theory is taken as a keystone. We propose a possible mechanism of this reversal, its refinement called 'gradual atavism', and evidence for the 'serial atavism' model. We showed the gradual core-to-periphery evolutionary growth of the human interactome resulting in the higher protein interaction density and global interactome centrality in the UC center. In addition, we revealed that UC genes are more actively expressed even in normal cells. The modeling of random walk along protein interaction trajectories demonstrated that random alterations in cellular networks, caused by genetic and epigenetic changes, can result in a further gradual activation of the UC center. These changes can be induced and accelerated by cellular stress that additionally activates UC genes (especially during cell proliferation), because the genes involved in cellular stress response and cell cycle are mostly of UC origin. The functional enrichment analysis showed that cancer cells demonstrate the hyperactivation of energetics and the suppression of multicellular genes involved in communication with the extracellular environment (especially immune surveillance). Collectively, these events can unleash selfish cell behavior aimed at survival at all means. All these changes are boosted by polyploidization. The UCA model may facilitate an understanding of oncogenesis and promote the development of therapeutic strategies.


Subject(s)
Brachyura , Neoplasms , Animals , Humans , Biological Evolution , Carcinogenesis/genetics , Cell Transformation, Neoplastic , Neoplasms/genetics
4.
Mol Biol Evol ; 40(4)2023 04 04.
Article in English | MEDLINE | ID: mdl-36947137

ABSTRACT

Protein domains that emerged more recently in evolution have a higher structural disorder and greater clustering of hydrophobic residues along the primary sequence. It is hard to explain how selection acting via descent with modification could act so slowly as not to saturate over the extraordinarily long timescales over which these trends persist. Here, we hypothesize that the trends were created by a higher level of selection that differentially affects the retention probabilities of protein domains with different properties. This hypothesis predicts that loss rates should depend on disorder and clustering trait values. To test this, we inferred loss rates via maximum likelihood for animal Pfam domains, after first performing a set of stringent quality control methods to reduce annotation errors. Intermediate trait values, matching those of ancient domains, are associated with the lowest loss rates, making our results difficult to explain with reference to previously described homology detection biases. Simulations confirm that effect sizes are of the right magnitude to produce the observed long-term trends. Our results support the hypothesis that differential domain loss slowly weeds out those protein domains that have nonoptimal levels of disorder and clustering. The same preferences also shape the differential diversification of Pfam domains, thereby further impacting proteome composition.


Subject(s)
Proteome , Animals , Protein Domains , Probability , Hydrophobic and Hydrophilic Interactions , Databases, Protein
5.
Genome Biol ; 24(1): 54, 2023 03 24.
Article in English | MEDLINE | ID: mdl-36964572

ABSTRACT

We present GenEra ( https://github.com/josuebarrera/GenEra ), a DIAMOND-fueled gene-family founder inference framework that addresses previously raised limitations and biases in genomic phylostratigraphy, such as homology detection failure. GenEra also reduces computational time from several months to a few days for any genome of interest. We analyze the emergence of taxonomically restricted gene families during major evolutionary transitions in plants, animals, and fungi. Our results indicate that the impact of homology detection failure on inferred patterns of gene emergence is lineage-dependent, suggesting that plants are more prone to evolve novelty through the emergence of new genes compared to animals and fungi.


Subject(s)
Biological Evolution , Genomics , Animals , Phylogeny , Genomics/methods , Fungi/genetics , Plants/genetics , Evolution, Molecular
6.
Int J Mol Sci ; 24(6)2023 Mar 15.
Article in English | MEDLINE | ID: mdl-36982667

ABSTRACT

Borreliella (syn. Borrelia) burgdorferi is a spirochete bacterium that causes tick-borne Lyme disease. Along its lifecycle B. burgdorferi develops several pleomorphic forms with unclear biological and medical relevance. Surprisingly, these morphotypes have never been compared at the global transcriptome level. To fill this void, we grew B. burgdorferi spirochete, round body, bleb, and biofilm-dominated cultures and recovered their transcriptomes by RNAseq profiling. We found that round bodies share similar expression profiles with spirochetes, despite their morphological differences. This sharply contrasts to blebs and biofilms that showed unique transcriptomes, profoundly distinct from spirochetes and round bodies. To better characterize differentially expressed genes in non-spirochete morphotypes, we performed functional, positional, and evolutionary enrichment analyses. Our results suggest that spirochete to round body transition relies on the delicate regulation of a relatively small number of highly conserved genes, which are located on the main chromosome and involved in translation. In contrast, spirochete to bleb or biofilm transition includes substantial reshaping of transcription profiles towards plasmids-residing and evolutionary young genes, which originated in the ancestor of Borreliaceae. Despite their abundance the function of these Borreliaceae-specific genes is largely unknown. However, many known Lyme disease virulence genes implicated in immune evasion and tissue adhesion originated in this evolutionary period. Taken together, these regularities point to the possibility that bleb and biofilm morphotypes might be important in the dissemination and persistence of B. burgdorferi inside the mammalian host. On the other hand, they prioritize the large pool of unstudied Borreliaceae-specific genes for functional characterization because this subset likely contains undiscovered Lyme disease pathogenesis genes.


Subject(s)
Borrelia burgdorferi , Lyme Disease , Animals , Humans , Bacterial Proteins/metabolism , Borrelia burgdorferi/genetics , Borrelia burgdorferi/metabolism , Lyme Disease/genetics , Mammals/metabolism , Transcriptome
7.
Int J Mol Sci ; 23(23)2022 Nov 29.
Article in English | MEDLINE | ID: mdl-36499258

ABSTRACT

The expression of gametogenesis-related (GG) genes and proteins, as well as whole genome duplications (WGD), are the hallmarks of cancer related to poor prognosis. Currently, it is not clear if these hallmarks are random processes associated only with genome instability or are programmatically linked. Our goal was to elucidate this via a thorough bioinformatics analysis of 1474 GG genes in the context of WGD. We examined their association in protein-protein interaction and coexpression networks, and their phylostratigraphic profiles from publicly available patient tumour data. The results show that GG genes are upregulated in most WGD-enriched somatic cancers at the transcriptome level and reveal robust GG gene expression at the protein level, as well as the ability to associate into correlation networks and enrich the reproductive modules. GG gene phylostratigraphy displayed in WGD+ cancers an attractor of early eukaryotic origin for DNA recombination and meiosis, and one relative to oocyte maturation and embryogenesis from early multicellular organisms. The upregulation of cancer-testis genes emerging with mammalian placentation was also associated with WGD. In general, the results suggest the role of polyploidy for soma-germ transition accessing latent cancer attractors in the human genome network, which appear as pre-formed along the whole Evolution of Life.


Subject(s)
Gene Duplication , Neoplasms , Animals , Humans , Genome, Plant , Proteome/genetics , Evolution, Molecular , Polyploidy , Transcriptome , Neoplasms/genetics , Mammals/genetics
8.
Zoological Lett ; 8(1): 14, 2022 Nov 26.
Article in English | MEDLINE | ID: mdl-36435814

ABSTRACT

The evolution of automixis - i.e., meiotic parthenogenesis - requires several features, including ploidy restoration after meiosis and maintenance of fertility. Characterizing the relative contribution of novel versus pre-existing genes and the similarities in their expression and sequence evolution is fundamental to understand the evolution of reproductive novelties. Here we identify gonads-biased genes in two Bacillus automictic stick-insects and compare their expression profile and sequence evolution with a bisexual congeneric species. The two parthenogens restore ploidy through different cytological mechanisms: in Bacillus atticus, nuclei derived from the first meiotic division fuse to restore a diploid egg nucleus, while in Bacillus rossius, diploidization occurs in some cells of the haploid blastula through anaphase restitution. Parthenogens' gonads transcriptional program is found to be largely assembled from genes that were already present before the establishment of automixis. The three species transcriptional profiles largely reflect their phyletic relationships, yet we identify a shared core of genes with gonad-biased patterns of expression in parthenogens which are either male gonads-biased in the sexual species or are not differentially expressed there. At the sequence level, just a handful of gonads-biased genes were inferred to have undergone instances of positive selection exclusively in the parthenogen species. This work is the first to explore the molecular underpinnings of automixis in a comparative framework: it delineates how reproductive novelties can be sustained by genes whose origin precedes the establishment of the novelty itself and shows that different meiotic mechanisms of reproduction can be associated with a shared molecular ground plan.

9.
Int J Mol Sci ; 23(19)2022 Sep 29.
Article in English | MEDLINE | ID: mdl-36232785

ABSTRACT

The biogenetic law (recapitulation law) states that ontogenesis recapitulates phylogenesis. However, this law can be distorted by the modification of development. We showed the recapitulation of phylogenesis during the differentiation of various cell types, using a meta-analysis of human single-cell transcriptomes, with the control for cell cycle activity and the improved phylostratigraphy (gene dating). The multipotent progenitors, differentiated from pluripotent embryonic stem cells (ESC), showed the downregulation of unicellular (UC) genes and the upregulation of multicellular (MC) genes, but only in the case of those originating up to the Euteleostomi (bony vertebrates). This picture strikingly resembles the evolutionary profile of regulatory gene expansion due to gene duplication in the human genome. The recapitulation of phylogenesis in the induced pluripotent stem cells (iPSC) during their differentiation resembles the ESC pattern. The unipotent erythroblasts differentiating into erythrocytes showed the downregulation of UC genes and the upregulation of MC genes originating after the Euteleostomi. The MC interactome neighborhood of a protein encoded by a UC gene reverses the gene expression pattern. The functional analysis showed that the evolved environment of the UC proteins is typical for protein modifiers and signaling-related proteins. Besides a fundamental aspect, this approach can provide a unified framework for cancer biology and regenerative/rejuvenation medicine because oncogenesis can be defined as an atavistic reversal to a UC state, while regeneration and rejuvenation require an ontogenetic reversal.


Subject(s)
Induced Pluripotent Stem Cells , Neoplasms , Animals , Biology , Cell Differentiation/genetics , Embryonic Stem Cells , Humans , Neoplasms/genetics , Neoplasms/metabolism , Regenerative Medicine
10.
G3 (Bethesda) ; 12(10)2022 09 30.
Article in English | MEDLINE | ID: mdl-35976114

ABSTRACT

Along with specialized functions, cells of multicellular organisms also perform essential functions common to most if not all cells. Whether diverse cells do this by using the same set of genes, interacting in a fixed coordinated fashion to execute essential functions, or a subset of genes specific to certain cells, remains a central question in biology. Here, we focus on gene coexpression to search for a core cellular network across a whole organism. Single-cell RNA-sequencing measures gene expression of individual cells, enabling researchers to discover gene expression patterns that contribute to the diversity of cell functions. Current efforts to study cellular functions focus primarily on identifying differentially expressed genes across cells. However, patterns of coexpression between genes are probably more indicative of biological processes than are the expression of individual genes. We constructed cell-type-specific gene coexpression networks using single-cell transcriptome datasets covering diverse cell types from the fruit fly, Drosophila melanogaster. We detected a set of highly coordinated genes preserved across cell types and present this as the best estimate of a core cellular network. This core is very small compared with cell-type-specific gene coexpression networks and shows dense connectivity. Gene members of this core tend to be ancient genes and are enriched for those encoding ribosomal proteins. Overall, we find evidence for a core cellular network in diverse cell types of the fruit fly. The topological, structural, functional, and evolutionary properties of this core indicate that it accounts for only a minority of essential functions.


Subject(s)
Drosophila , Transcriptome , Animals , Drosophila/genetics , Drosophila melanogaster/genetics , Gene Expression Profiling , Gene Regulatory Networks , RNA , Ribosomal Proteins/genetics
11.
Gene ; 817: 146168, 2022 Apr 05.
Article in English | MEDLINE | ID: mdl-34995731

ABSTRACT

Many studies in the model species Arabidopsis thaliana characterized genes involved in embryo formation. However, much remains to be learned about the portfolio of genes that are involved in signal transduction and transcriptional regulation during plant embryo development in other species, particularly in an evolutionary context, especially considering that some genes involved in embryo patterning are not exclusive of land plants. This study, used a combination of domain architecture phylostratigraphy and phylogenetic reconstruction to investigate the evolutionary history of embryo patterning and auxin metabolism (EPAM) genes in Viridiplantae. This approach shed light on the co-optation of auxin metabolism and other molecular mechanisms that contributed to the radiation of land plants, and specifically to embryo formation. These results have potential to assist conservation programs, by directing the development of tools for obtaining somatic embryos. In this context, we employed this methodology with critically endangered and non-model species Araucaria angustifolia, the Brazilian pine, which is current focus of conservation efforts using somatic embryogenesis. So far, this approach had little success since somatic embryos fail to completely develop. By profiling the expression of genes that we identified as necessary for the emergence of land-plant embryos, we found striking differences between zygotic and somatic embryos that might explain the developmental arrest and be used to improve A. angustifolia somatic culture.


Subject(s)
Araucaria/embryology , Araucaria/genetics , Indoleacetic Acids/metabolism , Plant Somatic Embryogenesis Techniques , Seeds/growth & development , Arabidopsis/genetics , Body Patterning , Evolution, Molecular , Phylogeny , Plant Development/genetics
12.
Mol Ecol Resour ; 22(4): 1559-1581, 2022 May.
Article in English | MEDLINE | ID: mdl-34839580

ABSTRACT

Many Drosophila species differ widely in their distributions and climate niches, making them excellent subjects for evolutionary genomic studies. Here, we have developed a database of high-quality assemblies for 46 Drosophila species and one closely related Zaprionus. Fifteen of the genomes were newly sequenced, and 20 were improved with additional sequencing. New or improved annotations were generated for all 47 species, assisted by new transcriptomes for 19. Phylogenomic analyses of these data resolved several previously ambiguous relationships, especially in the melanogaster species group. However, it also revealed significant phylogenetic incongruence among genes, mainly in the form of incomplete lineage sorting in the subgenus Sophophora but also including asymmetric introgression in the subgenus Drosophila. Using the phylogeny as a framework and taking into account these incongruences, we then screened the data for genome-wide signals of adaptation to different climatic niches. First, phylostratigraphy revealed relatively high rates of recent novel gene gain in three temperate pseudoobscura and five desert-adapted cactophilic mulleri subgroup species. Second, we found differing ratios of nonsynonymous to synonymous substitutions in several hundred orthologues between climate generalists and specialists, with trends for significantly higher ratios for those in tropical and lower ratios for those in temperate-continental specialists respectively than those in the climate generalists. Finally, resequencing natural populations of 13 species revealed tropics-restricted species generally had smaller population sizes, lower genome diversity and more deleterious mutations than the more widespread species. We conclude that adaptation to different climates in the genus Drosophila has been associated with large-scale and multifaceted genomic changes.


Subject(s)
Drosophila , Genome , Adaptation, Physiological/genetics , Animals , Drosophila/genetics , Genomics , Humans , Phylogeny
13.
Int J Mol Sci ; 22(21)2021 Oct 28.
Article in English | MEDLINE | ID: mdl-34769071

ABSTRACT

The growth of complexity in evolution is a most intriguing phenomenon. Using gene phylostratigraphy, we showed this growth (as reflected in regulatory mechanisms) in the human genome, tracing the path from prokaryotes to hominids. Generally, the different regulatory gene families expanded at different times, yet only up to the Euteleostomi (bony vertebrates). The only exception was the expansion of transcription factors (TF) in placentals; however, we argue that this was not related to increase in general complexity. Surprisingly, although TF originated in the Prokaryota while chromatin appeared only in the Eukaryota, the expansion of epigenetic factors predated the expansion of TF. Signaling receptors, tumor suppressors, oncogenes, and aging- and disease-associated genes (indicating vulnerabilities in terms of complex organization and strongly enrichment in regulatory genes) also expanded only up to the Euteleostomi. The complexity-related gene properties (protein size, number of alternative splicing mRNA, length of untranslated mRNA, number of biological processes per gene, number of disordered regions in a protein, and density of TF-TF interactions) rose in multicellular organisms and declined after the Euteleostomi, and possibly earlier. At the same time, the speed of protein sequence evolution sharply increased in the genes that originated after the Euteleostomi. Thus, several lines of evidence indicate that molecular mechanisms of complexity growth were changing with time, and in the phyletic lineage leading to humans, the most salient shift occurred after the basic vertebrate body plan was fixed with bony skeleton. The obtained results can be useful for evolutionary medicine.


Subject(s)
Evolution, Molecular , Gene Regulatory Networks , Genome, Human , Animals , Epigenesis, Genetic , Hominidae/genetics , Humans , Multigene Family , Oncogenes , Prokaryotic Cells/metabolism , Transcription Factors/genetics
14.
BMC Genomics ; 22(1): 794, 2021 Nov 04.
Article in English | MEDLINE | ID: mdl-34736418

ABSTRACT

BACKGROUND: The present availability of full genome sequences of a broad range of animal species across the whole range of evolutionary history enables one to ask questions as to the distribution of genes across the chromosomes. Do newly recruited genes, as new clades emerge, distribute at random or at non-random locations? RESULTS: We extracted values for the ages of the human genes and for their current chromosome locations, from published sources. A quantitative analysis showed that the distribution of newly-added genes among and within the chromosomes appears to be increasingly non-random if one observes animals along the evolutionary series from the precursors of the tetrapoda through to the great apes, whereas the oldest genes are randomly distributed. CONCLUSIONS: Randomization will result from chromosome evolution, but less and less time is available for this process as evolution proceeds. Much of the bunching of recently-added genes arises from new gene formation as paralogues in gene families, near the location of genes that were recruited in the preceding phylostratum. As examples we cite the KRTAP, ZNF, OR and some minor gene families. We show that bunching can also result from the evolution of the chromosomes themselves when, as for the KRTAP genes, blocks of genes that had previously been on disparate chromosomes become linked together.


Subject(s)
Evolution, Molecular , Genome , Animals , Chromosomes/genetics , Humans
15.
Prog Biophys Mol Biol ; 165: 49-55, 2021 10.
Article in English | MEDLINE | ID: mdl-34371024

ABSTRACT

Cancer or cancer-like phenomena pervade multicellular life, implying deep evolutionary roots. Many of the hallmarks of cancer recapitulate unicellular modalities, suggesting that cancer initiation and progression represent a systematic reversion to simpler ancestral phenotypes in response to a stress or insult. This so-called atavism theory may be tested using phylostratigraphy, which can be used to assign ages to genes. Several research groups have confirmed that cancer cells tend to over-express evolutionary older genes, and rewire the architecture linking unicellular and multicellular gene networks. In addition, some of the elevated mutation rate - a well-known hallmark of cancer - is actually self-inflicted, driven by genes found to be homologs of the ancient SOS genes activated in stressed bacteria, and employed to evolve biological workarounds. These findings have obvious implications for therapy.


Subject(s)
Neoplasms , Bacteria/genetics , Biological Evolution , Gene Regulatory Networks , Humans , Neoplasms/genetics , Phenotype
16.
Bioessays ; 43(7): e2000305, 2021 07.
Article in English | MEDLINE | ID: mdl-33984158

ABSTRACT

It has long been recognized that cancer onset and progression represent a type of reversion to an ancestral quasi-unicellular phenotype. This general concept has been refined into the atavistic model of cancer that attempts to provide a quantitative analysis and testable predictions based on genomic data. Over the past decade, support for the multicellular-to-unicellular reversion predicted by the atavism model has come from phylostratigraphy. Here, we propose that cancer onset and progression involve more than a one-off multicellular-to-unicellular reversion, and are better described as a series of reversionary transitions. We make new predictions based on the chronology of the unicellular-eukaryote-to-multicellular-eukaryote transition. We also make new predictions based on three other evolutionary transitions that occurred in our lineage: eukaryogenesis, oxidative phosphorylation and the transition to adaptive immunity. We propose several modifications to current phylostratigraphy to improve age resolution to test these predictions. Also see the video abstract here: https://youtu.be/3unEu5JYJrQ.


Subject(s)
Biological Evolution , Neoplasms , Eukaryota , Eukaryotic Cells , Humans , Neoplasms/genetics , Phenotype
17.
Plant J ; 107(1): 315-336, 2021 07.
Article in English | MEDLINE | ID: mdl-33901335

ABSTRACT

Coastal regions contribute an estimated 20% of annual gross primary production in the oceans, despite occupying only 0.03% of their surface area. Diatoms frequently dominate coastal sediments, where they experience large variations in light regime resulting from the interplay of diurnal and tidal cycles. Here, we report on an extensive diurnal transcript profiling experiment of the motile benthic diatom Seminavis robusta. Nearly 90% (23 328) of expressed protein-coding genes and 66.9% (1124) of expressed long intergenic non-coding RNAs showed significant expression oscillations and are predominantly phasing at night with a periodicity of 24 h. Phylostratigraphic analysis found that rhythmic genes are enriched in highly conserved genes, while diatom-specific genes are predominantly associated with midnight expression. Integration of genetic and physiological cell cycle markers with silica depletion data revealed potential new silica cell wall-associated gene families specific to diatoms. Additionally, we observed 1752 genes with a remarkable semidiurnal (12-h) periodicity, while the expansion of putative circadian transcription factors may reflect adaptations to cope with highly unpredictable external conditions. Taken together, our results provide new insights into the adaptations of diatoms to the benthic environment and serve as a valuable resource for the study of diurnal regulation in photosynthetic eukaryotes.


Subject(s)
Adaptation, Physiological , Circadian Rhythm/genetics , Diatoms/cytology , Diatoms/physiology , Gene Expression , Cell Cycle/genetics , Cell Wall/genetics , Cell Wall/metabolism , Chloroplasts/genetics , Enzymes/genetics , Enzymes/metabolism , Evolution, Molecular , Mitochondria/genetics , Phylogeny , Plankton/genetics , Plankton/physiology , RNA, Long Noncoding
18.
Elife ; 102021 01 08.
Article in English | MEDLINE | ID: mdl-33416492

ABSTRACT

Extant protein-coding sequences span a huge range of ages, from those that emerged only recently to those present in the last universal common ancestor. Because evolution has had less time to act on young sequences, there might be 'phylostratigraphy' trends in any properties that evolve slowly with age. A long-term reduction in hydrophobicity and hydrophobic clustering was found in previous, taxonomically restricted studies. Here we perform integrated phylostratigraphy across 435 fully sequenced species, using sensitive HMM methods to detect protein domain homology. We find that the reduction in hydrophobic clustering is universal across lineages. However, only young animal domains have a tendency to have higher structural disorder. Among ancient domains, trends in amino acid composition reflect the order of recruitment into the genetic code, suggesting that the composition of the contemporary descendants of ancient sequences reflects amino acid availability during the earliest stages of life, when these sequences first emerged.


Subject(s)
Amino Acid Sequence , Evolution, Molecular , Genetic Code , Phylogeny , Animals , Fungi/classification , Fungi/genetics , Plants/classification , Plants/genetics , Trypanosomatina/classification , Trypanosomatina/growth & development
19.
FEMS Yeast Res ; 20(2)2020 03 01.
Article in English | MEDLINE | ID: mdl-32009158

ABSTRACT

Over the past decade, improvements in technology and methods have enabled rapid and relatively inexpensive generation of high-quality RNA-seq datasets. These datasets have been used to characterize gene expression for several yeast species and have provided systems-level insights for basic biology, biotechnology and medicine. Herein, we discuss new techniques that have emerged and existing techniques that enable analysts to extract information from multifactorial yeast RNA-seq datasets. Ultimately, this minireview seeks to inspire readers to query datasets, whether previously published or freshly obtained, with creative and diverse methods to discover and support novel hypotheses.


Subject(s)
Data Analysis , RNA, Fungal/genetics , RNA-Seq/statistics & numerical data , Sequence Analysis, RNA/methods , Sequence Analysis, RNA/statistics & numerical data , Yeasts/genetics , Datasets as Topic , Gene Expression Profiling , RNA-Seq/methods , Transcriptome
20.
Mol Biol Evol ; 37(6): 1667-1678, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32061128

ABSTRACT

Bacilli can form dormant, highly resistant, and metabolically inactive spores to cope with extreme environmental challenges. In this study, we examined the evolutionary age of Bacillus subtilis sporulation genes using the approach known as genomic phylostratigraphy. We found that B. subtilis sporulation genes cluster in several groups that emerged at distant evolutionary time-points, suggesting that the sporulation process underwent several stages of expansion. Next, we asked whether such evolutionary stratification of the genome could be used to predict involvement in sporulation of presently uncharacterized genes (y-genes). We individually inactivated a representative sample of uncharacterized genes that arose during the same evolutionary periods as the known sporulation genes and tested the resulting strains for sporulation phenotypes. Sporulation was significantly affected in 16 out of 37 (43%) tested strains. In addition to expanding the knowledge base on B. subtilis sporulation, our findings suggest that evolutionary age could be used to help with genome mining.


Subject(s)
Bacillus subtilis/physiology , Evolution, Molecular , Genome, Bacterial , Spores, Bacterial , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL