Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genome Res ; 34(6): 888-903, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38977308

RESUMO

Species-specific genes, also known as orphans, are ubiquitous across life's domains. In prokaryotes, species-specific orphan genes (SSOGs) are mostly thought to originate in external elements such as viruses followed by horizontal gene transfer, whereas the scenario of native origination, through rapid divergence or de novo, is mostly dismissed. However, quantitative evidence supporting either scenario is lacking. Here, we systematically analyzed genomes from 4644 human gut microbiome species and identified more than 600,000 unique SSOGs, representing an average of 2.6% of a given species' pangenome. These sequences are mostly rare within each species yet show signs of purifying selection. Overall, SSOGs use optimal codons less frequently, and their proteins are more disordered than those of conserved genes (i.e., non-SSOGs). Importantly, across species, the GC content of SSOGs closely matches that of conserved ones. In contrast, the ∼5% of SSOGs that share similarity to known viral sequences have distinct characteristics, including lower GC content. Thus, SSOGs with similarity to viruses differ from the remaining SSOGs, contrasting an external origination scenario for most of them. By examining the orthologous genomic region in closely related species, we show that a small subset of SSOGs likely evolved natively de novo and find that these genes also differ in their properties from the remaining SSOGs. Our results challenge the notion that external elements are the dominant source of prokaryotic genetic novelty and will enable future studies into the biological role and relevance of species-specific genes in the human gut.


Assuntos
Evolução Molecular , Microbioma Gastrointestinal , Especificidade da Espécie , Humanos , Microbioma Gastrointestinal/genética , Composição de Bases , Filogenia
2.
Nucleic Acids Res ; 51(13): 6927-6943, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37254817

RESUMO

Casposons are transposable elements containing the CRISPR associated gene Cas1solo. Identified in many archaeal genomes, casposons are discussed as the origin of CRISPR-Cas systems due to their proposed Cas1solo-dependent translocation. However, apart from bioinformatic approaches and the demonstration of Cas1solo integrase and endonuclease activity in vitro, casposon transposition has not yet been shown in vivo. Here, we report on active casposon translocations in Methanosarcina mazei Gö1 using two independent experimental approaches. First, mini-casposons, consisting of a R6Kγ origin and two antibiotic resistance cassettes, flanked by target site duplications (TSDs) and terminal inverted repeats (TIRs), were generated, and shown to actively translocate from a suicide plasmid and integrate into the chromosomal MetMaz-C1 TSD IS1a. Second, casposon excision activity was confirmed in a long-term evolution experiment using a Cas1solo overexpression strain in comparison to an empty vector control under four different treatments (native, high temperature, high salt, mitomycin C) to study stress-induced translocation. Analysis of genomic DNA using a nested qPCR approach provided clear evidence of casposon activity in single cells and revealed significantly different casposon excision frequencies between treatments and strains. Our results, providing the first experimental evidence for in vivo casposon activity are summarized in a modified hypothetical translocation model.


Assuntos
Elementos de DNA Transponíveis , Methanosarcina , Humanos , Proteínas Arqueais/genética , Integrases/genética , Methanosarcina/genética , Plasmídeos/genética , Sequências Repetidas Terminais , Translocação Genética
3.
Mol Biol Evol ; 40(3)2023 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-36917489

RESUMO

Intergenic genomic regions have essential regulatory and structural roles that impose constraints on their sequences. But regions that do not currently encode proteins also carry the potential to do so in the future. De novo gene emergence, the evolution of novel genes out of previously noncoding sequences has now been established as a potent force for genomic novelty. Recently, it was shown that intergenic regions in the genome of Saccharomyces cerevisiae harbor pervasive cryptic potential to, if theoretically translated, form transmembrane domains (TM domains) more frequently than expected by chance given their nucleotide composition, a property that we refer to as TM-forming enrichment. The source and biological relevance of this property is unknown. Here, we expand the investigation into the TM-forming potential of intergenic regions to the entire Saccharomycotina budding yeast subphylum, in an effort to explain this property and understand its importance. We find pervasive but variable enrichment in TM-forming potential across the subphylum regardless of the composition and average size of intergenic regions. This cryptic property is evenly spread across the genome, cannot be explained by the hydrophobic content of the sequence, and does not appear to localize to regions containing regulatory motifs. This TM-forming enrichment specifically, and not the actual TM-forming potential, is associated, across genomes, with more TM domains in evolutionarily young genes. Our findings shed light on this newly discovered feature of yeast genomes and constitute a first step toward understanding its evolutionary importance.


Assuntos
Saccharomycetales , Leveduras , DNA Intergênico/genética , Leveduras/genética , Saccharomyces cerevisiae/genética , Genômica , Genoma , Saccharomycetales/genética
4.
Mol Biol Evol ; 35(3): 631-645, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29220506

RESUMO

New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.


Assuntos
Evolução Molecular , Proteínas Fúngicas/genética , Saccharomyces/genética , Fatores Etários , Sequência de Bases , Sequência Conservada , Regiões Promotoras Genéticas , Recombinação Genética , Seleção Genética
5.
Genome Res ; 26(7): 918-32, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27247244

RESUMO

Reconstructing genome history is complex but necessary to reveal quantitative principles governing genome evolution. Such reconstruction requires recapitulating into a single evolutionary framework the evolution of genome architecture and gene repertoire. Here, we reconstructed the genome history of the genus Lachancea that appeared to cover a continuous evolutionary range from closely related to more diverged yeast species. Our approach integrated the generation of a high-quality genome data set; the development of AnChro, a new algorithm for reconstructing ancestral genome architecture; and a comprehensive analysis of gene repertoire evolution. We found that the ancestral genome of the genus Lachancea contained eight chromosomes and about 5173 protein-coding genes. Moreover, we characterized 24 horizontal gene transfers and 159 putative gene creation events that punctuated species diversification. We retraced all chromosomal rearrangements, including gene losses, gene duplications, chromosomal inversions and translocations at single gene resolution. Gene duplications outnumbered losses and balanced rearrangements with 1503, 929, and 423 events, respectively. Gene content variations between extant species are mainly driven by differential gene losses, while gene duplications remained globally constant in all lineages. Remarkably, we discovered that balanced chromosomal rearrangements could be responsible for up to 14% of all gene losses by disrupting genes at their breakpoints. Finally, we found that nonsynonymous substitutions reached fixation at a coordinated pace with chromosomal inversions, translocations, and duplications, but not deletions. Overall, we provide a granular view of genome evolution within an entire eukaryotic genus, linking gene content, chromosome rearrangements, and protein divergence into a single evolutionary framework.


Assuntos
Ascomicetos/genética , Cromossomos Fúngicos/genética , Evolução Molecular , Rearranjo Gênico , Genoma Fúngico , Modelos Genéticos , Filogenia
6.
Yeast ; 36(7): 425-437, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30963617

RESUMO

The sequencing of over a thousand Saccharomyces cerevisiae genomes revealed a complex pangenome. Over one third of the discovered genes are not present in the S. cerevisiae core genome but instead are often restricted to a subset of yeast isolates and thus may be important for adaptation to specific environmental niches. We refer to these genes as "pan-genes," being part of the pangenome but not the core genome. Here, we describe the evolutionary journey and characterisation of a novel pan-gene, originally named hypothetical (HYPO) open-reading frame. Phylogenetic analysis reveals that HYPO has been predominantly retained in S. cerevisiae strains associated with brewing but has been repeatedly lost in most other fungal species during evolution. There is also evidence that HYPO was horizontally transferred at least once, from S. cerevisiae to Saccharomyces paradoxus. The phylogenetic analysis of HYPO exemplifies the complexity and intricacy of evolutionary trajectories of genes within the S. cerevisiae pangenome. To examine possible functions for Hypo, we overexpressed a HYPO-GFP fusion protein in both S. cerevisiae and Saccharomyces pastorianus. The protein localised to the plasma membrane where it accumulated initially in distinct foci. Time-lapse fluorescent imaging revealed that when cells are grown in wort, Hypo-gfp fluorescence spreads throughout the membrane during cell growth. The overexpression of Hypo-gfp in S. cerevisiae or S. pastorianus strains did not significantly alter cell growth in medium-containing glucose, maltose, maltotriose, or wort at different concentrations.


Assuntos
Cerveja/microbiologia , Proteínas Fúngicas/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/isolamento & purificação , Membrana Celular/metabolismo , Cromossomos Fúngicos/genética , Evolução Molecular , Proteínas Fúngicas/metabolismo , Deleção de Genes , Expressão Gênica , Transferência Genética Horizontal , Genoma Fúngico/genética , Fases de Leitura Aberta , Saccharomyces/classificação , Saccharomyces/genética , Saccharomyces/crescimento & desenvolvimento , Saccharomyces/isolamento & purificação , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/crescimento & desenvolvimento
7.
Genome Biol Evol ; 16(8)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39004885

RESUMO

New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.


Assuntos
Evolução Molecular , Saccharomyces cerevisiae/genética , Filogenia , Genoma Fúngico , Genes Fúngicos
8.
Nat Genet ; 56(6): 1278-1287, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38778243

RESUMO

Gene expression is an essential step in the translation of genotypes into phenotypes. However, little is known about the transcriptome architecture and the underlying genetic effects at the species level. Here we generated and analyzed the pan-transcriptome of ~1,000 yeast natural isolates across 4,977 core and 1,468 accessory genes. We found that the accessory genome is an underappreciated driver of transcriptome divergence. Global gene expression patterns combined with population structure showed that variation in heritable expression mainly lies within subpopulation-specific signatures, for which accessory genes are overrepresented. Genome-wide association analyses consistently highlighted that accessory genes are associated with proportionally more variants with larger effect sizes, illustrating the critical role of the accessory genome on the transcriptional landscape within and between populations.


Assuntos
Regulação Fúngica da Expressão Gênica , Genoma Fúngico , Estudo de Associação Genômica Ampla , Saccharomyces cerevisiae , Transcriptoma , Saccharomyces cerevisiae/genética , Variação Genética , Perfilação da Expressão Gênica/métodos , Genótipo , Polimorfismo de Nucleotídeo Único
9.
Cell Rep ; 41(12): 111808, 2022 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-36543139

RESUMO

Small open reading frames (sORFs) can encode functional "microproteins" that perform crucial biological tasks. However, their size makes them less amenable to genomic analysis, and their origins and conservation are poorly understood. Given their short length, it is plausible that some of these functional microproteins have recently originated entirely de novo from noncoding sequences. Here we sought to identify such cases in the human lineage by reconstructing the evolutionary origins of human microproteins previously found to have measurable, statistically significant fitness effects. By tracing the formation of each ORF and its transcriptional activation, we show that novel microproteins with significant phenotypic effects have emerged de novo throughout animal evolution, including two after the human-chimpanzee split. Notably, traditional methods for assessing coding potential would miss most of these cases. This evidence demonstrates that the functional potential intrinsic to sORFs can be relatively rapidly and frequently realized through de novo gene emergence.


Assuntos
Evolução Molecular , Hominidae , Animais , Humanos , Hominidae/genética , Genoma , Fases de Leitura Aberta/genética , Pan troglodytes , Micropeptídeos
10.
NAR Genom Bioinform ; 4(4): lqac086, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36381424

RESUMO

Nearly one third of Saccharomyces cerevisiae protein coding sequences correspond to duplicate genes, equally split between small-scale duplicates (SSD) and whole-genome duplicates (WGD). While duplicate genes have distinct properties compared to singletons, to date, there has been no systematic analysis of their positional preferences. In this work, we show that SSD and WGD genes are organized in distinct gene clusters that occupy different genomic regions, with SSD being more peripheral and WGD more centrally positioned close to centromeric chromatin. Duplicate gene clusters differ from the rest of the genome in terms of gene size and spacing, gene expression variability and regulatory complexity, properties that are also shared by singleton genes residing within them. Singletons within duplicate gene clusters have longer promoters, more complex structure and a higher number of protein-protein interactions. Particular chromatin architectures appear to be important for gene evolution, as we find SSD gene-pair co-expression to be strongly associated with the similarity of nucleosome positioning patterns. We propose that specific regions of the yeast genome provide a favourable environment for the generation and maintenance of small-scale gene duplicates, segregating them from WGD-enriched genomic domains. Our findings provide a valuable framework linking genomic innovation with positional genomic preferences.

11.
Pathogens ; 10(9)2021 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-34578206

RESUMO

High-throughput sequencing (HTS) technologies and bioinformatic analyses are of growing interest to be used as a routine diagnostic tool in the field of plant viruses. The reliability of HTS workflows from sample preparation to data analysis and results interpretation for plant virus detection and identification must be evaluated (verified and validated) to approve this tool for diagnostics. Many different extraction methods, library preparation protocols, and sequence and bioinformatic pipelines are available for virus sequence detection. To assess the performance of plant virology diagnostic laboratories in using the HTS of ribosomal RNA depleted total RNA (ribodepleted totRNA) as a diagnostic tool, we carried out an interlaboratory comparison study in which eight participants were required to use the same samples, (RNA) extraction kit, ribosomal RNA depletion kit, and commercial sequencing provider, but also their own bioinformatics pipeline, for analysis. The accuracy of virus detection ranged from 65% to 100%. The false-positive detection rate was very low and was related to the misinterpretation of results as well as to possible cross-contaminations in the lab or sequencing provider. The bioinformatic pipeline used by each laboratory influenced the correct detection of the viruses of this study. The main difficulty was the detection of a novel virus as its sequence was not available in a publicly accessible database at the time. The raw data were reanalysed using Virtool to assess its ability for virus detection. All virus sequences were detected using Virtool in the different pools. This study revealed that the ribodepletion target enrichment for sample preparation is a reliable approach for the detection of plant viruses with different genomes. A significant level of virology expertise is needed to correctly interpret the results. It is also important to improve and complete the reference data.

12.
Elife ; 92020 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-32066524

RESUMO

The origin of 'orphan' genes, species-specific sequences that lack detectable homologues, has remained mysterious since the dawn of the genomic era. There are two dominant explanations for orphan genes: complete sequence divergence from ancestral genes, such that homologues are not readily detectable; and de novo emergence from ancestral non-genic sequences, such that homologues genuinely do not exist. The relative contribution of the two processes remains unknown. Here, we harness the special circumstance of conserved synteny to estimate the contribution of complete divergence to the pool of orphan genes. By separately comparing yeast, fly and human genes to related taxa using conservative criteria, we find that complete divergence accounts, on average, for at most a third of eukaryotic orphan and taxonomically restricted genes. We observe that complete divergence occurs at a stable rate within a phylum but at different rates between phyla, and is frequently associated with gene shortening akin to pseudogenization.


Assuntos
Evolução Molecular , Genes/genética , Sintenia/genética , Animais , Sequência Conservada/genética , Drosophila melanogaster , Humanos , Filogenia , Saccharomyces cerevisiae , Homologia de Sequência
13.
Virus Res ; 280: 197899, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-32067976

RESUMO

The Plasma membrane Cation binding Protein 1 (PCaP1) has been shown to be important for the intra-cellular movement of two members of the Potyvirus genus in arabidopsis and tobacco plants. In this study, the orthologous PCaP1 gene of pepper (Capsicum annuum) was examined for its role in the accumulation of Potato virus Y, type member of the Potyvirus. Downregulation of C. annuum PCaP (CaPCaP) through tobacco rattle virus-induced gene silencing, resulted in lower accumulation of potato virus Y (PVY) in pepper plants. Using an improved pepper protoplast isolation protocol, we showed that knockdown of CaPCaP negatively affected PVY accumulation at the within-cell level in pepper in contrast with the turnip mosaic virus-arabidopsis pathosystem. Conversely, following overexpression of CaPCaP, the accumulation of PVY at the systemic level was increased. The results provide further knowledge on the role of PCaP in the potyvirus infection process and reveal differences of its action among different pathosystems.


Assuntos
Capsicum/virologia , Proteínas de Membrana/genética , Proteínas de Plantas/genética , Potyvirus/fisiologia , Protoplastos/virologia , Cátions , Técnicas de Silenciamento de Genes , Proteínas de Membrana/metabolismo , Doenças das Plantas/virologia , Proteínas de Plantas/metabolismo , Potyvirus/genética
14.
Nat Commun ; 11(1): 781, 2020 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-32034123

RESUMO

Recent evidence demonstrates that novel protein-coding genes can arise de novo from non-genic loci. This evolutionary innovation is thought to be facilitated by the pervasive translation of non-genic transcripts, which exposes a reservoir of variable polypeptides to natural selection. Here, we systematically characterize how these de novo emerging coding sequences impact fitness in budding yeast. Disruption of emerging sequences is generally inconsequential for fitness in the laboratory and in natural populations. Overexpression of emerging sequences, however, is enriched in adaptive fitness effects compared to overexpression of established genes. We find that adaptive emerging sequences tend to encode putative transmembrane domains, and that thymine-rich intergenic regions harbor a widespread potential to produce transmembrane domains. These findings, together with in-depth examination of the de novo emerging YBR196C-A locus, suggest a novel evolutionary model whereby adaptive transmembrane polypeptides emerge de novo from thymine-rich non-genic regions and subsequently accumulate changes molded by natural selection.


Assuntos
Evolução Molecular , Proteínas de Membrana/genética , Proteínas de Saccharomyces cerevisiae/genética , Fatores Associados à Proteína de Ligação a TATA/genética , Timina , Fator de Transcrição TFIID/genética , Adaptação Biológica/genética , Retículo Endoplasmático/genética , Retículo Endoplasmático/metabolismo , Regulação Fúngica da Expressão Gênica , Aptidão Genética , Membranas Intracelulares/metabolismo , Proteínas de Membrana/química , Fases de Leitura Aberta , Domínios Proteicos/genética , Saccharomyces cerevisiae/genética
15.
Methods Mol Biol ; 1851: 63-81, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30298392

RESUMO

De novo genes, that is, protein-coding genes originating from previously noncoding sequence, have gone from being considered impossibly unlikely to being recognized as an important source of genetic novelty in eukaryotic genomes. It is clear that de novo gene evolution is a rare but consistent feature of eukaryotic genomes, being detected in every genome studied. However, different studies often use different computational methods, and the numbers and identities of the detected genes vary greatly. Here we present a coherent protocol for the computational identification of de novo genes by comparative genomics. The method described uses homology searches, identification of syntenic regions, and ancestral sequence reconstruction to produce high-confidence candidates with robust evidence of de novo emergence. It is designed to be easily applicable given the basic knowledge of bioinformatic tools and scalable so that it can be applied on large and small datasets.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Sequência de Aminoácidos , Evolução Molecular , Filogenia , Proteínas/classificação , Proteínas/genética , Sintenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA