Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Biol Evol ; 40(1)2023 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-36508357

RESUMO

Interspecies RNA-Seq datasets are increasingly common, and have the potential to answer new questions about the evolution of gene expression. Single-species differential expression analysis is now a well-studied problem that benefits from sound statistical methods. Extensive reviews on biological or synthetic datasets have provided the community with a clear picture on the relative performances of the available methods in various settings. However, synthetic dataset simulation tools are still missing in the interspecies gene expression context. In this work, we develop and implement a new simulation framework. This tool builds on both the RNA-Seq and the phylogenetic comparative methods literatures to generate realistic count datasets, while taking into account the phylogenetic relationships between the samples. We illustrate the usefulness of this new framework through a targeted simulation study, that reproduces the features of a recently published dataset, containing gene expression data in adult eye tissue across blind and sighted freshwater crayfish species. Using our simulated datasets, we perform a fair comparison of several approaches used for differential expression analysis. This benchmark reveals some of the strengths and weaknesses of both the classical and phylogenetic approaches for interspecies differential expression analysis, and allows for a reanalysis of the crayfish dataset. The tool has been integrated in the R package compcodeR, freely available on Bioconductor.


Assuntos
Perfilação da Expressão Gênica , Software , RNA-Seq , Filogenia , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos
2.
Genome Res ; 31(12): 2303-2315, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34810219

RESUMO

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.

3.
RNA Biol ; 19(1): 1208-1227, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-36384383

RESUMO

This study investigates the importance of the structural context in the formation of a type I/II A-minor motif. This very frequent structural motif has been shown to be important in the spatial folding of RNA molecules. We developed an automated method to classify A-minor motif occurrences according to their 3D context similarities, and we used a graph approach to represent both the structural A-minor motif occurrences and their classes at different scales. This approach leads us to uncover new subclasses of A-minor motif occurrences according to their local 3D similarities. The majority of classes are composed of homologous occurrences, but some of them are composed of non-homologous occurrences. The different classifications we obtain allow us to better understand the importance of the context in the formation of A-minor motifs. In a second step, we investigate how much knowledge of the context around an A-minor motif can help to infer its presence (and position). More specifically, we want to determine what kind of information, contained in the structural context, can be useful to characterize and predict A-minor motifs. We show that, for some A-minor motifs, the topology combined with a sequence signal is sufficient to predict the presence and the position of an A-minor motif occurrence. In most other cases, these signals are not sufficient for predicting the A-minor motif, however we show that they are good signals for this purpose. All the classification and prediction pipelines rely on automated processes, for which we describe the underlying algorithms and parameters.


Assuntos
Imageamento Tridimensional , RNA , Algoritmos , Valor Preditivo dos Testes , Humanos , RNA/química
4.
PLoS Comput Biol ; 14(3): e1005992, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29543809

RESUMO

We present a new educational initiative called Meet-U that aims to train students for collaborative work in computational biology and to bridge the gap between education and research. Meet-U mimics the setup of collaborative research projects and takes advantage of the most popular tools for collaborative work and of cloud computing. Students are grouped in teams of 4-5 people and have to realize a project from A to Z that answers a challenging question in biology. Meet-U promotes "coopetition," as the students collaborate within and across the teams and are also in competition with each other to develop the best final product. Meet-U fosters interactions between different actors of education and research through the organization of a meeting day, open to everyone, where the students present their work to a jury of researchers and jury members give research seminars. This very unique combination of education and research is strongly motivating for the students and provides a formidable opportunity for a scientific community to unite and increase its visibility. We report on our experience with Meet-U in two French universities with master's students in bioinformatics and modeling, with protein-protein docking as the subject of the course. Meet-U is easy to implement and can be straightforwardly transferred to other fields and/or universities. All the information and data are available at www.meet-u.org.


Assuntos
Biologia Computacional/educação , Biologia Computacional/métodos , Pesquisa/educação , Humanos , Projetos de Pesquisa , Estudantes , Universidades
5.
BMC Genomics ; 18(1): 667, 2017 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-28851275

RESUMO

BACKGROUND: The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. RESULTS: Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications. CONCLUSION: The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.


Assuntos
Cromossomos Fúngicos/genética , Colletotrichum/genética , Colletotrichum/metabolismo , Elementos de DNA Transponíveis/genética , Genômica , Família Multigênica/genética , Recombinação Homóloga/genética , Anotação de Sequência Molecular , Filogenia , Mutação Puntual/genética
6.
BMC Genomics ; 15 Suppl 6: S16, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25573073

RESUMO

BACKGROUND: In comparative genomics, orthologs are used to transfer annotation from genes already characterized to newly sequenced genomes. Many methods have been developed for finding orthologs in sets of genomes. However, the application of different methods on the same proteome set can lead to distinct orthology predictions. METHODS: We developed a method based on a meta-approach that is able to combine the results of several methods for orthologous group prediction. The purpose of this method is to produce better quality results by using the overlapping results obtained from several individual orthologous gene prediction procedures. Our method proceeds in two steps. The first aims to construct seeds for groups of orthologous genes; these seeds correspond to the exact overlaps between the results of all or several methods. In the second step, these seed groups are expanded by using HMM profiles. RESULTS: We evaluated our method on two standard reference benchmarks, OrthoBench and Orthology Benchmark Service. Our method presents a higher level of accurately predicted groups than the individual input methods of orthologous group prediction. Moreover, our method increases the number of annotated orthologous pairs without decreasing the annotation quality compared to twelve state-of-the-art methods. CONCLUSIONS: The meta-approach based method appears to be a reliable procedure for predicting orthologous groups. Since a large number of methods for predicting groups of orthologous genes exist, it is quite conceivable to apply this meta-approach to several combinations of different methods.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Evolução Molecular , Reprodutibilidade dos Testes
7.
Genome Biol ; 25(1): 268, 2024 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-39402662

RESUMO

BACKGROUND: Pervasive translation is a widespread phenomenon that plays a critical role in the emergence of novel microproteins, but the diversity of translation patterns contributing to their generation remains unclear. Based on 54 ribosome profiling (Ribo-Seq) datasets, we investigated the yeast Ribo-Seq landscape using a representation framework that allows the comprehensive inventory and classification of the entire diversity of Ribo-Seq signals, including non-canonical ones. RESULTS: We show that if coding regions occupy specific areas of the Ribo-Seq landscape, noncoding regions encompass a wide diversity of Ribo-Seq signals and, conversely, populate the entire landscape. Our results show that pervasive translation can, nevertheless, be associated with high specificity, with 1055 noncoding ORFs exhibiting canonical Ribo-Seq signals. Using mass spectrometry under standard conditions or proteasome inhibition with an in-house analysis protocol, we report 239 microproteins originating from noncoding ORFs that display canonical but also non-canonical Ribo-Seq signals. Each condition yields dozens of additional microprotein candidates with comparable translation properties, suggesting a larger population of volatile microproteins that are challenging to detect. Our findings suggest that non-canonical translation signals may harbor valuable information and underscore the significance of considering them in proteogenomic studies. Finally, we show that the translation outcome of a noncoding ORF is primarily determined by the initiating codon and the codon distribution in its two alternative frames, rather than features indicative of functionality. CONCLUSION: Our results enable us to propose a topology of a species' Ribo-Seq landscape, opening the way to comparative analyses of this translation landscape under different conditions.


Assuntos
Fases de Leitura Aberta , Biossíntese de Proteínas , Ribossomos , Saccharomyces cerevisiae , Ribossomos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Perfil de Ribossomos
8.
NAR Genom Bioinform ; 6(2): lqae069, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38915823

RESUMO

Microbial specialized metabolite biosynthetic gene clusters (SMBGCs) are a formidable source of natural products of pharmaceutical interest. With the multiplication of genomic data available, very efficient bioinformatic tools for automatic SMBGC detection have been developed. Nevertheless, most of these tools identify SMBGCs based on sequence similarity with enzymes typically involved in specialised metabolism and thus may miss SMBGCs coding for undercharacterised enzymes. Here we present Synteruptor (https://bioi2.i2bc.paris-saclay.fr/synteruptor), a program that identifies genomic islands, known to be enriched in SMBGCs, in the genomes of closely related species. With this tool, we identified a SMBGC in the genome of Streptomyces ambofaciens ATCC23877, undetected by antiSMASH versions prior to antiSMASH 5, and experimentally demonstrated that it directs the biosynthesis of two metabolites, one of which was identified as sphydrofuran. Synteruptor is also a valuable resource for the delineation of individual SMBGCs within antiSMASH regions that may encompass multiple clusters, and for refining the boundaries of these SMBGCs.

9.
BMC Genomics ; 14: 623, 2013 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-24034898

RESUMO

BACKGROUND: Candida glabrata follows C. albicans as the second or third most prevalent cause of candidemia worldwide. These two pathogenic yeasts are distantly related, C. glabrata being part of the Nakaseomyces, a group more closely related to Saccharomyces cerevisiae. Although C. glabrata was thought to be the only pathogenic Nakaseomyces, two new pathogens have recently been described within this group: C. nivariensis and C. bracarensis. To gain insight into the genomic changes underlying the emergence of virulence, we sequenced the genomes of these two, and three other non-pathogenic Nakaseomyces, and compared them to other sequenced yeasts. RESULTS: Our results indicate that the two new pathogens are more closely related to the non-pathogenic N. delphensis than to C. glabrata. We uncover duplications and accelerated evolution that specifically affected genes in the lineage preceding the group containing N. delphensis and the three pathogens, which may provide clues to the higher propensity of this group to infect humans. Finally, the number of Epa-like adhesins is specifically enriched in the pathogens, particularly in C. glabrata. CONCLUSIONS: Remarkably, some features thought to be the result of adaptation of C. glabrata to a pathogenic lifestyle, are present throughout the Nakaseomyces, indicating these are rather ancient adaptations to other environments. Phylogeny suggests that human pathogenesis evolved several times, independently within the clade. The expansion of the EPA gene family in pathogens establishes an evolutionary link between adhesion and virulence phenotypes. Our analyses thus shed light onto the relationships between virulence and the recent genomic changes that occurred within the Nakaseomyces. SEQUENCE ACCESSION NUMBERS: Nakaseomyces delphensis: CAPT01000001 to CAPT01000179Candida bracarensis: CAPU01000001 to CAPU01000251Candida nivariensis: CAPV01000001 to CAPV01000123Candida castellii: CAPW01000001 to CAPW01000101Nakaseomyces bacillisporus: CAPX01000001 to CAPX01000186.


Assuntos
Candida glabrata/classificação , Genoma Fúngico , Filogenia , Saccharomycetales/classificação , Candida glabrata/genética , DNA Fúngico/genética , Evolução Molecular , Saccharomycetales/genética , Seleção Genética , Análise de Sequência de DNA
10.
Sci Rep ; 13(1): 1417, 2023 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-36697464

RESUMO

We report here a new application, CustomProteinSearch (CusProSe), whose purpose is to help users to search for proteins of interest based on their domain composition. The application is customizable. It consists of two independent tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows the iterative construction of Hidden Markov Model (HMM) profiles for conserved domains of selected protein sequences, while ProSeCDA scans a proteome of interest against an HMM profile database, and annotates identified proteins using user-defined rules. CusProSe was successfully used to identify, in fungal genomes, genes encoding key enzyme families involved in secondary metabolism, such as polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS), hybrid PKS-NRPS and dimethylallyl tryptophan synthases (DMATS), as well as to characterize distinct terpene synthases (TS) sub-families. The highly configurable characteristics of this application makes it a generic tool, which allows the user to refine the function of predicted proteins, to extend detection to new enzymes families, and may also be applied to biological systems other than fungi and to other proteins than those involved in secondary metabolism.


Assuntos
Fungos , Anotação de Sequência Molecular , Metabolismo Secundário , Software , Sequência de Aminoácidos , Anotação de Sequência Molecular/métodos , Peptídeo Sintases/genética , Policetídeo Sintases/genética , Metabolismo Secundário/genética , Fungos/enzimologia , Fungos/genética , Triptofano Sintase/genética , Sequência Conservada/genética
11.
J Mol Evol ; 73(3-4): 230-43, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22094890

RESUMO

The recent availability of genome sequences of four different Fusarium species offers the opportunity to perform extensive comparative analyses, in particular of repeated sequences. In a recent work, the overall content of such sequences in the genomes of three phylogenetically related Fusarium species, F. graminearum, F. verticillioides, and F. oxysporum f. sp. lycopersici has been estimated. In this study, we present an exhaustive characterization of pogo-like elements, named Fots, in four Fusarium genomes. Overall 10 Fot and two Fot-related miniature inverted-repeat transposable element families were identified, revealing a diversification of multiple lineages of pogo-like elements, some of which accompanied by a gain of introns. This analysis also showed that such elements are present in an unusual high proportion in the genomes of F. oxysporum f. sp. lycopersici and Nectria haematococca (anamorph F. solani f. sp. pisi) in contrast with most other fungal genomes in which retroelements are the most represented. Interestingly, our analysis showed that the most numerous Fot families all contain potentially active or mobilisable copies, thus conferring a mutagenic potential of these transposable elements and consequently a role in strain adaptation and genome evolution. This role is strongly reinforced when examining their genomic distribution which is clearly biased with a high proportion (more than 80%) located on strain- or species-specific regions enriched in genes involved in pathogenicity and/or adaptation. Finally, the different reproductive characteristics of the four Fusarium species allowed us to investigate the impact of the process of repeat-induced point mutations on the expansion and diversification of Fot elements.


Assuntos
Elementos de DNA Transponíveis/genética , Fusarium/genética , Genoma Fúngico , Sequência de Bases , Análise por Conglomerados , Evolução Molecular , Dosagem de Genes , Funções Verossimilhança , Modelos Genéticos , Família Multigênica , Fases de Leitura Aberta , Filogenia , Polimorfismo Genético
12.
Nat Commun ; 12(1): 5221, 2021 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-34471117

RESUMO

Bacteria of the genus Streptomyces are prolific producers of specialized metabolites, including antibiotics. The linear chromosome includes a central region harboring core genes, as well as extremities enriched in specialized metabolite biosynthetic gene clusters. Here, we show that chromosome structure in Streptomyces ambofaciens correlates with genetic compartmentalization during exponential phase. Conserved, large and highly transcribed genes form boundaries that segment the central part of the chromosome into domains, whereas the terminal ends tend to be transcriptionally quiescent compartments with different structural features. The onset of metabolic differentiation is accompanied by a rearrangement of chromosome architecture, from a rather 'open' to a 'closed' conformation, in which highly expressed specialized metabolite biosynthetic genes form new boundaries. Thus, our results indicate that the linear chromosome of S. ambofaciens is partitioned into structurally distinct entities, suggesting a link between chromosome folding, gene expression and genome evolution.


Assuntos
Antibacterianos/metabolismo , Cromossomos Bacterianos , Streptomyces/genética , Streptomyces/metabolismo , Estruturas Cromossômicas , Regulação Bacteriana da Expressão Gênica , Genoma Bacteriano , Família Multigênica , Transcriptoma
13.
BMC Genomics ; 11: 81, 2010 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-20122162

RESUMO

BACKGROUND: More and more completely sequenced fungal genomes are becoming available and many more sequencing projects are in progress. This deluge of data should improve our knowledge of the various primary and secondary metabolisms of Fungi, including their synthesis of useful compounds such as antibiotics or toxic molecules such as mycotoxins. Functional annotation of many fungal genomes is imperfect, especially of genes encoding enzymes, so we need dedicated tools to analyze their metabolic pathways in depth. DESCRIPTION: FUNGIpath is a new tool built using a two-stage approach. Groups of orthologous proteins predicted using complementary methods of detection were collected in a relational database. Each group was further mapped on to steps in the metabolic pathways published in the public databases KEGG and MetaCyc. As a result, FUNGIpath allows the primary and secondary metabolisms of the different fungal species represented in the database to be compared easily, making it possible to assess the level of specificity of various pathways at different taxonomic distances. It is freely accessible at http://www.fungipath.u-psud.fr. CONCLUSIONS: As more and more fungal genomes are expected to be sequenced during the coming years, FUNGIpath should help progressively to reconstruct the ancestral primary and secondary metabolisms of the main branches of the fungal tree of life and to elucidate the evolution of these ancestral fungal metabolisms to various specific derived metabolisms.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Genoma Fúngico , Redes e Vias Metabólicas , Mineração de Dados , Fungos/genética
14.
Microb Genom ; 7(6)2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33749576

RESUMO

Streptomyces possess a large linear chromosome (6-12 Mb) consisting of a conserved central region flanked by variable arms covering several megabases. In order to study the evolution of the chromosome across evolutionary times, a representative panel of Streptomyces strains and species (125) whose chromosomes are completely sequenced and assembled was selected. The pan-genome of the genus was modelled and shown to be open with a core-genome reaching 1018 genes. The evolution of Streptomyces chromosome was analysed by carrying out pairwise comparisons, and by monitoring indexes measuring the conservation of genes (presence/absence) and their synteny along the chromosome. Using the phylogenetic depth offered by the chosen panel, it was possible to infer that within the central region of the chromosome, the core-genes form a highly conserved organization, which can reveal the existence of an ancestral chromosomal skeleton. Conversely, the chromosomal arms, enriched in variable genes evolved faster than the central region under the combined effect of rearrangements and addition of new information from horizontal gene transfer. The genes hosted in these regions may be localized there because of the adaptive advantage that their rapid evolution may confer. We speculate that (i) within a bacterial population, the variability of these genes may contribute to the establishment of social characters by the production of 'public goods' (ii) at the evolutionary scale, this variability contributes to the diversification of the genetic pool of the bacteria.

15.
mBio ; 10(5)2019 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-31481382

RESUMO

In this work, by comparing genomes of closely related individuals of Streptomyces isolated at a spatial microscale (millimeters or centimeters), we investigated the extent and impact of horizontal gene transfer in the diversification of a natural Streptomyces population. We show that despite these conspecific strains sharing a recent common ancestor, all harbored significantly different gene contents, implying massive and rapid gene flux. The accessory genome of the strains was distributed across insertion/deletion events (indels) ranging from one to several hundreds of genes. Indels were preferentially located in the arms of the linear chromosomes (ca. 12 Mb) and appeared to form recombination hot spots. Some of them harbored biosynthetic gene clusters (BGCs) whose products confer an inhibitory capacity and may constitute public goods that can favor the cohesiveness of the bacterial population. Moreover, a significant proportion of these variable genes were either plasmid borne or harbored signatures of actinomycete integrative and conjugative elements (AICEs). We propose that conjugation is the main driver for the indel flux and diversity in Streptomyces populations.IMPORTANCE Horizontal gene transfer is a rapid and efficient way to diversify bacterial gene pools. Currently, little is known about this gene flux within natural soil populations. Using comparative genomics of Streptomyces strains belonging to the same species and isolated at microscale, we reveal frequent transfer of a significant fraction of the pangenome. We show that it occurs at a time scale enabling the population to diversify and to cope with its changing environment, notably, through the production of public goods.


Assuntos
Transferência Genética Horizontal , Genes Bacterianos/genética , Variação Genética , Streptomyces/genética , Actinobacteria/genética , Vias Biossintéticas/genética , Cromossomos Bacterianos , Conjugação Genética , DNA Bacteriano/genética , Genoma Bacteriano , Família Multigênica , Tipagem de Sequências Multilocus , Filogenia , Plasmídeos
16.
Microbiol Resour Announc ; 8(38)2019 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-31537669

RESUMO

The genomes of 11 conspecific Streptomyces strains, i.e., from the same species and inhabiting the same ecological niche, were sequenced and assembled. This data set offers an ideal framework to assess the genome evolution of Streptomyces species in their ecological context.

17.
BMC Bioinformatics ; 9: 536, 2008 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-19087285

RESUMO

BACKGROUND: It has been repeatedly observed that gene order is rapidly lost in prokaryotic genomes. However, persistent synteny blocks are found when comparing more or less distant species. These genes that remain consistently adjacent are appealing candidates for the study of genome evolution and a more accurate definition of their functional role. Such studies require visualizing conserved synteny blocks in a large number of genomes at all taxonomic distances. RESULTS: After comparing nearly 600 completely sequenced genomes encompassing the whole prokaryotic tree of life, the computed synteny data were assembled in a relational database, SynteBase. SynteView was designed to visualize conserved synteny blocks in a large number of genomes after choosing one of them as a reference. SynteView functions with data stored either in SynteBase or in a home-made relational database of personal data. In addition, this software can compute on-the-fly and display the distribution of synteny blocks which are conserved in pairs of genomes. This tool has been designed to provide a wealth of information on each positional orthologous gene, to be user-friendly and customizable. It is also possible to download sequences of genes belonging to these synteny blocks for further studies. SynteView is accessible through Java Webstart at http://www.synteview.u-psud.fr. CONCLUSION: SynteBase answers queries about gene order conservation and SynteView visualizes the obtained results in a flexible and powerful way which provides a comparative overview of the conserved synteny in a large number of genomes, whatever their taxonomic distances.


Assuntos
Ordem dos Genes/genética , Genoma Arqueal , Genoma Bacteriano , Software , Sintenia/genética , Biologia Computacional/métodos , Sequência Conservada , Bases de Dados Genéticas , Evolução Molecular , Genômica/métodos
18.
Biochimie ; 90(4): 595-608, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17961904

RESUMO

The incredible development of comparative genomics during the last decade has required a correct use of the concept of homology that was previously utilized only by evolutionary biologists. Unhappily, this concept has been often misunderstood and thus misused when exploited outside its evolutionary context. This review brings back to the correct definition of homology and explains how this definition has been progressively refined in order to adapt it to the various new kinds of analysis of gene properties and of their products that appear with the progress of comparative genomics. Then, we illustrate the power and the proficiency of such a concept when using the available genomics data in order to study the evolution of individual genes, of entire genomes and of species, respectively. After explaining how we detect homologues by an exhaustive comparison of a hundred of complete proteomes, we describe three main lines of research we have developed in the recent years. The first one exploits synteny and gene context data to better understand the mechanisms of genome evolution in prokaryotes. The second one is based on phylogenomics approaches to reconstruct the tree of life. The last one is devoted to reminding that protein homology is often limited to structural segments (SOH=segment of homology or module). Detecting and numbering modules allows tracing back protein history by identifying the events of gene duplication and gene fusion. We insist that one of the main present difficulties in such studies is a lack of a reliable method to identify genuine orthologues. Finally, we show how these homology studies are helpful to annotate genes and genomes and to study the complexity of the relationships between sequence and function of a gene.


Assuntos
Evolução Molecular , Genes/genética , Genoma , Genômica , Animais , Bactérias/classificação , Bactérias/genética , Filogenia , Proteoma/análise , Homologia de Sequência do Ácido Nucleico
19.
Antibiotics (Basel) ; 7(4)2018 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-30279346

RESUMO

Specialized metabolites are of great interest due to their possible industrial and clinical applications. The increasing number of antimicrobial resistant infectious agents is a major health threat and therefore, the discovery of chemical diversity and new antimicrobials is crucial. Extensive genomic data from Streptomyces spp. confirm their production potential and great importance. Genome sequencing of the same species strains indicates that specialized metabolite biosynthetic gene cluster (SMBGC) diversity is not exhausted, and instead, a pool of novel specialized metabolites still exists. Here, we analyze the genome sequence data from six phylogenetically close Streptomyces strains. The results reveal that the closer strains are phylogenetically, the number of shared gene clusters is higher. Eight specialized metabolites comprise the core metabolome, although some strains have only six core gene clusters. The number of conserved gene clusters common between the isolated strains and their closest phylogenetic counterparts varies from nine to 23 SMBGCs. However, the analysis of these phylogenetic relationships is not affected by the acquisition of gene clusters, probably by horizontal gene transfer events, as each strain also harbors strain-specific SMBGCs. Between one and 15 strain-specific gene clusters were identified, of which up to six gene clusters in a single strain are unknown and have no identifiable orthologs in other species, attesting to the existing SMBGC novelty at the strain level.

20.
BMC Evol Biol ; 7: 237, 2007 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-18047665

RESUMO

BACKGROUND: Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving. RESULTS: We have developed a suite of programs that automate three essential steps to study conservation of gene order, and validated them with a set of 107 bacteria and archaea that cover the majority of the prokaryotic taxonomic space. We identified the whole set of shared homologs between two or more species and computed the evolutionary distance separating each pair of homologs. We applied two strategies to extract from the set of homologs a collection of valid orthologs shared by at least two genomes. The first computes the Reciprocal Smallest Distance (RSD) using the PAM distances separating pairs of homologs. The second method groups homologs in families and reconstructs each family's evolutionary tree, distinguishing bona fide orthologs as well as paralogs created after the last speciation event. Although the phylogenetic tree method often succeeds where RSD fails, the reverse could occasionally be true. Accordingly, we used the data obtained with either methods or their intersection to number the orthologs that are adjacent in for each pair of genomes, the Positional Orthologous Genes (POGs), and to further study their properties. Once all these synteny blocks have been detected, we showed that POGs are subject to more evolutionary constraints than orthologs outside synteny groups, whichever the taxonomic distance separating the compared organisms. CONCLUSION: The suite of programs described in this paper allows a reliable detection of orthologs and is useful for evaluating gene order conservation in prokaryotes whichever their taxonomic distance. Thus, our approach will make easy the rapid identification of POGS in the next few years as we are expecting to be inundated with thousands of completely sequenced microbial genomes.


Assuntos
Archaea/genética , Bactérias/genética , Evolução Molecular , Genes Arqueais , Genes Bacterianos , Sintenia , Algoritmos , Análise por Conglomerados , Filogenia , Proteoma , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA