Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
2.
Nucleic Acids Res ; 48(D1): D579-D589, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31647104

RESUMO

Large-scale genome sequencing and the increasingly massive use of high-throughput approaches produce a vast amount of new information that completely transforms our understanding of thousands of microbial species. However, despite the development of powerful bioinformatics approaches, full interpretation of the content of these genomes remains a difficult task. Launched in 2005, the MicroScope platform (https://www.genoscope.cns.fr/agc/microscope) has been under continuous development and provides analysis for prokaryotic genome projects together with metabolic network reconstruction and post-genomic experiments allowing users to improve the understanding of gene functions. Here we present new improvements of the MicroScope user interface for genome selection, navigation and expert gene annotation. Automatic functional annotation procedures of the platform have also been updated and we added several new tools for the functional annotation of genes and genomic regions. We finally focus on new tools and pipeline developed to perform comparative analyses on hundreds of genomes based on pangenome graphs. To date, MicroScope contains data for >11 800 microbial genomes, part of which are manually curated and maintained by microbiologists (>4500 personal accounts in September 2019). The platform enables collaborative work in a rich comparative genomic context and improves community-based curation efforts.


Assuntos
Genes Arqueais , Genes Bacterianos , Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Bases de Dados Genéticas , Redes e Vias Metabólicas
3.
Brief Bioinform ; 20(4): 1071-1084, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-28968784

RESUMO

The overwhelming list of new bacterial genomes becoming available on a daily basis makes accurate genome annotation an essential step that ultimately determines the relevance of thousands of genomes stored in public databanks. The MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Starting from the results of our syntactic, functional and relational annotation pipelines, MicroScope provides an integrated environment for the expert annotation and comparative analysis of prokaryotic genomes. It combines tools and graphical interfaces to analyze genomes and to perform the manual curation of gene function in a comparative genomics and metabolic context. In this article, we describe the free-of-charge MicroScope services for the annotation and analysis of microbial (meta)genomes, transcriptomic and re-sequencing data. Then, the functionalities of the platform are presented in a way providing practical guidance and help to the nonspecialists in bioinformatics. Newly integrated analysis tools (i.e. prediction of virulence and resistance genes in bacterial genomes) and original method recently developed (the pan-genome graph representation) are also described. Integrated environments such as MicroScope clearly contribute, through the user community, to help maintaining accurate resources.


Assuntos
Genoma Microbiano , Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Biologia Computacional , Gráficos por Computador , Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Compostos Químicos , Genômica/estatística & dados numéricos , Internet , Redes e Vias Metabólicas/genética , Fenômenos Microbiológicos , Anotação de Sequência Molecular/estatística & dados numéricos , Interface Usuário-Computador
4.
Bioinformatics ; 36(Suppl_2): i651-i658, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381850

RESUMO

MOTIVATION: Horizontal gene transfer (HGT) is a major source of variability in prokaryotic genomes. Regions of genome plasticity (RGPs) are clusters of genes located in highly variable genomic regions. Most of them arise from HGT and correspond to genomic islands (GIs). The study of those regions at the species level has become increasingly difficult with the data deluge of genomes. To date, no methods are available to identify GIs using hundreds of genomes to explore their diversity. RESULTS: We present here the panRGP method that predicts RGPs using pangenome graphs made of all available genomes for a given species. It allows the study of thousands of genomes in order to access the diversity of RGPs and to predict spots of insertions. It gave the best predictions when benchmarked along other GI detection tools against a reference dataset. In addition, we illustrated its use on metagenome assembled genomes by redefining the borders of the leuX tRNA hotspot, a well-studied spot of insertion in Escherichia coli. panRPG is a scalable and reliable tool to predict GIs and spots making it an ideal approach for large comparative studies. AVAILABILITY AND IMPLEMENTATION: The methods presented in the current work are available through the following software: https://github.com/labgem/PPanGGOLiN. Detailed results and scripts to compute the benchmark metrics are available at https://github.com/axbazin/panrgp_supdata.


Assuntos
Ilhas Genômicas , Software , Transferência Genética Horizontal , Ilhas Genômicas/genética , Genômica , Metagenoma
5.
PLoS Comput Biol ; 16(3): e1007732, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32191703

RESUMO

The use of comparative genomics for functional, evolutionary, and epidemiological studies requires methods to classify gene families in terms of occurrence in a given species. These methods usually lack multivariate statistical models to infer the partitions and the optimal number of classes and don't account for genome organization. We introduce a graph structure to model pangenomes in which nodes represent gene families and edges represent genomic neighborhood. Our method, named PPanGGOLiN, partitions nodes using an Expectation-Maximization algorithm based on multivariate Bernoulli Mixture Model coupled with a Markov Random Field. This approach takes into account the topology of the graph and the presence/absence of genes in pangenomes to classify gene families into persistent, cloud, and one or several shell partitions. By analyzing the partitioned pangenome graphs of isolate genomes from 439 species and metagenome-assembled genomes from 78 species, we demonstrate that our method is effective in estimating the persistent genome. Interestingly, it shows that the shell genome is a key element to understand genome dynamics, presumably because it reflects how genes present at intermediate frequencies drive adaptation of species, and its proportion in genomes is independent of genome size. The graph-based approach proposed by PPanGGOLiN is useful to depict the overall genomic diversity of thousands of strains in a compact structure and provides an effective basis for very large scale comparative genomics. The software is freely available at https://github.com/labgem/PPanGGOLiN.


Assuntos
Genoma Bacteriano/genética , Genômica/métodos , Software , Algoritmos , Bactérias/classificação , Bactérias/genética , Análise Multivariada
6.
Appl Environ Microbiol ; 86(20)2020 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-32769182

RESUMO

We sought to identify and study the antibiofilm protein secreted by the marine bacterium Pseudoalteromonas sp. strain 3J6. The latter is active against marine and terrestrial bacteria, including Pseudomonas aeruginosa clinical strains forming different biofilm types. Several amino acid sequences were obtained from the partially purified antibiofilm protein, named alterocin. The Pseudoalteromonas sp. 3J6 genome was sequenced, and a candidate alt gene was identified by comparing the genome-encoded proteins to the sequences from purified alterocin. Expressing the alt gene in another nonactive Pseudoalteromonas sp. strain, 3J3, demonstrated that it is responsible for the antibiofilm activity. Alterocin is a 139-residue protein that includes a predicted 20-residue signal sequence, which would be cleaved off upon export by the general secretion system. No sequence homology was found between alterocin and proteins of known functions. The alt gene is not part of an operon and adjacent genes do not seem related to alterocin production, immunity, or regulation, suggesting that these functions are not fulfilled by devoted proteins. During growth in liquid medium, the alt mRNA level peaked during the stationary phase. A single promoter was experimentally identified, and several inverted repeats could be binding sites for regulators. alt genes were found in about 30% of the Pseudoalteromonas genomes and in only a few instances of other marine bacteria of the Hahella and Paraglaciecola genera. Comparative genomics yielded the hypothesis that alt gene losses occurred within the Pseudoalteromonas genus. Overall, alterocin is a novel kind of antibiofilm protein of ecological and biotechnological interest.IMPORTANCE Biofilms are microbial communities that develop on solid surfaces or interfaces and are detrimental in a number of fields, including for example food industry, aquaculture, and medicine. In the latter, antibiotics are insufficient to clear biofilm infections, leading to chronic infections such as in the case of infection by Pseudomonas aeruginosa of the lungs of cystic fibrosis patients. Antibiofilm molecules are thus urgently needed to be used in conjunction with conventional antibiotics, as well as in other fields of application, especially if they are environmentally friendly molecules. Here, we describe alterocin, a novel antibiofilm protein secreted by a marine bacterium belonging to the Pseudoalteromonas genus, and its gene. Alterocin homologs were found in about 30% of Pseudoalteromonas strains, indicating that this new family of antibiofilm proteins likely plays an important albeit nonessential function in the biology of these bacteria. This study opens up the possibility of a variety of applications.


Assuntos
Antibacterianos/farmacologia , Proteínas de Bactérias/genética , Biofilmes/efeitos dos fármacos , Pseudoalteromonas/genética , Proteínas de Bactérias/biossíntese
8.
Nat Chem Biol ; 13(8): 858-866, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28581482

RESUMO

Experimental validation of enzyme function is crucial for genome interpretation, but it remains challenging because it cannot be scaled up to accommodate the constant accumulation of genome sequences. We tackled this issue for the MetA and MetX enzyme families, phylogenetically unrelated families of acyl-L-homoserine transferases involved in L-methionine biosynthesis. Members of these families are prone to incorrect annotation because MetX and MetA enzymes are assumed to always use acetyl-CoA and succinyl-CoA, respectively. We determined the enzymatic activities of 100 enzymes from diverse species, and interpreted the results by structural classification of active sites based on protein structure modeling. We predict that >60% of the 10,000 sequences from these families currently present in databases are incorrectly annotated, and suggest that acetyl-CoA was originally the sole substrate of these isofunctional enzymes, which evolved to use exclusively succinyl-CoA in the most recent bacteria. We also uncovered a divergent subgroup of MetX enzymes in fungi that participate only in L-cysteine biosynthesis as O-succinyl-L-serine transferases.


Assuntos
Acetiltransferases/metabolismo , Evolução Molecular , Metionina/biossíntese , Acinetobacter/enzimologia , Escherichia coli/enzimologia
9.
Nucleic Acids Res ; 45(D1): D517-D528, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899624

RESUMO

The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations.


Assuntos
Bases de Dados Genéticas , Metagenoma , Metagenômica/métodos , Microbiota/genética , Biologia Computacional/métodos , Evolução Molecular , Metaboloma , Metabolômica/métodos , Família Multigênica , Polimorfismo de Nucleotídeo Único , Software
10.
BMC Bioinformatics ; 19(1): 132, 2018 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-29642842

RESUMO

BACKGROUND: High quality functional annotation is essential for understanding the phenotypic consequences encoded in a genome. Despite improvements in bioinformatics methods, millions of sequences in databanks are not assigned reliable functions. The curation of protein functions in the context of biological processes is a way to evaluate and improve their annotation. RESULTS: We developed an expert system using paraconsistent logic, named GROOLS (Genomic Rule Object-Oriented Logic System), that evaluates the completeness and the consistency of predicted functions through biological processes like metabolic pathways. Using a generic and hierarchical representation of knowledge, biological processes are modeled in a graph from which observations (i.e. predictions and expectations) are propagated by rules. At the end of the reasoning, conclusions are assigned to biological process components and highlight uncertainties and inconsistencies. Results on 14 microbial organisms are presented. CONCLUSIONS: GROOLS software is designed to evaluate the overall accuracy of functional unit and pathway predictions according to organism experimental data like growth phenotypes. It assists biocurators in the functional annotation of proteins by focusing on missing or contradictory observations.


Assuntos
Algoritmos , Fenômenos Biológicos , Biologia Computacional/métodos , Genoma , Anotação de Sequência Molecular , Software , Acinetobacter/genética , Vias Biossintéticas/genética , Cisteína/biossíntese , Bases de Dados Factuais
11.
Environ Microbiol ; 18(10): 3403-3424, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-26913973

RESUMO

By the time the complete genome sequence of the soil bacterium Pseudomonas putida KT2440 was published in 2002 (Nelson et al., ) this bacterium was considered a potential agent for environmental bioremediation of industrial waste and a good colonizer of the rhizosphere. However, neither the annotation tools available at that time nor the scarcely available omics data-let alone metabolic modeling and other nowadays common systems biology approaches-allowed them to anticipate the astonishing capacities that are encoded in the genetic complement of this unique microorganism. In this work we have adopted a suite of state-of-the-art genomic analysis tools to revisit the functional and metabolic information encoded in the chromosomal sequence of strain KT2440. We identified 242 new protein-coding genes and re-annotated the functions of 1548 genes, which are linked to almost 4900 PubMed references. Catabolic pathways for 92 compounds (carbon, nitrogen and phosphorus sources) that could not be accommodated by the previously constructed metabolic models were also predicted. The resulting examination not only accounts for some of the known stress tolerance traits known in P. putida but also recognizes the capacity of this bacterium to perform difficult redox reactions, thereby multiplying its value as a platform microorganism for industrial biotechnology.


Assuntos
Genoma Bacteriano , Pseudomonas putida/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Carbono/metabolismo , Genômica , Nitrogênio/metabolismo , Pseudomonas putida/metabolismo
12.
Nat Chem Biol ; 10(1): 42-9, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24240508

RESUMO

Millions of protein database entries are not assigned reliable functions, preventing the full understanding of chemical diversity in living organisms. Here, we describe an integrated strategy for the discovery of various enzymatic activities catalyzed within protein families of unknown or little known function. This approach relies on the definition of a generic reaction conserved within the family, high-throughput enzymatic screening on representatives, structural and modeling investigations and analysis of genomic and metabolic context. As a proof of principle, we investigated the DUF849 Pfam family and unearthed 14 potential new enzymatic activities, leading to the designation of these proteins as ß-keto acid cleavage enzymes. We propose an in vivo role for four enzymatic activities and suggest key residues for guiding further functional annotation. Our results show that the functional diversity within a family may be largely underestimated. The extension of this strategy to other families will improve our knowledge of the enzymatic landscape.


Assuntos
Enzimas/metabolismo , Enzimas/química , Conformação Proteica
13.
BMC Bioinformatics ; 16: 385, 2015 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-26573681

RESUMO

BACKGROUND: Metabolism is generally modeled by directed networks where nodes represent reactions and/or metabolites. In order to explore metabolic pathway conservation and divergence among organisms, previous studies were based on graph alignment to find similar pathways. Few years ago, the concept of chemical transformation modules, also called reaction modules, was introduced and correspond to sequences of chemical transformations which are conserved in metabolism. We propose here a novel graph representation of the metabolic network where reactions sharing a same chemical transformation type are grouped in Reaction Molecular Signatures (RMS). RESULTS: RMS were automatically computed for all reactions and encode changes in atoms and bonds. A reaction network containing all available metabolic knowledge was then reduced by an aggregation of reaction nodes and edges to obtain a RMS network. Paths in this network were explored and a substantial number of conserved chemical transformation modules was detected. Furthermore, this graph-based formalism allows us to define several path scores reflecting different biological conservation meanings. These scores are significantly higher for paths corresponding to known metabolic pathways and were used conjointly to build association rules that should predict metabolic pathway types like biosynthesis or degradation. CONCLUSIONS: This representation of metabolism in a RMS network offers new insights to capture relevant metabolic contexts. Furthermore, along with genomic context methods, it should improve the detection of gene clusters corresponding to new metabolic pathways.


Assuntos
Redes e Vias Metabólicas , Modelos Químicos , Genoma , Família Multigênica
15.
Nucleic Acids Res ; 41(Database issue): D636-47, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193269

RESUMO

MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest.


Assuntos
Bactérias/genética , Bactérias/metabolismo , Bases de Dados Genéticas , Genoma Bacteriano , Enzimas/genética , Evolução Molecular , Perfilação da Expressão Gênica , Genoma Arqueal , Genômica , Internet , Redes e Vias Metabólicas/genética , Software , Sintenia , Integração de Sistemas
17.
Nat Commun ; 15(1): 4933, 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38858403

RESUMO

Native amine dehydrogenases offer sustainable access to chiral amines, so the search for scaffolds capable of converting more diverse carbonyl compounds is required to reach the full potential of this alternative to conventional synthetic reductive aminations. Here we report a multidisciplinary strategy combining bioinformatics, chemoinformatics and biocatalysis to extensively screen billions of sequences in silico and to efficiently find native amine dehydrogenases features using computational approaches. In this way, we achieve a comprehensive overview of the initial native amine dehydrogenase family, extending it from 2,011 to 17,959 sequences, and identify native amine dehydrogenases with non-reported substrate spectra, including hindered carbonyls and ethyl ketones, and accepting methylamine and cyclopropylamine as amine donor. We also present preliminary model-based structural information to inform the design of potential (R)-selective amine dehydrogenases, as native amine dehydrogenases are mostly (S)-selective. This integrated strategy paves the way for expanding the resource of other enzyme families and in highlighting enzymes with original features.


Assuntos
Aminas , Aminas/metabolismo , Aminas/química , Especificidade por Substrato , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/metabolismo , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/genética , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/química , Biologia Computacional/métodos , Biocatálise , Biodiversidade , Modelos Moleculares
18.
BMC Genomics ; 14: 286, 2013 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-23622346

RESUMO

BACKGROUND: Nocardia cyriacigeorgica is recognized as one of the most prevalent etiological agents of human nocardiosis. Human exposure to these Actinobacteria stems from direct contact with contaminated environmental matrices. The full genome sequence of N. cyriacigeorgica strain GUH-2 was studied to infer major trends in its evolution, including the acquisition of novel genetic elements that could explain its ability to thrive in multiple habitats. RESULTS: N. cyriacigeorgica strain GUH-2 genome size is 6.19 Mb-long, 82.7% of its CDS have homologs in at least another actinobacterial genome, and 74.5% of these are found in N. farcinica. Among N. cyriacigeorgica specific CDS, some are likely implicated in niche specialization such as those involved in denitrification and RuBisCO production, and are found in regions of genomic plasticity (RGP). Overall, 22 RGP were identified in this genome, representing 11.4% of its content. Some of these RGP encode a recombinase and IS elements which are indicative of genomic instability. CDS playing part in virulence were identified in this genome such as those involved in mammalian cell entry or encoding a superoxide dismutase. CDS encoding non ribosomal peptide synthetases (NRPS) and polyketide synthases (PKS) were identified, with some being likely involved in the synthesis of siderophores and toxins. COG analyses showed this genome to have an organization similar to environmental Actinobacteria. CONCLUSION: N. cyriacigeorgica GUH-2 genome shows features suggesting a diversification from an ancestral saprophytic state. GUH-2 ability at acquiring foreign DNA was found significant and to have led to functional changes likely beneficial for its environmental cycle and opportunistic colonization of a human host.


Assuntos
Adaptação Fisiológica/genética , Evolução Molecular , Genoma Bacteriano , Nocardia/genética , Actinobacteria/genética , Animais , Hibridização Genômica Comparativa , Elementos de DNA Transponíveis , DNA Bacteriano/genética , Feminino , Metaboloma , Camundongos , Camundongos Endogâmicos BALB C , Nocardia/patogenicidade , Filogenia , Sintenia , Virulência
19.
Microbiology (Reading) ; 159(Pt 4): 757-770, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23429746

RESUMO

Continuous updating of the genome sequence of Bacillus subtilis, the model of the Firmicutes, is a basic requirement needed by the biology community. In this work new genomic objects have been included (toxin/antitoxin genes and small RNA genes) and the metabolic network has been entirely updated. The curated view of the validated metabolic pathways present in the organism as of 2012 shows several significant differences from pathways present in the other bacterial reference, Escherichia coli: variants in synthesis of cofactors (thiamine, biotin, bacillithiol), amino acids (lysine, methionine), branched-chain fatty acids, tRNA modification and RNA degradation. In this new version, gene products that are enzymes or transporters are explicitly linked to the biochemical reactions of the RHEA reaction resource (http://www.ebi.ac.uk/rhea/), while novel compound entries have been created in the database Chemical Entities of Biological Interest (http://www.ebi.ac.uk/chebi/). The newly annotated sequence is deposited at the International Nucleotide Sequence Data Collaboration with accession number AL009126.4.


Assuntos
Bacillus subtilis/metabolismo , Proteínas de Bactérias/metabolismo , Genoma Bacteriano , Redes e Vias Metabólicas/genética , Bacillus subtilis/genética , Proteínas de Bactérias/genética , Genômica , Anotação de Sequência Molecular , Dados de Sequência Molecular , Análise de Sequência de DNA
20.
PLoS Comput Biol ; 8(5): e1002540, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22693442

RESUMO

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.


Assuntos
Enzimas/genética , Genoma Arqueal , Genoma Bacteriano , Genômica/métodos , Metaboloma/genética , Metabolômica/métodos , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Bases de Dados Genéticas , Enzimas/metabolismo , Modelos Genéticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA