Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 488(7409): 86-90, 2012 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-22859206

RESUMEN

Land plants associate with a root microbiota distinct from the complex microbial community present in surrounding soil. The microbiota colonizing the rhizosphere (immediately surrounding the root) and the endophytic compartment (within the root) contribute to plant growth, productivity, carbon sequestration and phytoremediation. Colonization of the root occurs despite a sophisticated plant immune system, suggesting finely tuned discrimination of mutualists and commensals from pathogens. Genetic principles governing the derivation of host-specific endophyte communities from soil communities are poorly understood. Here we report the pyrosequencing of the bacterial 16S ribosomal RNA gene of more than 600 Arabidopsis thaliana plants to test the hypotheses that the root rhizosphere and endophytic compartment microbiota of plants grown under controlled conditions in natural soils are sufficiently dependent on the host to remain consistent across different soil types and developmental stages, and sufficiently dependent on host genotype to vary between inbred Arabidopsis accessions. We describe different bacterial communities in two geochemically distinct bulk soils and in rhizosphere and endophytic compartments prepared from roots grown in these soils. The communities in each compartment are strongly influenced by soil type. Endophytic compartments from both soils feature overlapping, low-complexity communities that are markedly enriched in Actinobacteria and specific families from other phyla, notably Proteobacteria. Some bacteria vary quantitatively between plants of different developmental stage and genotype. Our rigorous definition of an endophytic compartment microbiome should facilitate controlled dissection of plant-microbe interactions derived from complex soil communities.


Asunto(s)
Arabidopsis/microbiología , Endófitos/clasificación , Endófitos/aislamiento & purificación , Metagenoma , Raíces de Plantas/microbiología , Microbiología del Suelo , Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Arabidopsis/clasificación , Arabidopsis/crecimiento & desarrollo , Endófitos/genética , Genotipo , Hibridación Fluorescente in Situ , Raíces de Plantas/clasificación , Raíces de Plantas/crecimiento & desarrollo , Proteobacteria/genética , Proteobacteria/aislamiento & purificación , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/aislamiento & purificación , Rizosfera , Ribotipificación , Análisis de Secuencia de ADN , Simbiosis
2.
Nature ; 462(7276): 1056-60, 2009 Dec 24.
Artículo en Inglés | MEDLINE | ID: mdl-20033048

RESUMEN

Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms. There are now nearly 1,000 completed bacterial and archaeal genomes available, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic 'phylogenomic' efforts to compile a phylogeny-driven 'Genomic Encyclopedia of Bacteria and Archaea' in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.


Asunto(s)
Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , Filogenia , Actinas/química , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Biodiversidad , Bases de Datos Genéticas , Genes de ARNr/genética , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Alineación de Secuencia
3.
Microb Ecol ; 65(3): 709-19, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23233090

RESUMEN

Next-generation sequencing has increased the coverage of microbial diversity surveys by orders of magnitude, but differentiating artifacts from rare environmental sequences remains a challenge. Clustering 16S rRNA sequences into operational taxonomic units (OTUs) organizes sequence data into groups of 97 % identity, helping to reduce data volumes and avoid analyzing sequencing artifacts by grouping them with real sequences. Here, we analyze sequence abundance distributions across environmental samples and show that 16S rRNA sequences of >99 % identity can represent functionally distinct microorganisms, rendering OTU clustering problematic when the goal is an accurate analysis of organism distribution. Strict postsequencing quality control (QC) filters eliminated the most prevalent artifacts without clustering. Further experiments proved that DNA polymerase errors in polymerase chain reaction (PCR) generate a significant number of substitution errors, most of which pass QC filters. Based on our findings, we recommend minimizing the number of PCR cycles in DNA library preparation and applying strict postsequencing QC filters to reduce the most prevalent artifacts while maintaining a high level of accuracy in diversity estimates. We further recommend correlating rare and abundant sequences across environmental samples, rather than clustering into OTUs, to identify remaining sequence artifacts without losing the resolution afforded by high-throughput sequencing.


Asunto(s)
Actinomycetales/genética , Biodiversidad , Reacción en Cadena de la Polimerasa/normas , Actinomycetales/clasificación , Actinomycetales/aislamiento & purificación , Cartilla de ADN/genética , ADN Bacteriano/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Reacción en Cadena de la Polimerasa/métodos , ARN Ribosómico 16S/genética
4.
Nature ; 450(7169): 560-5, 2007 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-18033299

RESUMEN

From the standpoints of both basic research and biotechnology, there is considerable interest in reaching a clearer understanding of the diversity of biological mechanisms employed during lignocellulose degradation. Globally, termites are an extremely successful group of wood-degrading organisms and are therefore important both for their roles in carbon turnover in the environment and as potential sources of biochemical catalysts for efforts aimed at converting wood into biofuels. Only recently have data supported any direct role for the symbiotic bacteria in the gut of the termite in cellulose and xylan hydrolysis. Here we use a metagenomic analysis of the bacterial community resident in the hindgut paunch of a wood-feeding 'higher' Nasutitermes species (which do not contain cellulose-fermenting protozoa) to show the presence of a large, diverse set of bacterial genes for cellulose and xylan hydrolysis. Many of these genes were expressed in vivo or had cellulase activity in vitro, and further analyses implicate spirochete and fibrobacter species in gut lignocellulose degradation. New insights into other important symbiotic functions including H2 metabolism, CO2-reductive acetogenesis and N2 fixation are also provided by this first system-wide gene analysis of a microbial community specialized towards plant lignocellulose degradation. Our results underscore how complex even a 1-microl environment can be.


Asunto(s)
Bacterias/metabolismo , Genoma Bacteriano/genética , Genómica , Intestinos/microbiología , Isópteros/metabolismo , Isópteros/microbiología , Madera/metabolismo , Animales , Bacterias/enzimología , Bacterias/genética , Bacterias/aislamiento & purificación , Fuentes de Energía Bioeléctrica , Carbono/metabolismo , Dominio Catalítico , Celulosa/metabolismo , Costa Rica , Genes Bacterianos/genética , Glicósido Hidrolasas/química , Glicósido Hidrolasas/genética , Glicósido Hidrolasas/metabolismo , Hidrólisis , Lignina/metabolismo , Modelos Biológicos , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa , Simbiosis , Madera/química , Xilanos/metabolismo
5.
Proc Natl Acad Sci U S A ; 105(23): 8102-7, 2008 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-18535141

RESUMEN

The candidate division Korarchaeota comprises a group of uncultivated microorganisms that, by their small subunit rRNA phylogeny, may have diverged early from the major archaeal phyla Crenarchaeota and Euryarchaeota. Here, we report the initial characterization of a member of the Korarchaeota with the proposed name, "Candidatus Korarchaeum cryptofilum," which exhibits an ultrathin filamentous morphology. To investigate possible ancestral relationships between deep-branching Korarchaeota and other phyla, we used whole-genome shotgun sequencing to construct a complete composite korarchaeal genome from enriched cells. The genome was assembled into a single contig 1.59 Mb in length with a G + C content of 49%. Of the 1,617 predicted protein-coding genes, 1,382 (85%) could be assigned to a revised set of archaeal Clusters of Orthologous Groups (COGs). The predicted gene functions suggest that the organism relies on a simple mode of peptide fermentation for carbon and energy and lacks the ability to synthesize de novo purines, CoA, and several other cofactors. Phylogenetic analyses based on conserved single genes and concatenated protein sequences positioned the korarchaeote as a deep archaeal lineage with an apparent affinity to the Crenarchaeota. However, the predicted gene content revealed that several conserved cellular systems, such as cell division, DNA replication, and tRNA maturation, resemble the counterparts in the Euryarchaeota. In light of the known composition of archaeal genomes, the Korarchaeota might have retained a set of cellular features that represents the ancestral archaeal form.


Asunto(s)
Evolución Biológica , Genoma Arqueal/genética , Korarchaeota/genética , Ciclo Celular , Replicación del ADN , Metabolismo Energético , Evolución Molecular , Korarchaeota/citología , Korarchaeota/ultraestructura , Filogenia , Biosíntesis de Proteínas , Análisis de Secuencia de ADN , Transcripción Genética
6.
Environ Microbiol ; 12(1): 118-23, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19725865

RESUMEN

Massively parallel pyrosequencing of the small subunit (16S) ribosomal RNA gene has revealed that the extent of rare microbial populations in several environments, the 'rare biosphere', is orders of magnitude higher than previously thought. One important caveat with this method is that sequencing error could artificially inflate diversity estimates. Although the per-base error of 16S rDNA amplicon pyrosequencing has been shown to be as good as or lower than Sanger sequencing, no direct assessments of pyrosequencing errors on diversity estimates have been reported. Using only Escherichia coli MG1655 as a reference template, we find that 16S rDNA diversity is grossly overestimated unless relatively stringent read quality filtering and low clustering thresholds are applied. In particular, the common practice of removing reads with unresolved bases and anomalous read lengths is insufficient to ensure accurate estimates of microbial diversity. Furthermore, common and reproducible homopolymer length errors can result in relatively abundant spurious phylotypes further confounding data interpretation. We suggest that stringent quality-based trimming of 16S pyrotags and clustering thresholds no greater than 97% identity should be used to avoid overestimates of the rare biosphere.


Asunto(s)
Biodiversidad , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos , Análisis por Conglomerados , ADN Bacteriano/genética , Escherichia coli/genética , Genes Bacterianos , Variación Genética , Alineación de Secuencia
7.
Environ Microbiol ; 12(5): 1205-17, 2010 May.
Artículo en Inglés | MEDLINE | ID: mdl-20148930

RESUMEN

Here we report the first metatranscriptomic analysis of gene expression and regulation of 'Candidatus Accumulibacter'-enriched lab-scale sludge during enhanced biological phosphorus removal (EBPR). Medium density oligonucleotide microarrays were generated with probes targeting most predicted genes hypothesized to be important for the EBPR phenotype. RNA samples were collected at the early stage of anaerobic and aerobic phases (15 min after acetate addition and switching to aeration respectively). We detected the expression of a number of genes involved in the carbon and phosphate metabolisms, as proposed by EBPR models (e.g. polyhydroxyalkanoate synthesis, a split TCA cycle through methylmalonyl-CoA pathway, and polyphosphate formation), as well as novel genes discovered through metagenomic analysis. The comparison between the early stage anaerobic and aerobic gene expression profiles showed that expression levels of most genes were not significantly different between the two stages. The majority of upregulated genes in the aerobic sample are predicted to encode functions such as transcription, translation and protein translocation, reflecting the rapid growth phase of Accumulibacter shortly after being switched to aerobic conditions. Components of the TCA cycle and machinery involved in ATP synthesis were also upregulated during the early aerobic phase. These findings support the predictions of EBPR metabolic models that the oxidation of intracellularly stored carbon polymers through the TCA cycle provides ATP for cell growth when oxygen becomes available. Nitrous oxide reductase was among the very few Accumulibacter genes upregulated in the anaerobic sample, suggesting that its expression is likely induced by the deprivation of oxygen.


Asunto(s)
Proteínas Bacterianas/metabolismo , Betaproteobacteria/metabolismo , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Fósforo/metabolismo , Aguas del Alcantarillado/microbiología , Aerobiosis , Anaerobiosis , Proteínas Bacterianas/genética , Betaproteobacteria/genética , Betaproteobacteria/crecimiento & desarrollo , Biodegradación Ambiental , Regulación de la Expresión Génica , Metagenómica , ARN Bacteriano/análisis , ARN Bacteriano/genética , ARN Bacteriano/aislamiento & purificación
8.
Mol Syst Biol ; 4: 198, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18523433

RESUMEN

To investigate the extent of genetic stratification in structured microbial communities, we compared the metagenomes of 10 successive layers of a phylogenetically complex hypersaline mat from Guerrero Negro, Mexico. We found pronounced millimeter-scale genetic gradients that were consistent with the physicochemical profile of the mat. Despite these gradients, all layers displayed near-identical and acid-shifted isoelectric point profiles due to a molecular convergence of amino-acid usage, indicating that hypersalinity enforces an overriding selective pressure on the mat community.


Asunto(s)
Genética Microbiana , Salinidad , Selección Genética , Aminoácidos/metabolismo , México
9.
Nat Biotechnol ; 24(10): 1263-9, 2006 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16998472

RESUMEN

Enhanced biological phosphorus removal (EBPR) is one of the best-studied microbially mediated industrial processes because of its ecological and economic relevance. Despite this, it is not well understood at the metabolic level. Here we present a metagenomic analysis of two lab-scale EBPR sludges dominated by the uncultured bacterium, "Candidatus Accumulibacter phosphatis." The analysis sheds light on several controversies in EBPR metabolic models and provides hypotheses explaining the dominance of A. phosphatis in this habitat, its lifestyle outside EBPR and probable cultivation requirements. Comparison of the same species from different EBPR sludges highlights recent evolutionary dynamics in the A. phosphatis genome that could be linked to mechanisms for environmental adaptation. In spite of an apparent lack of phylogenetic overlap in the flanking communities of the two sludges studied, common functional themes were found, at least one of them complementary to the inferred metabolism of the dominant organism. The present study provides a much needed blueprint for a systems-level understanding of EBPR and illustrates that metagenomics enables detailed, often novel, insights into even well-studied biological systems.


Asunto(s)
Betaproteobacteria/genética , Betaproteobacteria/metabolismo , Genoma Bacteriano , Fósforo/metabolismo , Aguas del Alcantarillado/microbiología , Adaptación Biológica , Fósforo/aislamiento & purificación , Eliminación de Residuos Líquidos
10.
Bioinformatics ; 23(6): 764-6, 2007 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-17234642

RESUMEN

UNLABELLED: We describe a general multiplatform exploratory tool called TreeQ-Vista, designed for presenting functional annotations in a phylogenetic context. Traits, such as phenotypic and genomic properties, are interactively queried from a user-provided relational database with a user-friendly interface which provides a set of tools for users with or without SQL knowledge. The query results are projected onto a phylogenetic tree and can be displayed in multiple color groups. A rich set of browsing, grouping and query tools are provided to facilitate trait exploration, comparison and analysis. AVAILABILITY: The program, detailed tutorial and examples are available online (http:/genome.lbl.gov/vista/TreeQVista).


Asunto(s)
Mapeo Cromosómico/métodos , Bases de Datos Genéticas , Evolución Molecular , Almacenamiento y Recuperación de la Información/métodos , Modelos Genéticos , Programas Informáticos , Interfaz Usuario-Computador , Gráficos por Computador , Simulación por Computador , Sistemas de Administración de Bases de Datos , Filogenia
11.
BMC Genomics ; 8: 460, 2007 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-18081932

RESUMEN

BACKGROUND: Gene fusion detection - also known as the 'Rosetta Stone' method - involves the identification of fused composite genes in a set of reference genomes, which indicates potential interactions between its un-fused counterpart genes in query genomes. The precision of this method typically improves with an ever-increasing number of reference genomes. RESULTS: In order to explore the usefulness and scope of this approach for protein interaction prediction and generate a high-quality, non-redundant set of interacting pairs of proteins across a wide taxonomic range, we have exhaustively performed gene fusion analysis for 184 genomes using an efficient variant of a previously developed protocol. By analyzing interaction graphs and applying a threshold that limits the maximum number of possible interactions within the largest graph components, we show that we can reduce the number of implausible interactions due to the detection of promiscuous domains. With this generally applicable approach, we generate a robust set of over 2 million distinct and testable interactions encompassing 696,894 proteins in 184 species or strains, most of which have never been the subject of high-throughput experimental proteomics. We investigate the cumulative effect of increasing numbers of genomes on the fidelity and quantity of predictions, and show that, for large numbers of genomes, predictions do not become saturated but continue to grow linearly, for the majority of the species. We also examine the percentage of component (and composite) proteins with relation to the number of genes and further validate the functional categories that are highly represented in this robust set of detected genome-wide interactions. CONCLUSION: We illustrate the phylogenetic and functional diversity of gene fusion events across genomes, and their usefulness for accurate prediction of protein interaction and function.


Asunto(s)
Fusión Génica , Redes Reguladoras de Genes , Arabidopsis/genética , Proteínas Bacterianas/metabolismo , Chlamydia/genética , Variación Genética , Genoma , Filogenia , Proteínas de Plantas/metabolismo , Unión Proteica , Reproducibilidad de los Resultados
12.
Bioinformatics ; 22(14): e359-67, 2006 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-16873494

RESUMEN

The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity for microbial communities, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context. IMG/M is available at http://img.jgi.doe.gov/m.


Asunto(s)
Fenómenos Fisiológicos Bacterianos , Proteínas Bacterianas/fisiología , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Genoma Bacteriano/genética , Modelos Biológicos , Proteoma/metabolismo , Almacenamiento y Recuperación de la Información/métodos , Transducción de Señal/fisiología , Interfaz Usuario-Computador
13.
Nucleic Acids Res ; 33(2): 616-21, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-15681613

RESUMEN

Species evolutionary relationships have traditionally been defined by sequence similarities of phylogenetic marker molecules, recently followed by whole-genome phylogenies based on gene order, average ortholog similarity or gene content. Here, we introduce genome conservation--a novel metric of evolutionary distances between species that simultaneously takes into account, both gene content and sequence similarity at the whole-genome level. Genome conservation represents a robust distance measure, as demonstrated by accurate phylogenetic reconstructions. The genome conservation matrix for all presently sequenced organisms exhibits a remarkable ability to define evolutionary relationships across all taxonomic ranges. An assessment of taxonomic ranks with genome conservation shows that certain ranks are inadequately described and raises the possibility for a more precise and quantitative taxonomy in the future. All phylogenetic reconstructions are available at the genome phylogeny server: .


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Filogenia , Bacterias/clasificación , Bacterias/genética , Evolución Molecular , Genoma Bacteriano , Proteobacteria/clasificación , Proteobacteria/genética
14.
Nucleic Acids Res ; 33(19): 6083-9, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16246909

RESUMEN

The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing.


Asunto(s)
Bases de Datos Genéticas , Genoma , Animales , Biología Computacional , Genoma Arqueal , Genoma Bacteriano , Genómica , Humanos , Metabolismo/genética
15.
Res Microbiol ; 157(1): 57-68, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16431085

RESUMEN

Using an algorithm for ancestral state inference of gene content, given a large number of extant genome sequences and a phylogenetic tree, we aim to reconstruct the gene content of the last universal common ancestor (LUCA), a hypothetical life form that presumably was the progenitor of the three domains of life. The method allows for gene loss, previously found to be a major factor in shaping gene content, and thus the estimate of LUCA's gene content appears to be substantially higher than that proposed previously, with a typical number of over 1000 gene families, of which more than 90% are also functionally characterized. More precisely, when only prokaryotes are considered, the number varies between 1006 and 1189 gene families while when eukaryotes are also included, this number increases to between 1344 and 1529 families depending on the underlying phylogenetic tree. Therefore, the common belief that the hypothetical genome of LUCA should resemble those of the smallest extant genomes of obligate parasites is not supported by recent advances in computational genomics. Instead, a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology.


Asunto(s)
Planeta Tierra , Evolución Molecular , Exobiología , Genoma , Filogenia , Algoritmos , Transferencia de Gen Horizontal
16.
Nucleic Acids Res ; 31(15): 4632-8, 2003 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-12888524

RESUMEN

Accurate detection of protein families allows assignment of protein function and the analysis of functional diversity in complete genomes. Recently, we presented a novel algorithm called TribeMCL for the detection of protein families that is both accurate and efficient. This method allows family analysis to be carried out on a very large scale. Using TribeMCL, we have generated a resource called TRIBES that contains protein family information, comprising annotations, protein sequence alignments and phylogenetic distributions describing 311 257 proteins from 83 completely sequenced genomes. The analysis of at least 60 934 detected protein families reveals that, with the essential families excluded, paralogy levels are similar between prokaryotes, irrespective of genome size. The number of essential families is estimated to be between 366 and 426. We also show that the currently known space of protein families is scale free and discuss the implications of this distribution. In addition, we show that smaller families are often formed by shorter proteins and discuss the reasons for this intriguing pattern. Finally, we analyse the functional diversity of protein families in entire genome sequences. The TRIBES protein family resource is accessible at http://www.ebi.ac.uk/research/cgg/tribes/.


Asunto(s)
Genoma , Proteínas/clasificación , Análisis de Secuencia de Proteína/métodos , Algoritmos , Secuencia de Aminoácidos , Análisis por Conglomerados , Bases de Datos de Proteínas , Filogenia , Proteínas/química , Proteínas/genética , Alineación de Secuencia
17.
BMC Bioinformatics ; 6: 24, 2005 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-15703069

RESUMEN

BACKGROUND: Current protein clustering methods rely on either sequence or functional similarities between proteins, thereby limiting inferences to one of these areas. RESULTS: Here we report a new approach, named CLAN, which clusters proteins according to both annotation and sequence similarity. This approach is extremely fast, clustering the complete SwissProt database within minutes. It is also accurate, recovering consistent protein families agreeing on average in more than 97% with sequence-based protein families from Pfam. Discrepancies between sequence- and annotation-based clusters were scrutinized and the reasons reported. We demonstrate examples for each of these cases, and thoroughly discuss an example of a propagated error in SwissProt: a vacuolar ATPase subunit M9.2 erroneously annotated as vacuolar ATP synthase subunit H. CLAN algorithm is available from the authors and the CLAN database is accessible at http://maine.ebi.ac.uk:8000/cgi-bin/clan/ClanSearch.pl CONCLUSIONS: CLAN creates refined function-and-sequence specific protein families that can be used for identification and annotation of unknown family members. It also allows easy identification of erroneous annotations by spotting inconsistencies between similarities on annotation and sequence levels.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Adenosina Trifosfatasas/química , Adenosina Trifosfato/química , Algoritmos , Análisis por Conglomerados , Gráficos por Computador , Bases de Datos Factuales , Bases de Datos Genéticas , Bases de Datos de Proteínas , Reacciones Falso Negativas , Genoma , Humanos , Almacenamiento y Recuperación de la Información , Internet , Modelos Estadísticos , Lenguajes de Programación , Pliegue de Proteína , Reproducibilidad de los Resultados , Alineación de Secuencia , Análisis de Secuencia de Proteína , Programas Informáticos , Homología Estructural de Proteína , Interfaz Usuario-Computador , ATPasas de Translocación de Protón Vacuolares/química
18.
ISME J ; 4(5): 642-7, 2010 May.
Artículo en Inglés | MEDLINE | ID: mdl-20090784

RESUMEN

Pyrosequencing of 16S rRNA gene amplicons for microbial community profiling can, for equivalent costs, yield more than two orders of magnitude more sensitivity than traditional PCR cloning and Sanger sequencing. With this increased sensitivity and the ability to analyze multiple samples in parallel, it has become possible to evaluate several technical aspects of PCR-based community structure profiling methods. We tested the effect of amplicon length and primer pair on estimates of species richness (number of species) and evenness (relative abundance of species) by assessing the potentially tractable microbial community residing in the termite hindgut. Two regions of the 16S rRNA gene were sequenced from one of two common priming sites, spanning the V1-V2 or V8 regions, using amplicons ranging in length from 352 to 1443 bp. Our results show that both amplicon length and primer pair markedly influence estimates of richness and evenness. However, estimates of species evenness are consistent among different primer pairs targeting the same region. These results highlight the importance of experimental methodology when comparing diversity estimates across communities.


Asunto(s)
Bacterias/clasificación , Técnicas de Tipificación Bacteriana/métodos , Isópteros/microbiología , Reacción en Cadena de la Polimerasa , Análisis de Secuencia de ADN/métodos , Animales , Bacterias/genética , Técnicas de Tipificación Bacteriana/economía , Cartilla de ADN , ADN Bacteriano/genética , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/economía
19.
PLoS One ; 4(1): e4192, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19145256

RESUMEN

Halothermothirx orenii is a strictly anaerobic thermohalophilic bacterium isolated from sediment of a Tunisian salt lake. It belongs to the order Halanaerobiales in the phylum Firmicutes. The complete sequence revealed that the genome consists of one circular chromosome of 2578146 bps encoding 2451 predicted genes. This is the first genome sequence of an organism belonging to the Haloanaerobiales. Features of both Gram positive and Gram negative bacteria were identified with the presence of both a sporulating mechanism typical of Firmicutes and a characteristic Gram negative lipopolysaccharide being the most prominent. Protein sequence analyses and metabolic reconstruction reveal a unique combination of strategies for thermophilic and halophilic adaptation. H. orenii can serve as a model organism for the study of the evolution of the Gram negative phenotype as well as the adaptation under thermohalophilic conditions and the development of biotechnological applications under conditions that require high temperatures and high salt concentrations.


Asunto(s)
Bacterias Anaerobias/genética , Genoma Bacteriano , Halobacteriales/genética , ADN Circular/genética , Bacterias Gramnegativas/genética , Halógenos , Calor , Lipopolisacáridos , Microbiología del Agua
20.
Nat Rev Microbiol ; 6(3): 181-6, 2008 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-18157154

RESUMEN

Arrays of clustered, regularly interspaced short palindromic repeats (CRISPRs) are widespread in the genomes of many bacteria and almost all archaea. These arrays are composed of direct repeats that are separated by similarly sized non-repetitive spacers. CRISPR arrays, together with a group of associated proteins, confer resistance to phages, possibly by an RNA-interference-like mechanism. This Progress discusses the structure and function of this newly recognized antiviral mechanism.


Asunto(s)
Archaea/genética , Bacterias/genética , Secuencias Repetitivas Esparcidas/fisiología , Archaea/virología , Bacterias/virología , Proteínas Bacterianas/genética , Bacteriófagos/fisiología , ADN Intergénico , Silenciador del Gen , Genoma Arqueal , Genoma Bacteriano , Familia de Multigenes/genética , Interferencia Viral
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA