Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088548

RESUMO

A knowledge-based grouping of genes into pathways or functional units is essential for describing and understanding cellular complexity. However, it is not always clear a priori how and at what level of specificity functionally interconnected genes should be partitioned into pathways, for a given application. Here, we assess and compare nine existing and two conceptually novel functional classification systems, with respect to their discovery power and generality in gene set enrichment testing. We base our assessment on a collection of nearly 2000 functional genomics datasets provided by users of the STRING database. With these real-life and diverse queries, we assess which systems typically provide the most specific and complete enrichment results. We find many structural and performance differences between classification systems. Overall, the well-established, hierarchically organized pathway annotation systems yield the best enrichment performance, despite covering substantial parts of the human genome in general terms only. On the other hand, the more recent unsupervised annotation systems perform strongest in understudied areas and organisms, and in detecting more specific pathways, albeit with less informative labels.


Assuntos
Genômica , Software , Bases de Dados Factuais , Bases de Dados Genéticas , Genômica/métodos , Humanos
2.
Bioinformatics ; 33(23): 3808-3810, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-28961926

RESUMO

MOTIVATION: Ribosomal RNA profiling has become crucial to studying microbial communities, but meaningful taxonomic analysis and inter-comparison of such data are still hampered by technical limitations, between-study design variability and inconsistencies between taxonomies used. RESULTS: Here we present MAPseq, a framework for reference-based rRNA sequence analysis that is up to 30% more accurate (F½ score) and up to one hundred times faster than existing solutions, providing in a single run multiple taxonomy classifications and hierarchical operational taxonomic unit mappings, for rRNA sequences in both amplicon and shotgun sequencing strategies, and for datasets of virtually any size. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available at https://github.com/jfmrod/mapseq. CONTACT: mering@imls.uzh.ch. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genes Microbianos , RNA Ribossômico/genética , Análise de Sequência de DNA/métodos , Software , Algoritmos , Bactérias/genética , Eucariotos/genética
3.
PLoS One ; 12(4): e0176050, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28448512

RESUMO

To ensure faithful transmission of genetic material to progeny cells, DNA replication is tightly regulated, mainly at the initiation step. Escherichia coli cells regulate the frequency of initiation according to growth conditions. Results of the classical, as well as the latest studies, suggest that the DNA replication in E. coli starts at a predefined, constant cell volume per chromosome but the mechanisms coordinating DNA replication with cell growth are still not fully understood. Results of recent investigations have revealed a role of metabolic pathway proteins in the control of cell division and a direct link between metabolism and DNA replication has also been suggested both in Bacillus subtilis and E. coli cells. In this work we show that defects in the acetate overflow pathway suppress the temperature-sensitivity of a defective replication initiator-DnaA under acetogenic growth conditions. Transcriptomic and metabolic analyses imply that this suppression is correlated with pyruvate accumulation, resulting from alterations in the pyruvate dehydrogenase (PDH) activity. Consequently, deletion of genes encoding the pyruvate dehydrogenase subunits likewise resulted in suppression of the thermal-sensitive growth of the dnaA46 strain. We propose that the suppressor effect may be directly related to the PDH complex activity, providing a link between an enzyme of the central carbon metabolism and DNA replication.


Assuntos
Acetatos/análise , Proteínas de Bactérias/metabolismo , Carbono/metabolismo , Proteínas de Ligação a DNA/metabolismo , Escherichia coli/genética , Ácido Pirúvico/análise , Acetatos/metabolismo , Proteínas de Bactérias/genética , Replicação do DNA , Proteínas de Ligação a DNA/genética , Cetona Oxirredutases/metabolismo , Redes e Vias Metabólicas/genética , Mutação , Ácido Pirúvico/metabolismo , RNA Mensageiro/química , RNA Mensageiro/isolamento & purificação , RNA Mensageiro/metabolismo , Análise de Sequência de RNA
4.
Environ Microbiol ; 17(5): 1689-706, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25156547

RESUMO

The demarcation of operational taxonomic units (OTUs) from complex sequence data sets is a key step in contemporary studies of microbial ecology. However, as biologically motivated 'optimal' OTU-binning algorithms remain elusive, many conceptually distinct approaches continue to be used. Using a global data set of 887 870 bacterial 16S rRNA gene sequences, we objectively quantified biases introduced by several widely employed sequence clustering algorithms. We found that OTU-binning methods often provided surprisingly non-equivalent partitions of identical data sets, notably when clustering to the same nominal similarity thresholds; and we quantified the resulting impact on ecological data description for a well-defined human skin microbiome data set. We observed that some methods were very robust to varying clustering thresholds, while others were found to be highly susceptible even to slight threshold variations. Moreover, we comprehensively quantified the impact of the choice of 16S rRNA gene subregion, as well as of data set scope and context on algorithm performance. Our findings may contribute to an enhanced comparability of results across sequence-processing pipelines, and we arrive at recommendations towards higher levels of standardization in established workflows.


Assuntos
Microbiota/genética , RNA Ribossômico 16S/genética , Análise de Sequência/métodos , Pele/microbiologia , Algoritmos , Sequência de Bases , Análise por Conglomerados , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Reprodutibilidade dos Testes
5.
PLoS Comput Biol ; 10(4): e1003594, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24763141

RESUMO

Operational Taxonomic Units (OTUs), usually defined as clusters of similar 16S/18S rRNA sequences, are the most widely used basic diversity units in large-scale characterizations of microbial communities. However, it remains unclear how well the various proposed OTU clustering algorithms approximate 'true' microbial taxa. Here, we explore the ecological consistency of OTUs--based on the assumption that, like true microbial taxa, they should show measurable habitat preferences (niche conservatism). In a global and comprehensive survey of available microbial sequence data, we systematically parse sequence annotations to obtain broad ecological descriptions of sampling sites. Based on these, we observe that sequence-based microbial OTUs generally show high levels of ecological consistency. However, different OTU clustering methods result in marked differences in the strength of this signal. Assuming that ecological consistency can serve as an objective external benchmark for cluster quality, we conclude that hierarchical complete linkage clustering, which provided the most ecologically consistent partitions, should be the default choice for OTU clustering. To our knowledge, this is the first approach to assess cluster quality using an external, biologically meaningful parameter as a benchmark, on a global scale.


Assuntos
Ecologia , Filogenia , RNA Ribossômico/genética
6.
Bioinformatics ; 30(2): 287-8, 2014 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-24215029

RESUMO

MOTIVATION: Nucleotide sequence data are being produced at an ever increasing rate. Clustering such sequences by similarity is often an essential first step in their analysis-intended to reduce redundancy, define gene families or suggest taxonomic units. Exact clustering algorithms, such as hierarchical clustering, scale relatively poorly in terms of run time and memory usage, yet they are desirable because heuristic shortcuts taken during clustering might have unintended consequences in later analysis steps. RESULTS: Here we present HPC-CLUST, a highly optimized software pipeline that can cluster large numbers of pre-aligned DNA sequences by running on distributed computing hardware. It allocates both memory and computing resources efficiently, and can process more than a million sequences in a few hours on a small cluster. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are freely available at http://meringlab.org/software/hpc-clust/; the pipeline is implemented in Cþþ and uses the Message Passing Interface (MPI) standard for distributed computing.


Assuntos
Algoritmos , Análise por Conglomerados , Análise de Sequência de DNA/métodos , Software , RNA Bacteriano/genética , RNA Ribossômico 16S/genética
7.
BMC Syst Biol ; 5: 39, 2011 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-21385333

RESUMO

BACKGROUND: A metabolism is a complex network of chemical reactions. This network synthesizes multiple small precursor molecules of biomass from chemicals that occur in the environment. The metabolic network of any one organism is encoded by a metabolic genotype, defined as the set of enzyme-coding genes whose products catalyze the network's reactions. Each metabolic genotype has a metabolic phenotype. We define this metabolic phenotype as the spectrum of different sources of a chemical element that a metabolism can use to synthesize biomass. We here focus on the element sulfur. We study properties of the space of all possible metabolic genotypes in sulfur metabolism by analyzing random metabolic genotypes that are viable on different numbers of sulfur sources. RESULTS: We show that metabolic genotypes with the same phenotype form large connected genotype networks--networks of metabolic networks--that extend far through metabolic genotype space. How far they reach through this space depends linearly on the number of super-essential reactions. A super-essential reaction is an essential reaction that occurs in all networks viable in a given environment. Metabolic networks can differ in how robust their phenotype is to the removal of individual reactions. We find that this robustness depends on metabolic network size, and on other variables, such as the size of minimal metabolic networks whose reactions are all essential in a specific environment. We show that different neighborhoods of any genotype network harbor very different novel phenotypes, metabolic innovations that can sustain life on novel sulfur sources. We also analyze the ability of evolving populations of metabolic networks to explore novel metabolic phenotypes. This ability is facilitated by the existence of genotype networks, because different neighborhoods of these networks contain very different novel phenotypes. CONCLUSIONS: We show that the space of metabolic genotypes involved in sulfur metabolism is organized similarly to that of carbon metabolism. We demonstrate that the maximum genotype distance and robustness of metabolic networks can be explained by the number of superessential reactions and by the sizes of minimal metabolic networks viable in an environment. In contrast to the genotype space of macromolecules, where phenotypic robustness may facilitate phenotypic innovation, we show that here the ability to access novel phenotypes does not monotonically increase with robustness.


Assuntos
Evolução Biológica , Biologia Computacional/métodos , Redes e Vias Metabólicas/genética , Modelos Biológicos , Fenótipo , Enxofre/metabolismo , Simulação por Computador , Genótipo
8.
BMC Syst Biol ; 4: 30, 2010 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-20302636

RESUMO

BACKGROUND: A metabolic genotype comprises all chemical reactions an organism can catalyze via enzymes encoded in its genome. A genotype is viable in a given environment if it is capable of producing all biomass components the organism needs to survive and reproduce. Previous work has focused on the properties of individual genotypes while little is known about how genome-scale metabolic networks with a given function can vary in their reaction content. RESULTS: We here characterize spaces of such genotypes. Specifically, we study metabolic genotypes whose phenotype is viability in minimal chemical environments that differ in their sole carbon sources. We show that regardless of the number of reactions in a metabolic genotype, the genotypes of a given phenotype typically form vast, connected, and unstructured sets -- genotype networks -- that nearly span the whole of genotype space. The robustness of metabolic phenotypes to random reaction removal in such spaces has a narrow distribution with a high mean. Different carbon sources differ in the number of metabolic genotypes in their genotype network; this number decreases as a genotype is required to be viable on increasing numbers of carbon sources, but much less than if metabolic reactions were used independently across different chemical environments. CONCLUSIONS: Our work shows that phenotype-preserving genotype networks have generic organizational properties and that these properties are insensitive to the number of reactions in metabolic genotypes.


Assuntos
Genótipo , Redes e Vias Metabólicas/genética , Algoritmos , Análise por Conglomerados , Biologia Computacional , Análise Mutacional de DNA , Escherichia coli/metabolismo , Redes Reguladoras de Genes , Genes Bacterianos , Genoma Bacteriano , Cadeias de Markov , Modelos Genéticos , Fenótipo , Análise de Componente Principal , Biologia de Sistemas
9.
PLoS Comput Biol ; 5(12): e1000613, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20019795

RESUMO

Genome-scale metabolic networks are highly robust to the elimination of enzyme-coding genes. Their structure can evolve rapidly through mutations that eliminate such genes and through horizontal gene transfer that adds new enzyme-coding genes. Using flux balance analysis we study a vast space of metabolic network genotypes and their relationship to metabolic phenotypes, the ability to sustain life in an environment defined by an available spectrum of carbon sources. Two such networks typically differ in most of their reactions and have few essential reactions in common. Our observations suggest that the robustness of the Escherichia coli metabolic network to mutations is typical of networks with the same phenotype. We also demonstrate that networks with the same phenotype form large sets that can be traversed through single mutations, and that single mutations of different genotypes with the same phenotype can yield very different novel phenotypes. This means that the evolutionary plasticity and robustness of metabolic networks facilitates the evolution of new metabolic abilities. Our approach has broad implications for the evolution of metabolic networks, for our understanding of mutational robustness, for the design of antimetabolic drugs, and for metabolic engineering.


Assuntos
Biologia Computacional/métodos , Escherichia coli/fisiologia , Evolução Molecular , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Escherichia coli/genética , Escherichia coli/metabolismo , Genótipo , Modelos Genéticos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...