Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Funct Integr Genomics ; 16(2): 215-20, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26839085

RESUMEN

The utilization of metagenomic functional interactions represents a key technique for metagenomic functional annotation efforts. By definition, metagenomic operons represent such interactions, but many operon predictions protocols rely on information about orthology and/or gene function that is frequently unavailable for metagenomic genes. Recently, the concept of the metagenomic proximon was proposed for use in metagenomic scenarios where supplemental information is sparse. In this paper, we examine the validity and utility of the proximon proposition by measuring the extent to which proximons emulate actual operons. Using the Escherichia coli K-12 genome, we compare proximons and operons from the same genome and observe the configurations and cardinalities among their corresponding mappings. The results demonstrate that the vast majority of proximons map discretely to a single operon in a conservative fashion where a typical proximon is synonymous to an equivalent or truncated operon. However, a large proportion of operons had no corresponding mappings to any proximon. Various perspectives of operon and proximon intersection are discussed, along with the potential limitations for proximon detection and usage.


Asunto(s)
Algoritmos , Escherichia coli K12/genética , Genoma Bacteriano , Metagenómica , Operón , Mapeo Cromosómico , Modelos Genéticos , Programas Informáticos
2.
Bull Math Biol ; 75(12): 2431-49, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24078338

RESUMEN

Biological interaction networks represent a powerful tool for characterizing intracellular functional relationships, such as transcriptional regulation and protein interactions. Although artificial neural networks are routinely employed for a broad range of applications across computational biology, their underlying connectionist basis has not been extensively applied to modeling biological interaction networks. In particular, the Hopfield network offers nonlinear dynamics that represent the minimization of a system energy function through temporally distinct rewiring events. Here, a scaled energy minimization model is presented to test the feasibility of deriving a composite biological interaction network from multiple constituent data sets using the Hebbian learning principle. The performance of the scaled energy minimization model is compared against the standard Hopfield model using simulated data. Several networks are also derived from real data, compared to one another, and then combined to produce an aggregate network. The utility and limitations of the proposed model are discussed, along with possible implications for a genomic learning analogy where the fundamental Hebbian postulate is rendered into its genomic equivalent: Genes that function together junction together.


Asunto(s)
Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Biología Computacional , Simulación por Computador , ADN/metabolismo , Daño del ADN , Expresión Génica , Conceptos Matemáticos , Redes Neurales de la Computación , Dinámicas no Lineales
3.
Artículo en Inglés | MEDLINE | ID: mdl-25288655

RESUMEN

MetaProx is the database of metagenomic proximons: a searchable repository of proximon objects conceived with two specific goals. The first objective is to accelerate research involving metagenomic functional interactions by providing a database of metagenomic operon candidates. Proximons represent a special subset of directons (series of contiguous co-directional genes) where each member gene is in close proximity to its neighbours with respect to intergenic distance. As a result, proximons represent significant operon candidates where some subset of proximons is the set of true metagenomic operons. Proximons are well suited for the inference of metagenomic functional networks because predicted functional linkages do not rely on homology-dependent information that is frequently unavailable in metagenomic scenarios. The second objective is to explore representations for semistructured biological data that can offer an alternative to the traditional relational database approach. In particular, we use a serialized object implementation and advocate a Data as Data policy where the same serialized objects can be used at all levels (database, search tool and saved user file) without conversion or the use of human-readable markups. MetaProx currently includes 4,210,818 proximons consisting of 8 \,926,993 total member genes. Database URL: http://metaprox.uwaterloo.ca.


Asunto(s)
Bases de Datos Genéticas , Metagenoma/genética , Metagenómica/métodos , Microbiota/genética , Animales , Humanos , Internet , Redes y Vías Metabólicas/genética , Programas Informáticos , Interfaz Usuario-Computador
4.
PLoS One ; 9(6): e98968, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24911009

RESUMEN

High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.


Asunto(s)
Cósmidos/genética , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica , Análisis de Secuencia de ADN/métodos , Clonación Molecular , Humanos
5.
PLoS One ; 8(8): e71484, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23940763

RESUMEN

Next-generation sequencing projects continue to drive a vast accumulation of metagenomic sequence data. Given the growth rate of this data, automated approaches to functional annotation are indispensable and a cornerstone heuristic of many computational protocols is the concept of guilt by association. The guilt by association paradigm has been heavily exploited by genomic context methods that offer functional predictions that are complementary to homology-based annotations, thereby offering a means to extend functional annotation. In particular, operon methods that exploit co-directional intergenic distances can provide homology-free functional annotation through the transfer of functions among co-operonic genes, under the assumption that guilt by association is indeed applicable. Although guilt by association is a well-accepted annotative device, its applicability to metagenomic functional annotation has not been definitively demonstrated. Here a large-scale assessment of metagenomic guilt by association is undertaken where functional associations are predicted on the basis of co-directional intergenic distances. Specifically, functional annotations are compared within pairs of adjacent co-directional genes, as well as operons of various lengths (i.e. number of member genes), in order to reveal new information about annotative cohesion versus operon length. The results suggests that co-directional gene pairs offer reduced confidence for metagenomic guilt by association due to difficulty in resolving the existence of functional associations when intergenic distance is the sole predictor of pairwise gene interactions. However, metagenomic operons, particularly those with substantial lengths, appear to be capable of providing a superior basis for metagenomic guilt by association due to increased annotative stability. The need for improved recognition of metagenomic operons is discussed, as well as the limitations of the present work.


Asunto(s)
Epistasis Genética/fisiología , Estudios de Asociación Genética/métodos , Metagenómica/métodos , Operón/genética , Animales , Biodiversidad , Genes/fisiología , Humanos , Anotación de Secuencia Molecular , Análisis de Secuencia de ADN
6.
PLoS One ; 7(8): e41283, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22879885

RESUMEN

The derivation and comparison of biological interaction networks are vital for understanding the functional capacity and hierarchical organization of integrated microbial communities. In the current work we present metagenomic annotation networks as a novel taxonomy-free approach for understanding the functional architecture of metagenomes. Specifically, metagenomic operon predictions are exploited to derive functional interactions that are translated and categorized according to their associated functional annotations. The result is a collection of discrete networks of weighted annotation linkages. These networks are subsequently examined for the occurrence of annotation modules that portray the functional and organizational characteristics of various microbial communities. A variety of network perspectives and annotation categories are applied to recover a diverse range of modules with different degrees of annotative cohesiveness. Applications to biocatalyst discovery and human health issues are discussed, as well as the limitations of the current implementation.


Asunto(s)
Redes Reguladoras de Genes/genética , Metagenoma/genética , Metagenómica/métodos , Anotación de Secuencia Molecular/métodos , Animales , Celulasa/genética , Tracto Gastrointestinal/microbiología , Variación Genética , Humanos , Sus scrofa
7.
Mol Biosyst ; 6(7): 1247-54, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20419183

RESUMEN

The effectiveness of the computational inference of function by genomic context is bounded by the diversity of known microbial genomes. Although metagenomes offer access to previously inaccessible organisms, their fragmentary nature prevents the conventional establishment of orthologous relationships required for reliably predicting functional interactions. We introduce a protocol for the prediction of functional interactions using data sources without information about orthologous relationships. To illustrate this process, we use the Sargasso Sea metagenome to construct a functional interaction network for the Escherichia coli K12 genome. We identify two reliability metrics, target intergenic distance and source interaction count, and apply them to selectively filter the predictions retained to construct the network of functional interactions. The resulting network contains 2297 nodes with 10 072 edges with a positive predictive value of 0.80. The metagenome yielded 8423 functional interactions beyond those found using only the genomic orthologs as a data source. This amounted to a 134% increase in the total number of functional interactions that are predicted by combining the metagenome and the genomic orthologs versus the genomic orthologs alone. In the absence of detectable orthologous relationships it remains feasible to derive a reliable set of predicted functional interactions. This offers a strategy for harnessing other metagenomes and homologs in general. Because metagenomes allow access to previously unreachable microorganisms, this will result in expanding the universe of known functional interactions thus furthering our understanding of functional organization.


Asunto(s)
Genoma Bacteriano/genética , Metagenoma/genética , Metagenómica/métodos , Agua de Mar/microbiología , Secuencia de Bases , Biología Computacional/métodos , Escherichia coli K12/genética , Redes Reguladoras de Genes , Modelos Genéticos , Operón/genética , Reproducibilidad de los Resultados , Microbiología del Agua
8.
Database (Oxford) ; 2009: bap013, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20157486

RESUMEN

While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA