Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
BMC Bioinformatics ; 14: 192, 2013 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-23763838

RESUMEN

BACKGROUND: Forward-time population genetic simulations play a central role in deriving and testing evolutionary hypotheses. Such simulations may be data-intensive, depending on the settings to the various parameters controlling them. In particular, for certain settings, the data footprint may quickly exceed the memory of a single compute node. RESULTS: We develop a novel and general method for addressing the memory issue inherent in forward-time simulations by compressing and decompressing, in real-time, active and ancestral genotypes, while carefully accounting for the time overhead. We propose a general graph data structure for compressing the genotype space explored during a simulation run, along with efficient algorithms for constructing and updating compressed genotypes which support both mutation and recombination. We tested the performance of our method in very large-scale simulations. Results show that our method not only scales well, but that it also overcomes memory issues that would cripple existing tools. CONCLUSIONS: As evolutionary analyses are being increasingly performed on genomes, pathways, and networks, particularly in the era of systems biology, scaling population genetic simulators to handle large-scale simulations is crucial. We believe our method offers a significant step in that direction. Further, the techniques we provide are generic and can be integrated with existing population genetic simulators to boost their performance in terms of memory usage.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Genética de Población/métodos , Genotipo , Simulación por Computador , Evolución Molecular , Genoma , Mutación , Recombinación Genética
2.
Mol Syst Biol ; 9: 660, 2013 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-23591776

RESUMEN

Gene regulation in bacteria is usually described as an adaptive response to an environmental change so that genes are expressed when they are required. We instead propose that most genes are under indirect control: their expression responds to signal(s) that are not directly related to the genes' function. Indirect control should perform poorly in artificial conditions, and we show that gene regulation is often maladaptive in the laboratory. In Shewanella oneidensis MR-1, 24% of genes are detrimental to fitness in some conditions, and detrimental genes tend to be highly expressed instead of being repressed when not needed. In diverse bacteria, there is little correlation between when genes are important for optimal growth or fitness and when those genes are upregulated. Two common types of indirect control are constitutive expression and regulation by growth rate; these occur for genes with diverse functions and often seem to be suboptimal. Because genes that have closely related functions can have dissimilar expression patterns, regulation may be suboptimal in the wild as well as in the laboratory.


Asunto(s)
Proteínas Bacterianas/genética , Regulación Bacteriana de la Expresión Génica , Aptitud Genética , Shewanella/genética , Proteínas Bacterianas/metabolismo , Cromatina/metabolismo , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Shewanella/metabolismo , Estrés Fisiológico , Transcripción Genética , Zymomonas/genética , Zymomonas/metabolismo
3.
Proc Natl Acad Sci U S A ; 110(19): 7754-9, 2013 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-23610404

RESUMEN

Cis-regulatory networks (CRNs) play a central role in cellular decision making. Like every other biological system, CRNs undergo evolution, which shapes their properties by a combination of adaptive and nonadaptive evolutionary forces. Teasing apart these forces is an important step toward functional analyses of the different components of CRNs, designing regulatory perturbation experiments, and constructing synthetic networks. Although tests of neutrality and selection based on molecular sequence data exist, no such tests are currently available based on CRNs. In this work, we present a unique genotype model of CRNs that is grounded in a genomic context and demonstrate its use in identifying portions of the CRN with properties explainable by neutral evolutionary forces at the system, subsystem, and operon levels. We leverage our model against experimentally derived data from Escherichia coli. The results of this analysis show statistically significant and substantial neutral trends in properties previously identified as adaptive in origin--degree distribution, clustering coefficient, and motifs--within the E. coli CRN. Our model captures the tightly coupled genome-interactome of an organism and enables analyses of how evolutionary events acting at the genome level, such as mutation, and at the population level, such as genetic drift, give rise to neutral patterns that we can quantify in CRNs.


Asunto(s)
Escherichia coli/genética , Redes Reguladoras de Genes , Sitios de Unión , Análisis por Conglomerados , Simulación por Computador , ADN Bacteriano/metabolismo , Escherichia coli/metabolismo , Evolución Molecular , Flujo Genético , Variación Genética , Genética de Población , Genoma Bacteriano , Genómica , Genotipo , Modelos Genéticos , Modelos Estadísticos , Mutación , Regiones Promotoras Genéticas , ARN no Traducido/genética
4.
BMC Evol Biol ; 12: 159, 2012 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-22935101

RESUMEN

BACKGROUND: The amount of transcription factor binding sites (TFBS) in an organism's genome positively correlates with the complexity of the regulatory network of the organism. However, the manner by which TFBS arise and accumulate in genomes and the effects of regulatory network complexity on the organism's fitness are far from being known. The availability of TFBS data from many organisms provides an opportunity to explore these issues, particularly from an evolutionary perspective. RESULTS: We analyzed TFBS data from five model organisms - E. coli K12, S. cerevisiae, C. elegans, D. melanogaster, A. thaliana - and found a positive correlation between the amount of non-coding DNA (ncDNA) in the organism's genome and regulatory complexity. Based on this finding, we hypothesize that the amount of ncDNA, combined with the population size, can explain the patterns of regulatory complexity across organisms. To test this hypothesis, we devised a genome-based regulatory pathway model and subjected it to the forces of evolution through population genetic simulations. The results support our hypothesis, showing neutral evolutionary forces alone can explain TFBS patterns, and that selection on the regulatory network function does not alter this finding. CONCLUSIONS: The cis-regulome is not a clean functional network crafted by adaptive forces alone, but instead a data source filled with the noise of non-adaptive forces. From a regulatory perspective, this evolutionary noise manifests as complexity on both the binding site and pathway level, which has significant implications on many directions in microbiology, genetics, and synthetic biology.


Asunto(s)
ADN Intergénico/genética , Redes Reguladoras de Genes , Genoma/genética , Factores de Transcripción/metabolismo , Animales , Arabidopsis/genética , Sitios de Unión/genética , Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Escherichia coli/genética , Evolución Molecular , Flujo Genético , Genotipo , Modelos Genéticos , Mutación , Fenotipo , Unión Proteica , Saccharomyces cerevisiae/genética , Especificidad de la Especie
5.
J Struct Biol ; 174(2): 360-73, 2011 May.
Artículo en Inglés | MEDLINE | ID: mdl-21296162

RESUMEN

Electron cryo-microscopy (cryo-EM) has played an increasingly important role in elucidating the structure and function of macromolecular assemblies in near native solution conditions. Typically, however, only non-atomic resolution reconstructions have been obtained for these large complexes, necessitating computational tools for integrating and extracting structural details. With recent advances in cryo-EM, maps at near-atomic resolutions have been achieved for several macromolecular assemblies from which models have been manually constructed. In this work, we describe a new interactive modeling toolkit called Gorgon targeted at intermediate to near-atomic resolution density maps (10-3.5 Å), particularly from cryo-EM. Gorgon's de novo modeling procedure couples sequence-based secondary structure prediction with feature detection and geometric modeling techniques to generate initial protein backbone models. Beyond model building, Gorgon is an extensible interactive visualization platform with a variety of computational tools for annotating a wide variety of 3D volumes. Examples from cryo-EM maps of Rotavirus and Rice Dwarf Virus are used to demonstrate its applicability to modeling protein structure.


Asunto(s)
Modelos Moleculares , Conformación Proteica , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Antígenos Virales/química , Proteínas de la Cápside/química , Simulación por Computador , Microscopía por Crioelectrón/métodos , Presentación de Datos , Datos de Secuencia Molecular
6.
Bioinformatics ; 25(9): 1178-84, 2009 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-19289444

RESUMEN

MOTIVATION: The growing availability of genome-scale datasets has attracted increasing attention to the development of computational methods for automated inference of functional similarities among genes and their products. One class of such methods measures the functional similarity of genes based on their distance in the Gene Ontology (GO). To measure the functional relatedness of a gene set, these measures consider every pair of genes in the set, and the average of all pairwise distances is calculated. However, as more data becomes available and gene sets used for analysis become larger, such pair-based calculation becomes prohibitive. RESULTS: In this article, we propose GS(2) (GO-based similarity of gene sets), a novel GO-based measure of gene set similarity that is computable in linear time in the size of the gene set. The measure quantifies the similarity of the GO annotations among a set of genes by averaging the contribution of each gene's GO terms and their ancestor terms with respect to the GO vocabulary graph. To study the performance of our method, we compared our measure with an established pair-based measure when run on gene sets with varying degrees of functional similarities. In addition to a significant speed improvement, our method produced comparable similarity scores to the established method. Our method is available as a web-based tool and an open-source Python library. AVAILABILITY: The web-based tools and Python code are available at: http://bioserver.cs.rice.edu/gs2.


Asunto(s)
Genes , Programas Informáticos , Vocabulario Controlado , Algoritmos , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Genómica , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA