RESUMO
We study the effects of non-determinism and gene duplication on the structure of genotype-phenotype (GP) maps by introducing a non-deterministic version of the Polyomino self-assembly model. This model has previously been used in a variety of contexts to model the assembly and evolution of protein quaternary structure. Firstly, we show the limit of the current deterministic paradigm which leads to built-in anti-correlation between evolvability and robustness at the genotypic level. We develop a set of metrics to measure structural properties of GP maps in a non-deterministic setting and use them to evaluate the effects of gene duplication and subsequent diversification. Our generalized versions of evolvability and robustness exhibit positive correlation for a subset of genotypes. This positive correlation is only possible because non-deterministic phenotypes can contribute to both robustness and evolvability. Secondly, we show that duplication increases robustness and reduces evolvability initially, but that the subsequent diversification that duplication enables has a stronger, inverse effect, greatly increasing evolvability and reducing robustness relative to their original values.
RESUMO
Self-assembly processes are widespread in nature and lie at the heart of many biological and physical phenomena. The characteristics of self-assembly building blocks determine the structures that they form. Two crucial properties are the determinism and boundedness of the self-assembly. The former tells us whether the same set of building blocks always generates the same structure, and the latter whether it grows indefinitely. These properties are highly relevant in the context of protein structures, as the difference between deterministic protein self-assembly and nondeterministic protein aggregation is central to a number of diseases. Here we introduce a graph theoretical approach that can determine the determinism and boundedness for several geometries and dimensionalities of self-assembly more accurately and quickly than conventional methods. We apply this methodology to a previously studied lattice self-assembly model and discuss generalizations to a wide range of other self-assembling systems.
RESUMO
The map between genotype and phenotype is fundamental to biology. Biological information is stored and passed on in the form of genotypes, and expressed in the form of phenotypes. A growing body of literature has examined a wide range of genotype-phenotype (GP) maps and has established a number of properties that appear to be shared by many GP maps. These properties are 'structural' in the sense that they are properties of the distribution of phenotypes across the point-mutation network of genotypes. They include: a redundancy of genotypes, meaning that many genotypes map to the same phenotypes, a highly non-uniform distribution of the number of genotypes per phenotype, a high robustness of phenotypes and the ability to reach a large number of new phenotypes within a small number of mutational steps. A further important property is that the robustness and evolvability of phenotypes are positively correlated. In this review, I give an overview of the study of GP maps with particular emphasis on these structural properties, and discuss a model that attempts to explain why these properties arise, as well as some of the fundamental ways in which the structure of GP maps can affect evolutionary outcomes.
Assuntos
Mapeamento Cromossômico , Estudos de Associação Genética/métodos , Animais , Evolução Molecular , Genótipo , Modelos Genéticos , FenótipoRESUMO
We investigate general properties of nondeterministic self-assembly with asymmetric interactions, using a computational model and DNA tile assembly experiments. By contrasting symmetric and asymmetric interactions we show that the latter can lead to self-limiting cluster growth. Furthermore, by adjusting the relative abundance of self-assembly particles in a two-particle mixture, we are able to tune the final sizes of these clusters. We show that this is a fundamental property of asymmetric interactions, which has potential applications in bioengineering, and provides insights into the study of diseases caused by protein aggregation.
RESUMO
Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the 'function' of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature.
Assuntos
Regulação da Expressão Gênica/fisiologia , Redes Reguladoras de Genes/fisiologia , Modelos BiológicosRESUMO
Self-assembly is ubiquitous in nature, particularly in biology, where it underlies the formation of protein quaternary structure and protein aggregation. Quaternary structure assembles deterministically and performs a wide range of important functions in the cell, whereas protein aggregation is the hallmark of a number of diseases and represents a nondeterministic self-assembly process. Here we build on previous work on a lattice model of deterministic self-assembly to investigate nondeterministic self-assembly of single lattice tiles and mixtures of two tiles at varying relative concentrations. Despite limiting the simplicity of the model to two interface types, which results in 13 topologically distinct single tiles and 106 topologically distinct sets of two tiles, we observe a wide variety of concentration-dependent behaviors. Several two-tile sets display critical behaviors in the form of a sharp transition from bound to unbound structures as the relative concentration of one tile to another increases. Other sets exhibit gradual monotonic changes in structural density, or nonmonotonic changes, while again others show no concentration dependence at all. We catalog this extensive range of behaviors and present a model that provides a reasonably good estimate of the critical concentrations for a subset of the critical transitions. In addition, we show that the structures resulting from these tile sets are fractal, with one of two different fractal dimensions.
RESUMO
Biological information is stored in DNA, RNA and protein sequences, which can be understood as genotypes that are translated into phenotypes. The properties of genotype-phenotype (GP) maps have been studied in great detail for RNA secondary structure. These include a highly biased distribution of genotypes per phenotype, negative correlation of genotypic robustness and evolvability, positive correlation of phenotypic robustness and evolvability, shape-space covering, and a roughly logarithmic scaling of phenotypic robustness with phenotypic frequency. More recently similar properties have been discovered in other GP maps, suggesting that they may be fundamental to biological GP maps, in general, rather than specific to the RNA secondary structure map. Here we propose that the above properties arise from the fundamental organization of biological information into 'constrained' and 'unconstrained' sequences, in the broadest possible sense. As 'constrained' we describe sequences that affect the phenotype more immediately, and are therefore more sensitive to mutations, such as, e.g. protein-coding DNA or the stems in RNA secondary structure. 'Unconstrained' sequences, on the other hand, can mutate more freely without affecting the phenotype, such as, e.g. intronic or intergenic DNA or the loops in RNA secondary structure. To test our hypothesis we consider a highly simplified GP map that has genotypes with 'coding' and 'non-coding' parts. We term this the Fibonacci GP map, as it is equivalent to the Fibonacci code in information theory. Despite its simplicity the Fibonacci GP map exhibits all the above properties of much more complex and biologically realistic GP maps. These properties are therefore likely to be fundamental to many biological GP maps.
Assuntos
Mapeamento Cromossômico/métodos , Genótipo , Modelos Genéticos , Fenótipo , Animais , HumanosRESUMO
The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here we present a protein-DNA network between Arabidopsis thaliana transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.
Assuntos
Arabidopsis/genética , Arabidopsis/metabolismo , Parede Celular/metabolismo , Regulação da Expressão Gênica de Plantas/genética , Redes Reguladoras de Genes/genética , Fatores de Transcrição/metabolismo , Arabidopsis/crescimento & desenvolvimento , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , DNA de Plantas/genética , DNA de Plantas/metabolismo , Fatores de Transcrição E2F/metabolismo , Retroalimentação , Regulação da Expressão Gênica no Desenvolvimento/genética , Deficiências de Ferro , Especificidade de Órgãos , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Salinidade , Fatores de Tempo , Xilema/genética , Xilema/crescimento & desenvolvimento , Xilema/metabolismoRESUMO
We present a quantitative measure of physical complexity, based on the amount of information required to build a given physical structure through self-assembly. Our procedure can be adapted to any given geometry, and thus, to any given type of physical structure that can be divided into building blocks. We illustrate our approach using self-assembling polyominoes, and demonstrate the breadth of its potential applications by quantifying the physical complexity of molecules and protein complexes. This measure is particularly well suited for the detection of symmetry and modularity in the underlying structure, and allows for a quantitative definition of structural modularity. Furthermore we use our approach to show that symmetric and modular structures are favored in biological self-assembly, for example in protein complexes. Lastly, we also introduce the notions of joint, mutual and conditional complexity, which provide a useful quantitative measure of the difference between physical structures.
RESUMO
We use a clustering signature, based on a recently introduced generalization of the clustering coefficient to directed networks, to analyze 16 directed real-world networks of five different types: social networks, genetic transcription networks, word adjacency networks, food webs, and electric circuits. We show that these five classes of networks are cleanly separated in the space of clustering signatures due to the statistical properties of their local neighborhoods, demonstrating the usefulness of clustering signatures as a classifier of directed networks.
RESUMO
We present an approach to the analysis of weighted networks, by providing a straightforward generalization of any network measure defined on unweighted networks, such as the average degree of the nearest neighbors, the clustering coefficient, the "betweenness," the distance between two nodes, and the diameter of a network. All these measures are well established for unweighted networks but have hitherto proven difficult to define for weighted networks. Our approach is based on the translation of a weighted network into an ensemble of edges. Further introducing this approach we demonstrate its advantages by applying the clustering coefficient constructed in this way to two real-world weighted networks.
RESUMO
MOTIVATION: Following the advent of microarray technology in recent years, the challenge for biologists is to identify genes of interest from the thousands of genetic expression levels measured in each microarray experiment. In many cases the aim is to identify pattern in the data series generated by successive microarray measurements. RESULTS: Here we introduce a new method of detecting pattern in microarray data series which is independent of the nature of this pattern. Our approach provides a measure of the algorithmic compressibility of each data series. A series which is significantly compressible is much more likely to result from simple underlying mechanisms than series which are incompressible. Accordingly, the gene associated with a compressible series is more likely to be biologically significant. We test our method on microarray time series of yeast cell cycle and show that it blindly selects genes exhibiting the expected cyclic behaviour as well as detecting other forms of pattern. Our results successfully predict two independent non-microarray experimental studies.