RESUMO
We demonstrate that the assembly pathway method underlying assembly theory (AT) is an encoding scheme widely used by popular statistical compression algorithms. We show that in all cases (synthetic or natural) AT performs similarly to other simple coding schemes and underperforms compared to system-related indexes based upon algorithmic probability that take into account statistical repetitions but also the likelihood of other computable patterns. Our results imply that the assembly index does not offer substantial improvements over existing methods, including traditional statistical ones, and imply that the separation between living and non-living compounds following these methods has been reported before.
Assuntos
Algoritmos , Biologia Computacional/métodosRESUMO
Cancers are complex adaptive diseases regulated by the nonlinear feedback systems between genetic instabilities, environmental signals, cellular protein flows, and gene regulatory networks. Understanding the cybernetics of cancer requires the integration of information dynamics across multidimensional spatiotemporal scales, including genetic, transcriptional, metabolic, proteomic, epigenetic, and multi-cellular networks. However, the time-series analysis of these complex networks remains vastly absent in cancer research. With longitudinal screening and time-series analysis of cellular dynamics, universally observed causal patterns pertaining to dynamical systems, may self-organize in the signaling or gene expression state-space of cancer triggering processes. A class of these patterns, strange attractors, may be mathematical biomarkers of cancer progression. The emergence of intracellular chaos and chaotic cell population dynamics remains a new paradigm in systems medicine. As such, chaotic and complex dynamics are discussed as mathematical hallmarks of cancer cell fate dynamics herein. Given the assumption that time-resolved single-cell datasets are made available, a survey of interdisciplinary tools and algorithms from complexity theory, are hereby reviewed to investigate critical phenomena and chaotic dynamics in cancer ecosystems. To conclude, the perspective cultivates an intuition for computational systems oncology in terms of nonlinear dynamics, information theory, inverse problems, and complexity. We highlight the limitations we see in the area of statistical machine learning but the opportunity at combining it with the symbolic computational power offered by the mathematical tools explored.
RESUMO
One of the challenges of defining emergence is that one observer's prior knowledge may cause a phenomenon to present itself as emergent that to another observer appears reducible. By formalizing the act of observing as mutual perturbations between dynamical systems, we demonstrate that the emergence of algorithmic information does depend on the observer's formal knowledge, while being robust vis-a-vis other subjective factors, particularly: the choice of programming language and method of measurement; errors or distortions during the observation; and the informational cost of processing. This is called observer-dependent emergence (ODE). In addition, we demonstrate that the unbounded and rapid increase of emergent algorithmic information implies asymptotically observer-independent emergence (AOIE). Unlike ODE, AOIE is a type of emergence for which emergent phenomena will be considered emergent no matter what formal theory an observer might bring to bear. We demonstrate the existence of an evolutionary model that displays the diachronic variant of AOIE and a network model that displays the holistic variant of AOIE. Our results show that, restricted to the context of finite discrete deterministic dynamical systems, computable systems and irreducible information content measures, AOIE is the strongest form of emergence that formal theories can attain. This article is part of the theme issue 'Emergent phenomena in complex physical and socio-technical systems: from cells to societies'.
Assuntos
Evolução Biológica , ConhecimentoRESUMO
Being able to objectively characterize the intrinsic complexity of behavioral patterns resulting from human or animal decisions is fundamental for deconvolving cognition and designing autonomous artificial intelligence systems. Yet complexity is difficult in practice, particularly when strings are short. By numerically approximating algorithmic (Kolmogorov) complexity (K), we establish an objective tool to characterize behavioral complexity. Next, we approximate structural (Bennett's Logical Depth) complexity (LD) to assess the amount of computation required for generating a behavioral string. We apply our toolbox to three landmark studies of animal behavior of increasing sophistication and degree of environmental influence, including studies of foraging communication by ants, flight patterns of fruit flies, and tactical deception and competition (e.g., predator-prey) strategies. We find that ants harness the environmental condition in their internal decision process, modulating their behavioral complexity accordingly. Our analysis of flight (fruit flies) invalidated the common hypothesis that animals navigating in an environment devoid of stimuli adopt a random strategy. Fruit flies exposed to a featureless environment deviated the most from Levy flight, suggesting an algorithmic bias in their attempt to devise a useful (navigation) strategy. Similarly, a logical depth analysis of rats revealed that the structural complexity of the rat always ends up matching the structural complexity of the competitor, with the rats' behavior simulating algorithmic randomness. Finally, we discuss how experiments on how humans perceive randomness suggest the existence of an algorithmic bias in our reasoning and decision processes, in line with our analysis of the animal experiments. This contrasts with the view of the mind as performing faulty computations when presented with randomized items. In summary, our formal toolbox objectively characterizes external constraints on putative models of the "internal" decision process in humans and animals.
RESUMO
In this article, we investigate limitations of importing methods based on algorithmic information theory from monoplex networks into multidimensional networks (such as multilayer networks) that have a large number of extra dimensions (i.e., aspects). In the worst-case scenario, it has been previously shown that node-aligned multidimensional networks with non-uniform multidimensional spaces can display exponentially larger algorithmic information (or lossless compressibility) distortions with respect to their isomorphic monoplex networks, so that these distortions grow at least linearly with the number of extra dimensions. In the present article, we demonstrate that node-unaligned multidimensional networks, either with uniform or non-uniform multidimensional spaces, can also display exponentially larger algorithmic information distortions with respect to their isomorphic monoplex networks. However, unlike the node-aligned non-uniform case studied in previous work, these distortions in the node-unaligned case grow at least exponentially with the number of extra dimensions. On the other hand, for node-aligned multidimensional networks with uniform multidimensional spaces, we demonstrate that any distortion can only grow up to a logarithmic order of the number of extra dimensions. Thus, these results establish that isomorphisms between finite multidimensional networks and finite monoplex networks do not preserve algorithmic information in general and highlight that the algorithmic information of the multidimensional space itself needs to be taken into account in multidimensional network complexity analysis.
RESUMO
Some established and also novel techniques in the field of applications of algorithmic (Kolmogorov) complexity currently co-exist for the first time and are here reviewed, ranging from dominant ones such as statistical lossless compression to newer approaches that advance, complement and also pose new challenges and may exhibit their own limitations. Evidence suggesting that these different methods complement each other for different regimes is presented and despite their many challenges, some of these methods can be better motivated by and better grounded in the principles of algorithmic information theory. It will be explained how different approaches to algorithmic complexity can explore the relaxation of different necessary and sufficient conditions in their pursuit of numerical applicability, with some of these approaches entailing greater risks than others in exchange for greater relevance. We conclude with a discussion of possible directions that may or should be taken into consideration to advance the field and encourage methodological innovation, but more importantly, to contribute to scientific discovery. This paper also serves as a rebuttal of claims made in a previously published minireview by another author, and offers an alternative account.
RESUMO
We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this model-driven approach may require less training data and can potentially be more generalizable as it shows greater resilience to random attacks. In an algorithmic space the order of its element is given by its algorithmic probability, which arises naturally from computable processes. We investigate the shape of a discrete algorithmic space when performing regression or classification using a loss function parametrized by algorithmic complexity, demonstrating that the property of differentiation is not required to achieve results similar to those obtained using differentiable programming approaches such as deep learning. In doing so we use examples which enable the two approaches to be compared (small, given the computational power required for estimations of algorithmic complexity). We find and report that 1) machine learning can successfully be performed on a non-smooth surface using algorithmic complexity; 2) that solutions can be found using an algorithmic-probability classifier, establishing a bridge between a fundamentally discrete theory of computability and a fundamentally continuous mathematical theory of optimization methods; 3) a formulation of an algorithmically directed search technique in non-smooth manifolds can be defined and conducted; 4) exploitation techniques and numerical methods for algorithmic search to navigate these discrete non-differentiable spaces can be performed; in application of the (a) identification of generative rules from data observations; (b) solutions to image classification problems more resilient against pixel attacks compared to neural networks; (c) identification of equation parameters from a small data-set in the presence of noise in continuous ODE system problem, (d) classification of Boolean NK networks by (1) network topology, (2) underlying Boolean function, and (3) number of incoming edges.
RESUMO
We introduce and develop a method that demonstrates that the algorithmic information content of a system can be used as a steering handle in the dynamical phase space, thus affording an avenue for controlling and reprogramming systems. The method consists of applying a series of controlled interventions to a networked system while estimating how the algorithmic information content is affected. We demonstrate the method by reconstructing the phase space and their generative rules of some discrete dynamical systems (cellular automata) serving as controlled case studies. Next, the model-based interventional or causal calculus is evaluated and validated using (1) a huge large set of small graphs, (2) a number of larger networks with different topologies, and finally (3) biological networks derived from a widely studied and validated genetic network (E. coli) as well as on a significant number of differentiating (Th17) and differentiated human cells from a curated biological network data.
RESUMO
We introduce and study a set of training-free methods of an information-theoretic and algorithmic complexity nature that we apply to DNA sequences to identify their potential to identify nucleosomal binding sites. We test the measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint high and low nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that the complexity indices are informative of nucleosome occupancy. We found that, while it is clear that the gold standard Kaplan model is driven by GC content (by design) and by k-mer training; for high occupancy, entropy and complexity-based scores are also informative and can complement the Kaplan model.
Assuntos
Nucleossomos/genética , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Composição de Bases , DNA/química , DNA/genética , Humanos , Nucleossomos/química , ProbabilidadeRESUMO
The principle of maximum entropy (Maxent) is often used to obtain prior probability distributions as a method to obtain a Gibbs measure under some restriction giving the probability that a system will be in a certain state compared to the rest of the elements in the distribution. Because classical entropy-based Maxent collapses cases confounding all distinct degrees of randomness and pseudo-randomness, here we take into consideration the generative mechanism of the systems considered in the ensemble to separate objects that may comply with the principle under some restriction and whose entropy is maximal but may be generated recursively from those that are actually algorithmically random offering a refinement to classical Maxent. We take advantage of a causal algorithmic calculus to derive a thermodynamic-like result based on how difficult it is to reprogram a computer code. Using the distinction between computable and algorithmic randomness, we quantify the cost in information loss associated with reprogramming. To illustrate this, we apply the algorithmic refinement to Maxent on graphs and introduce a Maximal Algorithmic Randomness Preferential Attachment (MARPA) Algorithm, a generalisation over previous approaches. We discuss practical implications of evaluation of network randomness. Our analysis provides insight in that the reprogrammability asymmetry appears to originate from a non-monotonic relationship to algorithmic probability. Our analysis motivates further analysis of the origin and consequences of the aforementioned asymmetries, reprogrammability, and computation.
RESUMO
Natural selection explains how life has evolved over millions of years from more primitive forms. The speed at which this happens, however, has sometimes defied formal explanations when based on random (uniformly distributed) mutations. Here, we investigate the application of a simplicity bias based on a natural but algorithmic distribution of mutations (no recombination) in various examples, particularly binary matrices, in order to compare evolutionary convergence rates. Results both on synthetic and on small biological examples indicate an accelerated rate when mutations are not statistically uniform but algorithmically uniform. We show that algorithmic distributions can evolve modularity and genetic memory by preservation of structures when they first occur sometimes leading to an accelerated production of diversity but also to population extinctions, possibly explaining naturally occurring phenomena such as diversity explosions (e.g. the Cambrian) and massive extinctions (e.g. the End Triassic) whose causes are currently a cause for debate. The natural approach introduced here appears to be a better approximation to biological evolution than models based exclusively upon random uniform mutations, and it also approaches a formal version of open-ended evolution based on previous formal results. These results validate some suggestions in the direction that computation may be an equally important driver of evolution. We also show that inducing the method on problems of optimization, such as genetic algorithms, has the potential to accelerate convergence of artificial evolutionary algorithms.
RESUMO
In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point of view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predicitive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e., equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.
Assuntos
Biologia de Sistemas/métodos , Toxicologia/métodos , Algoritmos , Animais , Descoberta de Drogas/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Aprendizado de Máquina , Modelos TeóricosRESUMO
The slime mould Physarum polycephalum has been used in developing unconventional computing devices for in which the slime mould played a role of a sensing, actuating, and computing device. These devices treated the slime mould as an active living substrate, yet it is a self-consistent living creature which evolved over millions of years and occupied most parts of the world, but in any case, that living entity did not own true cognition, just automated biochemical mechanisms. To "rehabilitate" slime mould from the rank of a purely living electronics element to a "creature of thoughts" we are analyzing the cognitive potential of P. polycephalum. We base our theory of minimal cognition of the slime mould on a bottom-up approach, from the biological and biophysical nature of the slime mould and its regulatory systems using frameworks such as Lyon's biogenic cognition, Muller, di Primio-Lengelers modifiable pathways, Bateson's "patterns that connect" framework, Maturana's autopoietic network, or proto-consciousness and Morgan's Canon.
Assuntos
Cognição/fisiologia , Biologia Computacional/métodos , Modelos Biológicos , Physarum polycephalum/fisiologia , Transporte Biológico , HumanosRESUMO
Is undecidability a requirement for open-ended evolution (OEE)? Using methods derived from algorithmic complexity theory, we propose robust computational definitions of open-ended evolution and the adaptability of computable dynamical systems. Within this framework, we show that decidability imposes absolute limits on the stable growth of complexity in computable dynamical systems. Conversely, systems that exhibit (strong) open-ended evolution must be undecidable, establishing undecidability as a requirement for such systems. Complexity is assessed in terms of three measures: sophistication, coarse sophistication, and busy beaver logical depth. These three complexity measures assign low complexity values to random (incompressible) objects. As time grows, the stated complexity measures allow for the existence of complex states during the evolution of a computable dynamical system. We show, however, that finding these states involves undecidable computations. We conjecture that for similar complexity measures that assign low complexity values, decidability imposes comparable limits on the stable growth of complexity, and that such behavior is necessary for nontrivial evolutionary systems. We show that the undecidability of adapted states imposes novel and unpredictable behavior on the individuals or populations being modeled. Such behavior is irreducible. Finally, we offer an example of a system, first proposed by Chaitin, that exhibits strong OEE.
Assuntos
Evolução Biológica , Modelos Biológicos , Biologia SintéticaRESUMO
We introduce a definition of algorithmic symmetry in the context of geometric and spatial complexity able to capture mathematical aspects of different objects using as a case study polyominoes and polyhedral graphs. We review, study and apply a method for approximating the algorithmic complexity (also known as Kolmogorov-Chaitin complexity) of graphs and networks based on the concept of Algorithmic Probability (AP). AP is a concept (and method) capable of recursively enumerate all properties of computable (causal) nature beyond statistical regularities. We explore the connections of algorithmic complexity-both theoretical and numerical-with geometric properties mainly symmetry and topology from an (algorithmic) information-theoretic perspective. We show that approximations to algorithmic complexity by lossless compression and an Algorithmic Probability-based method can characterize spatial, geometric, symmetric and topological properties of mathematical objects and graphs.
RESUMO
Information-theoretic-based measures have been useful in quantifying network complexity. Here we briefly survey and contrast (algorithmic) information-theoretic methods which have been used to characterize graphs and networks. We illustrate the strengths and limitations of Shannon's entropy, lossless compressibility and algorithmic complexity when used to identify aspects and properties of complex networks. We review the fragility of computable measures on the one hand and the invariant properties of algorithmic measures on the other demonstrating how current approaches to algorithmic complexity are misguided and suffer of similar limitations than traditional statistical approaches such as Shannon entropy. Finally, we review some current definitions of algorithmic complexity which are used in analyzing labelled and unlabelled graphs. This analysis opens up several new opportunities to advance beyond traditional measures.
RESUMO
We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff-Levin's theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π ) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages-Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell-and an online algorithmic complexity calculator.
RESUMO
MOTIVATION: The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here, we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data based upon an improvement on the calculation scheme of the derivatives and a pre-filtration step to reduce the number of possible links. The method introduces a linear differential equation model with adaptive numerical differentiation that is scalable to extremely large regulatory networks. RESULTS: We demonstrate the ability of this method to outperform current state-of-the-art methods applied to experimental and synthetic data using test data from the DREAM4 and DREAM5 challenges. Our method displays greater accuracy and scalability. We benchmark the performance of the pipeline with respect to dataset size and levels of noise. We show that the computation time is linear over various network sizes. AVAILABILITY AND IMPLEMENTATION: The Matlab code of the HiDi implementation is available at: www.complexitycalculator.com/HiDiScript.zip. CONTACT: hzenilc@gmail.com or narsis.kiani@ki.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Benchmarking , Expressão Gênica , Modelos GenéticosRESUMO
Unconventional computing is about breaking boundaries in thinking, acting and computing. Typical topics of this non-typical field include, but are not limited to physics of computation, non-classical logics, new complexity measures, novel hardware, mechanical, chemical and quantum computing. Unconventional computing encourages a new style of thinking while practical applications are obtained from uncovering and exploiting principles and mechanisms of information processing in and functional properties of, physical, chemical and living systems; in particular, efficient algorithms are developed, (almost) optimal architectures are designed and working prototypes of future computing devices are manufactured. This article includes idiosyncratic accounts of 'unconventional computing' scientists reflecting on their personal experiences, what attracted them to the field, their inspirations and discoveries.