Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
1.
Philos Trans A Math Phys Eng Sci ; 380(2227): 20200429, 2022 Jul 11.
Article in English | MEDLINE | ID: mdl-35599568

ABSTRACT

One of the challenges of defining emergence is that one observer's prior knowledge may cause a phenomenon to present itself as emergent that to another observer appears reducible. By formalizing the act of observing as mutual perturbations between dynamical systems, we demonstrate that the emergence of algorithmic information does depend on the observer's formal knowledge, while being robust vis-a-vis other subjective factors, particularly: the choice of programming language and method of measurement; errors or distortions during the observation; and the informational cost of processing. This is called observer-dependent emergence (ODE). In addition, we demonstrate that the unbounded and rapid increase of emergent algorithmic information implies asymptotically observer-independent emergence (AOIE). Unlike ODE, AOIE is a type of emergence for which emergent phenomena will be considered emergent no matter what formal theory an observer might bring to bear. We demonstrate the existence of an evolutionary model that displays the diachronic variant of AOIE and a network model that displays the holistic variant of AOIE. Our results show that, restricted to the context of finite discrete deterministic dynamical systems, computable systems and irreducible information content measures, AOIE is the strongest form of emergence that formal theories can attain. This article is part of the theme issue 'Emergent phenomena in complex physical and socio-technical systems: from cells to societies'.


Subject(s)
Biological Evolution , Knowledge
2.
Nucleic Acids Res ; 47(20): e129, 2019 11 18.
Article in English | MEDLINE | ID: mdl-31511887

ABSTRACT

We introduce and study a set of training-free methods of an information-theoretic and algorithmic complexity nature that we apply to DNA sequences to identify their potential to identify nucleosomal binding sites. We test the measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint high and low nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that the complexity indices are informative of nucleosome occupancy. We found that, while it is clear that the gold standard Kaplan model is driven by GC content (by design) and by k-mer training; for high occupancy, entropy and complexity-based scores are also informative and can complement the Kaplan model.


Subject(s)
Nucleosomes/genetics , Sequence Analysis, DNA/methods , Algorithms , Animals , Base Composition , DNA/chemistry , DNA/genetics , Humans , Nucleosomes/chemistry , Probability
3.
Entropy (Basel) ; 23(7)2021 Jun 29.
Article in English | MEDLINE | ID: mdl-34210065

ABSTRACT

In this article, we investigate limitations of importing methods based on algorithmic information theory from monoplex networks into multidimensional networks (such as multilayer networks) that have a large number of extra dimensions (i.e., aspects). In the worst-case scenario, it has been previously shown that node-aligned multidimensional networks with non-uniform multidimensional spaces can display exponentially larger algorithmic information (or lossless compressibility) distortions with respect to their isomorphic monoplex networks, so that these distortions grow at least linearly with the number of extra dimensions. In the present article, we demonstrate that node-unaligned multidimensional networks, either with uniform or non-uniform multidimensional spaces, can also display exponentially larger algorithmic information distortions with respect to their isomorphic monoplex networks. However, unlike the node-aligned non-uniform case studied in previous work, these distortions in the node-unaligned case grow at least exponentially with the number of extra dimensions. On the other hand, for node-aligned multidimensional networks with uniform multidimensional spaces, we demonstrate that any distortion can only grow up to a logarithmic order of the number of extra dimensions. Thus, these results establish that isomorphisms between finite multidimensional networks and finite monoplex networks do not preserve algorithmic information in general and highlight that the algorithmic information of the multidimensional space itself needs to be taken into account in multidimensional network complexity analysis.

4.
Entropy (Basel) ; 22(6)2020 May 30.
Article in English | MEDLINE | ID: mdl-33286384

ABSTRACT

Some established and also novel techniques in the field of applications of algorithmic (Kolmogorov) complexity currently co-exist for the first time and are here reviewed, ranging from dominant ones such as statistical lossless compression to newer approaches that advance, complement and also pose new challenges and may exhibit their own limitations. Evidence suggesting that these different methods complement each other for different regimes is presented and despite their many challenges, some of these methods can be better motivated by and better grounded in the principles of algorithmic information theory. It will be explained how different approaches to algorithmic complexity can explore the relaxation of different necessary and sufficient conditions in their pursuit of numerical applicability, with some of these approaches entailing greater risks than others in exchange for greater relevance. We conclude with a discussion of possible directions that may or should be taken into consideration to advance the field and encourage methodological innovation, but more importantly, to contribute to scientific discovery. This paper also serves as a rebuttal of claims made in a previously published minireview by another author, and offers an alternative account.

5.
Entropy (Basel) ; 21(6)2019 Jun 03.
Article in English | MEDLINE | ID: mdl-33267274

ABSTRACT

The principle of maximum entropy (Maxent) is often used to obtain prior probability distributions as a method to obtain a Gibbs measure under some restriction giving the probability that a system will be in a certain state compared to the rest of the elements in the distribution. Because classical entropy-based Maxent collapses cases confounding all distinct degrees of randomness and pseudo-randomness, here we take into consideration the generative mechanism of the systems considered in the ensemble to separate objects that may comply with the principle under some restriction and whose entropy is maximal but may be generated recursively from those that are actually algorithmically random offering a refinement to classical Maxent. We take advantage of a causal algorithmic calculus to derive a thermodynamic-like result based on how difficult it is to reprogram a computer code. Using the distinction between computable and algorithmic randomness, we quantify the cost in information loss associated with reprogramming. To illustrate this, we apply the algorithmic refinement to Maxent on graphs and introduce a Maximal Algorithmic Randomness Preferential Attachment (MARPA) Algorithm, a generalisation over previous approaches. We discuss practical implications of evaluation of network randomness. Our analysis provides insight in that the reprogrammability asymmetry appears to originate from a non-monotonic relationship to algorithmic probability. Our analysis motivates further analysis of the origin and consequences of the aforementioned asymmetries, reprogrammability, and computation.

6.
Semin Cell Dev Biol ; 51: 32-43, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26802516

ABSTRACT

We survey and introduce concepts and tools located at the intersection of information theory and network biology. We show that Shannon's information entropy, compressibility and algorithmic complexity quantify different local and global aspects of synthetic and biological data. We show examples such as the emergence of giant components in Erdös-Rényi random graphs, and the recovery of topological properties from numerical kinetic properties simulating gene expression data. We provide exact theoretical calculations, numerical approximations and error estimations of entropy, algorithmic probability and Kolmogorov complexity for different types of graphs, characterizing their variant and invariant properties. We introduce formal definitions of complexity for both labeled and unlabeled graphs and prove that the Kolmogorov complexity of a labeled graph is a good approximation of its unlabeled Kolmogorov complexity and thus a robust definition of graph complexity.


Subject(s)
Information Theory , Metabolic Networks and Pathways , Algorithms , Animals , Entropy , Humans
7.
Semin Cell Dev Biol ; 51: 44-52, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26851626

ABSTRACT

Network inference is a rapidly advancing field, with new methods being proposed on a regular basis. Understanding the advantages and limitations of different network inference methods is key to their effective application in different circumstances. The common structural properties shared by diverse networks naturally pose a challenge when it comes to devising accurate inference methods, but surprisingly, there is a paucity of comparison and evaluation methods. Historically, every new methodology has only been tested against gold standard (true values) purpose-designed synthetic and real-world (validated) biological networks. In this paper we aim to assess the impact of taking into consideration aspects of topological and information content in the evaluation of the final accuracy of an inference procedure. Specifically, we will compare the best inference methods, in both graph-theoretic and information-theoretic terms, for preserving topological properties and the original information content of synthetic and biological networks. New methods for performance comparison are introduced by borrowing ideas from gene set enrichment analysis and by applying concepts from algorithmic complexity. Experimental results show that no individual algorithm outperforms all others in all cases, and that the challenging and non-trivial nature of network inference is evident in the struggle of some of the algorithms to turn in a performance that is superior to random guesswork. Therefore special care should be taken to suit the method to the purpose at hand. Finally, we show that evaluations from data generated using different underlying topologies have different signatures that can be used to better choose a network reconstruction method.


Subject(s)
Gene Regulatory Networks , Algorithms , Animals , Bayes Theorem , Entropy , Humans , Models, Genetic , Reverse Genetics
8.
Bioinformatics ; 33(24): 3964-3972, 2017 Dec 15.
Article in English | MEDLINE | ID: mdl-28961895

ABSTRACT

MOTIVATION: The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here, we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data based upon an improvement on the calculation scheme of the derivatives and a pre-filtration step to reduce the number of possible links. The method introduces a linear differential equation model with adaptive numerical differentiation that is scalable to extremely large regulatory networks. RESULTS: We demonstrate the ability of this method to outperform current state-of-the-art methods applied to experimental and synthetic data using test data from the DREAM4 and DREAM5 challenges. Our method displays greater accuracy and scalability. We benchmark the performance of the pipeline with respect to dataset size and levels of noise. We show that the computation time is linear over various network sizes. AVAILABILITY AND IMPLEMENTATION: The Matlab code of the HiDi implementation is available at: www.complexitycalculator.com/HiDiScript.zip. CONTACT: hzenilc@gmail.com or narsis.kiani@ki.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Computational Biology/methods , Gene Regulatory Networks , Benchmarking , Gene Expression , Models, Genetic
9.
PLoS Comput Biol ; 13(4): e1005408, 2017 04.
Article in English | MEDLINE | ID: mdl-28406953

ABSTRACT

Random Item Generation tasks (RIG) are commonly used to assess high cognitive abilities such as inhibition or sustained attention. They also draw upon our approximate sense of complexity. A detrimental effect of aging on pseudo-random productions has been demonstrated for some tasks, but little is as yet known about the developmental curve of cognitive complexity over the lifespan. We investigate the complexity trajectory across the lifespan of human responses to five common RIG tasks, using a large sample (n = 3429). Our main finding is that the developmental curve of the estimated algorithmic complexity of responses is similar to what may be expected of a measure of higher cognitive abilities, with a performance peak around 25 and a decline starting around 60, suggesting that RIG tasks yield good estimates of such cognitive abilities. Our study illustrates that very short strings of, i.e., 10 items, are sufficient to have their complexity reliably estimated and to allow the documentation of an age-dependent decline in the approximate sense of complexity.


Subject(s)
Behavior , Adolescent , Adult , Aged , Aged, 80 and over , Child , Child, Preschool , Humans , Middle Aged , Task Performance and Analysis , Young Adult
10.
Entropy (Basel) ; 20(7)2018 Jul 18.
Article in English | MEDLINE | ID: mdl-33265623

ABSTRACT

We introduce a definition of algorithmic symmetry in the context of geometric and spatial complexity able to capture mathematical aspects of different objects using as a case study polyominoes and polyhedral graphs. We review, study and apply a method for approximating the algorithmic complexity (also known as Kolmogorov-Chaitin complexity) of graphs and networks based on the concept of Algorithmic Probability (AP). AP is a concept (and method) capable of recursively enumerate all properties of computable (causal) nature beyond statistical regularities. We explore the connections of algorithmic complexity-both theoretical and numerical-with geometric properties mainly symmetry and topology from an (algorithmic) information-theoretic perspective. We show that approximations to algorithmic complexity by lossless compression and an Algorithmic Probability-based method can characterize spatial, geometric, symmetric and topological properties of mathematical objects and graphs.

11.
Entropy (Basel) ; 20(8)2018 Jul 25.
Article in English | MEDLINE | ID: mdl-33265640

ABSTRACT

Information-theoretic-based measures have been useful in quantifying network complexity. Here we briefly survey and contrast (algorithmic) information-theoretic methods which have been used to characterize graphs and networks. We illustrate the strengths and limitations of Shannon's entropy, lossless compressibility and algorithmic complexity when used to identify aspects and properties of complex networks. We review the fragility of computable measures on the one hand and the invariant properties of algorithmic measures on the other demonstrating how current approaches to algorithmic complexity are misguided and suffer of similar limitations than traditional statistical approaches such as Shannon entropy. Finally, we review some current definitions of algorithmic complexity which are used in analyzing labelled and unlabelled graphs. This analysis opens up several new opportunities to advance beyond traditional measures.

12.
Entropy (Basel) ; 20(8)2018 Aug 15.
Article in English | MEDLINE | ID: mdl-33265694

ABSTRACT

We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff-Levin's theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π ) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages-Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell-and an online algorithmic complexity calculator.

13.
Behav Res Methods ; 48(1): 314-29, 2016 Mar.
Article in English | MEDLINE | ID: mdl-25761393

ABSTRACT

Kolmogorov-Chaitin complexity has long been believed to be impossible to approximate when it comes to short sequences (e.g. of length 5-50). However, with the newly developed coding theorem method the complexity of strings of length 2-11 can now be numerically estimated. We present the theoretical basis of algorithmic complexity for short strings (ACSS) and describe an R-package providing functions based on ACSS that will cover psychologists' needs and improve upon previous methods in three ways: (1) ACSS is now available not only for binary strings, but for strings based on up to 9 different symbols, (2) ACSS no longer requires time-consuming computing, and (3) a new approach based on ACSS gives access to an estimation of the complexity of strings of any length. Finally, three illustrative examples show how these tools can be applied to psychology.


Subject(s)
Algorithms , Psychology , Software , Humans
14.
Behav Res Methods ; 46(3): 732-44, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24311059

ABSTRACT

As human randomness production has come to be more closely studied and used to assess executive functions (especially inhibition), many normative measures for assessing the degree to which a sequence is randomlike have been suggested. However, each of these measures focuses on one feature of randomness, leading researchers to have to use multiple measures. Although algorithmic complexity has been suggested as a means for overcoming this inconvenience, it has never been used, because standard Kolmogorov complexity is inapplicable to short strings (e.g., of length l ≤ 50), due to both computational and theoretical limitations. Here, we describe a novel technique (the coding theorem method) based on the calculation of a universal distribution, which yields an objective and universal measure of algorithmic complexity for short strings that approximates Kolmogorov-Chaitin complexity.


Subject(s)
Algorithms , Behavior/physiology , Psychology/methods , Automation , Child , Cognition , Computer Simulation , Executive Function , Female , Humans , Language , Male , Memory, Short-Term , Models, Theoretical , Neuropsychological Tests , Probability
17.
Front Oncol ; 12: 850731, 2022.
Article in English | MEDLINE | ID: mdl-35957879

ABSTRACT

Cancers are complex adaptive diseases regulated by the nonlinear feedback systems between genetic instabilities, environmental signals, cellular protein flows, and gene regulatory networks. Understanding the cybernetics of cancer requires the integration of information dynamics across multidimensional spatiotemporal scales, including genetic, transcriptional, metabolic, proteomic, epigenetic, and multi-cellular networks. However, the time-series analysis of these complex networks remains vastly absent in cancer research. With longitudinal screening and time-series analysis of cellular dynamics, universally observed causal patterns pertaining to dynamical systems, may self-organize in the signaling or gene expression state-space of cancer triggering processes. A class of these patterns, strange attractors, may be mathematical biomarkers of cancer progression. The emergence of intracellular chaos and chaotic cell population dynamics remains a new paradigm in systems medicine. As such, chaotic and complex dynamics are discussed as mathematical hallmarks of cancer cell fate dynamics herein. Given the assumption that time-resolved single-cell datasets are made available, a survey of interdisciplinary tools and algorithms from complexity theory, are hereby reviewed to investigate critical phenomena and chaotic dynamics in cancer ecosystems. To conclude, the perspective cultivates an intuition for computational systems oncology in terms of nonlinear dynamics, information theory, inverse problems, and complexity. We highlight the limitations we see in the area of statistical machine learning but the opportunity at combining it with the symbolic computational power offered by the mathematical tools explored.

18.
Front Comput Neurosci ; 16: 956074, 2022.
Article in English | MEDLINE | ID: mdl-36761393

ABSTRACT

Being able to objectively characterize the intrinsic complexity of behavioral patterns resulting from human or animal decisions is fundamental for deconvolving cognition and designing autonomous artificial intelligence systems. Yet complexity is difficult in practice, particularly when strings are short. By numerically approximating algorithmic (Kolmogorov) complexity (K), we establish an objective tool to characterize behavioral complexity. Next, we approximate structural (Bennett's Logical Depth) complexity (LD) to assess the amount of computation required for generating a behavioral string. We apply our toolbox to three landmark studies of animal behavior of increasing sophistication and degree of environmental influence, including studies of foraging communication by ants, flight patterns of fruit flies, and tactical deception and competition (e.g., predator-prey) strategies. We find that ants harness the environmental condition in their internal decision process, modulating their behavioral complexity accordingly. Our analysis of flight (fruit flies) invalidated the common hypothesis that animals navigating in an environment devoid of stimuli adopt a random strategy. Fruit flies exposed to a featureless environment deviated the most from Levy flight, suggesting an algorithmic bias in their attempt to devise a useful (navigation) strategy. Similarly, a logical depth analysis of rats revealed that the structural complexity of the rat always ends up matching the structural complexity of the competitor, with the rats' behavior simulating algorithmic randomness. Finally, we discuss how experiments on how humans perceive randomness suggest the existence of an algorithmic bias in our reasoning and decision processes, in line with our analysis of the animal experiments. This contrasts with the view of the mind as performing faulty computations when presented with randomized items. In summary, our formal toolbox objectively characterizes external constraints on putative models of the "internal" decision process in humans and animals.

19.
Front Artif Intell ; 3: 567356, 2020.
Article in English | MEDLINE | ID: mdl-33733213

ABSTRACT

We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this model-driven approach may require less training data and can potentially be more generalizable as it shows greater resilience to random attacks. In an algorithmic space the order of its element is given by its algorithmic probability, which arises naturally from computable processes. We investigate the shape of a discrete algorithmic space when performing regression or classification using a loss function parametrized by algorithmic complexity, demonstrating that the property of differentiation is not required to achieve results similar to those obtained using differentiable programming approaches such as deep learning. In doing so we use examples which enable the two approaches to be compared (small, given the computational power required for estimations of algorithmic complexity). We find and report that 1) machine learning can successfully be performed on a non-smooth surface using algorithmic complexity; 2) that solutions can be found using an algorithmic-probability classifier, establishing a bridge between a fundamentally discrete theory of computability and a fundamentally continuous mathematical theory of optimization methods; 3) a formulation of an algorithmically directed search technique in non-smooth manifolds can be defined and conducted; 4) exploitation techniques and numerical methods for algorithmic search to navigate these discrete non-differentiable spaces can be performed; in application of the (a) identification of generative rules from data observations; (b) solutions to image classification problems more resilient against pixel attacks compared to neural networks; (c) identification of equation parameters from a small data-set in the presence of noise in continuous ODE system problem, (d) classification of Boolean NK networks by (1) network topology, (2) underlying Boolean function, and (3) number of incoming edges.

20.
iScience ; 19: 1160-1172, 2019 Sep 27.
Article in English | MEDLINE | ID: mdl-31541920

ABSTRACT

We introduce and develop a method that demonstrates that the algorithmic information content of a system can be used as a steering handle in the dynamical phase space, thus affording an avenue for controlling and reprogramming systems. The method consists of applying a series of controlled interventions to a networked system while estimating how the algorithmic information content is affected. We demonstrate the method by reconstructing the phase space and their generative rules of some discrete dynamical systems (cellular automata) serving as controlled case studies. Next, the model-based interventional or causal calculus is evaluated and validated using (1) a huge large set of small graphs, (2) a number of larger networks with different topologies, and finally (3) biological networks derived from a widely studied and validated genetic network (E. coli) as well as on a significant number of differentiating (Th17) and differentiated human cells from a curated biological network data.

SELECTION OF CITATIONS
SEARCH DETAIL