Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
J Comput Neurosci ; 48(1): 85-102, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31993923

RESUMEN

Neuronal responses to complex stimuli and tasks can encompass a wide range of time scales. Understanding these responses requires measures that characterize how the information on these response patterns are represented across multiple temporal resolutions. In this paper we propose a metric - which we call multiscale relevance (MSR) - to capture the dynamical variability of the activity of single neurons across different time scales. The MSR is a non-parametric, fully featureless indicator in that it uses only the time stamps of the firing activity without resorting to any a priori covariate or invoking any specific structure in the tuning curve for neural activity. When applied to neural data from the mEC and from the ADn and PoS regions of freely-behaving rodents, we found that neurons having low MSR tend to have low mutual information and low firing sparsity across the correlates that are believed to be encoded by the region of the brain where the recordings were made. In addition, neurons with high MSR contain significant information on spatial navigation and allow to decode spatial position or head direction as efficiently as those neurons whose firing activity has high mutual information with the covariate to be decoded and significantly better than the set of neurons with high local variations in their interspike intervals. Given these results, we propose that the MSR can be used as a measure to rank and select neurons for their information content without the need to appeal to any a priori covariate.


Asunto(s)
Potenciales de Acción/fisiología , Fenómenos Electrofisiológicos/fisiología , Neuronas/fisiología , Algoritmos , Animales , Núcleos Talámicos Anteriores/fisiología , Teorema de Bayes , Encéfalo/fisiología , Corteza Entorrinal/fisiología , Cabeza , Teoría de la Información , Ratones , Ratas , Roedores , Percepción Espacial/fisiología
2.
Neural Comput ; 31(8): 1592-1623, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31260388

RESUMEN

We investigate the complexity of logistic regression models, which is defined by counting the number of indistinguishable distributions that the model can represent (Balasubramanian, 1997). We find that the complexity of logistic models with binary inputs depends not only on the number of parameters but also on the distribution of inputs in a nontrivial way that standard treatments of complexity do not address. In particular, we observe that correlations among inputs induce effective dependencies among parameters, thus constraining the model and, consequently, reducing its complexity. We derive simple relations for the upper and lower bounds of the complexity. Furthermore, we show analytically that defining the model parameters on a finite support rather than the entire axis decreases the complexity in a manner that critically depends on the size of the domain. Based on our findings, we propose a novel model selection criterion that takes into account the entropy of the input distribution. We test our proposal on the problem of selecting the input variables of a logistic regression model in a Bayesian model selection framework. In our numerical tests, we find that while the reconstruction errors of standard model selection approaches (AIC, BIC, ℓ1 regularization) strongly depend on the sparsity of the ground truth, the reconstruction error of our method is always close to the minimum in all conditions of sparsity, data size, and strength of input correlations. Finally, we observe that when considering categorical instead of binary inputs, in a simple and mathematically tractable case, the contribution of the alphabet size to the complexity is very small compared to that of parameter space dimension. We further explore the issue by analyzing the data set of the "13 keys to the White House," a method for forecasting the outcomes of US presidential elections.

3.
Entropy (Basel) ; 20(10)2018 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-33265844

RESUMEN

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with a high value of the relevance, defined as the entropy of the empirical frequencies. These results are derived for different statistical models (Dirichlet model, independent and pairwise dependent spin models, and restricted Boltzmann machines). Second, MDL codes sit precisely at a second order phase transition point where the symmetry between the sampled outcomes is spontaneously broken. The order parameter controlling the phase transition is the coding cost of the samples. The phase transition is a manifestation of the optimality of MDL codes, and it arises because codes that achieve a higher compression do not exist. These results suggest a clear interpretation of the widespread occurrence of statistical criticality as a characterization of samples which are maximally informative on the underlying generative process.

4.
Entropy (Basel) ; 20(10)2018 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-33265828

RESUMEN

Models can be simple for different reasons: because they yield a simple and computationally efficient interpretation of a generic dataset (e.g., in terms of pairwise dependencies)-as in statistical learning-or because they capture the laws of a specific phenomenon-as e.g., in physics-leading to non-trivial falsifiable predictions. In information theory, the simplicity of a model is quantified by the stochastic complexity, which measures the number of bits needed to encode its parameters. In order to understand how simple models look like, we study the stochastic complexity of spin models with interactions of arbitrary order. We show that bijections within the space of possible interactions preserve the stochastic complexity, which allows to partition the space of all models into equivalence classes. We thus found that the simplicity of a model is not determined by the order of the interactions, but rather by their mutual arrangements. Models where statistical dependencies are localized on non-overlapping groups of few variables are simple, affording predictions on independencies that are easy to falsify. On the contrary, fully connected pairwise models, which are often used in statistical learning, appear to be highly complex, because of their extended set of interactions, and they are hard to falsify.

5.
Biophys J ; 113(1): 206-213, 2017 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-28700919

RESUMEN

Competition to bind microRNAs induces an effective positive cross talk between their targets, which are therefore known as "competing endogenous RNAs" (ceRNAs). Although such an effect is known to play a significant role in specific situations, estimating its strength from data and experimentally in physiological conditions appears to be far from simple. Here, we show that the susceptibility of ceRNAs to different types of perturbations affecting their competitors (and hence their tendency to cross talk) can be encoded in quantities as intuitive and as simple to measure as correlation functions. This scenario is confirmed by extensive numerical simulations and validated by re-analyzing phosphatase and tensin homolog's cross-talk pattern from The Cancer Genome Atlas breast cancer database. These results clarify the links between different quantities used to estimate the intensity of ceRNA cross talk and provide, to our knowledge, new keys to analyze transcriptional data sets and effectively probe ceRNA networks in silico.


Asunto(s)
Algoritmos , Unión Competitiva , MicroARNs/metabolismo , Modelos Biológicos , Modelos Moleculares , Neoplasias de la Mama/metabolismo , Simulación por Computador , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Humanos , Cinética , MicroARNs/química , Proteínas Nucleares/química , Proteínas Nucleares/metabolismo , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismo , Procesos Estocásticos , Tensinas/química , Tensinas/metabolismo , Transcripción Genética/fisiología
6.
Proc Natl Acad Sci U S A ; 109(12): 4395-400, 2012 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-22383559

RESUMEN

The very notion of social network implies that linked individuals interact repeatedly with each other. This notion allows them not only to learn successful strategies and adapt to them, but also to condition their own behavior on the behavior of others, in a strategic forward looking manner. Game theory of repeated games shows that these circumstances are conducive to the emergence of collaboration in simple games of two players. We investigate the extension of this concept to the case where players are engaged in a local contribution game and show that rationality and credibility of threats identify a class of Nash equilibria--that we call "collaborative equilibria"--that have a precise interpretation in terms of subgraphs of the social network. For large network games, the number of such equilibria is exponentially large in the number of players. When incentives to defect are small, equilibria are supported by local structures whereas when incentives exceed a threshold they acquire a nonlocal nature, which requires a "critical mass" of more than a given fraction of the players to collaborate. Therefore, when incentives are high, an individual deviation typically causes the collapse of collaboration across the whole system. At the same time, higher incentives to defect typically support equilibria with a higher density of collaborators. The resulting picture conforms with several results in sociology and in the experimental literature on game theory, such as the prevalence of collaboration in denser groups and in the structural hubs of sparse networks.


Asunto(s)
Apoyo Social , Algoritmos , Comunicación , Conducta Cooperativa , Teoría del Juego , Humanos , Modelos Psicológicos , Modelos Estadísticos , Modelos Teóricos
7.
Sci Rep ; 13(1): 14879, 2023 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-37689770

RESUMEN

We use an agnostic information-theoretic approach to investigate the statistical properties of natural images. We introduce the Multiscale Relevance (MSR) measure to assess the robustness of images to compression at all scales. Starting in a controlled environment, we characterize the MSR of synthetic random textures as function of image roughness [Formula: see text] and other relevant parameters. We then extend the analysis to natural images and find striking similarities with critical ([Formula: see text]) random textures. We show that the MSR is more robust and informative of image content than classical methods such as power spectrum analysis. Finally, we confront the MSR to classical measures for the calibration of common procedures such as color mapping and denoising. Overall, the MSR approach appears to be a good candidate for advanced image analysis and image processing, while providing a good level of physical interpretability.

8.
Proc Natl Acad Sci U S A ; 106(28): 11433-8, 2009 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-19571013

RESUMEN

Networks describe a variety of interacting complex systems in social science, biology, and information technology. Usually the nodes of real networks are identified not only by their connections but also by some other characteristics. Examples of characteristics of nodes can be age, gender, or nationality of a person in a social network, the abundance of proteins in the cell taking part in protein-interaction networks, or the geographical position of airports that are connected by directed flights. Integrating the information on the connections of each node with the information about its characteristics is crucial to discriminating between the essential and negligible characteristics of nodes for the structure of the network. In this paper we propose a general indicator Theta, based on entropy measures, to quantify the dependence of a network's structure on a given set of features. We apply this method to social networks of friendships in U.S. schools, to the protein-interaction network of Saccharomyces cerevisiae and to the U.S. airport network, showing that the proposed measure provides information that complements other known measures.


Asunto(s)
Modelos Teóricos , Mapeo de Interacción de Proteínas/métodos , Saccharomyces cerevisiae/metabolismo , Apoyo Social , Transportes , Entropía , Humanos
9.
Proc Natl Acad Sci U S A ; 106(8): 2607-11, 2009 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-19196991

RESUMEN

Understanding the organization of reaction fluxes in cellular metabolism from the stoichiometry and the topology of the underlying biochemical network is a central issue in systems biology. In this task, it is important to devise reasonable approximation schemes that rely on the stoichiometric data only, because full-scale kinetic approaches are computationally affordable only for small networks (e.g., red blood cells, approximately 50 reactions). Methods commonly used are based on finding the stationary flux configurations that satisfy mass-balance conditions for metabolites, often coupling them to local optimization rules (e.g., maximization of biomass production) to reduce the size of the solution space to a single point. Such methods have been widely applied and have proven able to reproduce experimental findings for relatively simple organisms in specific conditions. Here, we define and study a constraint-based model of cellular metabolism where neither mass balance nor flux stationarity are postulated and where the relevant flux configurations optimize the global growth of the system. In the case of Escherichia coli, steady flux states are recovered as solutions, although mass-balance conditions are violated for some metabolites, implying a nonzero net production of the latter. Such solutions furthermore turn out to provide the correct statistics of fluxes for the bacterium E. coli in different environments and compare well with the available experimental evidence on individual fluxes. Conserved metabolic pools play a key role in determining growth rate and flux variability. Finally, we are able to connect phenomenological gene essentiality with "frozen" fluxes (i.e., fluxes with smaller allowed variability) in E. coli metabolism.


Asunto(s)
Escherichia coli/genética , Genes Bacterianos , Genes Esenciales , Escherichia coli/metabolismo
10.
Phys Rev E ; 103(5-1): 052121, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-34134259

RESUMEN

A 1929 Gedankenexperiment proposed by Szilárd, often referred to as "Szilárd's engine", has served as a foundation for computing fundamental thermodynamic bounds to information processing. While Szilárd's original box could be partitioned into two halves and contains one gas molecule, we calculate here the maximal average work that can be extracted in a system with N particles and q partitions, given an observer which counts the molecules in each partition, and given a work extraction mechanism that is limited to pressure equalization. We find that the average extracted work is proportional to the mutual information between the one-particle position and the vector containing the counts of how many particles are in each partition. We optimize this quantity over the initial locations of the dividing walls, and find that there exists a critical number of particles N^{★}(q) below which the extracted work is maximized by a symmetric configuration of the q partitions, and above which the optimal partitioning is asymmetric. Overall, the average extracted work is maximized for a number of particles N[over ̂](q)

11.
PLoS One ; 15(10): e0239331, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33104709

RESUMEN

Clustering and community detection provide a concise way of extracting meaningful information from large datasets. An ever growing plethora of data clustering and community detection algorithms have been proposed. In this paper, we address the question of ranking the performance of clustering algorithms for a given dataset. We show that, for hard clustering and community detection, Linsker's Infomax principle can be used to rank clustering algorithms. In brief, the algorithm that yields the highest value of the entropy of the partition, for a given number of clusters, is the best one. We show indeed, on a wide range of datasets of various sizes and topological structures, that the ranking provided by the entropy of the partition over a variety of partitioning algorithms is strongly correlated with the overlap with a ground truth partition The codes related to the project are available in https://github.com/Sandipan99/Ranking_cluster_algorithms.


Asunto(s)
Algoritmos , Interfaz Usuario-Computador , Análisis por Conglomerados , Bases de Datos Factuales
12.
Phys Rev E Stat Nonlin Soft Matter Phys ; 79(1 Pt 2): 015101, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19257095

RESUMEN

We define a minimal model of traffic flows in complex networks in order to study the trade-off between topological-based and traffic-based routing strategies. The resulting collective behavior is obtained analytically for an ensemble of uncorrelated networks and summarized in a rich phase diagram presenting second-order as well as first-order phase transitions between a free-flow phase and a congested phase. We find that traffic control improves global performance, enlarging the free-flow region in parameter space only in heterogeneous networks. Traffic control introduces nonlinear effects and, beyond a critical strength, may trigger the appearance of a congested phase in a discontinuous manner. The model also reproduces the crossover in the scaling of traffic fluctuations empirically observed on the Internet.

13.
Cell Rep ; 27(9): 2759-2771.e5, 2019 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-31141697

RESUMEN

Loss of functional cardiomyocytes is a major determinant of heart failure after myocardial infarction. Previous high throughput screening studies have identified a few microRNAs (miRNAs) that can induce cardiomyocyte proliferation and stimulate cardiac regeneration in mice. Here, we show that all of the most effective of these miRNAs activate nuclear localization of the master transcriptional cofactor Yes-associated protein (YAP) and induce expression of YAP-responsive genes. In particular, miR-199a-3p directly targets two mRNAs coding for proteins impinging on the Hippo pathway, the upstream YAP inhibitory kinase TAOK1, and the E3 ubiquitin ligase ß-TrCP, which leads to YAP degradation. Several of the pro-proliferative miRNAs (including miR-199a-3p) also inhibit filamentous actin depolymerization by targeting Cofilin2, a process that by itself activates YAP nuclear translocation. Thus, activation of YAP and modulation of the actin cytoskeleton are major components of the pro-proliferative action of miR-199a-3p and other miRNAs that induce cardiomyocyte proliferation.


Asunto(s)
Proteínas Reguladoras de la Apoptosis/metabolismo , Biomarcadores/metabolismo , Proliferación Celular , MicroARNs/genética , Miocitos Cardíacos/citología , Miocitos Cardíacos/metabolismo , Citoesqueleto de Actina , Animales , Animales Recién Nacidos , Proteínas Reguladoras de la Apoptosis/genética , Cofilina 2/genética , Cofilina 2/metabolismo , Femenino , Masculino , Ratas , Proteínas Señalizadoras YAP
14.
PLoS Comput Biol ; 2(4): e37, 2006 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-16683018

RESUMEN

The geography of codon bias distributions over prokaryotic genomes and its impact upon chromosomal organization are analyzed. To this aim, we introduce a clustering method based on information theory, specifically designed to cluster genes according to their codon usage and apply it to the coding sequences of Escherichia coli and Bacillus subtilis. One of the clusters identified in each of the organisms is found to be related to expression levels, as expected, but other groups feature an over-representation of genes belonging to different functional groups, namely horizontally transferred genes, motility, and intermediary metabolism. Furthermore, we show that genes with a similar bias tend to be close to each other on the chromosome and organized in coherent domains, more extended than operons, demonstrating a role of translation in structuring bacterial chromosomes. It is argued that a sizeable contribution to this effect comes from the dynamical compartimentalization induced by the recycling of tRNAs, leading to gene expression rates dependent on their genomic and expression context.


Asunto(s)
Bacillus subtilis/genética , Cromosomas Bacterianos/genética , Codón/genética , Escherichia coli/genética , Aminoácidos/genética , Secuencia de Bases , Regulación Bacteriana de la Expresión Génica/genética , Familia de Multigenes/genética , Operón/genética
15.
Phys Rev E Stat Nonlin Soft Matter Phys ; 76(2 Pt 2): 026104, 2007 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-17930101

RESUMEN

We investigate the emergence of a structure in the correlation matrix of assets' returns as the time horizon over which returns are computed increases from the minutes to the daily scale. We analyze data from different stock markets (New York, Paris, London, Milano) and with different methods. In addition to the usual correlations, we also analyze those obtained by subtracting the dynamics of the "center of mass" (i.e., the market mode). We find that when the center of mass is not removed the structure emerges, as the time horizon increases, from splitting a single large cluster into smaller ones. By contrast, when the market mode is removed the structure of correlations observed at the daily scale is already well defined at very high frequency (5 min in the New York Stock Exchange). Moreover, this structure accounts for 80% of the classification of stocks in economic sectors. Similar results, though less sharp, are found for the other markets. We also find that the structure of correlations in the overnight returns is markedly different from that of intraday activity.

16.
Sci Rep ; 7(1): 3096, 2017 06 08.
Artículo en Inglés | MEDLINE | ID: mdl-28596593

RESUMEN

Random number generation plays an essential role in technology with important applications in areas ranging from cryptography to Monte Carlo methods, and other probabilistic algorithms. All such applications require high-quality sources of random numbers, yet effective methods for assessing whether a source produce truly random sequences are still missing. Current methods either do not rely on a formal description of randomness (NIST test suite) on the one hand, or are inapplicable in principle (the characterization derived from the Algorithmic Theory of Information), on the other, for they require testing all the possible computer programs that could produce the sequence to be analysed. Here we present a rigorous method that overcomes these problems based on Bayesian model selection. We derive analytic expressions for a model's likelihood which is then used to compute its posterior distribution. Our method proves to be more rigorous than NIST's suite and Borel-Normality criterion and its implementation is straightforward. We applied our method to an experimental device based on the process of spontaneous parametric downconversion to confirm it behaves as a genuine quantum random number generator. As our approach relies on Bayesian inference our scheme transcends individual sequence analysis, leading to a characterization of the source itself.

17.
Phys Rev E Stat Nonlin Soft Matter Phys ; 73(6 Pt 2): 066127, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16906934

RESUMEN

In this paper we study the impact of degree correlations in the subgraph statistics of scale-free networks. In particular we consider loops, simple cases of network subgraphs which encode the redundancy of the paths passing through every two nodes of the network. We provide an understanding of the scaling of the clustering coefficient in modular networks in terms of the maximal eigenvector of the average adjacency matrix of the ensemble. Furthermore we show that correlations affect in a relevant way the average number of Hamiltonian paths in a three-core of real world networks. We prove our results in the two-vertex correlated hidden variable ensemble and we check the results with exact counting of small loops in real graphs.

18.
Phys Rev E Stat Nonlin Soft Matter Phys ; 73(4 Pt 2): 046113, 2006 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-16711884

RESUMEN

We study scale-free simple graphs with an exponent of the degree distribution gamma less than 2. Generically one expects such extremely skewed networks--which occur very frequently in systems of virtually or logically connected units--to have different properties than those of scale free networks with gamma>2: The number of links grows faster than the number of nodes and they naturally possess the small world property, because the diameter increases by the logarithm of the size of the network and the clustering coefficient is finite. We discuss a simple prototype model of such networks, inspired by real world phenomena, which exhibits these properties and allows for a detailed analytical investigation.

19.
Phys Rev E Stat Nonlin Soft Matter Phys ; 74(3 Pt 2): 036106, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-17025707

RESUMEN

We study a general set of models of social network evolution and dynamics. The models consist of both a dynamics on the network and evolution of the network. Links are formed preferentially between "similar" nodes, where the similarity is defined by the particular process taking place on the network. The interplay between the two processes produces phase transitions and hysteresis, as seen using numerical simulations for three specific processes. We obtain analytic results using mean-field approximations, and for a particular case we derive an exact solution for the network. In common with real-world social networks, we find coexistence of high and low connectivity phases and history dependence.

20.
Mol Biosyst ; 12(7): 2147-58, 2016 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-26974515

RESUMEN

Evolution in its course has found a variety of solutions to the same optimisation problem. The advent of high-throughput genomic sequencing has made available extensive data from which, in principle, one can infer the underlying structure on which biological functions rely. In this paper, we present a new method aimed at the extraction of sites encoding structural and functional properties from a set of protein primary sequences, namely a multiple sequence alignment. The method, called critical variable selection, is based on the idea that subsets of relevant sites correspond to subsequences that occur with a particularly broad frequency distribution in the dataset. By applying this algorithm to in silico sequences, to the response regulator receiver and to the voltage sensor domain of ion channels, we show that this procedure recovers not only the information encoded in single site statistics and pairwise correlations but also captures dependencies going beyond pairwise correlations. The method proposed here is complementary to statistical coupling analysis, in that the most relevant sites predicted by the two methods differ markedly. We find robust and consistent results for datasets as small as few hundred sequences that reveal a hidden hierarchy of sites that are consistent with the present knowledge on biologically relevant sites and evolutionary dynamics. This suggests that critical variable selection is capable of identifying a core of sites encoding functional and structural information in a multiple sequence alignment.


Asunto(s)
Aminoácidos/química , Aminoácidos/genética , Codón , Variación Genética , Proteínas/química , Proteínas/genética , Selección Genética , Algoritmos , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Biología Computacional/métodos , Simulación por Computador , Modelos Moleculares , Modelos Estadísticos , Conformación Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA