Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Phys Rev Lett ; 132(7): 077402, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38427895

RESUMO

Studies of dynamics on temporal networks often represent the network as a series of "snapshots," static networks active for short durations of time. We argue that successive snapshots can be aggregated if doing so has little effect on the overlying dynamics. We propose a method to compress network chronologies by progressively combining pairs of snapshots whose matrix commutators have the smallest dynamical effect. We apply this method to epidemic modeling on real contact tracing data and find that it allows for significant compression while remaining faithful to the epidemic dynamics.

2.
PLoS Comput Biol ; 18(11): e1010650, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36413581

RESUMO

Network science has increasingly become central to the field of epidemiology and our ability to respond to infectious disease threats. However, many networks derived from modern datasets are not just large, but dense, with a high ratio of edges to nodes. This includes human mobility networks where most locations have a large number of links to many other locations. Simulating large-scale epidemics requires substantial computational resources and in many cases is practically infeasible. One way to reduce the computational cost of simulating epidemics on these networks is sparsification, where a representative subset of edges is selected based on some measure of their importance. We test several sparsification strategies, ranging from naive thresholding to random sampling of edges, on mobility data from the U.S. Following recent work in computer science, we find that the most accurate approach uses the effective resistances of edges, which prioritizes edges that are the only efficient way to travel between their endpoints. The resulting sparse network preserves many aspects of the behavior of an SIR model, including both global quantities, like the epidemic size, and local details of stochastic events, including the probability each node becomes infected and its distribution of arrival times. This holds even when the sparse network preserves fewer than 10% of the edges of the original network. In addition to its practical utility, this method helps illuminate which links of a weighted, undirected network are most important to disease spread.


Assuntos
Pandemias , Viagem , Humanos , Projetos de Pesquisa , Modelos Epidemiológicos , Probabilidade
3.
Phys Rev Lett ; 123(23): 230605, 2019 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-31868436

RESUMO

We prove a remarkable combinatorial symmetry in the number of spanning configurations in site percolation: for a large class of lattices, the number of spanning configurations with an odd or even number of occupied sites differs by ±1. In particular, this symmetry implies that the total number of spanning configurations is always odd, independent of the size or shape of the lattice. The class of lattices that share this symmetry includes the square lattice and the hypercubic lattice in any dimension, with a wide variety of boundary conditions.

4.
Proc Natl Acad Sci U S A ; 113(50): 14207-14212, 2016 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-27911773

RESUMO

With increasing amounts of information available, modeling and predicting user preferences-for books or articles, for example-are becoming more important. We present a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of users' ratings. Like previous approaches, we assume that there are groups of users and of items and that the rating a user gives an item is determined by their respective group memberships. However, we allow each user and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches such as matrix factorization, we do not assume that users in each group prefer a single group of items. In particular, we do not assume that ratings depend linearly on a measure of similarity, but allow probability distributions of ratings to depend freely on the user's and item's groups. The resulting overlapping groups and predicted ratings can be inferred with an expectation-maximization algorithm whose running time scales linearly with the number of observed ratings. Our approach enables us to predict user preferences in large datasets and is considerably more accurate than the current algorithms for such large datasets.

5.
Proc Natl Acad Sci U S A ; 113(7): 1766-71, 2016 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-26831113

RESUMO

How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides indirect access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here, we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries to translate words to and from languages carefully selected to be representative of worldwide diversity. These translations reveal cases where a particular language uses a single "polysemous" word to express multiple concepts that another language represents using distinct words. We use the frequency of such polysemies linking two concepts as a measure of their semantic proximity and represent the pattern of these linkages by a weighted network. This network is highly structured: Certain concepts are far more prone to polysemy than others, and naturally interpretable clusters of closely related concepts emerge. Statistical analysis of the polysemies observed in a subset of the basic vocabulary shows that these structural properties are consistent across different language groups, and largely independent of geography, environment, and the presence or absence of a literary tradition. The methods developed here can be applied to any semantic domain to reveal the extent to which its conceptual structure is, similarly, a universal attribute of human cognition and language use.


Assuntos
Semântica , Humanos
6.
Proc Natl Acad Sci U S A ; 111(51): 18144-9, 2014 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-25489096

RESUMO

Modularity is a popular measure of community structure. However, maximizing the modularity can lead to many competing partitions, with almost the same modularity, that are poorly correlated with each other. It can also produce illusory ''communities'' in random graphs where none exist. We address this problem by using the modularity as a Hamiltonian at finite temperature and using an efficient belief propagation algorithm to obtain the consensus of many partitions with high modularity, rather than looking for a single partition that maximizes it. We show analytically and numerically that the proposed algorithm works all of the way down to the detectability transition in networks generated by the stochastic block model. It also performs well on real-world networks, revealing large communities in some networks where previous work has claimed no communities exist. Finally we show that by applying our algorithm recursively, subdividing communities until no statistically significant subcommunities can be found, we can detect hierarchical structure in real-world networks more efficiently than previous methods.


Assuntos
Modelos Teóricos , Algoritmos , Análise por Conglomerados , Processos Estocásticos
7.
Proc Natl Acad Sci U S A ; 110(52): 20935-40, 2013 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-24277835

RESUMO

Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here, we present a class of spectral algorithms based on a nonbacktracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all of the way down to the theoretical limit. We also show the spectrum of the nonbacktracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.


Assuntos
Algoritmos , Análise por Conglomerados , Interpretação Estatística de Dados , Modelos Estatísticos
8.
Nature ; 453(7191): 98-101, 2008 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-18451861

RESUMO

Networks have in recent years emerged as an invaluable tool for describing and quantifying complex systems in many branches of science. Recent studies suggest that networks often exhibit hierarchical organization, in which vertices divide into groups that further subdivide into groups of groups, and so forth over multiple scales. In many cases the groups are found to correspond to known functional units, such as ecological niches in food webs, modules in biochemical networks (protein interaction networks, metabolic networks or genetic regulatory networks) or communities in social networks. Here we present a general technique for inferring hierarchical structure from network data and show that the existence of hierarchy can simultaneously explain and quantitatively reproduce many commonly observed topological properties of networks, such as right-skewed degree distributions, high clustering coefficients and short path lengths. We further show that knowledge of hierarchical structure can be used to predict missing connections in partly known networks with high accuracy, and for more general network structures than competing techniques. Taken together, our results suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena.


Assuntos
Algoritmos , Modelos Biológicos , Probabilidade , Vias Biossintéticas , Cadeia Alimentar , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Ligação Proteica , Sensibilidade e Especificidade , Comportamento Social
9.
J Stat Mech ; 2014(5)2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-26167197

RESUMO

The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.

10.
Phys Rev E ; 105(5): L052303, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35706256

RESUMO

Many datasets give partial information about an ordering or ranking by indicating which team won a game, which item a user prefers, or who infected whom. We define a continuous spin system whose Gibbs distribution is the posterior distribution on permutations, given a probabilistic model of these interactions. Using the cavity method, we derive a belief propagation algorithm that computes the marginal distribution of each node's position. In addition, the Bethe free energy lets us approximate the number of linear extensions of a partial order and perform model selection between competing probabilistic models, such as the Bradley-Terry-Luce model of noisy comparisons and its cousins.

11.
ArXiv ; 2022 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-35169597

RESUMO

Most models of epidemic spread, including many designed specifically for COVID-19, implicitly assume mass-action contact patterns and undirected contact networks, meaning that the individuals most likely to spread the disease are also the most at risk to receive it from others. Here, we review results from the theory of random directed graphs which show that many important quantities, including the reproduction number and the epidemic size, depend sensitively on the joint distribution of in- and out-degrees ("risk" and "spread"), including their heterogeneity and the correlation between them. By considering joint distributions of various kinds, we elucidate why some types of heterogeneity cause a deviation from the standard Kermack-McKendrick analysis of SIR models, i.e., so-called mass-action models where contacts are homogeneous and random, and some do not. We also show that some structured SIR models informed by realistic complex contact patterns among types of individuals (age or activity) are simply mixtures of Poisson processes and tend not to deviate significantly from the simplest mass-action model. Finally, we point out some possible policy implications of this directed structure, both for contact tracing strategy and for interventions designed to prevent superspreading events. In particular, directed graphs have a forward and backward version of the classic "friendship paradox" -- forward edges tend to lead to individuals with high risk, while backward edges lead to individuals with high spread -- such that a combination of both forward and backward contact tracing is necessary to find superspreading events and prevent future cascades of infection.

12.
Phys Rev Lett ; 107(6): 065701, 2011 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-21902340

RESUMO

We present an asymptotically exact analysis of the problem of detecting communities in sparse random networks generated by stochastic block models. Using the cavity method of statistical physics and its relationship to belief propagation, we unveil a phase transition from a regime where we can infer the correct group assignments of the nodes to one where these groups are undetectable. Our approach yields an optimal inference algorithm for detecting modules, including both assortative and disassortative functional modules, assessing their significance, and learning the parameters of the underlying block model. Our algorithm is scalable and applicable to real-world networks, as long as they are well described by the block model.

14.
Phys Rev E ; 98(2-1): 022120, 2018 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30253462

RESUMO

We use invasion percolation to compute highly accurate numerical values for bond and site percolation thresholds p_{c} on the hypercubic lattice Z^{d} for d=4,...,13. We also compute the Fisher exponent τ governing the cluster size distribution at criticality. Our results support the claim that the mean-field value τ=5/2 holds for d≥6, with logarithmic corrections to power-law scaling at d=6.

15.
Sci Adv ; 4(7): eaar8260, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-30035220

RESUMO

We present a physically inspired model and an efficient algorithm to infer hierarchical rankings of nodes in directed networks. It assigns real-valued ranks to nodes rather than simply ordinal ranks, and it formalizes the assumption that interactions are more likely to occur between individuals with similar ranks. It provides a natural statistical significance test for the inferred hierarchy, and it can be used to perform inference tasks such as predicting the existence or direction of edges. The ranking is obtained by solving a linear system of equations, which is sparse if the network is; thus, the resulting algorithm is extremely efficient and scalable. We illustrate these findings by analyzing real and synthetic data, including data sets from animal behavior, faculty hiring, social support networks, and sports tournaments. We show that our method often outperforms a variety of others, in both speed and accuracy, in recovering the underlying ranks and predicting edge directions.

16.
Phys Rev E ; 96(4-1): 042116, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29347529

RESUMO

We use invasion percolation to compute numerical values for bond and site percolation thresholds p_{c} (existence of an infinite cluster) and p_{u} (uniqueness of the infinite cluster) of tesselations {P,Q} of the hyperbolic plane, where Q faces meet at each vertex and each face is a P-gon. Our values are accurate to six or seven decimal places, allowing us to explore their functional dependency on P and Q and to numerically compute critical exponents. We also prove rigorous upper and lower bounds for p_{c} and p_{u} that can be used to find the scaling of both thresholds as a function of P and Q.

17.
Phys Rev E ; 95(4-1): 042317, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28505768

RESUMO

Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization algorithm, which allows us to perform inference tasks such as community detection and link prediction in this setting. Our model assumes overlapping communities that are common between the layers, while allowing these communities to affect each layer in a different way, including arbitrary mixtures of assortative, disassortative, or directed structure. It also gives us a mathematically principled way to define the interdependence between layers, by measuring how much information about one layer helps us predict links in another layer. In particular, this allows us to bundle layers together to compress redundant information and identify small groups of layers which suffice to predict the remaining layers accurately. We illustrate these findings by analyzing synthetic data and two real multilayer networks, one representing social support relationships among villagers in South India and the other representing shared genetic substring material between genes of the malaria parasite.

18.
Phys Rev E Stat Nonlin Soft Matter Phys ; 74(3 Pt 2): 036121, 2006 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17025722

RESUMO

There has been considerable recent interest in the properties of networks, such as citation networks and the worldwide web, that grow by the addition of vertices, and a number of simple solvable models of network growth have been studied. In the real world, however, many networks, including the web, not only add vertices but also lose them. Here we formulate models of the time evolution of such networks and give exact solutions for a number of cases of particular interest. For the case of net growth and so-called preferential attachment--in which newly appearing vertices attach to previously existing ones in proportion to vertex degree--we show that the resulting networks have power-law degree distributions, but with an exponent that diverges as the growth rate vanishes. We conjecture that the low exponent values observed in real-world networks are thus the result of vigorous growth in which the rate of addition of vertices far exceeds the rate of removal. Were growth to slow in the future--for instance, in a more mature future version of the web--we would expect to see exponents increase, potentially without bound.

19.
Phys Rev E Stat Nonlin Soft Matter Phys ; 73(2 Pt 2): 026130, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16605421

RESUMO

We study the topological and geographic structure of the national road networks of the United States, England, and Denmark. By transforming these networks into their dual representation, where roads are vertices and an edge connects two vertices if the corresponding roads ever intersect, we show that they exhibit both topological and geographic scale invariance. That is, we show that for sufficiently large geographic areas, the dual degree distribution follows a power law with exponent 2.2< or = alpha < or =2.4, and that journeys, regardless of their length, have a largely identical structure. To explain these properties, we introduce and analyze a simple fractal model of road placement that reproduces the observed structure, and suggests a testable connection between the scaling exponent and the fractal dimensions governing the placement of roads and intersections.

20.
Phys Rev E Stat Nonlin Soft Matter Phys ; 73(3 Pt 2): 036104, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16605595

RESUMO

Most current methods for identifying coherent structures in spatially extended systems rely on prior information about the form which those structures take. Here we present two approaches to automatically filter the changing configurations of spatial dynamical systems and extract coherent structures. One, local sensitivity filtering, is a modification of the local Lyapunov exponent approach suitable to cellular automata and other discrete spatial systems. The other, local statistical complexity filtering, calculates the amount of information needed for optimal prediction of the system's behavior in the vicinity of a given point. By examining the changing spatiotemporal distributions of these quantities, we can find the coherent structures in a variety of pattern-forming cellular automata, without needing to guess or postulate the form of that structure. We apply both filters to elementary and cyclical cellular automata (ECA and CCA) and find that they readily identify particles, domains, and other more complicated structures. We compare the results from ECA with earlier ones based upon the theory of formal languages and the results from CCA with a more traditional approach based on an order parameter and free energy. While sensitivity and statistical complexity are equally adept at uncovering structure, they are based on different system properties (dynamical and probabilistic, respectively) and provide complementary information.


Assuntos
Biofísica/métodos , Biologia Computacional/métodos , Automação , Modelos Estatísticos , Linguagens de Programação , Sensibilidade e Especificidade , Software , Termodinâmica , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA