RESUMEN
Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three- or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins.
Asunto(s)
Proteínas Intrínsecamente Desordenadas/química , Evolución Molecular Dirigida/métodos , Genómica , Humanos , Proteínas Intrínsecamente Desordenadas/genética , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteoma/química , Proteoma/genéticaRESUMEN
Understanding how cell identity is encoded by the genome and acquired during differentiation is a central challenge in cell biology. I have developed a theoretical framework called EnhancerNet, which models the regulation of cell identity through the lens of transcription factor-enhancer interactions. I demonstrate that autoregulation in these interactions imposes a constraint on the model, resulting in simplified dynamics that can be parameterized from observed cell identities. Despite its simplicity, EnhancerNet recapitulates a broad range of experimental observations on cell identity dynamics, including enhancer selection, cell fate induction, hierarchical differentiation through multipotent progenitor states and direct reprogramming by transcription factor overexpression. The model makes specific quantitative predictions, reproducing known reprogramming recipes and the complex haematopoietic differentiation hierarchy without fitting unobserved parameters. EnhancerNet provides insights into how new cell types could evolve and highlights the functional importance of distal regulatory elements with dynamic chromatin in multicellular evolution.
Asunto(s)
Diferenciación Celular , Elementos de Facilitación Genéticos , Factores de Transcripción , Elementos de Facilitación Genéticos/genética , Diferenciación Celular/genética , Animales , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Cromatina/metabolismo , Linaje de la Célula/genética , Humanos , Modelos Biológicos , Modelos GenéticosRESUMEN
The population loss of trained deep neural networks often follows precise power-law scaling relations with either the size of the training dataset or the number of parameters in the network. We propose a theory that explains the origins of and connects these scaling laws. We identify variance-limited and resolution-limited scaling behavior for both dataset and model size, for a total of four scaling regimes. The variance-limited scaling follows simply from the existence of a well-behaved infinite data or infinite width limit, while the resolution-limited regime can be explained by positing that models are effectively resolving a smooth data manifold. In the large width limit, this can be equivalently obtained from the spectrum of certain kernels, and we present evidence that large width and large dataset resolution-limited scaling exponents are related by a duality. We exhibit all four scaling regimes in the controlled setting of large random feature and pretrained models and test the predictions empirically on a range of standard architectures and datasets. We also observe several empirical relationships between datasets and scaling exponents under modifications of task and architecture aspect ratio. Our work provides a taxonomy for classifying different scaling regimes, underscores that there can be different mechanisms driving improvements in loss, and lends insight into the microscopic origin and relationships between scaling exponents.
RESUMEN
How do we capture the breadth of behavior in animal movement, from rapid body twitches to aging? Using high-resolution videos of the nematode worm Caenorhabditis elegans, we show that a single dynamics connects posture-scale fluctuations with trajectory diffusion and longer-lived behavioral states. We take short posture sequences as an instantaneous behavioral measure, fixing the sequence length for maximal prediction. Within the space of posture sequences, we construct a fine-scale, maximum entropy partition so that transitions among microstates define a high-fidelity Markov model, which we also use as a means of principled coarse-graining. We translate these dynamics into movement using resistive force theory, capturing the statistical properties of foraging trajectories. Predictive across scales, we leverage the longest-lived eigenvectors of the inferred Markov chain to perform a top-down subdivision of the worm's foraging behavior, revealing both "runs-and-pirouettes" as well as previously uncharacterized finer-scale behaviors. We use our model to investigate the relevance of these fine-scale behaviors for foraging success, recovering a trade-off between local and global search strategies.
Asunto(s)
Conducta Animal , Caenorhabditis elegans , Cadenas de Markov , Animales , Caenorhabditis elegans/fisiología , Conducta Animal/fisiología , Modelos Biológicos , Movimiento/fisiologíaRESUMEN
How does social complexity depend on population size and cultural transmission? Kinship structures in traditional societies provide a fundamental illustration, where cultural rules between clans determine people's marriage possibilities. Here, we propose a simple model of kinship interactions that considers kin and in-law cooperation and sexual rivalry. In this model, multiple societies compete. Societies consist of multiple families with different cultural traits and mating preferences. These values determine interactions and hence the growth rate of families and are transmitted to offspring with mutations. Through a multilevel evolutionary simulation, family traits and preferences are grouped into multiple clans with interclan mating preferences. It illustrates the emergence of kinship structures as the spontaneous formation of interdependent cultural associations. Emergent kinship structures are characterized by the cycle length of marriage exchange and the number of cycles in society. We numerically and analytically clarify their parameter dependence. The relative importance of cooperation versus rivalry determines whether attraction or repulsion exists between families. Different structures evolve as locally stable attractors. The probabilities of formation and collapse of complex structures depend on the number of families and the mutation rate, showing characteristic scaling relationships. It is now possible to explore macroscopic kinship structures based on microscopic interactions, together with their environmental dependence and the historical causality of their evolution. We propose the basic causal mechanism of the formation of typical human social structures by referring to ethnographic observations and concepts from statistical physics and multilevel evolution. Such interdisciplinary collaboration will unveil universal features in human societies.
Asunto(s)
Matrimonio , Densidad de Población , Humanos , Tasa de Mutación , Familia , Evolución Cultural , Masculino , Mutación , Femenino , Modelos Teóricos , CulturaRESUMEN
Stochastic processes on graphs can describe a great variety of phenomena ranging from neural activity to epidemic spreading. While many existing methods can accurately describe typical realizations of such processes, computing properties of extremely rare events is a hard task, particularly so in the case of recurrent models, in which variables may return to a previously visited state. Here, we build on the matrix product cavity method, extending it fundamentally in two directions: First, we show how it can be applied to Markov processes biased by arbitrary reweighting factors that concentrate most of the probability mass on rare events. Second, we introduce an efficient scheme to reduce the computational cost of a single node update from exponential to polynomial in the node degree. Two applications are considered: inference of infection probabilities from sparse observations within the SIRS epidemic model and the computation of both typical observables and large deviations of several kinetic Ising models.
RESUMEN
Phase separation has emerged as an essential concept for the spatial organization inside biological cells. However, despite the clear relevance to virtually all physiological functions, we understand surprisingly little about what phases form in a system of many interacting components, like in cells. Here we introduce a numerical method based on physical relaxation dynamics to study the coexisting phases in such systems. We use our approach to optimize interactions between components, similar to how evolution might have optimized the interactions of proteins. These evolved interactions robustly lead to a defined number of phases, despite substantial uncertainties in the initial composition, while random or designed interactions perform much worse. Moreover, the optimized interactions are robust to perturbations, and they allow fast adaption to new target phase counts. We thus show that genetically encoded interactions of proteins provide versatile control of phase behavior. The phases forming in our system are also a concrete example of a robust emergent property that does not rely on fine-tuning the parameters of individual constituents.
Asunto(s)
Condensados Biomoleculares , Células , Fenómenos Físicos , Modelos Teóricos , ProteínasRESUMEN
Network effects are the added value derived solely from the popularity of a product in an economic market. Using agent-based models inspired by statistical physics, we propose a minimal theory of a competitive market for (nearly) indistinguishable goods with demand-side network effects, sold by statistically identical sellers. With weak network effects, the model reproduces conventional microeconomics: there is a statistical steady state of (nearly) perfect competition. Increasing network effects, we find a phase transition to a robust nonequilibrium phase driven by the spontaneous formation and collapse of fads in the market. When sellers update prices sufficiently quickly, an emergent monopolist can capture the market and undercut competition, leading to a symmetry- and ergodicity-breaking transition. The nonequilibrium phase simultaneously exhibits three empirically established phenomena not contained in the standard theory of competitive markets: spontaneous price fluctuations, persistent seller profits, and broad distributions of firm market shares.
RESUMEN
All life on Earth is unified by its use of a shared set of component chemical compounds and reactions, providing a detailed model for universal biochemistry. However, this notion of universality is specific to known biochemistry and does not allow quantitative predictions about examples not yet observed. Here, we introduce a more generalizable concept of biochemical universality that is more akin to the kind of universality found in physics. Using annotated genomic datasets including an ensemble of 11,955 metagenomes, 1,282 archaea, 11,759 bacteria, and 200 eukaryotic taxa, we show how enzyme functions form universality classes with common scaling behavior in their relative abundances across the datasets. We verify that these scaling laws are not explained by the presence of compounds, reactions, and enzyme functions shared across known examples of life. We demonstrate how these scaling laws can be used as a tool for inferring properties of ancient life by comparing their predictions with a consensus model for the last universal common ancestor (LUCA). We also illustrate how network analyses shed light on the functional principles underlying the observed scaling behaviors. Together, our results establish the existence of a new kind of biochemical universality, independent of the details of life on Earth's component chemistry, with implications for guiding our search for missing biochemical diversity on Earth or for biochemistries that might deviate from the exact chemical makeup of life as we know it, such as at the origins of life, in alien environments, or in the design of synthetic life.
Asunto(s)
Fenómenos Bioquímicos , Enzimas/metabolismo , Planeta Tierra , Origen de la Vida , Biología SintéticaRESUMEN
Modern theories of phase transitions and scale invariance are rooted in path integral formulation and renormalization groups (RGs). Despite the applicability of these approaches in simple systems with only pairwise interactions, they are less effective in complex systems with undecomposable high-order interactions (i.e. interactions among arbitrary sets of units). To precisely characterize the universality of high-order interacting systems, we propose a simplex path integral and a simplex RG (SRG) as the generalizations of classic approaches to arbitrary high-order and heterogeneous interactions. We first formalize the trajectories of units governed by high-order interactions to define path integrals on corresponding simplices based on a high-order propagator. Then, we develop a method to integrate out short-range high-order interactions in the momentum space, accompanied by a coarse graining procedure functioning on the simplex structure generated by high-order interactions. The proposed SRG, equipped with a divide-and-conquer framework, can deal with the absence of ergodicity arising from the sparse distribution of high-order interactions and can renormalize a system with intertwined high-order interactions at thep-order according to its properties at theq-order (p⩽q). The associated scaling relation and its corollaries provide support to differentiate among scale-invariant, weakly scale-invariant, and scale-dependent systems across different orders. We validate our theory in multi-order scale-invariance verification, topological invariance discovery, organizational structure identification, and information bottleneck analysis. These experiments demonstrate the capability of our theory to identify intrinsic statistical and topological properties of high-order interacting systems during system reduction.
RESUMEN
The current investigation reports the usage of adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN), the two recognized machine learning techniques in modelling tetracycline (TC) adsorption onto Cynometra ramiflora fruit biomass derived activated carbon (AC). Many characterization methods utilized, confirmed the porous structure of synthesized AC. ANN and ANFIS models utilized pH, dose, initial TC concentration, mixing speed, time duration, and temperature as input parameters, whereas TC removal percentage was designated as the output parameter. The optimized configuration for the ANN model was determined as 6-8-1, while the ANFIS model employed trimf input and linear output membership functions. The obtained results showed a strong correlation, indicated by high R2 values (ANNR2: 0.9939 & ANFISR2: 0.9906) and low RMSE values (ANNRMSE: 0.0393 & ANFISRMSE: 0.0503). Apart from traditional isotherms, the dataset was fitted to statistical physics models wherein, the double-layer with a single energy satisfactorily explained the physisorption mechanism of TC adsorption. The sorption energy was 21.06 kJ/mol, and the number of TC moieties bound per site (n) was found to be 0.42, conclusive of parallel binding of TC molecules to the adsorbent surface. The adsorption capacity at saturation (Qsat) was estimated to be 466.86 mg/g - appreciably more than previously reported values. These findings collectively demonstrate that the AC derived from C. ramiflora fruit holds great potential for efficient removal of TC from a given system, and machine learning approaches can effectively model the adsorption processes.
Asunto(s)
Biomasa , Carbón Orgánico , Aprendizaje Automático , Redes Neurales de la Computación , Tetraciclina , Adsorción , Tetraciclina/química , Tetraciclina/análisis , Carbón Orgánico/química , Frutas/química , Contaminantes Químicos del Agua/química , Contaminantes Químicos del Agua/análisisRESUMEN
In the present study, ß-cyclodextrin modified magnetic graphene oxide/cellulose (CN/IGO/Cel) was fabricated for removal of Cd(II) ions. The material was characterized through various analytical techniques like FTIR, XRD, TGA/DTA, SEM, TEM, and XPS. The point of zero charge of the material was obtained as 5.38. The controllable factors were optimized by Taguchi design and optimum values were: adsorbent dose-16 mg, equilibrium time-40 min, and initial concentration of Cd(II) ions-40 mg/L. The material shows high adsorption capacity (303.98 mg/g). The good fitting of Langmuir model to adsorption data (R2 = 0.9918-0.9936) revealed the monolayer coverage on adsorbent surface. Statistical physics model M 2 showed best fitting to adsorption data (R2 > 0.997), suggesting the binding of Cd(II) ions occurred on two different receptor sites (n). Stereographically n > 1 confirming vertical multi-molecular mechanisms of Cd(II) ions adsorption on CN/IGO/Cel surface. The adsorption energies (E1 = 23.71-28.95 kJ/mol; E2 = 22.69-29.38 kJ/mol) concluded the involvement of physical forces for Cd(II) ions adsorption. Kinetic data fitted well to fractal-like pseudo first-order model (R2 > 0.9952), concluding the adsorption of Cd(II) ions occurred on energetically heterogeneous surface. The kinetic analysis shows that both the film-diffusion and pore-diffusion were responsible for Cd(II) ions uptake. XPS analysis was utilized to explain the adsorption mechanism of Cd(II) ions onto CN/IGO/Cel.
Asunto(s)
Grafito , Contaminantes Químicos del Agua , beta-Ciclodextrinas , Cadmio/análisis , Adsorción , Fractales , Celulosa , Cinética , Magnetismo , Fenómenos Magnéticos , beta-Ciclodextrinas/análisis , Contaminantes Químicos del Agua/análisis , Concentración de Iones de HidrógenoRESUMEN
The selection of a single molecular handedness, or homochirality across all living matter, is a mystery in the origin of life. Frank's seminal model showed in the '50s how chiral symmetry breaking can occur in nonequilibrium chemical networks. However, an important shortcoming in this classic model is that it considers a small number of species, while there is no reason for the prebiotic system, in which homochirality first appeared, to have had such a simple composition. Furthermore, this model does not provide information on what could have been the size of the molecules involved in this homochiral prebiotic system. Here, we show that large molecular systems are likely to undergo a phase transition toward a homochiral state, as a consequence of the fact that they contain a large number of chiral species. Using chemoinformatics tools, we quantify how abundant chiral species are in the chemical universe of all possible molecules of a given length. Then, we propose that Frank's model should be extended to include a large number of species, in order to possess the transition toward homochirality, as confirmed by numerical simulations. Finally, using random matrix theory, we prove that large nonequilibrium reaction networks possess a generic and robust phase transition toward a homochiral state.
RESUMEN
Despite tremendous success of the stochastic gradient descent (SGD) algorithm in deep learning, little is known about how SGD finds generalizable solutions at flat minima of the loss function in high-dimensional weight space. Here, we investigate the connection between SGD learning dynamics and the loss function landscape. A principal component analysis (PCA) shows that SGD dynamics follow a low-dimensional drift-diffusion motion in the weight space. Around a solution found by SGD, the loss function landscape can be characterized by its flatness in each PCA direction. Remarkably, our study reveals a robust inverse relation between the weight variance and the landscape flatness in all PCA directions, which is the opposite to the fluctuation-response relation (aka Einstein relation) in equilibrium statistical physics. To understand the inverse variance-flatness relation, we develop a phenomenological theory of SGD based on statistical properties of the ensemble of minibatch loss functions. We find that both the anisotropic SGD noise strength (temperature) and its correlation time depend inversely on the landscape flatness in each PCA direction. Our results suggest that SGD serves as a landscape-dependent annealing algorithm. The effective temperature decreases with the landscape flatness so the system seeks out (prefers) flat minima over sharp ones. Based on these insights, an algorithm with landscape-dependent constraints is developed to mitigate catastrophic forgetting efficiently when learning multiple tasks sequentially. In general, our work provides a theoretical framework to understand learning dynamics, which may eventually lead to better algorithms for different learning tasks.
RESUMEN
We analyze about 200 naturally occurring networks with distinct dynamical origins to formally test whether the commonly assumed hypothesis of an underlying scale-free structure is generally viable. This has recently been questioned on the basis of statistical testing of the validity of power law distributions of network degrees. Specifically, we analyze by finite size scaling analysis the datasets of real networks to check whether the purported departures from power law behavior are due to the finiteness of sample size. We find that a large number of the networks follows a finite size scaling hypothesis without any self-tuning. This is the case of biological protein interaction networks, technological computer and hyperlink networks, and informational networks in general. Marked deviations appear in other cases, especially involving infrastructure and transportation but also in social networks. We conclude that underlying scale invariance properties of many naturally occurring networks are extant features often clouded by finite size effects due to the nature of the sample data.
RESUMEN
The production of cobalt oxide nanoparticles and their use in the adsorption of methylene blue (MB) from solution is described in the paper. The X-ray diffraction patterns show that the synthesized cobalt oxide nanoparticles have a crystalline cubic structure. The study of the adsorption of methylene blue onto the cobalt oxide nanoparticles involved determining the contact time and initial concentration of the adsorption of MB on the adsorbent. The kinetics of adsorption were analyzed using two kinetic models (pseudo-first order and pseudo-second order), and the pseudo-second-order model was found to be the most appropriate for describing the behavior of the adsorption. This study indicates that the MLTS (monolayer with the same number of molecules per site) model is the most suitable model for describing methylene blue/cobalt oxide systems, and the parameter values help to further understand the adsorption process with the steric parameters. Indicating that methylene blue is horizontally adsorbed onto the surface of the cobalt oxide, which is bonded to two different receptor sites. Regarding the temperature effect, it was found that the adsorption capacity increased, with the experimental value ranging from 313.7 to 405.3 mg g-1, while the MLTS predicted 313.32 and 408.16 mg g-1. From the thermodynamic functions, high entropy was found around 280 mg L-1 concentration. For all concentrations and temperatures examined, the Gibbs free energy and enthalpy of adsorption were found to be negative and positive, respectively, suggesting that the system is spontaneous and endothermic. According to this study's findings, methylene blue adsorption onto cobalt oxide nanoparticles happens via the creation of a monolayer, in which the same amount of molecules are adsorbed at two distinct locations. The findings shed light on the methylene blue adsorption process onto cobalt oxide nanoparticles, which have a variety of uses, including the remediation of wastewater.
RESUMEN
Studies of collective motion have heretofore been dominated by a thermodynamic perspective in which the emergent "flocked" phases are analyzed in terms of their time-averaged orientational and spatial properties. Studies that attempt to scrutinize the dynamical processes that spontaneously drive the formation of these flocks from initially random configurations are far more rare, perhaps owing to the fact that said processes occur far from the eventual long-time steady state of the system and thus lie outside the scope of traditional statistical mechanics. For systems whose dynamics are simulated numerically, the nonstationary distribution of system configurations can be sampled at different time points, and the time evolution of the average structural properties of the system can be quantified. In this paper, we employ this strategy to characterize the spatial dynamics of the standard Vicsek flocking model using two correlation functions common to condensed matter physics. We demonstrate, for modest system sizes with 800 to 2000 agents, that the self-assembly dynamics can be characterized by three distinct and disparate time scales that we associate with the corresponding physical processes of clustering (compaction), relaxing (expansion), and mixing (rearrangement). We further show that the behavior of these correlation functions can be used to reliably distinguish between phenomenologically similar models with different underlying interactions and, in some cases, even provide a direct measurement of key model parameters.
RESUMEN
Brain-computer interfaces have seen extraordinary surges in developments in recent years, and a significant discrepancy now exists between the abundance of available data and the limited headway made in achieving a unified theoretical framework. This discrepancy becomes particularly pronounced when examining the collective neural activity at the micro and meso scale, where a coherent formalization that adequately describes neural interactions is still lacking. Here, we introduce a mathematical framework to analyze systems of natural neurons and interpret the related empirical observations in terms of lattice field theory, an established paradigm from theoretical particle physics and statistical mechanics. Our methods are tailored to interpret data from chronic neural interfaces, especially spike rasters from measurements of single neuron activity, and generalize the maximum entropy model for neural networks so that the time evolution of the system is also taken into account. This is obtained by bridging particle physics and neuroscience, paving the way for particle physics-inspired models of the neocortex.
RESUMEN
Stochastic thermodynamics lays down a broad framework to revisit the venerable concepts of heat, work and entropy production for individual stochastic trajectories of mesoscopic systems. Remarkably, this approach, relying on stochastic equations of motion, introduces time into the description of thermodynamic processes-which opens the way to fine control them. As a result, the field of finite-time thermodynamics of mesoscopic systems has blossomed. In this article, after introducing a few concepts of control for isolated mechanical systems evolving according to deterministic equations of motion, we review the different strategies that have been developed to realize finite-time state-to-state transformations in both over and underdamped regimes, by the proper design of time-dependent control parameters/driving. The systems under study are stochastic, epitomized by a Brownian object immersed in a fluid; they are thus strongly coupled to their environment playing the role of a reservoir. Interestingly, a few of those methods (inverse engineering, counterdiabatic driving, fast-forward) are directly inspired by their counterpart in quantum control. The review also analyzes the control through reservoir engineering. Besides the reachability of a given target state from a known initial state, the question of the optimal path is discussed. Optimality is here defined with respect to a cost function, a subject intimately related to the field of information thermodynamics and the question of speed limit. Another natural extension discussed deals with the connection between arbitrary states or non-equilibrium steady states. This field of control in stochastic thermodynamics enjoys a wealth of applications, ranging from optimal mesoscopic heat engines to population control in biological systems.
Asunto(s)
Calor , Procesos Estocásticos , Termodinámica , Entropía , Movimiento (Física)RESUMEN
Species-rich communities, such as the microbiota or microbial ecosystems, provide key functions for human health and climatic resilience. Increasing effort is being dedicated to design experimental protocols for selecting community-level functions of interest. These experiments typically involve selection acting on populations of communities, each of which is composed of multiple species. If numerical simulations started to explore the evolutionary dynamics of this complex, multi-scale system, a comprehensive theoretical understanding of the process of artificial selection of communities is still lacking. Here, we propose a general model for the evolutionary dynamics of communities composed of a large number of interacting species, described by disordered generalised Lotka-Volterra equations. Our analytical and numerical results reveal that selection for scalar community functions leads to the emergence, along an evolutionary trajectory, of a low-dimensional structure in an initially featureless interaction matrix. Such structure reflects the combination of the properties of the ancestral community and of the selective pressure. Our analysis determines how the speed of adaptation scales with the system parameters and the abundance distribution of the evolved communities. Artificial selection for larger total abundance is thus shown to drive increased levels of mutualism and interaction diversity. Inference of the interaction matrix is proposed as a method to assess the emergence of structured interactions from experimentally accessible measures.