Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Entropy (Basel) ; 23(3)2021 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-33806454

RESUMO

Linear separability, a core concept in supervised machine learning, refers to whether the labels of a data set can be captured by the simplest possible machine: a linear classifier. In order to quantify linear separability beyond this single bit of information, one needs models of data structure parameterized by interpretable quantities, and tractable analytically. Here, I address one class of models with these properties, and show how a combinatorial method allows for the computation, in a mean field approximation, of two useful descriptors of linear separability, one of which is closely related to the popular concept of storage capacity. I motivate the need for multiple metrics by quantifying linear separability in a simple synthetic data set with controlled correlations between the points and their labels, as well as in the benchmark data set MNIST, where the capacity alone paints an incomplete picture. The analytical results indicate a high degree of "universality", or robustness with respect to the microscopic parameters controlling data structure.

2.
Phys Rev Lett ; 125(12): 120601, 2020 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-33016711

RESUMO

Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labeled subsets. At variance with the unstructured scenario, entropy is nonmonotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labeled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.

3.
Nucleic Acids Res ; 45(14): 8190-8198, 2017 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-28854733

RESUMO

Genome replication, a key process for a cell, relies on stochastic initiation by replication origins, causing a variability of replication timing from cell to cell. While stochastic models of eukaryotic replication are widely available, the link between the key parameters and overall replication timing has not been addressed systematically. We use a combined analytical and computational approach to calculate how positions and strength of many origins lead to a given cell-to-cell variability of total duration of the replication of a large region, a chromosome or the entire genome. Specifically, the total replication timing can be framed as an extreme-value problem, since it is due to the last region that replicates in each cell. Our calculations identify two regimes based on the spread between characteristic completion times of all inter-origin regions of a genome. For widely different completion times, timing is set by the single specific region that is typically the last to replicate in all cells. Conversely, when the completion time of all regions are comparable, an extreme-value estimate shows that the cell-to-cell variability of genome replication timing has universal properties. Comparison with available data shows that the replication program of three yeast species falls in this extreme-value regime.


Assuntos
Algoritmos , Período de Replicação do DNA/genética , Genoma/genética , Modelos Genéticos , Origem de Replicação/genética , Fase S/genética , Cromossomos Fúngicos/genética , Biologia Computacional/métodos , Cinética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Saccharomycetales/citologia , Saccharomycetales/genética , Schizosaccharomyces/citologia , Schizosaccharomyces/genética , Especificidade da Espécie , Processos Estocásticos
4.
Soft Matter ; 14(29): 6128-6136, 2018 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-29998272

RESUMO

Motivated by the problem of domain formation in chromosomes, we studied a co-polymer model where only a subset of the monomers feel attractive interactions. These monomers are displaced randomly from a regularly-spaced pattern, thus introducing some quenched disorder in the system. Previous work has shown that in the case of regularly-spaced interacting monomers this chain can fold into structures characterized by multiple distinct domains of consecutive segments. In each domain, attractive interactions are balanced by the entropy cost of forming loops. We show by advanced replica-exchange simulations that adding disorder in the position of the interacting monomers further stabilizes these domains. The model suggests that the partitioning of the chain into well-defined domains of consecutive monomers is a spontaneous property of heteropolymers. In the case of chromosomes, evolution could have acted on the spacing of interacting monomers to modulate in a simple way the underlying domains for functional reasons.


Assuntos
Cromossomos/química , Cromossomos/metabolismo , Modelos Moleculares , Polímeros/química , Entropia , Distribuição Normal
5.
Phys Rev Lett ; 116(25): 256803, 2016 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-27391740

RESUMO

After more than three decades, the fractional quantum Hall effect still poses challenges to contemporary physics. Recent experiments point toward a fractal scenario for the Hall resistivity as a function of the magnetic field. Here, we consider the so-called thin-torus limit of the Hamiltonian describing interacting electrons in a strong magnetic field, restricted to the lowest Landau level, and we show that it can be mapped onto a one-dimensional lattice gas with repulsive interactions, with the magnetic field playing the role of the chemical potential. The statistical mechanics of such models leads us to interpret the sequence of Hall plateaux as a fractal phase diagram whose landscape shows a qualitative agreement with experiments.

6.
Proc Natl Acad Sci U S A ; 110(52): 21054-8, 2013 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-24324175

RESUMO

The development of a complex system depends on the self-coordinated action of a large number of agents, often determining unexpected global behavior. The case of software evolution has great practical importance: knowledge of what is to be considered atypical can guide developers in recognizing and reacting to abnormal behavior. Although the initial framework of a theory of software exists, the current theoretical achievements do not fully capture existing quantitative data or predict future trends. Here we show that two elementary laws describe the evolution of package sizes in a Linux-based operating system: first, relative changes in size follow a random walk with non-Gaussian jumps; second, each size change is bounded by a limit that is dependent on the starting size, an intriguing behavior that we call "soft bound." Our approach is based on data analysis and on a simple theoretical model, which is able to reproduce empirical details without relying on any adjustable parameter and generates definite predictions. The same analysis allows us to formulate and support the hypothesis that a similar mechanism is shaping the distribution of mammalian body sizes, via size-dependent constraints during cladogenesis. Whereas generally accepted approaches struggle to reproduce the large-mass shoulder displayed by the distribution of extant mammalian species, this is a natural consequence of the softly bounded nature of the process. Additionally, the hypothesis that this model is valid has the relevant implication that, contrary to a common assumption, mammalian masses are still evolving, albeit very slowly.


Assuntos
Evolução Biológica , Tamanho Corporal/fisiologia , Mamíferos/crescimento & desenvolvimento , Modelos Teóricos , Software/estatística & dados numéricos , Software/tendências , Animais , Simulação por Computador , Processos Estocásticos
7.
Nat Genet ; 54(7): 976-984, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35817983

RESUMO

Compelling evidence shows that cancer persister cells represent a major limit to the long-term efficacy of targeted therapies. However, the phenotype and population dynamics of cancer persister cells remain unclear. We developed a quantitative framework to study persisters by combining experimental characterization and mathematical modeling. We found that, in colorectal cancer, a fraction of persisters slowly replicates. Clinically approved targeted therapies induce a switch to drug-tolerant persisters and a temporary 7- to 50-fold increase of their mutation rate, thus increasing the number of persister-derived resistant cells. These findings reveal that treatment may influence persistence and mutability in cancer cells and pinpoint inhibition of error-prone DNA polymerases as a strategy to restrict tumor recurrence.


Assuntos
Neoplasias Colorretais , Taxa de Mutação , Antibacterianos/farmacologia , Neoplasias Colorretais/tratamento farmacológico , Neoplasias Colorretais/genética , Humanos , Dinâmica Populacional
8.
Elife ; 102021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-34013887

RESUMO

Recent results comparing the temporal program of genome replication of yeast species belonging to the Lachancea clade support the scenario that the evolution of the replication timing program could be mainly driven by correlated acquisition and loss events of active replication origins. Using these results as a benchmark, we develop an evolutionary model defined as birth-death process for replication origins and use it to identify the evolutionary biases that shape the replication timing profiles. Comparing different evolutionary models with data, we find that replication origin birth and death events are mainly driven by two evolutionary pressures, the first imposes that events leading to higher double-stall probability of replication forks are penalized, while the second makes less efficient origins more prone to evolutionary loss. This analysis provides an empirically grounded predictive framework for quantitative evolutionary studies of the replication timing program.


Assuntos
Replicação do DNA , DNA Fúngico/biossíntese , DNA Fúngico/genética , Evolução Molecular , Genoma Fúngico , Modelos Genéticos , Saccharomycetales/genética , Simulação por Computador , Período de Replicação do DNA , Regulação Fúngica da Expressão Gênica , Filogenia , Origem de Replicação , Saccharomycetales/classificação , Saccharomycetales/crescimento & desenvolvimento
9.
Phys Rev E ; 102(3-1): 032119, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33075947

RESUMO

The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: Usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally labeled points lying on (possibly low-dimensional) object manifolds. Here we provide a bridge between this recently established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications compatible with the loss class. As a proof of concept, we focus on kernel machines and on two simple realizations of data structure introduced in recent physics literature: k-dimensional simplexes with prescribed geometric relations and spherical manifolds (equivalent to margin classification). Entropy, contrary to what happens for unstructured data, is nonmonotonic in the sample size, in contrast with the rigorous bounds. Moreover, data structure induces a transition beyond the storage capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a cue of low generalization error. The identification of a synaptic volume vanishing at the transition allows a quantification of the impact of data structure within replica theory, applicable in cases where combinatorial methods are not available, as we demonstrate for margin learning.

10.
Phys Rev E ; 102(1-1): 012306, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32794907

RESUMO

Many machine learning algorithms used for dimensional reduction and manifold learning leverage on the computation of the nearest neighbors to each point of a data set to perform their tasks. These proximity relations define a so-called geometric graph, where two nodes are linked if they are sufficiently close to each other. Random geometric graphs, where the positions of nodes are randomly generated in a subset of R^{d}, offer a null model to study typical properties of data sets and of machine learning algorithms. Up to now, most of the literature focused on the characterization of low-dimensional random geometric graphs whereas typical data sets of interest in machine learning live in high-dimensional spaces (d≫10^{2}). In this work, we consider the infinite dimensions limit of hard and soft random geometric graphs and we show how to compute the average number of subgraphs of given finite size k, e.g., the average number of k cliques. This analysis highlights that local observables display different behaviors depending on the chosen ensemble: soft random geometric graphs with continuous activation functions converge to the naive infinite-dimensional limit provided by Erdös-Rényi graphs, whereas hard random geometric graphs can show systematic deviations from it. We present numerical evidence that our analytical results, exact in infinite dimensions, provide a good approximation also for dimension d≳10.

11.
Pneumonol Alergol Pol ; 77(2): 173-9, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19462352

RESUMO

Many patients with chronic obstructive pulmonary disease (COPD) die each year as those with lung cancer but current guidelines make few recommendations on the care for the most severe patients i.e. those with Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages III and IV with chronic respiratory failure. Only smoking cessation and long term oxygen therapy (LTOT) improve survival in COPD. Although non invasive positive pressure ventilation (NPPV) may have an adjunctive role in the management of chronic respiratory insufficiency there is little evidence for its use in the routine management of stable hypercapnic COPD patients. At difference, several prospective, randomised, controlled studies, systematic reviews and meta-analyses show good level of evidence for clinical efficacy of NPPV in the treatment of acute on chronic respiratory failure due to acute exacerbations of COPD. NPPV is also alternative to invasive ventilation for symptom relief in end stage COPD. Surgical interventions for end stage COPD like bullectomy, different modalities of lung volume reduction surgery and lung transplantation are likely to be of value to only a small percentage of patients. Nevertheless, there are specific indications, which, when added to pulmonary rehabilitation will further advance exercise capacity and quality of life. As in other chronic diseases when severity of disease increases along the natural history, therapy aimed to prolong life becomes less and less important in comparison to palliative therapy aimed to relieve symptoms. The most effective treatments for dyspnoea are bronchodilators, although also opiates may improve dyspnoea. Supplemental oxygen reduce exertional breathlessness and improve exercise tolerance in hypoxaemic COPD patients. There are difficulties in treating with antidepressant the frail and elderly COPD patients. Good clinical care can prevent or alleviate suffering by assessing symptoms and providing psychological and social support to the patients and their families.


Assuntos
Cuidados Paliativos/métodos , Doença Pulmonar Obstrutiva Crônica/terapia , Qualidade de Vida , Progressão da Doença , Dispneia/etiologia , Dispneia/prevenção & controle , Humanos , Oxigênio/uso terapêutico , Oxigenoterapia/métodos , Prognóstico , Abandono do Hábito de Fumar/métodos , Apoio Social
12.
Sci Rep ; 9(1): 17133, 2019 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-31748557

RESUMO

Identifying the minimal number of parameters needed to describe a dataset is a challenging problem known in the literature as intrinsic dimension estimation. All the existing intrinsic dimension estimators are not reliable whenever the dataset is locally undersampled, and this is at the core of the so called curse of dimensionality. Here we introduce a new intrinsic dimension estimator that leverages on simple properties of the tangent space of a manifold and extends the usual correlation integral estimator to alleviate the extreme undersampling problem. Based on this insight, we explore a multiscale generalization of the algorithm that is capable of (i) identifying multiple dimensionalities in a dataset, and (ii) providing accurate estimates of the intrinsic dimension of extremely curved manifolds. We test the method on manifolds generated from global transformations of high-contrast images, relevant for invariant object recognition and considered a challenge for state-of-the-art intrinsic dimension estimators.

13.
J R Soc Interface ; 16(154): 20190101, 2019 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-31039692

RESUMO

Characterizing the spatio-temporal evolution of networks is a central topic in many disciplines. While network expansion has been studied thoroughly, less is known about how empirical networks behave when shrinking. For transportation networks, this is especially relevant on account of their connection with the socio-economical substrate, and we focus here on the evolution of the French railway network from its birth in 1840 to 2000, in relation to the country's demographic dynamics. The network evolved in parallel with technology (e.g. faster trains) and under strong constraints, such as preserving a good population coverage and balancing cost and efficiency. We show that the shrinking phase that started in 1930 decreased the total length of the network while preserving efficiency and population coverage: efficiency and robustness remained remarkably constant while the total length of the network shrank by 50% between 1930 and 2000, and the total travel time and time-diameter decreased by more than 75% during the same period. Moreover, shrinking the network did not affect the overall accessibility with an average travel time that decreases steadily since its formation. This evolution leads naturally to an increase in transportation multimodality (such as a massive use of cars) and shows the importance of considering together transportation modes acting at different spatial scales. More generally, our results suggest that shrinking is not necessarily associated with a decay in performance and functions but can be beneficial in terms of design goals and can be part of the natural evolution of an adaptive network.


Assuntos
Modelos Teóricos , Ferrovias/história , França , História do Século XIX , História do Século XX , História do Século XXI
14.
Phys Rev E ; 99(3-1): 032310, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30999432

RESUMO

To measure, predict, and prevent social segregation, it is necessary to understand the factors that cause it. While in most available descriptions space plays an essential role, one outstanding question is whether and how this phenomenon is possible in a well-mixed social network. We define and solve a simple model of segregation on networks based on discrete convictions. In our model, space does not play a role, and individuals never change their conviction, but they may choose to connect socially to other individuals based on two criteria: sharing the same conviction and individual popularity (regardless of conviction). The tradeoff between these two moves defines a parameter, analogous to the "tolerance" parameter in classical models of spatial segregation. We show numerically and analytically that this parameter determines a true phase transition (somewhat reminiscent of phase separation in a binary mixture) between a well-mixed and a segregated state. Additionally, minority convictions segregate faster and inter-specific aversion alone may lead to a segregation threshold with similar properties. Together, our results highlight the general principle that a segregation transition is possible in absence of spatial degrees of freedom, provided that conviction-based rewiring occurs on the same time scale of popularity rewirings.


Assuntos
Modelos Psicológicos , Segregação Social/psicologia , Pensamento , Simulação por Computador , Humanos , Preconceito , Probabilidade , Comportamento Social
15.
Science ; 366(6472): 1473-1480, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31699882

RESUMO

The emergence of drug resistance limits the efficacy of targeted therapies in human tumors. The prevalent view is that resistance is a fait accompli: when treatment is initiated, cancers already contain drug-resistant mutant cells. Bacteria exposed to antibiotics transiently increase their mutation rates (adaptive mutability), thus improving the likelihood of survival. We investigated whether human colorectal cancer (CRC) cells likewise exploit adaptive mutability to evade therapeutic pressure. We found that epidermal growth factor receptor (EGFR)/BRAF inhibition down-regulates mismatch repair (MMR) and homologous recombination DNA-repair genes and concomitantly up-regulates error-prone polymerases in drug-tolerant (persister) cells. MMR proteins were also down-regulated in patient-derived xenografts and tumor specimens during therapy. EGFR/BRAF inhibition induced DNA damage, increased mutability, and triggered microsatellite instability. Thus, like unicellular organisms, tumor cells evade therapeutic pressures by enhancing mutability.


Assuntos
Neoplasias Colorretais/tratamento farmacológico , Neoplasias Colorretais/genética , Reparo de Erro de Pareamento de DNA/genética , Resistencia a Medicamentos Antineoplásicos/genética , Receptores ErbB/antagonistas & inibidores , Terapia de Alvo Molecular , Mutagênese , Proteínas Proto-Oncogênicas B-raf/antagonistas & inibidores , Adaptação Biológica/genética , Regulação para Baixo , Humanos , Seleção Genética
16.
Phys Rev E ; 97(5-1): 052109, 2018 May.
Artigo em Inglês | MEDLINE | ID: mdl-29906886

RESUMO

The traveling-salesman problem is one of the most studied combinatorial optimization problems, because of the simplicity in its statement and the difficulty in its solution. We characterize the optimal cycle for every convex and increasing cost function when the points are thrown independently and with an identical probability distribution in a compact interval. We compute the average optimal cost for every number of points when the distance function is the square of the Euclidean distance. We also show that the average optimal cost is not a self-averaging quantity by explicitly computing the variance of its distribution in the thermodynamic limit. Moreover, we prove that the cost of the optimal cycle is not smaller than twice the cost of the optimal assignment of the same set of points. Interestingly, this bound is saturated in the thermodynamic limit.

17.
Phys Rev E ; 98(1-1): 012315, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30110773

RESUMO

Complex natural and technological systems can be considered, on a coarse-grained level, as assemblies of elementary components: for example, genomes as sets of genes or texts as sets of words. On one hand, the joint occurrence of components emerges from architectural and specific constraints in such systems. On the other hand, general regularities may unify different systems, such as the broadly studied Zipf and Heaps laws, respectively concerning the distribution of component frequencies and their number as a function of system size. Dependency structures (i.e., directed networks encoding the dependency relations between the components in a system) were proposed recently as a possible organizing principles underlying some of the regularities observed. However, the consequences of this assumption were explored only in binary component systems, where solely the presence or absence of components is considered, and multiple copies of the same component are not allowed. Here we consider a simple model that generates, from a given ensemble of dependency structures, a statistical ensemble of sets of components, allowing for components to appear with any multiplicity. Our model is a minimal extension that is memoryless and therefore accessible to analytical calculations. A mean-field analytical approach (analogous to the "Zipfian ensemble" in the linguistics literature) captures the relevant laws describing the component statistics as we show by comparison with numerical computations. In particular, we recover a power-law Zipf rank plot, with a set of core components, and a Heaps law displaying three consecutive regimes (linear, sublinear, and saturating) that we characterize quantitatively.

18.
Methods Mol Biol ; 1624: 291-307, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28842891

RESUMO

This chapter provides theoretical background and practical procedures for model-guided analysis of mobility of chromosomal loci from movies of many single trajectories. We guide the reader through existing physical models and measurable quantities, illustrating how this knowledge is useful for the interpretation of the measurements.


Assuntos
Bactérias/genética , Cromossomos Bacterianos/química , Bactérias/química , Núcleo Celular/química , Modelos Teóricos
19.
Phys Rev E ; 96(3-1): 032316, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29346919

RESUMO

The costs associated to the length of links impose unavoidable constraints to the growth of natural and artificial transport networks. When future network developments cannot be predicted, the costs of building and maintaining connections cannot be minimized simultaneously, requiring competing optimization mechanisms. Here, we study a one-parameter nonequilibrium model driven by an optimization functional, defined as the convex combination of building cost and maintenance cost. By varying the coefficient of the combination, the model interpolates between global and local length minimization, i.e., between minimum spanning trees and a local version known as dynamical minimum spanning trees. We show that cost balance within this ensemble of dynamical networks is a sufficient ingredient for the emergence of tradeoffs between the network's total length and transport efficiency, and of optimal strategies of construction. At the transition between two qualitatively different regimes, the dynamics builds up power-law distributed waiting times between global rearrangements, indicating a point of nonoptimality. Finally, we use our model as a framework to analyze empirical ant trail networks, showing its relevance as a null model for cost-constrained network formation.

20.
Phys Rev E ; 96(4-1): 042402, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29347533

RESUMO

The short-time dynamics of bacterial chromosomal loci is a mixture of subdiffusive and active motion, in the form of rapid relocations with near-ballistic dynamics. While previous work has shown that such rapid motions are ubiquitous, we still have little grasp on their physical nature, and no positive model is available that describes them. Here, we propose a minimal theoretical model for loci movements as a fractional Brownian motion subject to a constant but intermittent driving force, and compare simulations and analytical calculations to data from high-resolution dynamic tracking in E. coli. This analysis yields the characteristic time scales for intermittency. Finally, we discuss the possible shortcomings of this model, and show that an increase in the effective local noise felt by the chromosome associates to the active relocations.


Assuntos
Cromossomos Bacterianos , Modelos Genéticos , Simulação por Computador , Difusão , Escherichia coli , Movimento (Física)
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA