RESUMO
MOTIVATION: As more behavioural assays are carried out in large-scale experiments on Drosophila larvae, the definitions of the archetypal actions of a larva are regularly refined. In addition, video recording and tracking technologies constantly evolve. Consequently, automatic tagging tools for Drosophila larval behaviour must be retrained to learn new representations from new data. However, existing tools cannot transfer knowledge from large amounts of previously accumulated data. We introduce LarvaTagger, a piece of software that combines a pre-trained deep neural network, providing a continuous latent representation of larva actions for stereotypical behaviour identification, with a graphical user interface to manually tag the behaviour and train new automatic taggers with the updated ground truth. RESULTS: We reproduced results from an automatic tagger with high accuracy, and we demonstrated that pre-training on large databases accelerates the training of a new tagger, achieving similar prediction accuracy using less data. AVAILABILITY AND IMPLEMENTATION: All the code is free and open source. Docker images are also available. See gitlab.pasteur.fr/nyx/LarvaTagger.jl.
Assuntos
Comportamento Animal , Drosophila , Larva , Software , Animais , Comportamento Animal/fisiologia , Gravação em Vídeo/métodos , Redes Neurais de ComputaçãoRESUMO
Physical and functional constraints on biological networks lead to complex topological patterns across multiple scales in their organization. A particular type of higher-order network feature that has received considerable interest is network motifs, defined as statistically regular subgraphs. These may implement fundamental logical and computational circuits and are referred to as "building blocks of complex networks". Their well-defined structures and small sizes also enable the testing of their functions in synthetic and natural biological experiments. Here, we develop a framework for motif mining based on lossless network compression using subgraph contractions. This provides an alternative definition of motif significance which allows us to compare different motifs and select the collectively most significant set of motifs as well as other prominent network features in terms of their combined compression of the network. Our approach inherently accounts for multiple testing and correlations between subgraphs and does not rely on a priori specification of an appropriate null model. It thus overcomes common problems in hypothesis testing-based motif analysis and guarantees robust statistical inference. We validate our methodology on numerical data and then apply it on synaptic-resolution biological neural networks, as a medium for comparative connectomics, by evaluating their respective compressibility and characterize their inferred circuit motifs.
Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Rede Nervosa/fisiologia , Humanos , Compressão de Dados/métodosRESUMO
Numerous models have been developed to account for the complex properties of the random walks of biomolecules. However, when analysing experimental data, conditions are rarely met to ensure model identification. The dynamics may simultaneously be influenced by spatial and temporal heterogeneities of the environment, out-of-equilibrium fluxes and conformal changes of the tracked molecules. Recorded trajectories are often too short to reliably discern such multi-scale dynamics, which precludes unambiguous assessment of the type of random walk and its parameters. Furthermore, the motion of biomolecules may not be well described by a single, canonical random walk model. Here, we develop a two-step statistical testing scheme for comparing biomolecule dynamics observed in different experimental conditions without having to identify or make strong prior assumptions about the model generating the recorded random walks. We first train a graph neural network to perform simulation-based inference and thus learn a rich summary statistics vector describing individual trajectories. We then compare trajectories obtained in different biological conditions using a non-parametric maximum mean discrepancy (MMD) statistical test on their so-obtained summary statistics. This procedure allows us to characterise sets of random walks regardless of their generating models, without resorting to model-specific physical quantities or estimators. We first validate the relevance of our approach on numerically simulated trajectories. This demonstrates both the statistical power of the MMD test and the descriptive power of the learnt summary statistics compared to estimates of physical quantities. We then illustrate the ability of our framework to detect changes in α-synuclein dynamics at synapses in cultured cortical neurons, in response to membrane depolarisation, and show that detected differences are largely driven by increased protein mobility in the depolarised state, in agreement with previous findings. The method provides a means of interpreting the differences it detects in terms of single trajectory characteristics. Finally, we emphasise the interest of performing various comparisons to probe the heterogeneity of experimentally acquired datasets at different levels of granularity (e.g., biological replicates, fields of view, and organelles).
Assuntos
Redes Neurais de Computação , Proteínas , Simulação por Computador , Movimento (Física) , Proteínas/químicaRESUMO
MOTIVATION: Single-molecule localization microscopy allows studying the dynamics of biomolecules in cells and resolving the biophysical properties of the molecules and their environment underlying cellular function. With the continuously growing amount of data produced by individual experiments, the computational cost of quantifying these properties is increasingly becoming the bottleneck of single-molecule analysis. Mining these data requires an integrated and efficient analysis toolbox. RESULTS: We introduce TRamWAy, a modular Python library that features: (i) a conservative tracking procedure for localization data, (ii) a range of sampling techniques for meshing the spatio-temporal support of the data, (iii) computationally efficient solvers for inverse models, with the option of plugging in user-defined functions and (iv) a collection of analysis tools and a simple web-based interface. AVAILABILITY AND IMPLEMENTATION: TRamWAy is a Python library and can be installed with pip and conda. The source code is available at https://github.com/DecBayComp/TRamWAy.
Assuntos
Software , Movimento (Física)RESUMO
[This corrects the article DOI: 10.1371/journal.pcbi.1004579.].
RESUMO
We reanalyze trajectories of hOGG1 repair proteins diffusing on DNA. A previous analysis of these trajectories with the popular mean-squared-displacement approach revealed only simple diffusion. Here, a new optimal estimator of diffusion coefficients reveals two-state kinetics of the protein. A simple, solvable model, in which the protein randomly switches between a loosely bound, highly mobile state and a tightly bound, less mobile state is the simplest possible dynamic model consistent with the data. It yields accurate estimates of hOGG1's (i) diffusivity in each state, uncorrupted by experimental errors arising from shot noise, motion blur and thermal fluctuations of the DNA; (ii) rates of switching between states and (iii) rate of detachment from the DNA. The protein spends roughly equal time in each state. It detaches only from the loosely bound state, with a rate that depends on pH and the salt concentration in solution, while its rates for switching between states are insensitive to both. The diffusivity in the loosely bound state depends primarily on pH and is three to ten times higher than in the tightly bound state. We propose and discuss some new experiments that take full advantage of the new tools of analysis presented here.
Assuntos
DNA Glicosilases/metabolismo , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Difusão , Humanos , Cinética , Modelos Biológicos , Movimento (Física)RESUMO
We present a Bayesian framework for inferring spatio-temporal maps of diffusivity and potential fields from recorded trajectories of single molecules inside living cells. The framework naturally lets us regularise the high-dimensional inference problem using prior distributions in order to obtain robust results. To overcome the computational complexity of inferring thousands of map parameters from large single particle tracking datasets, we developed a stochastic optimisation method based on local mini-batches and parsimonious gradient calculation. We quantified the gain in convergence speed on numerical simulations, and we demonstrated for the first time temporal regularisation and aligned values of the inferred potential fields across multiple time segments. As a proof-of-concept, we mapped the dynamics of HIV-1 Gag proteins involved in the formation of virus-like particles (VLPs) on the plasma membrane of live T cells at high spatial and temporal resolutions. We focused on transient aggregation events lasting only on tenth of the time required for full VLP formation. The framework and optimisation methods are implemented in the TRamWAy open-source software platform for analysing single biomolecule dynamics.
Assuntos
HIV-1/fisiologia , Análise de Célula Única/métodos , Produtos do Gene gag do Vírus da Imunodeficiência Humana/metabolismo , Teorema de Bayes , Membrana Celular/virologia , Modelos Biológicos , Análise Espaço-Temporal , Linfócitos T/virologiaRESUMO
Stochastic simulations are one of the cornerstones of the analysis of dynamical processes on complex networks, and are often the only accessible way to explore their behavior. The development of fast algorithms is paramount to allow large-scale simulations. The Gillespie algorithm can be used for fast simulation of stochastic processes, and variants of it have been applied to simulate dynamical processes on static networks. However, its adaptation to temporal networks remains non-trivial. We here present a temporal Gillespie algorithm that solves this problem. Our method is applicable to general Poisson (constant-rate) processes on temporal networks, stochastically exact, and up to multiple orders of magnitude faster than traditional simulation schemes based on rejection sampling. We also show how it can be extended to simulate non-Markovian processes. The algorithm is easily applicable in practice, and as an illustration we detail how to simulate both Poissonian and non-Markovian models of epidemic spreading. Namely, we provide pseudocode and its implementation in C++ for simulating the paradigmatic Susceptible-Infected-Susceptible and Susceptible-Infected-Recovered models and a Susceptible-Infected-Recovered model with non-constant recovery rates. For empirical networks, the temporal Gillespie algorithm is here typically from 10 to 100 times faster than rejection sampling.
Assuntos
Algoritmos , Busca de Comunicante/métodos , Busca de Comunicante/estatística & dados numéricos , Transmissão de Doença Infecciosa/estatística & dados numéricos , Epidemias/estatística & dados numéricos , Modelos Estatísticos , Simulação por Computador , Humanos , Medição de Risco/métodos , Análise Espaço-Temporal , Processos EstocásticosRESUMO
BACKGROUND: The homogeneous mixing assumption is widely adopted in epidemic modelling for its parsimony and represents the building block of more complex approaches, including very detailed agent-based models. The latter assume homogeneous mixing within schools, workplaces and households, mostly for the lack of detailed information on human contact behaviour within these settings. The recent data availability on high-resolution face-to-face interactions makes it now possible to assess the goodness of this simplified scheme in reproducing relevant aspects of the infection dynamics. METHODS: We consider empirical contact networks gathered in different contexts, as well as synthetic data obtained through realistic models of contacts in structured populations. We perform stochastic spreading simulations on these contact networks and in populations of the same size under a homogeneous mixing hypothesis. We adjust the epidemiological parameters of the latter in order to fit the prevalence curve of the contact epidemic model. We quantify the agreement by comparing epidemic peak times, peak values, and epidemic sizes. RESULTS: Good approximations of the peak times and peak values are obtained with the homogeneous mixing approach, with a median relative difference smaller than 20 % in all cases investigated. Accuracy in reproducing the peak time depends on the setting under study, while for the peak value it is independent of the setting. Recalibration is found to be linear in the epidemic parameters used in the contact data simulations, showing changes across empirical settings but robustness across groups and population sizes. CONCLUSIONS: An adequate rescaling of the epidemiological parameters can yield a good agreement between the epidemic curves obtained with a real contact network and a homogeneous mixing approach in a population of the same size. The use of such recalibrated homogeneous mixing approximations would enhance the accuracy and realism of agent-based simulations and limit the intrinsic biases of the homogeneous mixing.
Assuntos
Busca de Comunicante/métodos , Epidemias , Modelos Biológicos , Simulação por Computador , Características da Família , Humanos , Densidade Demográfica , Instituições Acadêmicas , Processos Estocásticos , Local de TrabalhoRESUMO
Filopodia are dynamic, finger-like plasma membrane protrusions that sense the mechanical and chemical surroundings of the cell. Here, we show in epithelial cells that the dynamics of filopodial extension and retraction are determined by the difference between the actin polymerization rate at the tip and the retrograde flow at the base of the filopodium. Adhesion of a bead to the filopodial tip locally reduces actin polymerization and leads to retraction via retrograde flow, reminiscent of a process used by pathogens to invade cells. Using optical tweezers, we show that filopodial retraction occurs at a constant speed against counteracting forces up to 50 pN. Our measurements point toward retrograde flow in the cortex together with frictional coupling between the filopodial and cortical actin networks as the main retraction-force generator for filopodia. The force exerted by filopodial retraction, however, is limited by the connection between filopodial actin filaments and the membrane at the tip. Upon mechanical rupture of the tip connection, filopodia exert a passive retraction force of 15 pN via their plasma membrane. Transient reconnection at the tip allows filopodia to continuously probe their surroundings in a load-and-fail manner within a well-defined force range.
Assuntos
Actinas/metabolismo , Pseudópodes/fisiologia , Fenômenos Biomecânicos/fisiologia , Proteínas de Fluorescência Verde , Células HeLa , Humanos , Microscopia Confocal , Microesferas , Pinças Ópticas , Fotodegradação , PolimerizaçãoRESUMO
This paper addresses the exploration-exploitation dilemma inherent in decision-making, focusing on multiarmed bandit problems. These involve an agent deciding whether to exploit current knowledge for immediate gains or explore new avenues for potential long-term rewards. We here introduce a class of algorithms, approximate information maximization (AIM), which employs a carefully chosen analytical approximation to the gradient of the entropy to choose which arm to pull at each point in time. AIM matches the performance of Thompson sampling, which is known to be asymptotically optimal, as well as that of Infomax from which it derives. AIM thus retains the advantages of Infomax while also offering enhanced computational speed, tractability, and ease of implementation. In particular, we demonstrate how to apply it to a 50-armed bandit game. Its expression is tunable, which allows for specific optimization in various settings, making it possible to surpass the performance of Thompson sampling at short and intermediary times.
RESUMO
We introduce a simulation-based, amortized Bayesian inference scheme to infer the parameters of random walks. Our approach learns the posterior distribution of the walks' parameters with a likelihood-free method. In the first step a graph neural network is trained on simulated data to learn optimized low-dimensional summary statistics of the random walk. In the second step an invertible neural network generates the posterior distribution of the parameters from the learned summary statistics using variational inference. We apply our method to infer the parameters of the fractional Brownian motion model from single trajectories. The computational complexity of the amortized inference procedure scales linearly with trajectory length, and its precision scales similarly to the Cramér-Rao bound over a wide range of lengths. The approach is robust to positional noise, and generalizes to trajectories longer than those seen during training. Finally, we adapt this scheme to show that a finite decorrelation time in the environment can furthermore be inferred from individual trajectories.
RESUMO
Hematopoietic stem cell transplantation (HSCT) is a therapy used for multiple malignant and nonmalignant diseases, with chemotherapy used for pretransplantation myeloablation. The post-HSCT brain contains peripheral engrafted parenchymal macrophages, despite their absence in the normal brain, with the engraftment mechanism still undefined. Here we show that HSCT chemotherapy broadly disrupts mouse brain regenerative populations, including a permanent loss of adult neurogenesis. Microglial density was halved, causing microglial process expansion, coinciding with indicators of broad senescence. Although microglia expressed cell proliferation markers, they underwent cell cycle arrest in S phase with a majority expressing the senescence and antiapoptotic marker p21. In vivo single-cell tracking of microglia after recovery from chemical depletion showed loss of their regenerative capacity, subsequently replaced with donor macrophages. We propose that HSCT chemotherapy causes microglial senescence with a gradual decrease to a critical microglial density, providing a permissive niche for peripheral macrophage engraftment of the brain.
Assuntos
Transplante de Células-Tronco Hematopoéticas , Microglia , Animais , Encéfalo , Macrófagos , Camundongos , Condicionamento Pré-TransplanteRESUMO
We devise a method to detect and estimate forces in a heterogeneous environment based on experimentally recorded stochastic trajectories. In particular, we focus on systems modeled by the heterogeneous overdamped Langevin equation. Here, the observed drift includes a "spurious" force term when the diffusivity varies in space. We show how Bayesian inference can be leveraged to reliably infer forces by taking into account such spurious forces of unknown amplitude as well as experimental sources of error. The method is based on marginalizing the force posterior over all possible spurious force contributions. The approach is combined with a Bayes factor statistical test for the presence of forces. The performance of our method is investigated analytically, numerically and tested on experimental data sets. The main results are obtained in a closed form allowing for direct exploration of their properties and fast computation. The method is incorporated into TRamWAy, an open-source software platform for automated analysis of biomolecule trajectories.
RESUMO
We describe how a single-particle tracking experiment should be designed in order for its recorded trajectories to contain the most information about a tracked particle's diffusion coefficient. The precision of estimators for the diffusion coefficient is affected by motion blur, limited photon statistics, and the length of recorded time series. We demonstrate for a particle undergoing free diffusion that precision is negligibly affected by motion blur in typical experiments, while optimizing photon counts and the number of recorded frames is the key to precision. Building on these results, we describe for a wide range of experimental scenarios how to choose experimental parameters in order to optimize the precision. Generally, one should choose quantity over quality: experiments should be designed to maximize the number of frames recorded in a time series, even if this means lower information content in individual frames.
RESUMO
Transition state theory (TST) provides a simple interpretation of many thermally activated processes. It applies successfully on timescales and length scales that differ several orders of magnitude: to chemical reactions, breaking of chemical bonds, unfolding of proteins and RNA structures and polymers crossing entropic barriers. Here we apply TST to out-of-equilibrium transport through confined environments: the thermally activated translocation of single DNA molecules over an entropic barrier helped by an external force field. Reaction pathways are effectively one dimensional and so long that they are observable in a microscope. Reaction rates are so slow that transitions are recorded on video. We find sharp transition states that are independent of the applied force, similar to chemical bond rupture, as well as transition states that change location on the reaction pathway with the strength of the applied force. The states of equilibrium and transition are separated by micrometres as compared with angstroms/nanometres for chemical bonds.
RESUMO
Data describing human interactions often suffer from incomplete sampling of the underlying population. As a consequence, the study of contagion processes using data-driven models can lead to a severe underestimation of the epidemic risk. Here we present a systematic method to alleviate this issue and obtain a better estimation of the risk in the context of epidemic models informed by high-resolution time-resolved contact data. We consider several such data sets collected in various contexts and perform controlled resampling experiments. We show how the statistical information contained in the resampled data can be used to build a series of surrogate versions of the unknown contacts. We simulate epidemic processes on the resulting reconstructed data sets and show that it is possible to obtain good estimates of the outcome of simulations performed using the complete data set. We discuss limitations and potential improvements of our method.
Assuntos
Doenças Transmissíveis/epidemiologia , Busca de Comunicante , Algoritmos , Doenças Transmissíveis/transmissão , Humanos , Modelos TeóricosRESUMO
Molecular motors are responsible for numerous cellular processes from cargo transport to heart contraction. Their interactions with other cellular components are often transient and exhibit kinetics that depend on load. Here, we measure such interactions using 'harmonic force spectroscopy'. In this method, harmonic oscillation of the sample stage of a laser trap immediately, automatically and randomly applies sinusoidally varying loads to a single motor molecule interacting with a single track along which it moves. The experimental protocol and the data analysis are simple, fast and efficient. The protocol accumulates statistics fast enough to deliver single-molecule results from single-molecule experiments. We demonstrate the method's performance by measuring the force-dependent kinetics of individual human ß-cardiac myosin molecules interacting with an actin filament at physiological ATP concentration. We show that a molecule's ADP release rate depends exponentially on the applied load, in qualitative agreement with cardiac muscle, which contracts with a velocity inversely proportional to external load.
Assuntos
Citoesqueleto de Actina/metabolismo , Difosfato de Adenosina/metabolismo , Miosinas Ventriculares/metabolismo , Humanos , Cinética , Lasers , Análise EspectralRESUMO
How does one optimally determine the diffusion coefficient of a diffusing particle from a single-time-lapse recorded trajectory of the particle? We answer this question with an explicit, unbiased, and practically optimal covariance-based estimator (CVE). This estimator is regression-free and is far superior to commonly used methods based on measured mean squared displacements. In experimentally relevant parameter ranges, it also outperforms the analytically intractable and computationally more demanding maximum likelihood estimator (MLE). For the case of diffusion on a flexible and fluctuating substrate, the CVE is biased by substrate motion. However, given some long time series and a substrate under some tension, an extended MLE can separate particle diffusion on the substrate from substrate motion in the laboratory frame. This provides benchmarks that allow removal of bias caused by substrate fluctuations in CVE. The resulting unbiased CVE is optimal also for short time series on a fluctuating substrate. We have applied our estimators to human 8-oxoguanine DNA glycolase proteins diffusing on flow-stretched DNA, a fluctuating substrate, and found that diffusion coefficients are severely overestimated if substrate fluctuations are not accounted for.
Assuntos
Algoritmos , Biopolímeros/química , DNA Glicosilases/química , DNA/química , Difusão , Modelos Químicos , Biopolímeros/análise , Simulação por Computador , Modelos EstatísticosRESUMO
Empirical temporal networks display strong heterogeneities in their dynamics, which profoundly affect processes taking place on these networks, such as rumor and epidemic spreading. Despite the recent wealth of data on temporal networks, little work has been devoted to the understanding of how such heterogeneities can emerge from microscopic mechanisms at the level of nodes and links. Here we show that long-term memory effects are present in the creation and disappearance of links in empirical networks. We thus consider a simple generative modeling framework for temporal networks able to incorporate these memory mechanisms. This allows us to study separately the role of each of these mechanisms in the emergence of heterogeneous network dynamics. In particular, we show analytically and numerically how heterogeneous distributions of contact durations, of intercontact durations, and of numbers of contacts per link emerge. We also study the individual effect of heterogeneities on dynamical processes, such as the paradigmatic susceptible-infected epidemic spreading model. Our results confirm in particular the crucial role of the distributions of intercontact durations and of the numbers of contacts per link.