Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(12): e2310002121, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38470929

RESUMEN

We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we reveal that the training process explores an effectively low-dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We study the details of this manifold to find that networks with different architectures follow distinguishable trajectories, but other factors have a minimal influence; larger networks train along a similar manifold as that of smaller networks, just faster; and networks initialized at very different parts of the prediction space converge to the solution along a similar manifold.

2.
J Acoust Soc Am ; 155(2): 962-970, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38341729

RESUMEN

Separating crowd responses from raw acoustic signals at sporting events is challenging because recordings contain complex combinations of acoustic sources, including crowd noise, music, individual voices, and public address (PA) systems. This paper presents a data-driven decomposition of recordings of 30 collegiate sporting events. The decomposition uses machine-learning methods to find three principal spectral shapes that separate various acoustic sources. First, the distributions of recorded one-half-second equivalent continuous sound levels from men's and women's basketball and volleyball games are analyzed with regard to crowd size and venue. Using 24 one-third-octave bands between 50 Hz and 10 kHz, spectrograms from each type of game are then analyzed. Based on principal component analysis, 87.5% of the spectral variation in the signals can be represented with three principal components, regardless of sport, venue, or crowd composition. Using the resulting three-dimensional component coefficient representation, a Gaussian mixture model clustering analysis finds nine different clusters. These clusters separate audibly distinct signals and represent various combinations of acoustic sources, including crowd noise, music, individual voices, and the PA system.

3.
J Acoust Soc Am ; 154(5): 2950-2958, 2023 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-37943738

RESUMEN

The National Transportation Noise Map (NTNM) gives time-averaged traffic noise across the continental United States (CONUS) using annual average daily traffic. However, traffic noise varies significantly with time. This paper outlines the development and utility of a traffic volume model which is part of VROOM, the Vehicular Reduced-Order Observation-based model, which, using hourly traffic volume data from thousands of traffic monitoring stations across CONUS, predicts nationwide hourly varying traffic source noise. Fourier analysis finds daily, weekly, and yearly temporal traffic volume cycles at individual traffic monitoring stations. Then, principal component analysis uses denoised Fourier spectra to find the most widespread cyclic traffic patterns. VROOM uses nine principal components to represent hourly traffic characteristics for any location, encapsulating daily, weekly, and yearly variation. The principal component coefficients are predicted across CONUS using location-specific features. Expected traffic volume model sound level errors-obtained by comparing predicted traffic counts to measured traffic counts-and expected NTNM-like errors, are presented. VROOM errors are typically within a couple of decibels, whereas NTNM-like errors are often inaccurate, even exceeding 10 decibels. This work details the first steps towards creation of a temporally and spectrally variable national transportation noise map.

4.
J Acoust Soc Am ; 154(2): 1168-1178, 2023 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-37610283

RESUMEN

Modeling environmental sound levels over continental scales is difficult due to the variety of geospatial environments. Moreover, current continental-scale models depend upon machine learning and therefore face additional challenges due to limited acoustic training data. In previous work, an ensemble of machine learning models was used to predict environmental sound levels in the contiguous United States using a training set composed of 51 geospatial layers (downselected from 120) and acoustic data from 496 geographic sites from Pedersen, Transtrum, Gee, Lympany, James, and Salton [JASA Express Lett. 1(12), 122401 (2021)]. In this paper, the downselection process, which is based on factors such as data quality and inter-feature correlations, is described in further detail. To investigate additional dimensionality reduction, four different feature selection methods are applied to the 51 layers. Leave-one-out median absolute deviation cross-validation errors suggest that the number of geospatial features can be reduced to 15 without significant degradation of the model's predictive error. However, ensemble predictions demonstrate that feature selection results are sensitive to variations in details of the problem formulation and, therefore, should elicit some skepticism. These results suggest that more sophisticated dimensionality reduction techniques are necessary for problems with limited training data and different training and testing distributions.

5.
J Proteome Res ; 21(11): 2703-2714, 2022 11 04.
Artículo en Inglés | MEDLINE | ID: mdl-36099490

RESUMEN

The synthesis of new proteins and the degradation of old proteins in vivo can be quantified in serial samples using metabolic isotope labeling to measure turnover. Because serial biopsies in humans are impractical, we set out to develop a method to calculate the turnover rates of proteins from single human biopsies. This method involved a new metabolic labeling approach and adjustments to the calculations used in previous work to calculate protein turnover. We demonstrate that using a nonequilibrium isotope enrichment strategy avoids the time dependent bias caused by variable lag in label delivery to different tissues observed in traditional metabolic labeling methods. Turnover rates are consistent for the same subject in biopsies from different labeling periods, and turnover rates calculated in this study are consistent with previously reported values. We also demonstrate that by measuring protein turnover we can determine where proteins are synthesized. In human subjects a significant difference in turnover rates differentiated proteins synthesized in the salivary glands versus those imported from the serum. We also provide a data analysis tool, DeuteRater-H, to calculate protein turnover using this nonequilibrium metabolic 2H2O method.


Asunto(s)
Isótopos , Proteínas , Humanos , Marcaje Isotópico/métodos , Proteínas/metabolismo , Proteolisis , Biopsia/métodos
6.
Rep Prog Phys ; 86(3)2022 Dec 28.
Artículo en Inglés | MEDLINE | ID: mdl-36576176

RESUMEN

Complex models in physics, biology, economics, and engineering are oftensloppy, meaning that the model parameters are not well determined by the model predictions for collective behavior. Many parameter combinations can vary over decades without significant changes in the predictions. This review uses information geometry to explore sloppiness and its deep relation to emergent theories. We introduce themodel manifoldof predictions, whose coordinates are the model parameters. Itshyperribbonstructure explains why only a few parameter combinations matter for the behavior. We review recent rigorous results that connect the hierarchy of hyperribbon widths to approximation theory, and to the smoothness of model predictions under changes of the control variables. We discuss recent geodesic methods to find simpler models on nearby boundaries of the model manifold-emergent theories with fewer parameters that explain the behavior equally well. We discuss a Bayesian prior which optimizes the mutual information between model parameters and experimental data, naturally favoring points on the emergent boundary theories and thus simpler models. We introduce a 'projected maximum likelihood' prior that efficiently approximates this optimal prior, and contrast both to the poor behavior of the traditional Jeffreys prior. We discuss the way the renormalization group coarse-graining in statistical mechanics introduces a flow of the model manifold, and connect stiff and sloppy directions along the model manifold with relevant and irrelevant eigendirections of the renormalization group. Finally, we discuss recently developed 'intensive' embedding methods, allowing one to visualize the predictions of arbitrary probabilistic models as low-dimensional projections of an isometric embedding, and illustrate our method by generating the model manifold of the Ising model.


Asunto(s)
Modelos Estadísticos , Física , Teorema de Bayes , Ingeniería
7.
J Chem Phys ; 156(21): 214103, 2022 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-35676145

RESUMEN

In this paper, we consider the problem of quantifying parametric uncertainty in classical empirical interatomic potentials (IPs) using both Bayesian (Markov Chain Monte Carlo) and frequentist (profile likelihood) methods. We interface these tools with the Open Knowledgebase of Interatomic Models and study three models based on the Lennard-Jones, Morse, and Stillinger-Weber potentials. We confirm that IPs are typically sloppy, i.e., insensitive to coordinated changes in some parameter combinations. Because the inverse problem in such models is ill-conditioned, parameters are unidentifiable. This presents challenges for traditional statistical methods, as we demonstrate and interpret within both Bayesian and frequentist frameworks. We use information geometry to illuminate the underlying cause of this phenomenon and show that IPs have global properties similar to those of sloppy models from fields, such as systems biology, power systems, and critical phenomena. IPs correspond to bounded manifolds with a hierarchy of widths, leading to low effective dimensionality in the model. We show how information geometry can motivate new, natural parameterizations that improve the stability and interpretation of uncertainty quantification analysis and further suggest simplified, less-sloppy models.


Asunto(s)
Biología de Sistemas , Teorema de Bayes , Cadenas de Markov , Método de Montecarlo , Incertidumbre
8.
Proc Natl Acad Sci U S A ; 115(8): 1760-1765, 2018 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-29434042

RESUMEN

We use the language of uninformative Bayesian prior choice to study the selection of appropriately simple effective models. We advocate for the prior which maximizes the mutual information between parameters and predictions, learning as much as possible from limited data. When many parameters are poorly constrained by the available data, we find that this prior puts weight only on boundaries of the parameter space. Thus, it selects a lower-dimensional effective theory in a principled way, ignoring irrelevant parameter directions. In the limit where there are sufficient data to tightly constrain any number of parameters, this reduces to the Jeffreys prior. However, we argue that this limit is pathological when applied to the hyperribbon parameter manifolds generic in science, because it leads to dramatic dependence on effects invisible to experiment.


Asunto(s)
Modelos Estadísticos , Algoritmos , Teorema de Bayes
9.
Mol Cell Proteomics ; 16(2): 243-254, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27932527

RESUMEN

Control of protein homeostasis is fundamental to the health and longevity of all organisms. Because the rate of protein synthesis by ribosomes is a central control point in this process, regulation, and maintenance of ribosome function could have amplified importance in the overall regulatory circuit. Indeed, ribosomal defects are commonly associated with loss of protein homeostasis, aging, and disease (1-4), whereas improved protein homeostasis, implying optimal ribosomal function, is associated with disease resistance and increased lifespan (5-7). To maintain a high-quality ribosome population within the cell, dysfunctional ribosomes are targeted for autophagic degradation. It is not known if complete degradation is the only mechanism for eukaryotic ribosome maintenance or if they might also be repaired by replacement of defective components. We used stable-isotope feeding and protein mass spectrometry to measure the kinetics of turnover of ribosomal RNA (rRNA) and 71 ribosomal proteins (r-proteins) in mice. The results indicate that exchange of individual proteins and whole ribosome degradation both contribute to ribosome maintenance in vivo In general, peripheral r-proteins and those with more direct roles in peptide-bond formation are replaced multiple times during the lifespan of the assembled structure, presumably by exchange with a free cytoplasmic pool, whereas the majority of r-proteins are stably incorporated for the lifetime of the ribosome. Dietary signals impact the rates of both new ribosome assembly and component exchange. Signal-specific modulation of ribosomal repair and degradation could provide a mechanistic link in the frequently observed associations among diminished rates of protein synthesis, increased autophagy, and greater longevity (5, 6, 8, 9).


Asunto(s)
Espectrometría de Masas/métodos , ARN Ribosómico/metabolismo , Proteínas Ribosómicas/metabolismo , Ribosomas/metabolismo , Animales , Autofagia , Dieta , Marcaje Isotópico , Ratones
10.
Biochim Biophys Acta ; 1860(5): 957-966, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26721335

RESUMEN

BACKGROUND: Isothermal calorimetry allows monitoring of reaction rates via direct measurement of the rate of heat produced by the reaction. Calorimetry is one of very few techniques that can be used to measure rates without taking a derivative of the primary data. Because heat is a universal indicator of chemical reactions, calorimetry can be used to measure kinetics in opaque solutions, suspensions, and multiple phase systems and does not require chemical labeling. The only significant limitation of calorimetry for kinetic measurements is that the time constant of the reaction must be greater than the time constant of the calorimeter which can range from a few seconds to a few minutes. Calorimetry has the unique ability to provide both kinetic and thermodynamic data. SCOPE OF REVIEW: This article describes the calorimetric methodology for determining reaction kinetics and reviews examples from recent literature that demonstrate applications of titration calorimetry to determine kinetics of enzyme-catalyzed and ligand binding reactions. MAJOR CONCLUSIONS: A complete model for the temperature dependence of enzyme activity is presented. A previous method commonly used for blank corrections in determinations of equilibrium constants and enthalpy changes for binding reactions is shown to be subject to significant systematic error. GENERAL SIGNIFICANCE: Methods for determination of the kinetics of enzyme-catalyzed reactions and for simultaneous determination of thermodynamics and kinetics of ligand binding reactions are reviewed.


Asunto(s)
Proteínas Bacterianas/química , Hidroliasas/química , Complejos Multienzimáticos/química , NADH NADPH Oxidorreductasas/química , Tripsina/química , beta-Fructofuranosidasa/química , Biocatálisis , Calorimetría/métodos , Escherichia coli/química , Escherichia coli/enzimología , Calor , Humanos , Cinética , Modelos Químicos , Sacarosa/química , Termodinámica , Thermus thermophilus/química , Thermus thermophilus/enzimología
11.
PLoS Comput Biol ; 12(5): e1004915, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27187545

RESUMEN

The inherent complexity of biological systems gives rise to complicated mechanistic models with a large number of parameters. On the other hand, the collective behavior of these systems can often be characterized by a relatively small number of phenomenological parameters. We use the Manifold Boundary Approximation Method (MBAM) as a tool for deriving simple phenomenological models from complicated mechanistic models. The resulting models are not black boxes, but remain expressed in terms of the microscopic parameters. In this way, we explicitly connect the macroscopic and microscopic descriptions, characterize the equivalence class of distinct systems exhibiting the same range of collective behavior, and identify the combinations of components that function as tunable control knobs for the behavior. We demonstrate the procedure for adaptation behavior exhibited by the EGFR pathway. From a 48 parameter mechanistic model, the system can be effectively described by a single adaptation parameter τ characterizing the ratio of time scales for the initial response and recovery time of the system which can in turn be expressed as a combination of microscopic reaction rates, Michaelis-Menten constants, and biochemical concentrations. The situation is not unlike modeling in physics in which microscopically complex processes can often be renormalized into simple phenomenological models with only a few effective parameters. The proposed method additionally provides a mechanistic explanation for non-universal features of the behavior.


Asunto(s)
Modelos Biológicos , Adaptación Fisiológica , Biología Computacional , Receptores ErbB/metabolismo , Retroalimentación Fisiológica , Cinética , Sistema de Señalización de MAP Quinasas , Biología de Sistemas
12.
PLoS Comput Biol ; 12(12): e1005227, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27923060

RESUMEN

We explore the relationship among experimental design, parameter estimation, and systematic error in sloppy models. We show that the approximate nature of mathematical models poses challenges for experimental design in sloppy models. In many models of complex biological processes it is unknown what are the relevant physical mechanisms that must be included to explain system behaviors. As a consequence, models are often overly complex, with many practically unidentifiable parameters. Furthermore, which mechanisms are relevant/irrelevant vary among experiments. By selecting complementary experiments, experimental design may inadvertently make details that were ommitted from the model become relevant. When this occurs, the model will have a large systematic error and fail to give a good fit to the data. We use a simple hyper-model of model error to quantify a model's discrepancy and apply it to two models of complex biological processes (EGFR signaling and DNA repair) with optimally selected experiments. We find that although parameters may be accurately estimated, the discrepancy in the model renders it less predictive than it was in the sloppy regime where systematic error is small. We introduce the concept of a sloppy system-a sequence of models of increasing complexity that become sloppy in the limit of microscopic accuracy. We explore the limits of accurate parameter estimation in sloppy systems and argue that identifying underlying mechanisms controlling system behavior is better approached by considering a hierarchy of models of varying detail rather than focusing on parameter estimation in a single model.


Asunto(s)
Modelos Biológicos , Proyectos de Investigación , Algoritmos , Animales , Reparación del ADN , Receptores ErbB , Cinética , Ratones , Transducción de Señal , Irradiación Corporal Total
13.
Methods ; 76: 194-200, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25497059

RESUMEN

The purposes of this paper are (a) to examine the effect of calorimeter time constant (τ) on heat rate data from a single enzyme injection into substrate in an isothermal titration calorimeter (ITC), (b) to provide information that can be used to predict the optimum experimental conditions for determining the rate constant (k2), Michaelis constant (KM), and enthalpy change of the reaction (ΔRH), and (c) to describe methods for evaluating these parameters. We find that KM, k2 and ΔRH can be accurately estimated without correcting for the calorimeter time constant, τ, if (k2E/KM), where E is the total active enzyme concentration, is between 0.1/τ and 1/τ and the reaction goes to at least 99% completion. If experimental conditions are outside this domain and no correction is made for τ, errors in the inferred parameters quickly become unreasonable. A method for fitting single-injection data to the Michaelis-Menten or Briggs-Haldane model to simultaneously evaluate KM, k2, ΔRH, and τ is described and validated with experimental data. All four of these parameters can be accurately inferred provided the reaction time constant (k2E/KM) is larger than 1/τ and the data include enzyme saturated conditions.


Asunto(s)
Calorimetría/métodos , Enzimas/química , Cinética , Modelos Químicos , Sacarosa/química , Termodinámica , beta-Fructofuranosidasa/química
14.
J Chem Phys ; 143(1): 010901, 2015 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-26156455

RESUMEN

Large scale models of physical phenomena demand the development of new statistical and computational tools in order to be effective. Many such models are "sloppy," i.e., exhibit behavior controlled by a relatively small number of parameter combinations. We review an information theoretic framework for analyzing sloppy models. This formalism is based on the Fisher information matrix, which is interpreted as a Riemannian metric on a parameterized space of models. Distance in this space is a measure of how distinguishable two models are based on their predictions. Sloppy model manifolds are bounded with a hierarchy of widths and extrinsic curvatures. The manifold boundary approximation can extract the simple, hidden theory from complicated sloppy models. We attribute the success of simple effective models in physics as likewise emerging from complicated processes exhibiting a low effective dimensionality. We discuss the ramifications and consequences of sloppy models for biochemistry and science more generally. We suggest that the reason our complex world is understandable is due to the same fundamental reason: simple theories of macroscopic behavior are hidden inside complicated microscopic processes.


Asunto(s)
Modelos Teóricos , Física/métodos , Biología de Sistemas/métodos
15.
Phys Rev Lett ; 113(9): 098701, 2014 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-25216014

RESUMEN

Understanding the collective behavior of complex systems from their basic components is a difficult yet fundamental problem in science. Existing model reduction techniques are either applicable under limited circumstances or produce "black boxes" disconnected from the microscopic physics. We propose a new approach by translating the model reduction problem for an arbitrary statistical model into a geometric problem of constructing a low-dimensional, submanifold approximation to a high-dimensional manifold. When models are overly complex, we use the observation that the model manifold is bounded with a hierarchy of widths and propose using the boundaries as submanifold approximations. We refer to this approach as the manifold boundary approximation method. We apply this method to several models, including a sum of exponentials, a dynamical systems model of protein signaling, and a generalized Ising model. By focusing on parameters rather than physical degrees of freedom, the approach unifies many other model reduction techniques, such as singular limits, equilibrium approximations, and the renormalization group, while expanding the domain of tractable models. The method produces a series of approximations that decrease the complexity of the model and reveal how microscopic parameters are systematically "compressed" into a few macroscopic degrees of freedom, effectively building a bridge between the microscopic and the macroscopic descriptions.


Asunto(s)
Receptores ErbB/metabolismo , Modelos Teóricos , Animales , Modelos Biológicos , Transducción de Señal , Biología de Sistemas
16.
JASA Express Lett ; 4(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38949613

RESUMEN

The model manifold, an information geometry tool, is a geometric representation of a model that can quantify the expected information content of modeling parameters. For a normal-mode sound propagation model in a shallow ocean environment, transmission loss (TL) is calculated for a vertical line array and model manifolds are constructed for both absolute and relative TL. For the example presented in this paper, relative TL yields more compact model manifolds with seabed environments that are less statistically distinguishable than manifolds of absolute TL. This example illustrates how model manifolds can be used to improve experimental design for inverse problems.

17.
Phys Rev E ; 108(6-1): 064215, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38243539

RESUMEN

Bifurcation phenomena are common in multidimensional multiparameter dynamical systems. Normal form theory suggests that bifurcations are driven by relatively few combinations of parameters. Models of complex systems, however, rarely appear in normal form, and bifurcations are controlled by nonlinear combinations of the bare parameters of differential equations. Discovering reparameterizations to transform complex equations into a normal form is often very difficult, and the reparameterization may not even exist in a closed form. Here we show that information geometry and sloppy model analysis using the Fisher information matrix can be used to identify the combination of parameters that control bifurcations. By considering observations on increasingly long timescales, we find those parameters that rapidly characterize the system's topological inhomogeneities, whether the system is in normal form or not. We anticipate that this novel analytical method, which we call time-widening information geometry (TWIG), will be useful in applied network analysis.

18.
BMC Bioinformatics ; 13: 181, 2012 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-22838836

RESUMEN

BACKGROUND: Parameter estimation in biological models is a common yet challenging problem. In this work we explore the problem for gene regulatory networks modeled by differential equations with unknown parameters, such as decay rates, reaction rates, Michaelis-Menten constants, and Hill coefficients. We explore the question to what extent parameters can be efficiently estimated by appropriate experimental selection. RESULTS: A minimization formulation is used to find the parameter values that best fit the experiment data. When the data is insufficient, the minimization problem often has many local minima that fit the data reasonably well. We show that selecting a new experiment based on the local Fisher Information of one local minimum generates additional data that allows one to successfully discriminate among the many local minima. The parameters can be estimated to high accuracy by iteratively performing minimization and experiment selection. We show that the experiment choices are roughly independent of which local minima is used to calculate the local Fisher Information. CONCLUSIONS: We show that by an appropriate choice of experiments, one can, in principle, efficiently and accurately estimate all the parameters of gene regulatory network. In addition, we demonstrate that appropriate experiment selection can also allow one to restrict model predictions without constraining the parameters using many fewer experiments. We suggest that predicting model behaviors and inferring parameters represent two different approaches to model calibration with different requirements on data and experimental cost.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Cómputos Matemáticos , Modelos Genéticos , Proyectos de Investigación/estadística & datos numéricos
19.
JASA Express Lett ; 1(6): 063602, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-36154371

RESUMEN

Outdoor acoustic data often include non-acoustic pressures caused by atmospheric turbulence, particularly below a few hundred Hz in frequency, even when using microphone windscreens. This paper describes a method for automatic wind-noise classification and reduction in spectral data without requiring measured wind speeds. The method finds individual frequency bands matching the characteristic decreasing spectral slope of wind noise. Uncontaminated data from several short-timescale spectra can be used to obtain a decontaminated long-timescale spectrum. This method is validated with field-test data and can be applied to large datasets to efficiently find and reduce the negative impact of wind noise contamination.


Asunto(s)
Ruido , Viento , Acústica , Ruido/efectos adversos , Presión
20.
JASA Express Lett ; 1(12): 122401, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-36154374

RESUMEN

Modeling outdoor environmental sound levels is a challenging problem. This paper reports on a validation study of two continental-scale machine learning models using geospatial layers as inputs and the summer daytime A-weighted L50 as a validation metric. The first model was developed by the National Park Service while the second was developed by the present authors. Validation errors greater than 20 dBA are observed. Large errors are attributed to limited acoustic training data. Validation environments are geospatially dissimilar to training sites, requiring models to extrapolate beyond their training sets. Results motivate further work in optimal data collection and uncertainty quantification.


Asunto(s)
Aprendizaje Automático , Estaciones del Año
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA