Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-37584045

RESUMO

Time-series are commonly susceptible to various types of corruption due to sensor-level changes and defects which can result in missing samples, sensor and quantization noise, unknown calibration, unknown phase shifts etc. These corruptions cannot be easily corrected as the noise model may be unknown at the time of deployment. This also results in the inability to employ pre-trained classifiers, trained on (clean) source data. In this paper, we present a general framework and models for time-series that can make use of (unlabeled) test samples to estimate the noise model-entirely at test time. To this end, we use a coupled decoder model and an additional neural network which acts as a learned noise model simulator. We show that the framework is able to "clean" the data so as to match the source training data statistics and the cleaned data can be directly used with a pre-trained classifier for robust predictions. We perform empirical studies on diverse application domains with different types of sensors, clearly demonstrating the effectiveness and generality of this method.

2.
Front Big Data ; 4: 589417, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34337397

RESUMO

Interpretability has emerged as a crucial aspect of building trust in machine learning systems, aimed at providing insights into the working of complex neural networks that are otherwise opaque to a user. There are a plethora of existing solutions addressing various aspects of interpretability ranging from identifying prototypical samples in a dataset to explaining image predictions or explaining mis-classifications. While all of these diverse techniques address seemingly different aspects of interpretability, we hypothesize that a large family of interepretability tasks are variants of the same central problem which is identifying relative change in a model's prediction. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges.

3.
Nat Commun ; 11(1): 5622, 2020 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-33159053

RESUMO

Predictive models that accurately emulate complex scientific processes can achieve speed-ups over numerical simulators or experiments and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning methods to build data-driven emulators. In this work, we study an often overlooked, yet important, problem of choosing loss functions while designing such emulators. Popular choices such as the mean squared error or the mean absolute error are based on a symmetric noise assumption and can be unsuitable for heterogeneous data or asymmetric noise distributions. We propose Learn-by-Calibrating, a novel deep learning approach based on interval calibration for designing emulators that can effectively recover the inherent noise structure without any explicit priors. Using a large suite of use-cases, we demonstrate the efficacy of our approach in providing high-quality emulators, when compared to widely-adopted loss function choices, even in small-data regimes.

4.
Proc Natl Acad Sci U S A ; 117(18): 9741-9746, 2020 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32312816

RESUMO

Neural networks have become the method of choice in surrogate modeling because of their ability to characterize arbitrary, high-dimensional functions in a data-driven fashion. This paper advocates for the training of surrogates that are 1) consistent with the physical manifold, resulting in physically meaningful predictions, and 2) cyclically consistent with a jointly trained inverse model; i.e., backmapping predictions through the inverse results in the original input parameters. We find that these two consistencies lead to surrogates that are superior in terms of predictive performance, are more resilient to sampling artifacts, and tend to be more data efficient. Using inertial confinement fusion (ICF) as a test-bed problem, we model a one-dimensional semianalytic numerical simulator and demonstrate the effectiveness of our approach.

5.
IEEE Trans Vis Comput Graph ; 26(1): 291-300, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31484123

RESUMO

With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications.

6.
IEEE Trans Pattern Anal Mach Intell ; 39(5): 922-936, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28113699

RESUMO

Visual observations of dynamic phenomena, such as human actions, are often represented as sequences of smoothly-varying features. In cases where the feature spaces can be structured as Riemannian manifolds, the corresponding representations become trajectories on manifolds. Analysis of these trajectories is challenging due to non-linearity of underlying spaces and high-dimensionality of trajectories. In vision problems, given the nature of physical systems involved, these phenomena are better characterized on a low-dimensional manifold compared to the space of Riemannian trajectories. For instance, if one does not impose physical constraints of the human body, in data involving human action analysis, the resulting representation space will have highly redundant features. Learning an effective, low-dimensional embedding for action representations will have a huge impact in the areas of search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in the euclidean space, but its extension to Riemannian trajectories is non-trivial and remains unexplored. The difficulty lies in inherent non-linearity of the domain and temporal variability of actions that can distort any traditional metric between trajectories. To overcome these issues, we use the framework based on transported square-root velocity fields (TSRVF); this framework has several desirable properties, including a rate-invariant metric and vector space representations. We propose to learn an embedding such that each action trajectory is mapped to a single point in a low-dimensional euclidean space, and the trajectories that differ only in temporal rates map to the same point. We utilize the TSRVF representation, and accompanying statistical summaries of Riemannian trajectories, to extend existing coding methods such as PCA, KSVD and Label Consistent KSVD to Riemannian trajectories or more generally to Riemannian functions. We show that such coding efficiently captures trajectories in applications such as action recognition, stroke rehabilitation, visual speech recognition, clustering and diverse sequence sampling. Using this framework, we obtain state-of-the-art recognition results, while reducing the dimensionality/ complexity by a factor of 100-250x. Since these mappings and codes are invertible, they can also be used to interactively-visualize Riemannian trajectories and synthesize actions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...