Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 12(1): 597, 2022 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-35022467

RESUMO

The rapid adoption of artificial intelligence methods in healthcare is coupled with the critical need for techniques to rigorously introspect models and thereby ensure that they behave reliably. This has led to the design of explainable AI techniques that uncover the relationships between discernible data signatures and model predictions. In this context, counterfactual explanations that synthesize small, interpretable changes to a given query while producing desired changes in model predictions have become popular. This under-constrained, inverse problem is vulnerable to introducing irrelevant feature manipulations, particularly when the model's predictions are not well-calibrated. Hence, in this paper, we propose the TraCE (training calibration-based explainers) technique, which utilizes a novel uncertainty-based interval calibration strategy for reliably synthesizing counterfactuals. Given the wide-spread adoption of machine-learned solutions in radiology, our study focuses on deep models used for identifying anomalies in chest X-ray images. Using rigorous empirical studies, we demonstrate the superiority of TraCE explanations over several state-of-the-art baseline approaches, in terms of several widely adopted evaluation metrics. Our findings show that TraCE can be used to obtain a holistic understanding of deep models by enabling progressive exploration of decision boundaries, to detect shortcuts, and to infer relationships between patient attributes and disease severity.

2.
Front Big Data ; 4: 589417, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34337397

RESUMO

Interpretability has emerged as a crucial aspect of building trust in machine learning systems, aimed at providing insights into the working of complex neural networks that are otherwise opaque to a user. There are a plethora of existing solutions addressing various aspects of interpretability ranging from identifying prototypical samples in a dataset to explaining image predictions or explaining mis-classifications. While all of these diverse techniques address seemingly different aspects of interpretability, we hypothesize that a large family of interepretability tasks are variants of the same central problem which is identifying relative change in a model's prediction. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges.

3.
Front Artif Intell ; 4: 589632, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34179767

RESUMO

Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an important research issue in machine learning. Though several existing efforts have focused on image/video data, applications with graph-structured data have not received sufficient attention. Therefore, in this paper, we investigate the problem of detecting shifts in graph structured data through the lens of statistical hypothesis testing. Specifically, we propose a practical two-sample test based approach for shift detection in large-scale graph structured data. Our approach is very flexible in that it is suitable for both undirected and directed graphs, and eliminates the need for equal sample sizes. Using empirical studies, we demonstrate the effectiveness of the proposed test in detecting dataset shifts. We also corroborate these findings using real-world datasets, characterized by directed graphs and a large number of nodes.

4.
IEEE Trans Neural Netw Learn Syst ; 32(3): 1241-1253, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-32305942

RESUMO

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning (ML), and sequential optimization has become a popular solution. Typical examples include data summarization, sample mining for predictive modeling, and hyperparameter optimization. Existing solutions attempt to adaptively trade off between global exploration and local exploitation, in which the initial exploratory sample is critical to their success. While discrepancy-based samples have become the de facto approach for exploration, results from computer graphics suggest that coverage-based designs, e.g., Poisson disk sampling, can be a superior alternative. In order to successfully adopt coverage-based sample designs to ML applications, which were originally developed for 2-D image analysis, we propose fundamental advances by constructing a parameterized family of designs with provably improved coverage characteristics and developing algorithms for effective sample synthesis. Using experiments in sample mining and hyperparameter optimization for supervised learning, we show that our approach consistently outperforms the existing exploratory sampling methods in both blind exploration and sequential search with Bayesian optimization.

5.
Nat Commun ; 11(1): 5622, 2020 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-33159053

RESUMO

Predictive models that accurately emulate complex scientific processes can achieve speed-ups over numerical simulators or experiments and at the same time provide surrogates for improving the subsequent analysis. Consequently, there is a recent surge in utilizing modern machine learning methods to build data-driven emulators. In this work, we study an often overlooked, yet important, problem of choosing loss functions while designing such emulators. Popular choices such as the mean squared error or the mean absolute error are based on a symmetric noise assumption and can be unsuitable for heterogeneous data or asymmetric noise distributions. We propose Learn-by-Calibrating, a novel deep learning approach based on interval calibration for designing emulators that can effectively recover the inherent noise structure without any explicit priors. Using a large suite of use-cases, we demonstrate the efficacy of our approach in providing high-quality emulators, when compared to widely-adopted loss function choices, even in small-data regimes.

6.
Sci Rep ; 10(1): 16428, 2020 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-33009423

RESUMO

Effective patient care mandates rapid, yet accurate, diagnosis. With the abundance of non-invasive diagnostic measurements and electronic health records (EHR), manual interpretation for differential diagnosis has become time-consuming and challenging. This has led to wide-spread adoption of AI-powered tools, in pursuit of improving accuracy and efficiency of this process. While the unique challenges presented by each modality and clinical task demand customized tools, the cumbersome process of making problem-specific choices has triggered the critical need for a generic solution to enable rapid development of models in practice. In this spirit, we develop DDxNet, a deep architecture for time-varying clinical data, which we demonstrate to be well-suited for diagnostic tasks involving different modalities (ECG/EEG/EHR), required level of characterization (abnormality detection/phenotyping) and data fidelity (single-lead ECG/22-channel EEG). Using multiple benchmark problems, we show that DDxNet produces high-fidelity predictive models, and sometimes even provides significant performance gains over problem-specific solutions.


Assuntos
Aprendizado Profundo , Eletrocardiografia/métodos , Eletroencefalografia/métodos , Registros Eletrônicos de Saúde , Assistência ao Paciente/métodos , Algoritmos , Humanos , Aprendizado de Máquina , Modelos Teóricos , Software
7.
Proc Natl Acad Sci U S A ; 117(18): 9741-9746, 2020 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32312816

RESUMO

Neural networks have become the method of choice in surrogate modeling because of their ability to characterize arbitrary, high-dimensional functions in a data-driven fashion. This paper advocates for the training of surrogates that are 1) consistent with the physical manifold, resulting in physically meaningful predictions, and 2) cyclically consistent with a jointly trained inverse model; i.e., backmapping predictions through the inverse results in the original input parameters. We find that these two consistencies lead to surrogates that are superior in terms of predictive performance, are more resilient to sampling artifacts, and tend to be more data efficient. Using inertial confinement fusion (ICF) as a test-bed problem, we model a one-dimensional semianalytic numerical simulator and demonstrate the effectiveness of our approach.

8.
IEEE Trans Vis Comput Graph ; 26(1): 291-300, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31484123

RESUMO

With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications.

9.
IEEE Trans Neural Netw Learn Syst ; 31(10): 3977-3988, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31725400

RESUMO

Modern data analysis pipelines are becoming increasingly complex due to the presence of multiview information sources. While graphs are effective in modeling complex relationships, in many scenarios, a single graph is rarely sufficient to succinctly represent all interactions, and hence, multilayered graphs have become popular. Though this leads to richer representations, extending solutions from the single-graph case is not straightforward. Consequently, there is a strong need for novel solutions to solve classical problems, such as node classification, in the multilayered case. In this article, we consider the problem of semisupervised learning with multilayered graphs. Though deep network embeddings, e.g., DeepWalk, are widely adopted for community discovery, we argue that feature learning with random node attributes, using graph neural networks, can be more effective. To this end, we propose to use attention models for effective feature learning and develop two novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the interlayer dependences for building multilayered graph embeddings. Using empirical studies on several benchmark data sets, we evaluate the proposed approaches and demonstrate significant performance improvements in comparison with the state-of-the-art network embedding strategies. The results also show that using simple random features is an effective choice, even in cases where explicit node attributes are not available.

10.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 2571-2574, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30440933

RESUMO

Processing temporal sequences is central to a variety of applications in health care, and in particular multichannel Electrocardiogram (ECG) is a highly prevalent diagnostic modality that relies on robust sequence modeling. While Recurrent Neural Networks (RNNs) have led to significant advances in automated diagnosis with time-series data, they perform poorly when models are trained using a limited set of channels. A crucial limitation of existing solutions is that they rely solely on discriminative models, which tend to generalize poorly in such scenarios. In order to combat this limitation, we develop a generative modeling approach to limited channel ECG classification. This approach first uses a Seq2Seq model to implicitly generate the missing channel information, and then uses the latent representation to perform the actual supervisory task. This decoupling enables the use of unsupervised data and also provides highly robust metric spaces for subsequent discriminative learning. Our experiments with the Physionet dataset clearly evidence the effectiveness of our approach over standard RNNs in disease prediction.


Assuntos
Eletrocardiografia , Redes Neurais de Computação
11.
IEEE Trans Vis Comput Graph ; 24(1): 553-562, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28866574

RESUMO

Constructing distributed representations for words through neural language models and using the resulting vector spaces for analysis has become a crucial component of natural language processing (NLP). However, despite their widespread application, little is known about the structure and properties of these spaces. To gain insights into the relationship between words, the NLP community has begun to adapt high-dimensional visualization techniques. In particular, researchers commonly use t-distributed stochastic neighbor embeddings (t-SNE) and principal component analysis (PCA) to create two-dimensional embeddings for assessing the overall structure and exploring linear relationships (e.g., word analogies), respectively. Unfortunately, these techniques often produce mediocre or even misleading results and cannot address domain-specific visualization challenges that are crucial for understanding semantic relationships in word embeddings. Here, we introduce new embedding techniques for visualizing semantic and syntactic analogies, and the corresponding tests to determine whether the resulting views capture salient structures. Additionally, we introduce two novel views for a comprehensive study of analogy relationships. Finally, we augment t-SNE embeddings to convey uncertainty information in order to allow a reliable interpretation. Combined, the different views address a number of domain-specific tasks difficult to solve with existing tools.

12.
IEEE Trans Neural Netw Learn Syst ; 26(9): 1913-26, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25343771

RESUMO

Sparse representations using learned dictionaries are being increasingly used with success in several data processing and machine learning applications. The increasing need for learning sparse models in large-scale applications motivates the development of efficient, robust, and provably good dictionary learning algorithms. Algorithmic stability and generalizability are desirable characteristics for dictionary learning algorithms that aim to build global dictionaries, which can efficiently model any test data similar to the training samples. In this paper, we propose an algorithm to learn dictionaries for sparse representations from large scale data, and prove that the proposed learning algorithm is stable and generalizable asymptotically. The algorithm employs a 1-D subspace clustering procedure, the K-hyperline clustering, to learn a hierarchical dictionary with multiple levels. We also propose an information-theoretic scheme to estimate the number of atoms needed in each level of learning and develop an ensemble approach to learn robust dictionaries. Using the proposed dictionaries, the sparse code for novel test data can be computed using a low-complexity pursuit procedure. We demonstrate the stability and generalization characteristics of the proposed algorithm using simulations. We also evaluate the utility of the multilevel dictionaries in compressed recovery and subspace learning applications.


Assuntos
Algoritmos , Aprendizagem , Redes Neurais de Computação , Análise por Conglomerados , Simulação por Computador , Humanos , Reconhecimento Automatizado de Padrão
13.
IEEE Trans Image Process ; 23(7): 2905-15, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24833593

RESUMO

In complex visual recognition tasks, it is typical to adopt multiple descriptors, which describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a unified feature space in a principled manner using kernel methods. Sparse models that generalize well to the test data can be learned in the unified kernel space, and appropriate constraints can be incorporated for application in supervised and unsupervised learning. In this paper, we propose to perform sparse coding and dictionary learning in the multiple kernel space, where the weights of the ensemble kernel are tuned based on graph-embedding principles such that class discrimination is maximized. In our proposed algorithm, dictionaries are inferred using multiple levels of 1D subspace clustering in the kernel space, and the sparse codes are obtained using a simple levelwise pursuit scheme. Empirical results for object recognition and image clustering show that our algorithm outperforms existing sparse coding based approaches, and compares favorably to other state-of-the-art methods.


Assuntos
Algoritmos , Inteligência Artificial , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise por Conglomerados , Bases de Dados Factuais , Flores
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA