Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Neural Netw ; 176: 106345, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38733798

RESUMEN

Local Interpretability Model-agnostic Explanations (LIME) is a well-known post-hoc technique for explaining black-box models. While very useful, recent research highlights challenges around the explanations generated. In particular, there is a potential lack of stability, where the explanations provided vary over repeated runs of the algorithm, casting doubt on their reliability. This paper investigates the stability of LIME when applied to multivariate time series classification. We demonstrate that the traditional methods for generating neighbours used in LIME carry a high risk of creating 'fake' neighbours, which are out-of-distribution in respect to the trained model and far away from the input to be explained. This risk is particularly pronounced for time series data because of their substantial temporal dependencies. We discuss how these out-of-distribution neighbours contribute to unstable explanations. Furthermore, LIME weights neighbours based on user-defined hyperparameters which are problem-dependent and hard to tune. We show how unsuitable hyperparameters can impact the stability of explanations. We propose a two-fold approach to address these issues. First, a generative model is employed to approximate the distribution of the training data set, from which within-distribution samples and thus meaningful neighbours can be created for LIME. Second, an adaptive weighting method is designed in which the hyperparameters are easier to tune than those of the traditional method. Experiments on real-world data sets demonstrate the effectiveness of the proposed method in providing more stable explanations using the LIME framework. In addition, in-depth discussions are provided on the reasons behind these results.


Asunto(s)
Algoritmos , Factores de Tiempo , Redes Neurales de la Computación
2.
Lancet Digit Health ; 4(12): e862-e872, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36333179

RESUMEN

BACKGROUND: Idiopathic pulmonary fibrosis is a progressive fibrotic lung disease with a variable clinical trajectory. Decline in forced vital capacity (FVC) is the main indicator of progression; however, missingness prevents long-term analysis of patterns in lung function. We aimed to identify distinct clusters of lung function trajectory among patients with idiopathic pulmonary fibrosis using machine learning techniques. METHODS: We did a secondary analysis of longitudinal data on FVC collected from a cohort of patients with idiopathic pulmonary fibrosis from the PROFILE study; a multicentre, prospective, observational cohort study. We evaluated the imputation performance of conventional and machine learning techniques to impute missing data and then analysed the fully imputed dataset by unsupervised clustering using self-organising maps. We compared anthropometric features, genomic associations, serum biomarkers, and clinical outcomes between clusters. We also performed a replication of the analysis on data from a cohort of patients with idiopathic pulmonary fibrosis from an independent dataset, obtained from the Chicago Consortium. FINDINGS: 415 (71%) of 581 participants recruited into the PROFILE study were eligible for further analysis. An unsupervised machine learning algorithm had the lowest imputation error among tested methods, and self-organising maps identified four distinct clusters (1-4), which was confirmed by sensitivity analysis. Cluster 1 comprised 140 (34%) participants and was associated with a disease trajectory showing a linear decline in FVC over 3 years. Cluster 2 comprised 100 (24%) participants and was associated with a trajectory showing an initial improvement in FVC before subsequently decreasing. Cluster 3 comprised 113 (27%) participants and was associated with a trajectory showing an initial decline in FVC before subsequent stabilisation. Cluster 4 comprised 62 (15%) participants and was associated with a trajectory showing stable lung function. Median survival was shortest in cluster 1 (2·87 years [IQR 2·29-3·40]) and cluster 3 (2·23 years [1·75-3·84]), followed by cluster 2 (4·74 years [3·96-5·73]), and was longest in cluster 4 (5·56 years [5·18-6·62]). Baseline FEV1 to FVC ratio and concentrations of the biomarker SP-D were significantly higher in clusters 1 and 3. Similar lung function clusters with some shared anthropometric features were identified in the replication cohort. INTERPRETATION: Using a data-driven unsupervised approach, we identified four clusters of lung function trajectory with distinct clinical and biochemical features. Enriching or stratifying longitudinal spirometric data into clusters might optimise evaluation of intervention efficacy during clinical trials and patient management. FUNDING: National Institute for Health and Care Research, Medical Research Council, and GlaxoSmithKline.


Asunto(s)
Fibrosis Pulmonar Idiopática , Humanos , Fibrosis Pulmonar Idiopática/tratamiento farmacológico , Fibrosis Pulmonar Idiopática/genética , Estudios Prospectivos , Capacidad Vital , Estudios de Cohortes , Biomarcadores
3.
IEEE Trans Cybern ; 45(4): 622-34, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25014988

RESUMEN

Self-labeled techniques are semi-supervised classification methods that address the shortage of labeled examples via a self-learning process based on supervised models. They progressively classify unlabeled data and use them to modify the hypothesis learned from labeled samples. Most relevant proposals are currently inspired by boosting schemes to iteratively enlarge the labeled set. Despite their effectiveness, these methods are constrained by the number of labeled examples and their distribution, which in many cases is sparse and scattered. The aim of this paper is to design a framework, named synthetic examples generation for self-labeled semi-supervised classification, to improve the classification performance of any given self-labeled method by using synthetic labeled data. These are generated via an oversampling technique and a positioning adjustment model that use both labeled and unlabeled examples as reference. Next, these examples are incorporated in the main stages of the self-labeling process. The principal aspects of the proposed framework are: 1) introducing diversity to the multiple classifiers used by using more (new) labeled data; 2) fulfilling labeled data distribution with the aid of unlabeled data; and 3) being applicable to any kind of self-labeled method. In our empirical studies, we have applied this scheme to four recent self-labeled methods, testing their capabilities with a large number of data sets. We show that this framework significantly improves the classification capabilities of self-labeled techniques.

4.
IEEE Trans Syst Man Cybern B Cybern ; 42(5): 1383-97, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22531787

RESUMEN

Cooperative coevolution is a successful trend of evolutionary computation which allows us to define partitions of the domain of a given problem, or to integrate several related techniques into one, by the use of evolutionary algorithms. It is possible to apply it to the development of advanced classification methods, which integrate several machine learning techniques into a single proposal. A novel approach integrating instance selection, instance weighting, and feature weighting into the framework of a coevolutionary model is presented in this paper. We compare it with a wide range of evolutionary and nonevolutionary related methods, in order to show the benefits of the employment of coevolution to apply the techniques considered simultaneously. The results obtained, contrasted through nonparametric statistical tests, show that our proposal outperforms other methods in the comparison, thus becoming a suitable tool in the task of enhancing the nearest neighbor classifier.


Asunto(s)
Algoritmos , Inteligencia Artificial , Técnicas de Apoyo para la Decisión , Modelos Teóricos , Reconocimiento de Normas Patrones Automatizadas/métodos , Simulación por Computador
5.
IEEE Trans Neural Netw Learn Syst ; 23(11): 1841-7, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24808077

RESUMEN

In this brief, we present a novel model fitting procedure for the neuro-coefficient smooth transition autoregressive model (NCSTAR), as presented by Medeiros and Veiga. The model is endowed with a statistically founded iterative building procedure and can be interpreted in terms of fuzzy rule-based systems. The interpretability of the generated models and a mathematically sound building procedure are two very important properties of forecasting models. The model fitting procedure employed by the original NCSTAR is a combination of initial parameter estimation by a grid search procedure with a traditional local search algorithm. We propose a different fitting procedure, using a memetic algorithm, in order to obtain more accurate models. An empirical evaluation of the method is performed, applying it to various real-world time series originating from three forecasting competitions. The results indicate that we can significantly enhance the accuracy of the models, making them competitive to models commonly used in the field.

6.
IEEE Trans Neural Netw ; 21(12): 1984-90, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21041159

RESUMEN

Nearest prototype methods are a successful trend of many pattern classification tasks. However, they present several shortcomings such as time response, noise sensitivity, and storage requirements. Data reduction techniques are suitable to alleviate these drawbacks. Prototype generation is an appropriate process for data reduction, which allows the fitting of a dataset for nearest neighbor (NN) classification. This brief presents a methodology to learn iteratively the positioning of prototypes using real parameter optimization procedures. Concretely, we propose an iterative prototype adjustment technique based on differential evolution. The results obtained are contrasted with nonparametric statistical tests and show that our proposal consistently outperforms previously proposed methods, thus becoming a suitable tool in the task of enhancing the performance of the NN classifier.


Asunto(s)
Inteligencia Artificial , Clasificación/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Análisis por Conglomerados , Almacenamiento y Recuperación de la Información
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA