Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 40(6): 1323-1337, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-28641245

RESUMEN

A typical objective of data visualization is to generate low-dimensional plots that maximally convey the information within the data. The visualization output should help the user not only identify the local neighborhood structure of individual samples, but also obtain a global view of the relative positioning and separation between cohorts. Here, we propose a novel visualization framework designed to satisfy these needs. By incorporating additional cohort positioning and discriminative constraints into local neighbor preservation models through the use of computed cohort prototypes, effective control over the arrangements and proximities of data cohorts can be obtained. We introduce various embedding and projection algorithms based on objective functions addressing the different visualization requirements. Their underlying models are optimized effectively using matrix manifold procedures to incorporate the problem constraints. Additionally, to facilitate large-scale applications, a matrix decomposition based model is also proposed to accelerate the computation. The improved capabilities of the new methods are demonstrated using various state-of-the-art dimensionality reduction algorithms. We present many qualitative and quantitative comparisons, on both synthetic problems and real-world tasks of complex text and image data, that show notable improvements over existing techniques.

2.
IEEE Trans Pattern Anal Mach Intell ; 38(5): 833-48, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-26353365

RESUMEN

This work is related to the combinatorial data analysis problem of seriation used for data visualization and exploratory analysis. Seriation re-sequences the data, so that more similar samples or objects appear closer together, whereas dissimilar ones are further apart. Despite the large number of current algorithms to realize such re-sequencing, there has not been a systematic way for analyzing the resulting sequences, comparing them, or fusing them to obtain a single unifying one. We propose a new positional proximity measure that evaluates the similarity of two arbitrary sequences based on their agreement on pairwise positional information of the sequenced objects. Furthermore, we present various statistical properties of this measure as well as its normalized version modeled as an instance of the generalized correlation coefficient. Based on this measure, we define a new procedure for consensus seriation that fuses multiple arbitrary sequences based on a quadratic assignment problem formulation and an efficient way of approximating its solution. We also derive theoretical links with other permutation distance functions and present their associated combinatorial optimization forms for consensus tasks. The utility of the proposed contributions is demonstrated through the comparison and fusion of multiple seriation algorithms we have implemented, using many real-world datasets from different application domains.

3.
IEEE Trans Biomed Eng ; 52(9): 1549-62, 2005 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16189968

RESUMEN

In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrap 632+ and k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead to approximately 96% correct classification rates with less than 10% of the original features.


Asunto(s)
Inteligencia Artificial , Diagnóstico por Computador/métodos , Dermatosis del Pie/diagnóstico , Dermatosis del Pie/fisiopatología , Pie/fisiopatología , Queratodermia Palmoplantar/diagnóstico , Queratodermia Palmoplantar/fisiopatología , Modelos Biológicos , Adulto , Algoritmos , Fenómenos Biomecánicos/métodos , Simulación por Computador , Análisis Discriminante , Femenino , Marcha , Humanos , Pierna/fisiopatología , Masculino , Persona de Mediana Edad , Reconocimiento de Normas Patrones Automatizadas/métodos , Úlcera por Presión/diagnóstico , Úlcera por Presión/fisiopatología , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
4.
IEEE Trans Pattern Anal Mach Intell ; 35(10): 2340-56, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23969381

RESUMEN

In this paper, we study the co-embedding problem of how to map different types of patterns into one common low-dimensional space, given only the associations (relation values) between samples. We conduct a generic analysis to discover the commonalities between existing co-embedding algorithms and indirectly related approaches and investigate possible factors controlling the shapes and distributions of the co-embeddings. The primary contribution of this work is a novel method for computing co-embeddings, termed the automatic co-embedding with adaptive shaping (ACAS) algorithm, based on an efficient transformation of the co-embedding problem. Its advantages include flexible model adaptation to the given data, an economical set of model variables leading to a parametric co-embedding formulation, and a robust model fitting criterion for model optimization based on a quantization procedure. The secondary contribution of this work is the introduction of a set of generic schemes for the qualitative analysis and quantitative assessment of the output of co-embedding algorithms, using existing labeled benchmark datasets. Experiments with synthetic and real-world datasets show that the proposed algorithm is very competitive compared to existing ones.


Asunto(s)
Algoritmos , Inteligencia Artificial , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
5.
IEEE Trans Neural Netw Learn Syst ; 24(10): 1575-87, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24808595

RESUMEN

Spectral embedding methods have played a very important role in dimensionality reduction and feature generation in machine learning. Supervised spectral embedding methods additionally improve the classification of labeled data, using proximity information that considers both features and class labels. However, these calculate the proximity information by treating all intraclass similarities homogeneously for all classes, and similarly for all interclass samples. In this paper, we propose a very novel and generic method which can treat all the intra- and interclass sample similarities heterogeneously by potentially using a different proximity function for each class and each class pair. To handle the complexity of selecting these functions, we employ evolutionary programming as an automated powerful formula induction engine. In addition, for computational efficiency and expressive power, we use a compact matrix tree representation equipped with a broad set of functions that can build most currently used similarity functions as well as new ones. Model selection is data driven, because the entire model is symbolically instantiated using only problem training data, and no user-selected functions or parameters are required. We perform thorough comparative experimentations with multiple classification datasets and many existing state-of-the-art embedding methods, which show that the proposed algorithm is very competitive in terms of classification accuracy and generalization ability.

6.
IEEE Trans Pattern Anal Mach Intell ; 34(11): 2216-32, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23289130

RESUMEN

This paper is about supervised and semi-supervised dimensionality reduction (DR) by generating spectral embeddings from multi-output data based on the pairwise proximity information. Two flexible and generic frameworks are proposed to achieve supervised DR (SDR) for multilabel classification. One is able to extend any existing single-label SDR to multilabel via sample duplication, referred to as MESD. The other is a multilabel design framework that tackles the SDR problem by computing weight (proximity) matrices based on simultaneous feature and label information, referred to as MOPE, as a generalization of many current techniques. A diverse set of different schemes for label-based proximity calculation, as well as a mechanism for combining label-based and feature-based weight information by considering information importance and prioritization, are proposed for MOPE. Additionally, we summarize many current spectral methods for unsupervised DR (UDR), single/multilabel SDR, and semi-supervised DR (SSDR) and express them under a common template representation as a general guide to researchers in the field. We also propose a general framework for achieving SSDR by combining existing SDR and UDR models, and also a procedure of reducing the computational cost via learning with a target set of relation features. The effectiveness of our proposed methodologies is demonstrated with experiments with document collections for multilabel text categorization from the natural language processing domain.


Asunto(s)
Algoritmos , Inteligencia Artificial , Procesamiento Automatizado de Datos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , Técnica de Sustracción , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
7.
IEEE Trans Neural Netw Learn Syst ; 23(7): 1169-75, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24807143

RESUMEN

The operation of instance-based learning algorithms is based on storing a large set of prototypes in the system's database. However, such systems often experience issues with storage requirements, sensitivity to noise, and computational complexity, which result in high search and response times. In this brief, we introduce a novel framework that employs spectral graph theory to efficiently partition the dataset to border and internal instances. This is achieved by using a diverse set of border-discriminating features that capture the local friend and enemy profiles of the samples. The fused information from these features is then used via graph-cut modeling approach to generate the final dataset partitions of border and nonborder samples. The proposed method is referred to as the spectral instance reduction (SIR) algorithm. Experiments with a large number of datasets show that SIR performs competitively compared to many other reduction algorithms, in terms of both objectives of classification accuracy and data condensation.

8.
IEEE Trans Neural Netw ; 21(8): 1281-95, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20624706

RESUMEN

Projection techniques are frequently used as the principal means for the implementation of feature extraction and dimensionality reduction for machine learning applications. A well established and broad class of such projection techniques is the projection pursuit (PP). Its core design parameter is a projection index, which is the driving force in obtaining the transformation function via optimization, and represents in an explicit or implicit way the user's perception of the useful information contained within the datasets. This paper seeks to address the problem related to the design of PP index functions for the linear feature extraction case. We achieve this using an evolutionary search framework, capable of building new indices to fit the properties of the available datasets. The high expressive power of this framework is sustained by a rich set of function primitives. The performance of several PP indices previously proposed by human experts is compared with these automatically generated indices for the task of classification, and results show a decrease in the classification errors.


Asunto(s)
Algoritmos , Inteligencia Artificial , Minería de Datos/normas , Modelos Lineales , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/normas , Artefactos , Biología Computacional/métodos , Humanos , Programas Informáticos/normas
9.
IEEE Trans Inf Technol Biomed ; 14(2): 418-24, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19726270

RESUMEN

Plantar lesions induced by biomechanical dysfunction pose a considerable socioeconomic health care challenge, and failure to detect lesions early can have significant effects on patient prognoses. Most of the previous works on plantar lesion identification employed the analysis of biomechanical microenvironment variables like pressure and thermal fields. This paper focuses on foot kinematics and applies kernel principal component analysis (KPCA) for nonlinear dimensionality reduction of features, followed by Fisher's linear discriminant analysis for the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. Performance comparisons are made using leave-one-out cross-validation. Results show that the proposed method can lead to approximately 94% correct classification rates, with a reduction of feature dimensionality from 2100 to 46, without any manual preprocessing or elaborate feature extraction methods. The results imply that foot kinematics contain information that is highly relevant to pathology classification and also that the nonlinear KPCA approach has considerable power in unraveling abstract biomechanical features into a relatively low-dimensional pathology-relevant space.


Asunto(s)
Úlcera del Pie/fisiopatología , Marcha/fisiología , Queratodermia Palmoplantar/fisiopatología , Reconocimiento de Normas Patrones Automatizadas/métodos , Úlcera por Presión/fisiopatología , Algoritmos , Inteligencia Artificial , Fenómenos Biomecánicos , Análisis Discriminante , Pie/fisiopatología , Humanos , Modelos Biológicos , Dinámicas no Lineales , Análisis de Componente Principal , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
10.
IEEE Trans Biomed Eng ; 56(3): 871-9, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19272902

RESUMEN

Driven by the demands on healthcare resulting from the shift toward more sedentary lifestyles, considerable effort has been devoted to the monitoring and classification of human activity. In previous studies, various classification schemes and feature extraction methods have been used to identify different activities from a range of different datasets. In this paper, we present a comparison of 14 methods to extract classification features from accelerometer signals. These are based on the wavelet transform and other well-known time- and frequency-domain signal characteristics. To allow an objective comparison between the different features, we used two datasets of activities collected from 20 subjects. The first set comprised three commonly used activities, namely, level walking, stair ascent, and stair descent, and the second a total of eight activities. Furthermore, we compared the classification accuracy for each feature set across different combinations of three different accelerometer placements. The classification analysis has been performed with robust subject-based cross-validation methods using a nearest-neighbor classifier. The findings show that, although the wavelet transform approach can be used to characterize nonstationary signals, it does not perform as accurately as frequency-based features when classifying dynamic activities performed by healthy subjects. Overall, the best feature sets achieved over 95% intersubject classification accuracy.


Asunto(s)
Monitoreo Ambulatorio/métodos , Movimiento , Procesamiento de Señales Asistido por Computador , Adulto , Algoritmos , Tobillo/fisiología , Femenino , Humanos , Locomoción/fisiología , Masculino , Modelos Teóricos , Reproducibilidad de los Resultados , Muslo/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA