Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
IEEE Trans Pattern Anal Mach Intell ; 32(10): 1809-21, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20724758

RESUMEN

Object matching is a fundamental operation in data analysis. It typically requires the definition of a similarity measure between the classes of objects to be matched. Instead, we develop an approach which is able to perform matching by requiring a similarity measure only within each of the classes. This is achieved by maximizing the dependency between matched pairs of observations by means of the Hilbert-Schmidt Independence Criterion. This problem can be cast as one of maximizing a quadratic assignment problem with special structure and we present a simple algorithm for finding a locally optimal solution.

2.
IEEE Trans Pattern Anal Mach Intell ; 31(6): 1048-58, 2009 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-19372609

RESUMEN

As a fundamental problem in pattern recognition, graph matching has applications in a variety of fields, from computer vision to computational biology. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. Many formulations of this problem can be cast in general as a quadratic assignment problem, where a linear term in the objective function encodes node compatibility and a quadratic term encodes edge compatibility. The main research focus in this theme is about designing efficient algorithms for approximately solving the quadratic assignment problem, since it is NP-hard. In this paper we turn our attention to a different question: how to estimate compatibility functions such that the solution of the resulting graph matching problem best matches the expected solution that a human would manually provide. We present a method for learning graph matching: the training examples are pairs of graphs and the 'labels' are matches between them. Our experimental results reveal that learning can substantially improve the performance of standard graph matching algorithms. In particular, we find that simple linear assignment with such a learning scheme outperforms Graduated Assignment with bistochastic normalisation, a state-of-the-art quadratic assignment relaxation algorithm.


Asunto(s)
Algoritmos , Inteligencia Artificial , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Técnica de Sustracción , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
3.
Bioinformatics ; 23(13): i490-8, 2007 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-17646335

RESUMEN

MOTIVATION: Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are involved in disease outbreak and progression. A multitude of methods have been proposed for this task of feature selection, yet the selected gene lists differ greatly between different methods. To accomplish biologically meaningful gene selection from microarray data, we have to understand the theoretical connections and the differences between these methods. In this article, we define a kernel-based framework for feature selection based on the Hilbert-Schmidt independence criterion and backward elimination, called BAHSIC. We show that several well-known feature selectors are instances of BAHSIC, thereby clarifying their relationship. Furthermore, by choosing a different kernel, BAHSIC allows us to easily define novel feature selection algorithms. As a further advantage, feature selection via BAHSIC works directly on multiclass problems. RESULTS: In a broad experimental evaluation, the members of the BAHSIC family reach high levels of accuracy and robustness when compared to other feature selection techniques. Experiments show that features selected with a linear kernel provide the best classification performance in general, but if strong non-linearities are present in the data then non-linear kernels can be more suitable. AVAILABILITY: Accompanying homepage is http://www.dbs.ifi.lmu.de/~borgward/BAHSIC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Biomarcadores de Tumor/análisis , Diagnóstico por Computador/métodos , Perfilación de la Expresión Génica/métodos , Proteínas de Neoplasias/análisis , Neoplasias/diagnóstico , Neoplasias/metabolismo , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Sensibilidad y Especificidad
4.
Bioinformatics ; 22(14): e49-57, 2006 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-16873512

RESUMEN

MOTIVATION: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based statistical test for this problem, based on the fact that two distributions are different if and only if there exists at least one function having different expectation on the two distributions. Consequently we use the maximum discrepancy between function means as the basis of a test statistic. The Maximum Mean Discrepancy (MMD) can take advantage of the kernel trick, which allows us to apply it not only to vectors, but strings, sequences, graphs, and other common structured data types arising in molecular biology. RESULTS: We study the practical feasibility of an MMD-based test on three central data integration tasks: Testing cross-platform comparability of microarray data, cancer diagnosis, and data-content based schema matching for two different protein function classification schemas. In all of these experiments, including high-dimensional ones, MMD is very accurate in finding samples that were generated from the same distribution, and outperforms its best competitors. CONCLUSIONS: We have defined a novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by our experiments. AVAILABILITY: http://www.dbs.ifi.lmu.de/~borgward/MMD.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Interpretación Estadística de Datos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información/métodos , Modelos Biológicos , Modelos Estadísticos , Simulación por Computador , Tamaño de la Muestra , Distribuciones Estadísticas , Integración de Sistemas
5.
Bioinformatics ; 21 Suppl 1: i47-56, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15961493

RESUMEN

MOTIVATION: Computational approaches to protein function prediction infer protein function by finding proteins with similar sequence, structure, surface clefts, chemical properties, amino acid motifs, interaction partners or phylogenetic profiles. We present a new approach that combines sequential, structural and chemical information into one graph model of proteins. We predict functional class membership of enzymes and non-enzymes using graph kernels and support vector machine classification on these protein graphs. RESULTS: Our graph model, derivable from protein sequence and structure only, is competitive with vector models that require additional protein information, such as the size of surface pockets. If we include this extra information into our graph model, our classifier yields significantly higher accuracy levels than the vector models. Hyperkernels allow us to select and to optimally combine the most relevant node attributes in our protein graphs. We have laid the foundation for a protein function prediction system that integrates protein information from various sources efficiently and effectively. AVAILABILITY: More information available via www.dbs.ifi.lmu.de/Mitarbeiter/borgwardt.html.


Asunto(s)
Biología Computacional/métodos , Enzimas/química , Algoritmos , Bases de Datos de Proteínas , Modelos Estadísticos , Conformación Proteica , Estructura Secundaria de Proteína , Análisis de Secuencia de Proteína/métodos , Programas Informáticos
7.
Neural Netw ; 17(1): 127-41, 2004 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-14690713

RESUMEN

In Support Vector (SV) regression, a parameter nu controls the number of Support Vectors and the number of points that come to lie outside of the so-called epsilon-insensitive tube. For various noise models and SV parameter settings, we experimentally determine the values of nu that lead to the lowest generalization error. We find good agreement with the values that had previously been predicted by a theoretical argument based on the asymptotic efficiency of a simplified model of SV regression. As a side effect of the experiments, valuable information about the generalization behavior of the remaining SVM parameters and their dependencies is gained. The experimental findings are valid even for complex 'real-world' data sets. Based on our results on the role of the nu-SVM parameters, we discuss various model selection methods.


Asunto(s)
Modelos Teóricos , Redes Neurales de la Computación , Dinámicas no Lineales , Análisis de Regresión , Generalización Psicológica , Distribución Normal
8.
Neural Netw ; 11(4): 637-649, 1998 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-12662802

RESUMEN

In this paper a correspondence is derived between regularization operators used in regularization networks and support vector kernels. We prove that the Green's Functions associated with regularization operators are suitable support vector kernels with equivalent regularization properties. Moreover, the paper provides an analysis of currently used support vector kernels in the view of regularization theory and corresponding operators associated with the classes of both polynomial kernels and translation invariant kernels. The latter are also analyzed on periodical domains. As a by-product we show that a large number of radial basis functions, namely conditionally positive definite functions, may be used as support vector kernels.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...