Kernel learning at the first level of inference.

Cawley, Gavin C; Talbot, Nicola L C

Cawley, Gavin C; Talbot, Nicola L C.

Afiliación

Cawley GC; School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK. Electronic address: G.Cawley@uea.ac.uk.
Talbot NL; School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK. Electronic address: nlct@cmp.uea.ac.uk.

Neural Netw ; 53: 69-80, 2014 May.

Article en En | MEDLINE | ID: mdl-24561452

ABSTRACT

ABSTRACT

Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense.

Asunto(s)

Máquina de Vectores de Soporte; Teorema de Bayes; Análisis de los Mínimos Cuadrados; Modelos Teóricos

Palabras clave

Automatic relevance determination; Kernel methods; Model selection; Over-fitting; Regularisation

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Máquina de Vectores de Soporte Tipo de estudio: Prognostic_studies Idioma: En Revista: Neural Netw Asunto de la revista: NEUROLOGIA Año: 2014 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google