Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 36(22-23): 5304-5312, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33367584

RESUMEN

MOTIVATION: Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. RESULTS: We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. AVAILABILITYAND IMPLEMENTATION: Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Knowl Inf Syst ; 59(1): 137-166, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32647403

RESUMEN

Hubness is an aspect of the curse of dimensionality related to the distance concentration effect. Hubs occur in high-dimensional data spaces as objects that are particularly often among the nearest neighbors of other objects. Conversely, other data objects become antihubs, which are rarely or never nearest neighbors to other objects. Many machine learning algorithms rely on nearest neighbor search and some form of measuring distances, which are both impaired by high hubness. Degraded performance due to hubness has been reported for various tasks such as classification, clustering, regression, visualization, recommendation, retrieval and outlier detection. Several hubness reduction methods based on different paradigms have previously been developed. Local and global scaling as well as shared neighbors approaches aim at repairing asymmetric neighborhood relations. Global and localized centering try to eliminate spatial centrality, while the related global and local dissimilarity measures are based on density gradient flattening. Additional methods and alternative dissimilarity measures that were argued to mitigate detrimental effects of distance concentration also influence the related hubness phenomenon. In this paper, we present a large-scale empirical evaluation of all available unsupervised hubness reduction methods and dissimilarity measures. We investigate several aspects of hubness reduction as well as its influence on data semantics which we measure via nearest neighbor classification. Scaling and density gradient flattening methods improve evaluation measures such as hubness and classification accuracy consistently for data sets from a wide range of domains, while centering approaches achieve the same only under specific settings.

3.
J New Music Res ; 47(1): 17-28, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29348779

RESUMEN

This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness.

4.
J New Music Res ; 45(3): 239-251, 2016 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-28190932

RESUMEN

One of the central goals of Music Information Retrieval (MIR) is the quantification of similarity between or within pieces of music. These quantitative relations should mirror the human perception of music similarity, which is however highly subjective with low inter-rater agreement. Unfortunately this principal problem has been given little attention in MIR so far. Since it is not meaningful to have computational models that go beyond the level of human agreement, these levels of inter-rater agreement present a natural upper bound for any algorithmic approach. We will illustrate this fundamental problem in the evaluation of MIR systems using results from two typical application scenarios: (i) modelling of music similarity between pieces of music; (ii) music structure analysis within pieces of music. For both applications, we derive upper bounds of performance which are due to the limited inter-rater agreement. We compare these upper bounds to the performance of state-of-the-art MIR systems and show how the upper bounds prevent further progress in developing better MIR systems.

5.
Neurocomputing (Amst) ; 169: 281-287, 2015 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-26640321

RESUMEN

The hubness phenomenon is a recently discovered aspect of the curse of dimensionality. Hub objects have a small distance to an exceptionally large number of data points while anti-hubs lie far from all other data points. A closely related problem is the concentration of distances in high-dimensional spaces. Previous work has already advocated the use of fractional ℓ p norms instead of the ubiquitous Euclidean norm to avoid the negative effects of distance concentration. However, which exact fractional norm to use is a largely unsolved problem. The contribution of this work is an empirical analysis of the relation of different ℓ p norms and hubness. We propose an unsupervised approach for choosing an ℓ p norm which minimizes hubs while simultaneously maximizing nearest neighbor classification. Our approach is evaluated on seven high-dimensional data sets and compared to three approaches that re-scale distances to avoid hubness.

6.
Neural Netw ; 18(7): 998-1005, 2005 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-15990276

RESUMEN

One of the standard applications of Independent Component Analysis (ICA) to EEG is removal of artifacts due to movements of the eye bulbs. Short blinks as well as slower saccadic movements are removed by subtracting respective independent components (ICs). EEG recorded from blind subjects poses special problems, since it shows a higher quantity of eye movements, which are also more irregular and very different across subjects. It is demonstrated that ICA can still be of use by comparing results from four blind subjects with results from one subject without eye bulbs who therefore does not show eye movement artifacts at all.


Asunto(s)
Algoritmos , Artefactos , Ceguera/fisiopatología , Electroencefalografía/métodos , Movimientos Oculares , Corteza Cerebral/fisiología , Corteza Cerebral/fisiopatología , Humanos , Modelos Lineales , Redes Neurales de la Computación , Distribución Normal , Procesamiento de Señales Asistido por Computador
7.
Artif Intell Med ; 33(3): 199-207, 2005 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-15811785

RESUMEN

OBJECTIVE: We developed a probabilistic continuous sleep stager based on Hidden Markov models using only a single EEG signal. It offers the advantage of being objective by not relying on human scorers, having much finer temporal resolution (1s instead of 30s), and being based on solid probabilistic principles rather than a predefined set of rules (Rechtschaffen & Kales) METHODS AND MATERIAL: Sixty-eight whole night sleep recordings from two different sleep labs are analysed using Gaussian observation Hidden Markov models. RESULTS: Our unsupervised approach detects the cornerstones of human sleep (wakefulness, deep and REM sleep) with around 80% accuracy based on data from a single EEG channel. There are some difficulties in generalizing results across sleep labs. CONCLUSION: Using data from a single electrode is sufficient for reliable continuous sleep staging. Sleep recordings from different sleep labs are not directly comparable. Training of separate models for the sleep labs is necessary.


Asunto(s)
Electroencefalografía/métodos , Procesamiento de Señales Asistido por Computador , Fases del Sueño/fisiología , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Electroencefalografía/estadística & datos numéricos , Femenino , Humanos , Masculino , Cadenas de Markov , Persona de Mediana Edad , Redes Neurales de la Computación , Distribución Normal , Probabilidad , Reproducibilidad de los Resultados , Sueño/fisiología , Sueño REM/fisiología , Factores de Tiempo , Vigilia/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA