Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Sci Rep ; 9(1): 20353, 2019 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-31889137

RESUMO

In many research areas scientists are interested in clustering objects within small datasets while making use of prior knowledge from large reference datasets. We propose a method to apply the machine learning concept of transfer learning to unsupervised clustering problems and show its effectiveness in the field of single-cell RNA sequencing (scRNA-Seq). The goal of scRNA-Seq experiments is often the definition and cataloguing of cell types from the transcriptional output of individual cells. To improve the clustering of small disease- or tissue-specific datasets, for which the identification of rare cell types is often problematic, we propose a transfer learning method to utilize large and well-annotated reference datasets, such as those produced by the Human Cell Atlas. Our approach modifies the dataset of interest while incorporating key information from the larger reference dataset via Non-negative Matrix Factorization (NMF). The modified dataset is subsequently provided to a clustering algorithm. We empirically evaluate the benefits of our approach on simulated scRNA-Seq data as well as on publicly available datasets. Finally, we present results for the analysis of a recently published small dataset and find improved clustering when transferring knowledge from a large reference dataset. Implementations of the method are available at https://github.com/nicococo/scRNA.


Assuntos
Análise por Conglomerados , Biologia Computacional , Perfilação da Expressão Gênica , Aprendizado de Máquina , Análise de Sequência de RNA , Análise de Célula Única , Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Curva ROC , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma
2.
PLoS One ; 12(3): e0174392, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28346487

RESUMO

High prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. For computational biology, positional oligomer importance matrices (POIMs) have been successfully applied to explain the decision of support vector machines (SVMs) using weighted-degree (WD) kernels. To extract relevant biological motifs from POIMs, the motifPOIM method has been devised and showed promising results on real-world data. Our contribution in this paper is twofold: as an extension to POIMs, we propose gPOIM, a general measure of feature importance for arbitrary learning machines and feature sets (including, but not limited to, SVMs and CNNs) and devise a sampling strategy for efficient computation. As a second contribution, we derive a convex formulation of motifPOIMs that leads to more reliable motif extraction from gPOIMs. Empirical evaluations confirm the usefulness of our approach on artificially generated data as well as on real-world datasets.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Máquina de Vetores de Suporte , Algoritmos
3.
IEEE Trans Neural Syst Rehabil Eng ; 24(9): 961-970, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26513794

RESUMO

Fundamental changes over time of surface EMG signal characteristics are a challenge for myocontrol algorithms controlling prosthetic devices. These changes are generally caused by electrode shifts after donning and doffing, sweating, additional weight or varying arm positions, which results in a change of the signal distribution-a scenario often referred to as covariate shift. A substantial decrease in classification accuracy due to these factors hinders the possibility to directly translate EMG signals into accurate myoelectric control patterns outside laboratory conditions. To overcome this limitation, we propose the use of supervised adaptation methods. The approach is based on adapting a trained classifier using a small calibration set only, which incorporates the relevant aspects of the nonstationarities, but requires only less than 1 min of data recording. The method was tested first through an offline analysis on signals acquired across 5 days from seven able-bodied individuals and four amputees. Moreover, we also conducted a three day online experiment on eight able-bodied individuals and one amputee, assessing user performance and user-ratings of the controllability. Across different testing days, both offline and online performance improved significantly when shrinking the training model parameters by a given estimator towards the calibration set parameters. In the offline data analysis, the classification accuracy remained above 92% over five days with the proposed approach, whereas it decreased to 75% without adaptation. Similarly, in the online study, with the proposed approach the performance increased by 25% compared to a test without adaptation. These results indicate that the proposed methodology can contribute to improve robustness of myoelectric pattern recognition methods in daily life applications.


Assuntos
Cotos de Amputação/fisiopatologia , Membros Artificiais , Eletromiografia/métodos , Contração Muscular/fisiologia , Músculo Esquelético/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Adulto , Algoritmos , Amputados/reabilitação , Interpretação Estatística de Dados , Humanos , Masculino , Pessoa de Meia-Idade , Rádio (Anatomia)/cirurgia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Adulto Jovem
4.
PLoS One ; 10(12): e0144782, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26690911

RESUMO

Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but--due to its black-box character--motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs--regardless of their length and complexity--underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set.


Assuntos
Aprendizado de Máquina , Modelos Genéticos , Motivos de Nucleotídeos , Análise de Sequência de DNA/métodos , Humanos
5.
Artigo em Inglês | MEDLINE | ID: mdl-25570960

RESUMO

Ensuring robustness of myocontrol algorithms for prosthetic devices is an important challenge. Robustness needs to be maintained under nonstationarities, e.g. due to electrode shifts after donning and doffing, sweating, additional weight or varying arm positions. Such nonstationary behavior changes the signal distributions - a scenario often referred to as covariate shift. This circumstance causes a significant decrease in classification accuracy in daily life applications. Re-training is possible but it is time consuming since it requires a large number of trials. In this paper, we propose to adapt the EMG classifier by a small calibration set only, which is able to capture the relevant aspects of the nonstationarities, but requires re-training data of only very short duration. We tested this strategy on signals acquired across 5 days in able-bodied individuals. The results showed that an estimator that shrinks the training model parameters towards the calibration set parameters significantly increased the classifier performance across different testing days. Even when using only one trial per class as re-training data for each day, the classification accuracy remained > 92% over five days. These results indicate that the proposed methodology can be a practical means for improving robustness in pattern recognition methods for myocontrol.


Assuntos
Eletromiografia/métodos , Próteses e Implantes , Adulto , Algoritmos , Análise Discriminante , Eletromiografia/instrumentação , Feminino , Mãos/fisiologia , Humanos , Masculino , Movimento , Reconhecimento Automatizado de Padrão , Fatores de Tempo , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA