Pesquisa | Biblioteca Virtual em Saúde

Multivariate Time Series Imputation: An Approach Based on Dictionary Learning.

Zheng, Xiaomeng; Dumitrescu, Bogdan; Liu, Jiamou; Giurcaneanu, Ciprian Doru.

Entropy (Basel) ; 24(8)2022 Jul 31.

Artigo em Inglês | MEDLINE | ID: mdl-36010721

RESUMO

The problem addressed by dictionary learning (DL) is the representation of data as a sparse linear combination of columns of a matrix called dictionary. Both the dictionary and the sparse representations are learned from the data. We show how DL can be employed in the imputation of multivariate time series. We use a structured dictionary, which is comprised of one block for each time series and a common block for all the time series. The size of each block and the sparsity level of the representation are selected by using information theoretic criteria. The objective function used in learning is designed to minimize either the sum of the squared errors or the sum of the magnitudes of the errors. We propose dimensionality reduction techniques for the case of high-dimensional time series. For demonstrating how the new algorithms can be used in practical applications, we conduct a large set of experiments on five real-life data sets. The missing data (MD) are simulated according to various scenarios where both the percentage of MD and the length of the sequences of MD are considered. This allows us to identify the situations in which the novel DL-based methods are superior to the existing methods.

Maximum Entropy Expectation-Maximization Algorithm for Fitting Latent-Variable Graphical Models to Multivariate Time Series.

Maanan, Saïd; Dumitrescu, Bogdan; Giurcaneanu, Ciprian Doru.

Entropy (Basel) ; 20(1)2018 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-33265161

RESUMO

This work is focused on latent-variable graphical models for multivariate time series. We show how an algorithm which was originally used for finding zeros in the inverse of the covariance matrix can be generalized such that to identify the sparsity pattern of the inverse of spectral density matrix. When applied to a given time series, the algorithm produces a set of candidate models. Various information theoretic (IT) criteria are employed for deciding the winner. A novel IT criterion, which is tailored to our model selection problem, is introduced. Some options for reducing the computational burden are proposed and tested via numerical examples. We conduct an empirical study in which the algorithm is compared with the state-of-the-art. The results are good, and the major advantage is that the subjective choices made by the user are less important than in the case of other methods.

Fast iterative gene clustering based on information theoretic criteria for selecting the cluster structure.

Giurcaneanu, Ciprian Doru; Tabus, Ioan; Astola, Jaakko; Ollila, Juha; Vihinen, Mauno.

J Comput Biol ; 11(4): 660-82, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15579237

RESUMO

Grouping of genes into clusters according to their expression levels is important for deriving biological information, e.g., on gene functions based on microarray and other related analyses. The paper introduces the selection of the number of clusters based on the minimum description length (MDL) principle for the selection of the number of clusters in gene expression data. The main feature of the new method is the ability to evaluate in a fast way the number of clusters according to the sound MDL principle, without exhaustive evaluations over all possible partitions of the gene set. The estimation method can be used in conjunction with various clustering algorithms. A recent clustering algorithm using principal component analysis, the "gene shaving" (GS) procedure, can be modified to make use of the new MDL estimation method, replacing the Gap statistics originally used in GS algorithm. The resulting clustering algorithm is shown to perform better than GS-Gap and CEM (classification expectation maximization), in the simulations using artificial data. The proposed method is applied to B-cell differentiation data, and the resulting clusters are compared with those found by self-organizing maps (SOM).

Assuntos

Família Multigênica , Algoritmos , Linfócitos B/citologia , Linfócitos B/fisiologia , Diferenciação Celular , Análise por Conglomerados , Biologia Computacional , Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Teoria da Informação , Modelos Genéticos , Teoria da Probabilidade

Using contexts and R-R interval estimation in lossless ECG compression.

Giurcaneanu, Ciprian Doru; Tabus, Ioan; Mereuta, Serban.

Comput Methods Programs Biomed ; 67(3): 177-86, 2002 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-11853943

RESUMO

The paper presents a new lossless ECG compression scheme. The short-term predictor and the coder use conditioning on a small number of contexts. The long-term prediction is based on an algorithm for R-R interval estimation. Several QRS detection algorithms are investigated to select a low complexity and reliable detection algorithm. The coding of prediction residuals uses primarily the Golomb-Rice (GR) codes, but, to improve the coding results, escape codes GR-ESC are used in some contexts for a limited number of samples. Experimental results indicate the good overall performance of the lossless ECG compression algorithms (reducing the storage needs from 12 to about 3-4 bits per sample). The scheme consistently outperforms other waveform or general purpose coding algorithms.

Assuntos

Algoritmos , Eletrocardiografia/métodos , Coração/fisiopatologia , Humanos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA