RESUMO
This paper presents a new method for analyzing array comparative genomic hybridization (aCGH) data based on Correntropy. A new formulation based on low-rank aCGH data and Correntropy is proposed and its solution is presented based on Half-Quadratic method. Compared to existing methods, the proposed method is more robust to high corruptions and various kinds of noise. Moreover, it analyzes all aCGH profiles relating to a data set simultaneously. Experimental results illustrate the robustness of the proposed method when the noise is non-Gaussian and show its excellent performance in other cases.
Assuntos
Algoritmos , Hibridização Genômica Comparativa/métodos , Variações do Número de Cópias de DNA , Animais , Genômica/métodos , HumanosRESUMO
Hierarchical Temporal Memory (HTM) is an unsupervised algorithm in machine learning. It models several fundamental neocortical computational principles. Spatial Pooler (SP) is one of the main components of the HTM, which continuously encodes streams of binary input from various layers and regions into sparse distributed representations. In this paper, the goal is to evaluate the sparsification in the SP algorithm from the perspective of information theory by the information bottleneck (IB), Cramer-Rao lower bound, and Fisher information matrix. This paper makes two main contributions. First, we introduce a new upper bound for the standard information bottleneck relation, which we refer to as modified-IB in this paper. This measure is used to evaluate the performance of the SP algorithm in different sparsity levels and various amounts of noise. The MNIST, Fashion-MNIST and NYC-Taxi datasets were fed to the SP algorithm separately. The SP algorithm with learning was found to be resistant to noise. Adding up to 40% noise to the input resulted in no discernible change in the output. Using the probabilistic mapping method and Hidden Markov Model, the sparse SP output representation was reconstructed in the input space. In the modified-IB relation, it is numerically calculated that a lower noise level and a higher sparsity level in the SP algorithm lead to a more effective reconstruction and SP with 2% sparsity produces the best results. Our second contribution is to prove mathematically that more sparsity leads to better performance of the SP algorithm. The data distribution was considered the Cauchy distribution, and the Cramer-Rao lower bound was analyzed to estimate SP's output at different sparsity levels.