Your browser doesn't support javascript.
loading
Generalized Matrix Local Low Rank Representation by Random Projection and Submatrix Propagation.
Dang, Pengtao; Zhu, Haiqi; Guo, Tingbo; Wan, Changlin; Zhao, Tong; Salama, Paul; Wang, Yijie; Cao, Sha; Zhang, Chi.
  • Dang P; Purdue University, Indianapolis, IN, USA.
  • Zhu H; Indiana University, Bloomington, IN, USA.
  • Guo T; School of Medicine, Indiana University, Indianapolis, IN, USA.
  • Wan C; Purdue University, West Lafayette, IN, USA.
  • Zhao T; Uber, Inc, Seattle, WA, USA.
  • Salama P; Purdue University, Indianapolis, IN, USA.
  • Wang Y; Indiana University, Bloomington, IN, USA.
  • Cao S; School of Medicine, Indiana University, Indianapolis, IN, USA.
  • Zhang C; School of Medicine, Indiana University, Indianapolis, IN, USA.
KDD ; 2023: 390-401, 2023 Aug.
Article en En | MEDLINE | ID: mdl-38948121
ABSTRACT
Matrix low rank approximation is an effective method to reduce or eliminate the statistical redundancy of its components. Compared with the traditional global low rank methods such as singular value decomposition (SVD), local low rank approximation methods are more advantageous to uncover interpretable data structures when clear duality exists between the rows and columns of the matrix. Local low rank approximation is equivalent to low rank submatrix detection. Unfortunately, existing local low rank approximation methods can detect only submatrices of specific mean structure, which may miss a substantial amount of true and interesting patterns. In this work, we develop a novel matrix computational framework called RPSP (Random Probing based submatrix Propagation) that provides an effective solution for the general matrix local low rank representation problem. RPSP detects local low rank patterns that grow from small submatrices of low rank property, which are determined by a random projection approach. RPSP is supported by theories of random projection. Experiments on synthetic data demonstrate that RPSP outperforms all state-of-the-art methods, with the capacity to robustly and correctly identify the low rank matrices when the pattern has a similar mean as the background, background noise is heteroscedastic and multiple patterns present in the data. On real-world datasets, RPSP also demonstrates its effectiveness in identifying interpretable local low rank matrices.
Palabras clave