Deep network embedding with dimension selection.
Neural Netw
; 179: 106512, 2024 Nov.
Article
de En
| MEDLINE
| ID: mdl-39032394
ABSTRACT
Network embedding is a general-purpose machine learning technique that converts network data from non-Euclidean space to Euclidean space, facilitating downstream analyses for the networks. However, existing embedding methods are often optimization-based, with the embedding dimension determined in a heuristic or ad hoc way, which can cause potential bias in downstream statistical inference. Additionally, existing deep embedding methods can suffer from a nonidentifiability issue due to the universal approximation power of deep neural networks. We address these issues within a rigorous statistical framework. We treat the embedding vectors as missing data, reconstruct the network features using a sparse decoder, and simultaneously impute the embedding vectors and train the sparse decoder using an adaptive stochastic gradient Markov chain Monte Carlo (MCMC) algorithm. Under mild conditions, we show that the sparse decoder provides a parsimonious mapping from the embedding space to network features, enabling effective selection of the embedding dimension and overcoming the nonidentifiability issue encountered by existing deep embedding methods. Furthermore, we show that the embedding vectors converge weakly to a desired posterior distribution in the 2-Wasserstein distance, addressing the potential bias issue experienced by existing embedding methods. This work lays down the first theoretical foundation for network embedding within the framework of missing data imputation.
Mots clés
Texte intégral:
1
Collection:
01-internacional
Base de données:
MEDLINE
Sujet principal:
Algorithmes
/
Chaines de Markov
/
29935
Limites:
Humans
Langue:
En
Journal:
Neural Netw
/
Neural netw
/
Neural networks
Sujet du journal:
NEUROLOGIA
Année:
2024
Type de document:
Article
Pays d'affiliation:
États-Unis d'Amérique
Pays de publication:
États-Unis d'Amérique