Integrated analysis of multimodal single-cell data with structural similarity.

Cao, Yingxin; Fu, Laiyi; Wu, Jie; Peng, Qinke; Nie, Qing; Zhang, Jing; Xie, Xiaohui

Cao, Yingxin; Fu, Laiyi; Wu, Jie; Peng, Qinke; Nie, Qing; Zhang, Jing; Xie, Xiaohui.

Afiliação

Cao Y; Department of Computer Science, University of California, Irvine, CA 92697, USA.
Fu L; Center for Complex Biological Systems, University of California, Irvine, CA 92697, USA.
Wu J; NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697, USA.
Peng Q; Department of Computer Science, University of California, Irvine, CA 92697, USA.
Nie Q; Systems Engineering Institute, School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shannxi 710049, China.
Zhang J; Department of Biological Chemistry, University of California, Irvine, CA 92697, USA.
Xie X; Systems Engineering Institute, School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shannxi 710049, China.

Nucleic Acids Res ; 50(21): e121, 2022 11 28.

Article em En | MEDLINE | ID: mdl-36130281

ABSTRACT

ABSTRACT

Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios.

Assuntos

Genômica; Multiômica; Análise por Conglomerados; Análise de Célula Única

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Genômica / Multiômica Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Genômica / Multiômica Idioma: En Ano de publicação: 2022 Tipo de documento: Article