NeuronMotif: Deciphering cis-regulatory codes by layer-wise demixing of deep neural networks.

Wei, Zheng; Hua, Kui; Wei, Lei; Ma, Shining; Jiang, Rui; Zhang, Xuegong; Li, Yanda; Wong, Wing H; Wang, Xiaowo

Wei, Zheng; Hua, Kui; Wei, Lei; Ma, Shining; Jiang, Rui; Zhang, Xuegong; Li, Yanda; Wong, Wing H; Wang, Xiaowo.

Afiliação

Wei Z; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
Hua K; Beijing Academy of Artificial Intelligence, Beijing 100084, China.
Wei L; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
Ma S; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
Jiang R; Department of Statistics, Stanford University, Stanford, CA 94305.
Zhang X; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305.
Li Y; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
Wong WH; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
Wang X; Ministry of Education Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.

Proc Natl Acad Sci U S A ; 120(15): e2216698120, 2023 04 11.

Article em En | MEDLINE | ID: mdl-37023129

RESUMO

Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons which respond to multiple types of sequence patterns. Since existing interpretation methods were mainly designed to visualize the class of sequences that can activate the neuron, the resulting visualization will correspond to a mixture of patterns. Such a mixture is usually difficult to interpret without resolving the mixed patterns. We propose the NeuronMotif algorithm to interpret such neurons. Given any convolutional neuron (CN) in the network, NeuronMotif first generates a large sample of sequences capable of activating the CN, which typically consists of a mixture of patterns. Then, the sequences are "demixed" in a layer-wise manner by backward clustering of the feature maps of the involved convolutional layers. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures. Compared to existing methods, the motifs found by NeuronMotif have more matches to known motifs in the JASPAR database. The higher-order patterns uncovered for deep CNs are supported by the literature and ATAC-seq footprinting. Overall, NeuronMotif enables the deciphering of cis-regulatory codes from deep CNs and enhances the utility of CNN in genome interpretation.

Assuntos

Algoritmos; Redes Neurais de Computação; Motivos de Nucleotídeos/genética; Sequências Reguladoras de Ácido Nucleico/genética; Bases de Dados Factuais

Palavras-chave

cis-regulatory grammar; deep neural network; model interpretation; motif combination; multifaceted neuron

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Tipo de estudo: Prognostic_studies Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google