Your browser doesn't support javascript.
loading
Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.
Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang.
Afiliación
  • Li M; Department of Computer Science and Technology, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China.
  • Ling C; Department of Computer Science and Technology, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China. s0897918@gmail.com.
  • Xu Q; Department of Computer Science and Technology, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China.
  • Gao J; Department of Computer Science and Technology, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China.
Amino Acids ; 50(2): 255-266, 2018 02.
Article en En | MEDLINE | ID: mdl-29151135
Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Alineación de Secuencia / Redes Neurales de la Computación / Análisis de Secuencia de Proteína / Receptores Acoplados a Proteínas G / Modelos Teóricos Tipo de estudio: Prognostic_studies Idioma: En Revista: Amino Acids Asunto de la revista: BIOQUIMICA Año: 2018 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Alineación de Secuencia / Redes Neurales de la Computación / Análisis de Secuencia de Proteína / Receptores Acoplados a Proteínas G / Modelos Teóricos Tipo de estudio: Prognostic_studies Idioma: En Revista: Amino Acids Asunto de la revista: BIOQUIMICA Año: 2018 Tipo del documento: Article País de afiliación: China