Your browser doesn't support javascript.
loading
iMPT-FDNPL: Identification of Membrane Protein Types with Functional Domains and a Natural Language Processing Approach.
Chen, Wei; Chen, Lei; Dai, Qi.
Affiliation
  • Chen W; College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.
  • Chen L; College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.
  • Dai Q; College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018, China.
Comput Math Methods Med ; 2021: 7681497, 2021.
Article de En | MEDLINE | ID: mdl-34671418
ABSTRACT
Membrane protein is an important kind of proteins. It plays essential roles in several cellular processes. Based on the intramolecular arrangements and positions in a cell, membrane proteins can be divided into several types. It is reported that the types of a membrane protein are highly related to its functions. Determination of membrane protein types is a hot topic in recent years. A plenty of computational methods have been proposed so far. Some of them used functional domain information to encode proteins. However, this procedure was still crude. In this study, we designed a novel feature extraction scheme to obtain informative features of proteins from their functional domain information. Such scheme termed domains as words and proteins, represented by its domains, as sentences. The natural language processing approach, word2vector, was applied to access the features of domains, which were further refined to protein features. Based on these features, RAndom k-labELsets with random forest as the base classifier was employed to build the multilabel classifier, namely, iMPT-FDNPL. The tenfold cross-validation results indicated the good performance of such classifier. Furthermore, such classifier was superior to other classifiers based on features derived from functional domains via one-hot scheme or derived from other properties of proteins, suggesting the effectiveness of protein features generated by the proposed scheme.
Sujet(s)

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Traitement du langage naturel / Protéines membranaires Type d'étude: Diagnostic_studies Limites: Humans Langue: En Journal: Comput Math Methods Med Sujet du journal: INFORMATICA MEDICA Année: 2021 Type de document: Article Pays d'affiliation: Chine

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Sujet principal: Traitement du langage naturel / Protéines membranaires Type d'étude: Diagnostic_studies Limites: Humans Langue: En Journal: Comput Math Methods Med Sujet du journal: INFORMATICA MEDICA Année: 2021 Type de document: Article Pays d'affiliation: Chine
...