Búsqueda | Portal de Búsqueda de la BVS España

Prediction of RNA-protein interactions by combining deep convolutional neural network with feature selection ensemble method.

Wang, Lei; Yan, Xin; Liu, Meng-Lin; Song, Ke-Jian; Sun, Xiao-Fei; Pan, Wen-Wen.

J Theor Biol ; 461: 230-238, 2019 01 14.

Artículo en Inglés | MEDLINE | ID: mdl-30321541

RESUMEN

RNA-protein interaction (RPI) plays an important role in the basic cellular processes of organisms. Unfortunately, due to time and cost constraints, it is difficult for biological experiments to determine the relationship between RNA and protein to a large extent. So there is an urgent need for reliable computational methods to quickly and accurately predict RNA-protein interaction. In this study, we propose a novel computational method RPIFSE (predicting RPI with Feature Selection Ensemble method) based on RNA and protein sequence information to predict RPI. Firstly, RPIFSE disturbs the features extracted by the convolution neural network (CNN) and generates multiple data sets according to the weight of the feature, and then use extreme learning machine (ELM) classifier to classify these data sets. Finally, the results of each classifier are combined, and the highest score is chosen as the final prediction result by weighting voting method. In 5-fold cross-validation experiments, RPIFSE achieved 91.87%, 89.74%, 97.76% and 98.98% accuracy on RPI369, RPI2241, RPI488 and RPI1807 data sets, respectively. To further evaluate the performance of RPIFSE, we compare it with the state-of-the-art support vector machine (SVM) classifier and other exiting methods on those data sets. Furthermore, we also predicted the RPI on the independent data set NPInter2.0 and drew the network graph based on the prediction results. These promising comparison results demonstrated the effectiveness of RPIFSE and indicated that RPIFSE could be a useful tool for predicting RPI.

Asunto(s)

Redes Neurales de la Computación , ARN/metabolismo , Biología Computacional/métodos , Conjuntos de Datos como Asunto , Unión Proteica , Análisis de Secuencia , Máquina de Vectores de Soporte

Identification of potential drug-targets by combining evolutionary information extracted from frequency profiles and molecular topological structures.

Wang, Lei; You, Zhu-Hong; Li, Li-Ping; Yan, Xin; Zhang, Wei; Song, Ke-Jian; Song, Chuan-Dong.

Chem Biol Drug Des ; 96(2): 758-767, 2020 08.

Artículo en Inglés | MEDLINE | ID: mdl-31393672

RESUMEN

Identifying interactions among drug compounds and target proteins is the basis of drug research and plays a crucial role in drug discovery. However, determining drug-target interactions (DTIs) and potential protein-compound interactions by biological experiment-based method alone is a very complicated, expensive, and time-consuming process. Hence, there is an intense motivation to design in silico prediction methods to overcome these obstacles. In this work, we designed a novel in silico strategy to predict proteome-scale DTIs based on the assumption that DTI pairs can be expressed through the evolutionary information derived from frequency profiles and drugs' structural properties. To achieve this, drug molecules are encoded into the substructure fingerprints to represent certain fragments; target proteins are first converted into position-specific scoring matrix (PSSM) and then encoded as 2-dimensional principal component analysis (2DPCA) descriptors. In the prediction phase, the feature weighted rotation forest (RF) classifier is used to estimate whether drug and target interact with each other on four benchmark datasets, including Enzymes, Ion Channels, GPCRs, and Nuclear Receptors. The prediction accuracy of cross-validation on the four datasets is 95.40%, 88.82%, 85.67%, and 82.22%, respectively. In order to have a clearer assessment of the proposed approach, we compared it with the discrete cosine transform (DCT) descriptor model, support vector machine (SVM) classifier model, and existing excellent approaches, including DBSI, NetCBP, KBMF2K, SIMCOMP, and RFDT. The excellent results of the experiment indicated that the proposed approach can effectively improve the DTI prediction accuracy and can be used as a practical tool for the research and design of new drugs.

Asunto(s)

Enzimas/química , Canales Iónicos/química , Preparaciones Farmacéuticas/química , Receptores Citoplasmáticos y Nucleares/química , Receptores Acoplados a Proteínas G/química , Simulación por Computador , Bases de Datos de Proteínas , Descubrimiento de Drogas , Interacciones Farmacológicas , Enzimas/metabolismo , Humanos , Canales Iónicos/metabolismo , Estructura Molecular , Preparaciones Farmacéuticas/metabolismo , Posición Específica de Matrices de Puntuación , Análisis de Componente Principal , Receptores Citoplasmáticos y Nucleares/metabolismo , Receptores Acoplados a Proteínas G/metabolismo , Máquina de Vectores de Soporte

NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information.

Jia, Li-Na; Yan, Xin; You, Zhu-Hong; Zhou, Xi; Li, Li-Ping; Wang, Lei; Song, Ke-Jian.

Evol Bioinform Online ; 16: 1176934320984171, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33488064

RESUMEN

The study of protein self-interactions (SIPs) can not only reveal the function of proteins at the molecular level, but is also crucial to understand activities such as growth, development, differentiation, and apoptosis, providing an important theoretical basis for exploring the mechanism of major diseases. With the rapid advances in biotechnology, a large number of SIPs have been discovered. However, due to the long period and high cost inherent to biological experiments, the gap between the identification of SIPs and the accumulation of data is growing. Therefore, fast and accurate computational methods are needed to effectively predict SIPs. In this study, we designed a new method, NLPEI, for predicting SIPs based on natural language understanding theory and evolutionary information. Specifically, we first understand the protein sequence as natural language and use natural language processing algorithms to extract its features. Then, we use the Position-Specific Scoring Matrix (PSSM) to represent the evolutionary information of the protein and extract its features through the Stacked Auto-Encoder (SAE) algorithm of deep learning. Finally, we fuse the natural language features of proteins with evolutionary features and make accurate predictions by Extreme Learning Machine (ELM) classifier. In the SIPs gold standard data sets of human and yeast, NLPEI achieved 94.19% and 91.29% prediction accuracy. Compared with different classifier models, different feature models, and other existing methods, NLPEI obtained the best results. These experimental results indicated that NLPEI is an effective tool for predicting SIPs and can provide reliable candidates for biological experiments.

Predicting Protein-Protein Interactions from Matrix-Based Protein Sequence Using Convolution Neural Network and Feature-Selective Rotation Forest.

Wang, Lei; Wang, Hai-Feng; Liu, San-Rong; Yan, Xin; Song, Ke-Jian.

Sci Rep ; 9(1): 9848, 2019 07 08.

Artículo en Inglés | MEDLINE | ID: mdl-31285519

RESUMEN

Protein is an essential component of the living organism. The prediction of protein-protein interactions (PPIs) has important implications for understanding the behavioral processes of life, preventing diseases, and developing new drugs. Although the development of high-throughput technology makes it possible to identify PPIs in large-scale biological experiments, it restricts the extensive use of experimental methods due to the constraints of time, cost, false positive rate and other conditions. Therefore, there is an urgent need for computational methods as a supplement to experimental methods to predict PPIs rapidly and accurately. In this paper, we propose a novel approach, namely CNN-FSRF, for predicting PPIs based on protein sequence by combining deep learning Convolution Neural Network (CNN) with Feature-Selective Rotation Forest (FSRF). The proposed method firstly converts the protein sequence into the Position-Specific Scoring Matrix (PSSM) containing biological evolution information, then uses CNN to objectively and efficiently extracts the deeply hidden features of the protein, and finally removes the redundant noise information by FSRF and gives the accurate prediction results. When performed on the PPIs datasets Yeast and Helicobacter pylori, CNN-FSRF achieved a prediction accuracy of 97.75% and 88.96%. To further evaluate the prediction performance, we compared CNN-FSRF with SVM and other existing methods. In addition, we also verified the performance of CNN-FSRF on independent datasets. Excellent experimental results indicate that CNN-FSRF can be used as a useful complement to biological experiments to identify protein interactions.

Asunto(s)

Biología Computacional/métodos , Helicobacter pylori/metabolismo , Mapeo de Interacción de Proteínas/métodos , Saccharomyces cerevisiae/metabolismo , Proteínas Bacterianas/metabolismo , Bases de Datos de Proteínas , Aprendizaje Profundo , Redes Neurales de la Computación , Posición Específica de Matrices de Puntuación , Proteínas de Saccharomyces cerevisiae/metabolismo

A Computational-Based Method for Predicting Drug-Target Interactions by Using Stacked Autoencoder Deep Neural Network.

Wang, Lei; You, Zhu-Hong; Chen, Xing; Xia, Shi-Xiong; Liu, Feng; Yan, Xin; Zhou, Yong; Song, Ke-Jian.

J Comput Biol ; 25(3): 361-373, 2018 03.

Artículo en Inglés | MEDLINE | ID: mdl-28891684

RESUMEN

Identifying the interaction between drugs and target proteins is an important area of drug research, which provides a broad prospect for low-risk and faster drug development. However, due to the limitations of traditional experiments when revealing drug-protein interactions (DTIs), the screening of targets not only takes a lot of time and money but also has high false-positive and false-negative rates. Therefore, it is imperative to develop effective automatic computational methods to accurately predict DTIs in the postgenome era. In this article, we propose a new computational method for predicting DTIs from drug molecular structure and protein sequence by using the stacked autoencoder of deep learning, which can adequately extract the raw data information. The proposed method has the advantage that it can automatically mine the hidden information from protein sequences and generate highly representative features through iterations of multiple layers. The feature descriptors are then constructed by combining the molecular substructure fingerprint information, and fed into the rotation forest for accurate prediction. The experimental results of fivefold cross-validation indicate that the proposed method achieves superior performance on gold standard data sets (enzymes, ion channels, GPCRs [G-protein-coupled receptors], and nuclear receptors) with accuracy of 0.9414, 0.9116, 0.8669, and 0.8056, respectively. We further comprehensively explore the performance of the proposed method by comparing it with other feature extraction algorithms, state-of-the-art classifiers, and other excellent methods on the same data set. The excellent comparison results demonstrate that the proposed method is highly competitive when predicting drug-target interactions.

Asunto(s)

Aprendizaje Profundo , Simulación del Acoplamiento Molecular/métodos , Análisis de Secuencia de Proteína/métodos , Bases de Datos de Compuestos Químicos , Simulación del Acoplamiento Molecular/normas , Unión Proteica , Reproducibilidad de los Resultados , Análisis de Secuencia de Proteína/normas

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA