Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Micromachines (Basel) ; 14(7)2023 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-37512678

RESUMEN

Equilibrium propagation (EP) has been proposed recently as a new neural network training algorithm based on a local learning concept, where only local information is used to calculate the weight update of the neural network. Despite the advantages of local learning, numerical iteration for solving the EP dynamic equations makes the EP algorithm less practical for realizing edge intelligence hardware. Some analog circuits have been suggested to solve the EP dynamic equations physically, not numerically, using the original EP algorithm. However, there are still a few problems in terms of circuit implementation: for example, the need for storing the free-phase solution and the lack of essential peripheral circuits for calculating and updating synaptic weights. Therefore, in this paper, a new analog circuit technique is proposed to realize the EP algorithm in practical and implementable hardware. This work has two major contributions in achieving this objective. First, the free-phase and nudge-phase solutions are calculated by the proposed analog circuits simultaneously, not at different times. With this process, analog voltage memories or digital memories with converting circuits between digital and analog domains for storing the free-phase solution temporarily can be eliminated in the proposed EP circuit. Second, a simple EP learning rule relying on a fixed amount of conductance change per programming pulse is newly proposed and implemented in peripheral circuits. The modified EP learning rule can make the weight update circuit practical and implementable without requiring the use of a complicated program verification scheme. The proposed memristor conductance update circuit is simulated and verified for training synaptic weights on memristor crossbars. The simulation results showed that the proposed EP circuit could be used for realizing on-device learning in edge intelligence hardware.

2.
Micromachines (Basel) ; 14(2)2023 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-36838009

RESUMEN

Memristor crossbars can be very useful for realizing edge-intelligence hardware, because the neural networks implemented by memristor crossbars can save significantly more computing energy and layout area than the conventional CMOS (complementary metal-oxide-semiconductor) digital circuits. One of the important operations used in neural networks is convolution. For performing the convolution by memristor crossbars, the full image should be partitioned into several sub-images. By doing so, each sub-image convolution can be mapped to small-size unit crossbars, of which the size should be defined as 128 × 128 or 256 × 256 to avoid the line resistance problem caused from large-size crossbars. In this paper, various convolution schemes with 3D, 2D, and 1D kernels are analyzed and compared in terms of neural network's performance and overlapping overhead. The neural network's simulation indicates that the 2D + 1D kernels can perform the sub-image convolution using a much smaller number of unit crossbars with less rate loss than the 3D kernels. When the CIFAR-10 dataset is tested, the mapping of sub-image convolution of 2D + 1D kernels to crossbars shows that the number of unit crossbars can be reduced almost by 90% and 95%, respectively, for 128 × 128 and 256 × 256 crossbars, compared with the 3D kernels. On the contrary, the rate loss of 2D + 1D kernels can be less than 2%. To improve the neural network's performance more, the 2D + 1D kernels can be combined with 3D kernels in one neural network. When the normalized ratio of 2D + 1D layers is around 0.5, the neural network's performance indicates very little rate loss compared to when the normalized ratio of 2D + 1D layers is zero. However, the number of unit crossbars for the normalized ratio = 0.5 can be reduced by half compared with that for the normalized ratio = 0.

3.
BMC Bioinformatics ; 23(1): 299, 2022 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-35879658

RESUMEN

BACKGROUND: A large number of evidences from biological experiments have confirmed that miRNAs play an important role in the progression and development of various human complex diseases. However, the traditional experiment methods are expensive and time-consuming. Therefore, it is a challenging task that how to develop more accurate and efficient methods for predicting potential associations between miRNA and disease. RESULTS: In the study, we developed a computational model that combined heterogeneous graph convolutional network with enhanced layer for miRNA-disease association prediction (HGCNELMDA). The major improvement of our method lies in through restarting the random walk optimized the original features of nodes and adding a reinforcement layer to the hidden layer of graph convolutional network retained similar information between nodes in the feature space. In addition, the proposed approach recalculated the influence of neighborhood nodes on target nodes by introducing the attention mechanism. The reliable performance of the HGCNELMDA was certified by the AUC of 93.47% in global leave-one-out cross-validation (LOOCV), and the average AUCs of 93.01% in fivefold cross-validation. Meanwhile, we compared the HGCNELMDA with the state­of­the­art methods. Comparative results indicated that o the HGCNELMDA is very promising and may provide a cost­effective alternative for miRNA-disease association prediction. Moreover, we applied HGCNELMDA to 3 different case studies to predict potential miRNAs related to lung cancer, prostate cancer, and pancreatic cancer. Results showed that 48, 50, and 50 of the top 50 predicted miRNAs were supported by experimental association evidence. Therefore, the HGCNELMDA is a reliable method for predicting disease-related miRNAs. CONCLUSIONS: The results of the HGCNELMDA method in the LOOCV (leave-one-out cross validation, LOOCV) and 5-cross validations were 93.47% and 93.01%, respectively. Compared with other typical methods, the performance of HGCNELMDA is higher. Three cases of lung cancer, prostate cancer, and pancreatic cancer were studied. Among the predicted top 50 candidate miRNAs, 48, 50, and 50 were verified in the biological database HDMMV2.0. Therefore; this further confirms the feasibility and effectiveness of our method. Therefore, this further confirms the feasibility and effectiveness of our method. To facilitate extensive studies for future disease-related miRNAs research, we developed a freely available web server called HGCNELMDA is available at http://124.221.62.44:8080/HGCNELMDA.jsp .


Asunto(s)
Neoplasias Pulmonares , MicroARNs , Neoplasias Pancreáticas , Neoplasias de la Próstata , Algoritmos , Biología Computacional/métodos , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Humanos , Neoplasias Pulmonares/genética , Masculino , MicroARNs/genética , Neoplasias de la Próstata/genética
4.
Micromachines (Basel) ; 13(2)2022 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-35208396

RESUMEN

To overcome the limitations of CMOS digital systems, emerging computing circuits such as memristor crossbars have been investigated as potential candidates for significantly increasing the speed and energy efficiency of next-generation computing systems, which are required for implementing future AI hardware. Unfortunately, manufacturing yield still remains a serious challenge in adopting memristor-based computing systems due to the limitations of immature fabrication technology. To compensate for malfunction of neural networks caused from the fabrication-related defects, a new crossbar training scheme combining the synapse-aware with the neuron-aware together is proposed in this paper, for optimizing the defect map size and the neural network's performance simultaneously. In the proposed scheme, the memristor crossbar's columns are divided into 3 groups, which are the severely-defective, moderately-defective, and normal columns, respectively. Here, each group is trained according to the trade-off relationship between the neural network's performance and the hardware overhead of defect-tolerant training. As a result of this group-based training method combining the neuron-aware with the synapse-aware, in this paper, the new scheme can be successful in improving the network's performance better than both the synapse-aware and the neuron-aware while minimizing its hardware burden. For example, when testing the defect percentage = 10% with MNIST dataset, the proposed scheme outperforms the synapse-aware and the neuron-aware by 3.8% and 3.4% for the number of crossbar's columns trained for synapse defects = 10 and 138 among 310, respectively, while maintaining the smaller memory size than the synapse-aware. When the trained columns = 138, the normalized memory size of the synapse-neuron-aware scheme can be smaller by 3.1% than the synapse-aware.

5.
Micromachines (Basel) ; 12(7)2021 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-34357201

RESUMEN

Voltages and currents in a memristor crossbar can be significantly affected due to nonideal effects such as parasitic source, line, and neuron resistance. These nonideal effects related to the parasitic resistance can cause the degradation of the neural network's performance realized with the nonideal memristor crossbar. To avoid performance degradation due to the parasitic-resistance-related nonideal effects, adaptive training methods were proposed previously. However, the complicated training algorithm could add a heavy computational burden to the neural network hardware. Especially, the hardware and algorithmic burden can be more serious for edge intelligence applications such as Internet of Things (IoT) sensors. In this paper, a memristor-CMOS hybrid neuron circuit is proposed for compensating the parasitic-resistance-related nonideal effects during not the training phase but the inference one, where the complicated adaptive training is not needed. Moreover, unlike the previous linear correction method performed by the external hardware, the proposed correction circuit can be included in the memristor crossbar to minimize the power and hardware overheads for compensating the nonideal effects. The proposed correction circuit has been verified to be able to restore the degradation of source and output voltages in the nonideal crossbar. For the source voltage, the average percentage error of the uncompensated crossbar is as large as 36.7%. If the correction circuit is used, the percentage error in the source voltage can be reduced from 36.7% to 7.5%. For the output voltage, the average percentage error of the uncompensated crossbar is as large as 65.2%. The correction circuit can improve the percentage error in the output voltage from 65.2% to 8.6%. Almost the percentage error can be reduced to ~1/7 if the correction circuit is used. The nonideal memristor crossbar with the correction circuit has been tested for MNIST and CIFAR-10 datasets in this paper. For MNIST, the uncompensated and compensated crossbars indicate the recognition rate of 90.4% and 95.1%, respectively, compared to 95.5% of the ideal crossbar. For CIFAR-10, the nonideal crossbars without and with the nonideal-effect correction show the rate of 85.3% and 88.1%, respectively, compared to the ideal crossbar achieving the rate as large as 88.9%.

6.
BioData Min ; 14(1): 3, 2021 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-33472664

RESUMEN

BACKGROUND: Prediction of novel Drug-Target interactions (DTIs) plays an important role in discovering new drug candidates and finding new proteins to target. In consideration of the time-consuming and expensive of experimental methods. Therefore, it is a challenging task that how to develop efficient computational approaches for the accurate predicting potential associations between drug and target. RESULTS: In the paper, we proposed a novel computational method called WELM-SURF based on drug fingerprints and protein evolutionary information for identifying DTIs. More specifically, for exploiting protein sequence feature, Position Specific Scoring Matrix (PSSM) is applied to capturing protein evolutionary information and Speed up robot features (SURF) is employed to extract sequence key feature from PSSM. For drug fingerprints, the chemical structure of molecular substructure fingerprints was used to represent drug as feature vector. Take account of the advantage that the Weighted Extreme Learning Machine (WELM) has short training time, good generalization ability, and most importantly ability to efficiently execute classification by optimizing the loss function of weight matrix. Therefore, the WELM classifier is used to carry out classification based on extracted features for predicting DTIs. The performance of the WELM-SURF model was evaluated by experimental validations on enzyme, ion channel, GPCRs and nuclear receptor datasets by using fivefold cross-validation test. The WELM-SURF obtained average accuracies of 93.54, 90.58, 85.43 and 77.45% on enzyme, ion channels, GPCRs and nuclear receptor dataset respectively. We also compared our performance with the Extreme Learning Machine (ELM), the state-of-the-art Support Vector Machine (SVM) on enzyme and ion channels dataset and other exiting methods on four datasets. By comparing with experimental results, the performance of WELM-SURF is significantly better than that of ELM, SVM and other previous methods in the domain. CONCLUSION: The results demonstrated that the proposed WELM-SURF model is competent for predicting DTIs with high accuracy and robustness. It is anticipated that the WELM-SURF method is a useful computational tool to facilitate widely bioinformatics studies related to DTIs prediction.

7.
BMC Bioinformatics ; 21(1): 470, 2020 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-33087064

RESUMEN

BACKGROUND: Many studies prove that miRNAs have significant roles in diagnosing and treating complex human diseases. However, conventional biological experiments are too costly and time-consuming to identify unconfirmed miRNA-disease associations. Thus, computational models predicting unidentified miRNA-disease pairs in an efficient way are becoming promising research topics. Although existing methods have performed well to reveal unidentified miRNA-disease associations, more work is still needed to improve prediction performance. RESULTS: In this work, we present a novel multiple meta-paths fusion graph embedding model to predict unidentified miRNA-disease associations (M2GMDA). Our method takes full advantage of the complex structure and rich semantic information of miRNA-disease interactions in a self-learning way. First, a miRNA-disease heterogeneous network was derived from verified miRNA-disease pairs, miRNA similarity and disease similarity. All meta-path instances connecting miRNAs with diseases were extracted to describe intrinsic information about miRNA-disease interactions. Then, we developed a graph embedding model to predict miRNA-disease associations. The model is composed of linear transformations of miRNAs and diseases, the means encoder of a single meta-path instance, the attention-aware encoder of meta-path type and attention-aware multiple meta-path fusion. We innovatively integrated meta-path instances, meta-path based neighbours, intermediate nodes in meta-paths and more information to strengthen the prediction in our model. In particular, distinct contributions of different meta-path instances and meta-path types were combined with attention mechanisms. The data sets and source code that support the findings of this study are available at https://github.com/dangdangzhang/M2GMDA . CONCLUSIONS: M2GMDA achieved AUCs of 0.9323 and 0.9182 in global leave-one-out cross validation and fivefold cross validation with HDMM V2.0. The results showed that our method outperforms other prediction methods. Three kinds of case studies with lung neoplasms, breast neoplasms, prostate neoplasms, pancreatic neoplasms, lymphoma and colorectal neoplasms demonstrated that 47, 50, 49, 48, 50 and 50 out of the top 50 candidate miRNAs predicted by M2GMDA were validated by biological experiments. Therefore, it further confirms the prediction performance of our method.


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , MicroARNs/genética , Neoplasias/genética , Algoritmos , Área Bajo la Curva , Predisposición Genética a la Enfermedad/genética , Humanos , Masculino , Factores de Riesgo
8.
Evol Bioinform Online ; 16: 1176934320924674, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32550764

RESUMEN

Self-interacting proteins (SIPs) play crucial roles in biological activities of organisms. Many high-throughput methods can be used to identify SIPs. However, these methods are both time-consuming and expensive. How to develop effective computational approaches for identifying SIPs is a challenging task. In the article, we present a novel computational method called RRN-SIFT, which combines the recurrent neural network (RNN) with scale invariant feature transform (SIFT) to predict SIPs based on protein evolutionary information. The main advantage of the proposed RNN-SIFT model is that it uses SIFT for extracting key feature by exploring the evolutionary information embedded in Position-Specific Iterated BLAST-constructed position-specific scoring matrix and employs an RNN classifier to perform classification based on extracted features. Extensive experiments show that the RRN-SIFT obtained average accuracy of 94.34% and 97.12% on the yeast and human dataset, respectively. We also compared our performance with the back propagation neural network (BPNN), the state-of-the-art support vector machine (SVM), and other existing methods. By comparing with experimental results, the performance of RNN-SIFT is significantly better than that of the BPNN, SVM, and other previous methods in the domain. Therefore, we conclude that the proposed RNN-SIFT model is a useful tool for predicting SIPs, as well to solve other bioinformatics tasks. To facilitate widely studies and encourage future proteomics research, a freely available web server called RNN-SIFT-SIPs was developed at http://219.219.62.123:8888/RNNSIFT/ including the source code and the SIP datasets.

9.
Evol Bioinform Online ; 15: 1176934319879920, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31619921

RESUMEN

BACKGROUND: Increasing evidence has indicated that protein-protein interactions (PPIs) play important roles in various aspects of the structural and functional organization of a cell. Thus, continuing to uncover potential PPIs is an important topic in the biomedical domain. Although various feature extraction methods with machine learning approaches have enhanced the prediction of PPIs. There remains room for improvement by developing novel and effective feature extraction methods and classifier approaches to identify PPIs. METHOD: In this study, we proposed a sequence-based feature extraction method called LCPSSMMF, which combined local coding position-specific scoring matrix (PSSM) with multifeatures fusion. First, we used a novel local coding method based on PSSM to build a new PSSM (CPSSM); the advantage of this method is that it incorporated global and local feature extraction, which can account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. Second, we adopted 2 different feature extraction methods (Local Average Group [LAG] and Bigram Probability [BP]) to capture multiple key feature information by employing the evolutionary information embedded in the CPSSM matrix. Finally, feature vectors were acquired by using multifeatures fusion method. RESULT: To evaluate the performance of the proposed feature extraction approach, we employed support vector machine (SVM) as a prediction classifier and applied this method to yeast and human PPI datasets. The prediction accuracies of LCPSSMMF were 93.43% and 90.41% on the yeast and human datasets, respectively. Moreover, we also compared the proposed method with the previous sequence-based approaches on the yeast datasets by using the same SVM classifier. The experimental results indicated that the performance of LCPSSMMF significantly exceeded that of several other state-of-the-art methods. It is proven that the LCPSSMMF approach can capture more local and global discriminatory information than almost all previous methods and can function remarkably well in identifying PPIs. To facilitate extensive research in future proteomics studies, we developed a LCPSSMMFSVM server, which is freely available for academic use at http://219.219.62.123:8888/LCPSSMMFSVM.

10.
Evol Bioinform Online ; 15: 1176934319844522, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31080346

RESUMEN

Protein-protein interactions (PPIs) are essential to a number of biological processes. The PPIs generated by biological experiment are both time-consuming and expensive. Therefore, many computational methods have been proposed to identify PPIs. However, most of these methods are limited as they are difficult to compute and rely on a large number of homologous proteins. Accordingly, it is urgent to develop effective computational methods to detect PPIs using only protein sequence information. The kernel parameter of relevance vector machine (RVM) is set by experience, which may not obtain the optimal solution, affecting the prediction performance of RVM. In this work, we presented a novel computational approach called GWORVM-BIG, which used Bi-gram (BIG) to represent protein sequences on a position-specific scoring matrix (PSSM) and GWORVM classifier to perform classification for predicting PPIs. More specifically, the proposed GWORVM model can obtain the optimum solution of kernel parameters using gray wolf optimizer approach, which has the advantages of less control parameters, strong global optimization ability, and ease of implementation compared with other optimization algorithms. The experimental results on yeast and human data sets demonstrated the good accuracy and efficiency of the proposed GWORVM-BIG method. The results showed that the proposed GWORVM classifier can significantly improve the prediction performance compared with the RVM model using other optimizer algorithms including grid search (GS), genetic algorithm (GA), and particle swarm optimization (PSO). In addition, the proposed method is also compared with other existing algorithms, and the experimental results further indicated that the proposed GWORVM-BIG model yields excellent prediction performance. For facilitating extensive studies for future proteomics research, the GWORVMBIG server is freely available for academic use at http://219.219.62.123:8888/GWORVMBIG.

11.
Biotechnol Biofuels ; 12: 6, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30622648

RESUMEN

BACKGROUND: Based on our previous studies of 17 Prunus sibirica germplasms, one plus tree with high quality and quantity of seed oils has emerged as novel potential source of biodiesel. To better develop P. sibirica seed oils as woody biodiesel, a concurrent exploration of oil content, FA composition, biodiesel yield and fuel properties as well as prediction model construction for fuel properties was conducted on developing seeds to determine the optimal seed harvest time for producing high-quality biodiesel. Oil synthesis required supply of carbon source, energy and FA, but their transport mechanisms still remains enigmatic. Our recent 454 sequencing of P. sibirica could provide long-read sequences to identify membrane transporters for a better understanding of regulatory mechanism for high oil production in developing seeds. RESULTS: To better develop the seed oils of P. sibirica as woody biodiesel, we firstly focused on a temporal and comparative evaluation of growth tendency, oil content, FA composition, biodiesel yield and fuel properties as well as model construction for biodiesel property prediction in different developing seeds from P. sibirica plus tree (accession AS-80), revealing that the oils from developing seeds harvested after 60 days after flowering (DAF) could be as novel potential feedstock for producing biodiesel with ideal fuel property. To gain new insight into membrane transport mechanism for high oil yield in developing seeds of P. sibirica, we presented a global analysis of transporter based on our recent 454 sequencing data of P. sibirica. We annotated a total of 116 genes for membrane-localized transporters at different organelles (plastid, endoplasmatic reticulum, tonoplast, mitochondria and peroxisome), of which some specific transporters were identified to be involved in carbon allocation, metabolite transport and energy supply for oil synthesis by both RT-PCR and qRT-PCR. Importantly, the transporter-mediated model was well established for high oil synthesis in developing P. sibirica seeds. Our findings could help to reveal molecular mechanism of increased oil production and may also present strategies for engineering oil accumulation in oilseed plants. CONCLUSIONS: This study presents a temporal and comparative evaluation of developing P. sibirica seed oils as a potential feedstock for producing high-quality biodiesel and a global identification for membrane transporters was to gain better insights into regulatory mechanism of high oil production in developing seeds of P. sibirica. Our findings may present strategies for developing woody biodiesel resources and engineering oil accumulation.

12.
Int J Mol Sci ; 19(4)2018 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-29596363

RESUMEN

Protein-protein interactions (PPI) are key to protein functions and regulations within the cell cycle, DNA replication, and cellular signaling. Therefore, detecting whether a pair of proteins interact is of great importance for the study of molecular biology. As researchers have become aware of the importance of computational methods in predicting PPIs, many techniques have been developed for performing this task computationally. However, there are few technologies that really meet the needs of their users. In this paper, we develop a novel and efficient sequence-based method for predicting PPIs. The evolutionary features are extracted from the position-specific scoring matrix (PSSM) of protein. The features are then fed into a robust relevance vector machine (RVM) classifier to distinguish between the interacting and non-interacting protein pairs. In order to verify the performance of our method, five-fold cross-validation tests are performed on the Saccharomyces cerevisiae dataset. A high accuracy of 94.56%, with 94.79% sensitivity at 94.36% precision, was obtained. The experimental results illustrated that the proposed approach can extract the most significant features from each protein sequence and can be a bright and meaningful tool for the research of proteomics.


Asunto(s)
Bases de Datos de Proteínas , Modelos Genéticos , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Programas Informáticos , Máquina de Vectores de Soporte , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
14.
J Cheminform ; 9(1): 47, 2017 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-29086182

RESUMEN

Self-interactions Proteins (SIPs) is important for their biological activity owing to the inherent interaction amongst their secondary structures or domains. However, due to the limitations of experimental Self-interactions detection, one major challenge in the study of prediction SIPs is how to exploit computational approaches for SIPs detection based on evolutionary information contained protein sequence. In the work, we presented a novel computational approach named WELM-LAG, which combined the Weighed-Extreme Learning Machine (WELM) classifier with Local Average Group (LAG) to predict SIPs based on protein sequence. The major improvement of our method lies in presenting an effective feature extraction method used to represent candidate Self-interactions proteins by exploring the evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix (PSSM); and then employing a reliable and robust WELM classifier to carry out classification. In addition, the Principal Component Analysis (PCA) approach is used to reduce the impact of noise. The WELM-LAG method gave very high average accuracies of 92.94 and 96.74% on yeast and human datasets, respectively. Meanwhile, we compared it with the state-of-the-art support vector machine (SVM) classifier and other existing methods on human and yeast datasets, respectively. Comparative results indicated that our approach is very promising and may provide a cost-effective alternative for predicting SIPs. In addition, we developed a freely available web server called WELM-LAG-SIPs to predict SIPs. The web server is available at http://219.219.62.123:8888/WELMLAG/ .

15.
J Theor Biol ; 432: 80-86, 2017 11 07.
Artículo en Inglés | MEDLINE | ID: mdl-28802824

RESUMEN

It is a challenging task for fundamental research whether proteins can interact with their partners. Protein self-interaction (SIP) is a special case of PPIs, which plays a key role in the regulation of cellular functions. Due to the limitations of experimental self-interaction identification, it is very important to develop an effective biological tool for predicting SIPs based on protein sequences. In the study, we developed a novel computational method called RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) for detecting SIPs from protein sequences. Firstly, Average Blocks (AB) feature extraction method is employed to represent protein sequences on a Position Specific Scoring Matrix (PSSM). Secondly, Principal Component Analysis (PCA) method is used to reduce the dimension of AB vector for reducing the influence of noise. Then, by employing the Relevance Vector Machine (RVM) algorithm, the performance of RVM-AB is assessed and compared with the state-of-the-art support vector machine (SVM) classifier and other exiting methods on yeast and human datasets respectively. Using the fivefold test experiment, RVM-AB model achieved very high accuracies of 93.01% and 97.72% on yeast and human datasets respectively, which are significantly better than the method based on SVM classifier and other previous methods. The experimental results proved that the RVM-AB prediction model is efficient and robust. It can be an automatic decision support tool for detecting SIPs. For facilitating extensive studies for future proteomics research, the RVMAB server is freely available for academic use at http://219.219.62.123:8888/SIP_AB.


Asunto(s)
Algoritmos , Posición Específica de Matrices de Puntuación , Mapeo de Interacción de Proteínas , Humanos , Unión Proteica , Curva ROC , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Máquina de Vectores de Soporte
16.
Molecules ; 22(7)2017 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-28678206

RESUMEN

Knowledge of drug-target interaction (DTI) plays an important role in discovering new drug candidates. Unfortunately, there are unavoidable shortcomings; including the time-consuming and expensive nature of the experimental method to predict DTI. Therefore, it motivates us to develop an effective computational method to predict DTI based on protein sequence. In the paper, we proposed a novel computational approach based on protein sequence, namely PDTPS (Predicting Drug Targets with Protein Sequence) to predict DTI. The PDTPS method combines Bi-gram probabilities (BIGP), Position Specific Scoring Matrix (PSSM), and Principal Component Analysis (PCA) with Relevance Vector Machine (RVM). In order to evaluate the prediction capacity of the PDTPS, the experiment was carried out on enzyme, ion channel, GPCR, and nuclear receptor datasets by using five-fold cross-validation tests. The proposed PDTPS method achieved average accuracy of 97.73%, 93.12%, 86.78%, and 87.78% on enzyme, ion channel, GPCR and nuclear receptor datasets, respectively. The experimental results showed that our method has good prediction performance. Furthermore, in order to further evaluate the prediction performance of the proposed PDTPS method, we compared it with the state-of-the-art support vector machine (SVM) classifier on enzyme and ion channel datasets, and other exiting methods on four datasets. The promising comparison results further demonstrate that the efficiency and robust of the proposed PDTPS method. This makes it a useful tool and suitable for predicting DTI, as well as other bioinformatics tasks.


Asunto(s)
Biología Computacional/métodos , Preparaciones Farmacéuticas/química , Proteínas/genética , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Interacciones Farmacológicas , Estructura Molecular , Posición Específica de Matrices de Puntuación , Análisis de Componente Principal , Proteínas/metabolismo , Máquina de Vectores de Soporte
17.
Biotechnol Biofuels ; 10: 134, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28559925

RESUMEN

BACKGROUND: Lindera glauca fruit with high quality and quantity of oil has emerged as a novel potential source of biodiesel in China, but the molecular regulatory mechanism of carbon flux and energy source for oil biosynthesis in developing fruits is still unknown. To better develop fruit oils of L. glauca as woody biodiesel, a combination of two different sequencing platforms (454 and Illumina) and qRT-PCR analysis was used to define a minimal reference transcriptome of developing L. glauca fruits, and to construct carbon and energy metabolic model for regulation of carbon partitioning and energy supply for FA biosynthesis and oil accumulation. RESULTS: We first analyzed the dynamic patterns of growth tendency, oil content, FA compositions, biodiesel properties, and the contents of ATP and pyridine nucleotide of L. glauca fruits from seven different developing stages. Comprehensive characterization of transcriptome of the developing L. glauca fruit was performed using a combination of two different next-generation sequencing platforms, of which three representative fruit samples (50, 125, and 150 DAF) and one mixed sample from seven developing stages were selected for Illumina and 454 sequencing, respectively. The unigenes separately obtained from long and short reads (201, and 259, respectively, in total) were reconciled using TGICL software, resulting in a total of 60,031 unigenes (mean length = 1061.95 bp) to describe a transcriptome for developing L. glauca fruits. Notably, 198 genes were annotated for photosynthesis, sucrose cleavage, carbon allocation, metabolite transport, acetyl-CoA formation, oil synthesis, and energy metabolism, among which some specific transporters, transcription factors, and enzymes were identified to be implicated in carbon partitioning and energy source for oil synthesis by an integrated analysis of transcriptomic sequencing and qRT-PCR. Importantly, the carbon and energy metabolic model was well established for oil biosynthesis of developing L. glauca fruits, which could help to reveal the molecular regulatory mechanism of the increased oil production in developing fruits. CONCLUSIONS: This study presents for the first time the application of an integrated two different sequencing analyses (Illumina and 454) and qRT-PCR detection to define a minimal reference transcriptome for developing L. glauca fruits, and to elucidate the molecular regulatory mechanism of carbon flux control and energy provision for oil synthesis. Our results will provide a valuable resource for future fundamental and applied research on the woody biodiesel plants.

18.
Mol Biosyst ; 12(12): 3702-3710, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27759121

RESUMEN

Self-interacting proteins (SIPs) play an essential role in cellular functions and the evolution of protein interaction networks (PINs). Due to the limitations of experimental self-interaction proteins detection technology, it is a very important task to develop a robust and accurate computational approach for SIPs prediction. In this study, we propose a novel computational method for predicting SIPs from protein amino acids sequence. Firstly, a novel feature representation scheme based on Local Binary Pattern (LBP) is developed, in which the evolutionary information, in the form of multiple sequence alignments, is taken into account. Then, by employing the Relevance Vector Machine (RVM) classifier, the performance of our proposed method is evaluated on yeast and human datasets using a five-fold cross-validation test. The experimental results show that the proposed method can achieve high accuracies of 94.82% and 97.28% on yeast and human datasets, respectively. For further assessing the performance of our method, we compared it with the state-of-the-art Support Vector Machine (SVM) classifier, and other existing methods, on the same datasets. Comparison results demonstrate that the proposed method is very promising and could provide a cost-effective alternative for predicting SIPs. In addition, to facilitate extensive studies for future proteomics research, a web server is freely available for academic use at .


Asunto(s)
Aminoácidos/química , Biología Computacional/métodos , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Evolución Molecular , Humanos , Posición Específica de Matrices de Puntuación , Unión Proteica , Mapeo de Interacción de Proteínas/métodos , Proteínas/metabolismo , Curva ROC , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Máquina de Vectores de Soporte , Navegador Web
19.
Sci Rep ; 6: 35675, 2016 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-27762296

RESUMEN

Recently, our transcriptomic analysis has identified some functional genes responsible for oil biosynthesis in developing SASK, yet miRNA-mediated regulation for SASK development and oil accumulation is poorly understood. Here, 3 representative periods of 10, 30 and 60 DAF were selected for sRNA sequencing based on the dynamic patterns of growth tendency and oil content of developing SASK. By miRNA transcriptomic analysis, we characterized 296 known and 44 novel miRNAs in developing SASK, among which 36 known and 6 novel miRNAs respond specifically to developing SASK. Importantly, we performed an integrated analysis of mRNA and miRNA transcriptome as well as qRT-PCR detection to identify some key miRNAs and their targets (miR156-SPL, miR160-ARF18, miR164-NAC1, miR171h-SCL6, miR172-AP2, miR395-AUX22B, miR530-P2C37, miR393h-TIR1/AFB2 and psi-miRn5-SnRK2A) potentially involved in developing response and hormone signaling of SASK. Our results provide new insights into the important regulatory function of cross-talk between development response and hormone signaling for SASK oil accumulation.


Asunto(s)
Perfilación de la Expresión Génica , MicroARNs/análisis , Reguladores del Crecimiento de las Plantas/metabolismo , Prunus armeniaca/crecimiento & desarrollo , ARN Mensajero/análisis , Semillas/crecimiento & desarrollo , Transducción de Señal , Regulación del Desarrollo de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Aceites Volátiles/metabolismo , Desarrollo de la Planta , Reacción en Cadena en Tiempo Real de la Polimerasa , Semillas/genética , Semillas/metabolismo , Análisis de Secuencia de ADN , Factores de Tiempo
20.
Oncotarget ; 7(50): 82440-82449, 2016 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-27732957

RESUMEN

Self-interacting Proteins (SIPs) play an essential role in a wide range of biological processes, such as gene expression regulation, signal transduction, enzyme activation and immune response. Because of the limitations for experimental self-interaction proteins identification, developing an effective computational method based on protein sequence to detect SIPs is much important. In the study, we proposed a novel computational approach called RVMBIGP that combines the Relevance Vector Machine (RVM) model and Bi-gram probability (BIGP) to predict SIPs based on protein sequence. The proposed prediction model includes as following steps: (1) an effective feature extraction method named BIGP is used to represent protein sequences on Position Specific Scoring Matrix (PSSM); (2) Principal Component Analysis (PCA) method is employed for integrating the useful information and reducing the influence of noise; (3) the robust classifier Relevance Vector Machine (RVM) is used to carry out classification. When performed on yeast and human datasets, the proposed RVMBIGP model can achieve very high accuracies of 95.48% and 98.80%, respectively. The experimental results show that our proposed method is very promising and may provide a cost-effective alternative for SIPs identification. In addition, to facilitate extensive studies for future proteomics research, the RVMBIGP server is freely available for academic use at http://219.219.62.123:8888/RVMBIGP.


Asunto(s)
Biología Computacional/métodos , Proteínas Fúngicas/química , Posición Específica de Matrices de Puntuación , Mapeo de Interacción de Proteínas/métodos , Máquina de Vectores de Soporte , Bases de Datos de Proteínas , Proteínas Fúngicas/clasificación , Humanos , Análisis de Componente Principal , Análisis de Secuencia de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...