Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Protein Sci ; 33(6): e5015, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38747369

RESUMO

Prokaryotic DNA binding proteins (DBPs) play pivotal roles in governing gene regulation, DNA replication, and various cellular functions. Accurate computational models for predicting prokaryotic DBPs hold immense promise in accelerating the discovery of novel proteins, fostering a deeper understanding of prokaryotic biology, and facilitating the development of therapeutics targeting for potential disease interventions. However, existing generic prediction models often exhibit lower accuracy in predicting prokaryotic DBPs. To address this gap, we introduce ProkDBP, a novel machine learning-driven computational model for prediction of prokaryotic DBPs. For prediction, a total of nine shallow learning algorithms and five deep learning models were utilized, with the shallow learning models demonstrating higher performance metrics compared to their deep learning counterparts. The light gradient boosting machine (LGBM), coupled with evolutionarily significant features selected via random forest variable importance measure (RF-VIM) yielded the highest five-fold cross-validation accuracy. The model achieved the highest auROC (0.9534) and auPRC (0.9575) among the 14 machine learning models evaluated. Additionally, ProkDBP demonstrated substantial performance with an independent dataset, exhibiting higher values of auROC (0.9332) and auPRC (0.9371). Notably, when benchmarked against several cutting-edge existing models, ProkDBP showcased superior predictive accuracy. Furthermore, to promote accessibility and usability, ProkDBP (https://iasri-sg.icar.gov.in/prokdbp/) is available as an online prediction tool, enabling free access to interested users. This tool stands as a significant contribution, enhancing the repertoire of resources for accurate and efficient prediction of prokaryotic DBPs.


Assuntos
Proteínas de Bactérias , Proteínas de Ligação a DNA , Aprendizado de Máquina , Algoritmos , Proteínas de Bactérias/química , Proteínas de Bactérias/metabolismo , Proteínas de Bactérias/genética , Biologia Computacional/métodos , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo
2.
Comput Struct Biotechnol J ; 23: 1631-1640, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38660008

RESUMO

RNA-binding proteins (RBPs) are central to key functions such as post-transcriptional regulation, mRNA stability, and adaptation to varied environmental conditions in prokaryotes. While the majority of research has concentrated on eukaryotic RBPs, recent developments underscore the crucial involvement of prokaryotic RBPs. Although computational methods have emerged in recent years to identify RBPs, they have fallen short in accurately identifying prokaryotic RBPs due to their generic nature. To bridge this gap, we introduce RBProkCNN, a novel machine learning-driven computational model meticulously designed for the accurate prediction of prokaryotic RBPs. The prediction process involves the utilization of eight shallow learning algorithms and four deep learning models, incorporating PSSM-based evolutionary features. By leveraging a convolutional neural network (CNN) and evolutionarily significant features selected through extreme gradient boosting variable importance measure, RBProkCNN achieved the highest accuracy in five-fold cross-validation, yielding 98.04% auROC and 98.19% auPRC. Furthermore, RBProkCNN demonstrated robust performance with an independent dataset, showcasing a commendable 95.77% auROC and 95.78% auPRC. Noteworthy is its superior predictive accuracy when compared to several state-of-the-art existing models. RBProkCNN is available as an online prediction tool (https://iasri-sg.icar.gov.in/rbprokcnn/), offering free access to interested users. This tool represents a substantial contribution, enriching the array of resources available for the accurate and efficient prediction of prokaryotic RBPs.

3.
Biochim Biophys Acta Gen Subj ; 1868(6): 130597, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38490467

RESUMO

BACKGROUND: Abiotic stresses pose serious threat to the growth and yield of crop plants. Several studies suggest that in plants, transcription factors (TFs) are important regulators of gene expression, especially when it comes to coping with abiotic stresses. Therefore, it is crucial to identify TFs associated with abiotic stress response for breeding of abiotic stress tolerant crop cultivars. METHODS: Based on a machine learning framework, a computational model was envisaged to predict TFs associated with abiotic stress response in plants. To numerically encode TF sequences, four distinct sequence derived features were generated. The prediction was performed using ten shallow learning and four deep learning algorithms. For prediction using more pertinent and informative features, feature selection techniques were also employed. RESULTS: Using the features chosen by the light-gradient boosting machine-variable importance measure (LGBM-VIM), the LGBM achieved the highest cross-validation performance metrics (accuracy: 86.81%, auROC: 92.98%, and auPRC: 94.03%). Further evaluation of the proposed model (LGBM prediction method + LGBM-VIM selected features) was also done using an independent test dataset, where the accuracy, auROC and auPRC were observed 81.98%, 90.65% and 91.30%, respectively. CONCLUSIONS: To facilitate the adoption of the proposed strategy by users, the approach was implemented as a prediction server called ASPTF, accessible at https://iasri-sg.icar.gov.in/asptf/. The developed approach and the corresponding web application are anticipated to supplement experimental methods in the identification of transcription factors (TFs) responsive to abiotic stress in plants.


Assuntos
Aprendizado de Máquina , Estresse Fisiológico , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Algoritmos , Regulação da Expressão Gênica de Plantas , Biologia Computacional/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas/metabolismo , Plantas/genética
4.
Funct Integr Genomics ; 23(2): 113, 2023 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-37000299

RESUMO

Abiotic stresses are detrimental to plant growth and development and have a major negative impact on crop yields. A growing body of evidence indicates that a large number of long non-coding RNAs (lncRNAs) are key to many abiotic stress responses. Thus, identifying abiotic stress-responsive lncRNAs is essential in crop breeding programs in order to develop crop cultivars resistant to abiotic stresses. In this study, we have developed the first machine learning-based computational model for predicting abiotic stress-responsive lncRNAs. The lncRNA sequences which were responsive and non-responsive to abiotic stresses served as the two classes of the dataset for binary classification using the machine learning algorithms. The training dataset was created using 263 stress-responsive and 263 non-stress-responsive sequences, whereas the independent test set consists of 101 sequences from both classes. As the machine learning model can adopt only the numeric data, the Kmer features ranging from sizes 1 to 6 were utilized to represent lncRNAs in numeric form. To select important features, four different feature selection strategies were utilized. Among the seven learning algorithms, the support vector machine (SVM) achieved the highest cross-validation accuracy with the selected feature sets. The observed 5-fold cross-validation accuracy, AU-ROC, and AU-PRC were found to be 68.84, 72.78, and 75.86%, respectively. Furthermore, the robustness of the developed model (SVM with the selected feature) was evaluated using an independent test dataset, where the overall accuracy, AU-ROC, and AU-PRC were found to be 76.23, 87.71, and 88.49%, respectively. The developed computational approach was also implemented in an online prediction tool ASLncR accessible at https://iasri-sg.icar.gov.in/aslncr/ . The proposed computational model and the developed prediction tool are believed to supplement the existing effort for the identification of abiotic stress-responsive lncRNAs in plants.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Biologia Computacional , Melhoramento Vegetal , Algoritmos , Plantas/genética , Estresse Fisiológico/genética
5.
Funct Integr Genomics ; 23(2): 92, 2023 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-36939943

RESUMO

Abiotic stresses have become a major challenge in recent years due to their pervasive nature and shocking impacts on plant growth, development, and quality. MicroRNAs (miRNAs) play a significant role in plant response to different abiotic stresses. Thus, identification of specific abiotic stress-responsive miRNAs holds immense importance in crop breeding programmes to develop cultivars resistant to abiotic stresses. In this study, we developed a machine learning-based computational model for prediction of miRNAs associated with four specific abiotic stresses such as cold, drought, heat and salt. The pseudo K-tuple nucleotide compositional features of Kmer size 1 to 5 were used to represent miRNAs in numeric form. Feature selection strategy was employed to select important features. With the selected feature sets, support vector machine (SVM) achieved the highest cross-validation accuracy in all four abiotic stress conditions. The highest cross-validated prediction accuracies in terms of area under precision-recall curve were found to be 90.15, 90.09, 87.71, and 89.25% for cold, drought, heat and salt respectively. Overall prediction accuracies for the independent dataset were respectively observed 84.57, 80.62, 80.38 and 82.78%, for the abiotic stresses. The SVM was also seen to outperform different deep learning models for prediction of abiotic stress-responsive miRNAs. To implement our method with ease, an online prediction server "ASmiR" has been established at https://iasri-sg.icar.gov.in/asmir/ . The proposed computational model and the developed prediction tool are believed to supplement the existing effort for identification of specific abiotic stress-responsive miRNAs in plants.


Assuntos
MicroRNAs , MicroRNAs/genética , Melhoramento Vegetal , Plantas/genética , Aprendizado de Máquina , Cloreto de Sódio , Estresse Fisiológico/genética , Regulação da Expressão Gênica de Plantas
6.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36416116

RESUMO

DNA-binding proteins (DBPs) play crucial roles in numerous cellular processes including nucleotide recognition, transcriptional control and the regulation of gene expression. Majority of the existing computational techniques for identifying DBPs are mainly applicable to human and mouse datasets. Even though some models have been tested on Arabidopsis, they produce poor accuracy when applied to other plant species. Therefore, it is imperative to develop an effective computational model for predicting plant DBPs. In this study, we developed a comprehensive computational model for plant specific DBPs identification. Five shallow learning and six deep learning models were initially used for prediction, where shallow learning methods outperformed deep learning algorithms. In particular, support vector machine achieved highest repeated 5-fold cross-validation accuracy of 94.0% area under receiver operating characteristic curve (AUC-ROC) and 93.5% area under precision recall curve (AUC-PR). With an independent dataset, the developed approach secured 93.8% AUC-ROC and 94.6% AUC-PR. While compared with the state-of-art existing tools by using an independent dataset, the proposed model achieved much higher accuracy. Overall results suggest that the developed computational model is more efficient and reliable as compared to the existing models for the prediction of DBPs in plants. For the convenience of the majority of experimental scientists, the developed prediction server PlDBPred is publicly accessible at https://iasri-sg.icar.gov.in/pldbpred/.The source code is also provided at https://iasri-sg.icar.gov.in/pldbpred/source_code.php for prediction using a large-size dataset.


Assuntos
Arabidopsis , Proteínas de Ligação a DNA , Algoritmos , Arabidopsis/genética , Arabidopsis/metabolismo , Biologia Computacional/métodos , Simulação por Computador , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Curva ROC , Software
7.
BMC Plant Biol ; 21(1): 604, 2021 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-34937558

RESUMO

BACKGROUND: Picrorhiza kurroa Royle ex Benth. being a rich source of phytochemicals, is a promising high altitude medicinal herb of Himalaya. The medicinal potential is attributed to picrosides i.e. iridoid glycosides, which synthesized in organ-specific manner through highly complex pathways. Here, we present a large-scale proteome reference map of P. kurroa, consisting of four morphologically differentiated organs and two developmental stages. RESULTS: We were able to identify 5186 protein accessions (FDR < 1%) providing a deep coverage of protein abundance array, spanning around six orders of magnitude. Most of the identified proteins are associated with metabolic processes, response to abiotic stimuli and cellular processes. Organ specific sub-proteomes highlights organ specialized functions that would offer insights to explore tissue profile for specific protein classes. With reference to P. kurroa development, vegetative phase is enriched with growth related processes, however generative phase harvests more energy in secondary metabolic pathways. Furthermore, stress-responsive proteins, RNA binding proteins (RBPs) and post-translational modifications (PTMs), particularly phosphorylation and ADP-ribosylation play an important role in P. kurroa adaptation to alpine environment. The proteins involved in the synthesis of secondary metabolites are well represented in P. kurroa proteome. The phytochemical analysis revealed that marker compounds were highly accumulated in rhizome and overall, during the late stage of development. CONCLUSIONS: This report represents first extensive proteomic description of organ and developmental dissected P. kurroa, providing a platform for future studies related to stress tolerance and medical applications.


Assuntos
Organogênese Vegetal , Picrorhiza/química , Proteínas de Plantas/análise , Conjuntos de Dados como Assunto , Espectrometria de Massas , Redes e Vias Metabólicas , Mapeamento de Peptídeos , Proteoma , Estresse Fisiológico
8.
iScience ; 24(12): 103381, 2021 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-34841226

RESUMO

Identifying the factors determining the RBP-RNA interactions remains a big challenge. It involves sparse binding motifs and a suitable sequence context for binding. The present work describes an approach to detect RBP binding sites in RNAs using an ultra-fast inexact k-mers search for statistically significant seeds. The seeds work as an anchor to evaluate the context and binding potential using flanking region information while leveraging from Deep Feed-forward Neural Network. The developed models also received support from MD-simulation studies. The implemented software, RBPSpot, scored consistently high for all the performance metrics including average accuracy of ∼90% across a large number of validated datasets. It outperformed the compared tools, including some with much complex deep-learning models, during a comprehensive benchmarking process. RBPSpot can identify RBP binding sites in the human system and can also be used to develop new models, making it a valuable resource in the area of regulatory system studies.

9.
PLoS One ; 16(10): e0258550, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34637468

RESUMO

Formation of mature miRNAs and their expression is a highly controlled process. It is very much dependent upon the post-transcriptional regulatory events. Recent findings suggest that several RNA binding proteins beyond Drosha/Dicer are involved in the processing of miRNAs. Deciphering of conditional networks for these RBP-miRNA interactions may help to reason the spatio-temporal nature of miRNAs which can also be used to predict miRNA profiles. In this direction, >25TB of data from different platforms were studied (CLIP-seq/RNA-seq/miRNA-seq) to develop Bayesian causal networks capable of reasoning miRNA biogenesis. The networks ably explained the miRNA formation when tested across a large number of conditions and experimentally validated data. The networks were modeled into an XGBoost machine learning system where expression information of the network components was found capable to quantitatively explain the miRNAs formation levels and their profiles. The models were developed for 1,204 human miRNAs whose accurate expression level could be detected directly from the RNA-seq data alone without any need of doing separate miRNA profiling experiments like miRNA-seq or arrays. A first of its kind, miRbiom performed consistently well with high average accuracy (91%) when tested across a large number of experimentally established data from several conditions. It has been implemented as an interactive open access web-server where besides finding the profiles of miRNAs, their downstream functional analysis can also be done. miRbiom will help to get an accurate prediction of human miRNAs profiles in the absence of profiling experiments and will be an asset for regulatory research areas. The study also shows the importance of having RBP interaction information in better understanding the miRNAs and their functional projectiles where it also lays the foundation of such studies and software in future.


Assuntos
Aprendizado de Máquina , MicroRNAs/metabolismo , Proteínas de Ligação a RNA/metabolismo , Interface Usuário-Computador , Teorema de Bayes , Bases de Dados Genéticas , Humanos , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA