Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
1.
Methods ; 228: 65-79, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38768931

RESUMO

This study proposed an intelligent model for predicting abiotic stress-responsive microRNAs in plants. MicroRNAs (miRNAs) are short RNA molecules regulates the stress in genes. Experimental methods are costly and time-consuming, as compare to in-silico prediction. Addressing this gap, the study seeks to develop an efficient computational model for plant stress response prediction. The two benchmark datasets for MiRNA and Pre-MiRNA dataset have been acquired in this study. Four ensemble approaches such as bagging, boosting, stacking, and blending have been employed. Classifiers such as Random Forest (RF), Extra Trees (ET), Ada Boost (ADB), Light Gradient Boosting Machine (LGBM), and Support Vector Machine (SVM). Stacking and Blending employed all stated classifiers as base learners and Logistic Regression (LR) as Meta Classifier. There have been a total of four types of testing used, including independent set, self-consistency, cross-validation with 5 and 10 folds, and jackknife. This study has utilized evaluation metrics such as accuracy score, specificity, sensitivity, Mathew's correlation coefficient (MCC), and AUC. Our proposed methodology has outperformed existing state of the art study in both datasets based on independent set testing. The SVM-based approach has exhibited accuracy score of 0.659 for the MiRNA dataset, which is better than the previous study. The ET classifier has surpassed the accuracy of Pre-MiRNA dataset as compared to the existing benchmark study, achieving an impressive score of 0.67. The proposed method can be used in future research to predict abiotic stresses in plants.


Assuntos
MicroRNAs , Estresse Fisiológico , Máquina de Vetores de Suporte , MicroRNAs/genética , Estresse Fisiológico/genética , RNA de Plantas/genética , Biologia Computacional/métodos , Plantas/genética , Algoritmos , Regulação da Expressão Gênica de Plantas/genética
2.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35048955

RESUMO

Replication of DNA is an important process for the cell division cycle, gene expression regulation and other biological evolution processes. It also has a crucial role in a living organism's physical growth and structure. Replication of DNA comprises of three stages known as initiation, elongation and termination, whereas the origin of replication sites (ORI) is the location of initiation of the DNA replication process. There exist various methodologies to identify ORIs in the genomic sequences, however, these methods have used either extensive computations for execution, or have limited optimization for the large datasets. Herein, a model called ORI-Deep is proposed to identify ORIs from the multiple cell type genomic sequence benchmark data. An efficient method is proposed using a deep neural network to identify ORIs for four different eukaryotic species. For better representation of data, a feature vector is constructed using statistical moments for the training and testing of data and is further fed to a long short-term memory (LSTM) network. To prove the effectiveness of the proposed model, we applied several validation techniques at different levels to obtain seven accuracy metrics, and the accuracy score for self-consistency, 10-fold cross-validation, jackknife and the independent set test is observed to be 0.977, 0.948, 0.976 and 0.977, respectively. Based on the results, it can be concluded that ORI-Deep can efficiently predict the sites of origin replication in DNA sequence with high accuracy. Webserver for ORI-Deep is available at (https://share.streamlit.io/waqarhusain/orideep/main/app.py), whereas source code is available at (https://github.com/WaqarHusain/OriDeep).


Assuntos
Memória de Curto Prazo , Origem de Replicação , Eucariotos , Redes Neurais de Computação , Software
3.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35262658

RESUMO

B-cell epitopes have the capability to recognize and attach to the surface of antigen receptors to stimulate the immune system against pathogens. Identification of B-cell epitopes from antigens has a great significance in several biomedical and biotechnological applications, provides support in the development of therapeutics, design and development of an epitope-based vaccine and antibody production. However, the identification of epitopes with experimental mapping approaches is a challenging job and usually requires extensive laboratory efforts. However, considerable efforts have been placed for the identification of epitopes using computational methods in the recent past but deprived of considerable achievements. In this study, we present LBCEPred, a python-based web-tool (http://lbcepred.pythonanywhere.com/), build with random forest classifier and statistical moment-based descriptors to predict the B-cell epitopes from the protein sequences. LBECPred outperforms all sequence-based available models that are currently in use for the B-cell epitopes prediction, with 0.868 accuracy value and 0.934 area under the curve. Moreover, the prediction performance of proposed models compared to other state-of-the-art models is 56.3% higher on average for Mathews Correlation Coefficient. LBCEPred is easy to use tool even for novice users and has also shown the models stability and reliability, thus we believe in its significant contribution to the research community and the area of bioinformatics.


Assuntos
Biologia Computacional , Epitopos de Linfócito B , Sequência de Aminoácidos , Biologia Computacional/métodos , Aprendizado de Máquina , Reprodutibilidade dos Testes
4.
Anal Biochem ; 676: 115247, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-37437648

RESUMO

Pseudouridine (ψ) is reported to occur frequently in all types of RNA. This uridine modification has been shown to be essential for processes such as RNA stability and stress response. Also, it is linked to a few human diseases, such as prostate cancer, anemia, etc. A few laboratory techniques, such as Pseudo-seq and N3-CMC-enriched Pseudouridine sequencing (CeU-Seq) are used for detecting ψ sites. However, these are laborious and drawn-out methods. The convenience of sequencing data has enabled the development of computationally intelligent models for improving ψ site identification methods. The proposed work provides a prediction model for the identification of ψ sites through popular ensemble methods such as stacking, bagging, and boosting. Features were obtained through a novel feature extraction mechanism with the assimilation of statistical moments, which were used to train ensemble models. The cross-validation test and independent set test were used to evaluate the precision of the trained models. The proposed model outperformed the preexisting predictors and revealed 87% accuracy, 0.90 specificity, 0.85 sensitivity, and a 0.75 Matthews correlation coefficient. A web server has been built and is available publicly for the researchers at https://taseersuleman-y-test-pseu-pred-c2wmtj.streamlit.app/.


Assuntos
Pseudouridina , RNA , Humanos , Pseudouridina/metabolismo , Processamento Pós-Transcricional do RNA
5.
Int J Mol Sci ; 23(19)2022 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-36232840

RESUMO

Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew's correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen's kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively.


Assuntos
Adenocarcinoma , Neoplasias da Mama , Aprendizado Profundo , Adenocarcinoma/diagnóstico , Adenocarcinoma/genética , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Carcinógenos , Feminino , Humanos , Masculino , Mutação , Nucleotídeos
6.
Anal Biochem ; 633: 114385, 2021 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-34571005

RESUMO

N4-methylcytosine (4 mC) is an important epigenetic modification that occurs enzymatically by the action of DNA methyltransferases. 4 mC sites exist in prokaryotes and eukaryotes while playing a vital role in regulating gene expression, DNA replication, and cell cycle. The efficient and accurate prediction of 4 mC sites has a significant role in the insight of 4 mC biological properties and functions. Therefore, a sequence-based predictor is proposed, namely 4 mC-RF, for identifying 4 mC sites through the integration of statistical moments along with position, and composition-dependent features. Relative and absolute position-based features are computed to extract optimal features. A popular machine learning classifier Random Forest was used for training the model. Validation results were obtained through rigorous processes of self-consistency, 10-fold cross-validation, Independent set testing, and Jackknife yielding 95.1%, 95.2%, 97.0%, and 94.7% accuracies, respectively. Our proposed model depicts the highest prediction accuracies as compared to existing models. Subsequently, the developed 4 mC-RF model was constructed into a web server. A significant and more accurate predictor of 4 mC Methylcytosine sites helps experimental scientists to gather faster, efficient, and cost-effective results.


Assuntos
Citosina/análogos & derivados , Aprendizado de Máquina , Citosina/química , Citosina/metabolismo
7.
Anal Biochem ; 615: 114069, 2021 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-33340540

RESUMO

Deep representations can be used to replace human-engineered representations, as such features are constrained by certain limitations. For the prediction of protein post-translation modifications (PTMs) sites, research community uses different feature extraction techniques applied on Pseudo amino acid compositions (PseAAC). Serine phosphorylation is one of the most important PTM as it is the most occurring, and is important for various biological functions. Creating efficient representations from large protein sequences, to predict PTM sites, is a time and resource intensive task. In this study we propose, implement and evaluate use of Deep learning to learn effective protein data representations from PseAAC to develop data driven PTM detection systems and compare the same with two human representations.. The comparisons are performed by training an xgboost based classifier using each representation. The best scores were achieved by RNN-LSTM based deep representation and CNN based representation with an accuracy score of 81.1% and 78.3% respectively. Human engineered representations scored 77.3% and 74.9% respectively. Based on these results, it is concluded that the deep features are promising feature engineering replacement to identify PhosS sites in a very efficient and accurate manner which can help scientists understand the mechanism of this modification in proteins.


Assuntos
Biologia Computacional/métodos , Processamento de Proteína Pós-Traducional , Proteínas/química , Serina/metabolismo , Sequência de Aminoácidos , Aminoácidos/química , Aprendizado Profundo , Humanos , Modelos Biológicos , Fosforilação , Proteínas/metabolismo
8.
Anal Biochem ; 588: 113477, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31654612

RESUMO

Proteases are a type of enzymes, which perform the process of proteolysis. Proteolysis normally refers to protein and peptide degradation which is crucial for the survival, growth and wellbeing of a cell. Moreover, proteases have a strong association with therapeutics and drug development. The proteases are classified into five different types according to their nature and physiochemical characteristics. Mostly the methods used to differentiate protease from other proteins and identify their class requires a clinical test which is usually time-consuming and operator dependent. Herein, we report a classifier named iProtease-PseAAC (2L) for identifying proteases and their classes. The predictor is developed employing the flow of 5-step rule, initiating from the collection of benchmark dataset and terminating at the development of predictor. Rigorous verification and validation tests are performed and metrics are collected to calculate the authenticity of the trained model. The self-consistency validation gives the 98.32% accuracy, for cross-validation the accuracy is 90.71% and jackknife gives 96.07% accuracy. The average accuracy for level-2 i.e. protease classification is 95.77%. Based on the above-mentioned results, it is concluded that iProtease-PseAAC (2L) has the great ability to identify the proteases and their classes using a given protein sequence.


Assuntos
Algoritmos , Biologia Computacional/métodos , Peptídeo Hidrolases/classificação , Proteínas/classificação , Software , Bases de Dados de Proteínas
9.
Curr Genomics ; 21(7): 536-545, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33214770

RESUMO

INTRODUCTION: Hydroxylation is one of the most important post-translational modifications (PTM) in cellular functions and is linked to various diseases. The addition of one of the hydroxyl groups (OH) to the lysine sites produces hydroxylysine when undergoes chemical modification. METHODS: The method which is used in this study for identifying hydroxylysine sites based on powerful mathematical and statistical methodology incorporating the sequence-order effect and composition of each object within protein sequences. This predictor is called "iHyd-LysSite (EPSV)" (identifying hydroxylysine sites by extracting enhanced position and sequence variant technique). The prediction of hydroxylysine sites by experimental methods is difficult, laborious and highly expensive. In silico technique is an alternative approach to identify hydroxylysine sites in proteins. RESULTS: The experimental results require that the predictive model should have high sensitivity and specificity values and must be more accurate. The self-consistency, independent, 10-fold cross-validation and jackknife tests are performed for validation purposes. These tests are resulted by using three renowned classifiers, Neural Networks (NN), Random Forest (RF) and Support Vector Machine (SVM) with the demanding prediction rate. The overall predictive outcomes are extraordinarily superior to the results obtained by previous predictors. The proposed model contributed an excellent prediction rate in the system for NN, RF, and SVM classifiers. The sensitivity and specificity results using all these classifiers for jackknife test are 96.08%, 94.99%, 98.16% and 97.52%, 98.52%, 80.95%. CONCLUSION: The results obtained by the proposed tool show that this method may meet the future demand of hydroxylysine sites with a better prediction rate over the existing methods.

10.
Anal Biochem ; 568: 14-23, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30593778

RESUMO

S-Palmitoylation is a uniquely reversible and biologically important post-translational modification as it plays an essential role in a variety of cellular processes including signal transduction, protein-membrane interactions, neuronal development, lipid raft targeting, subcellular localization and apoptosis. Due to its association with the neuronal development, it plays a pivotal role in a variety of neurodegenerative diseases, mainly Alzheimer's, Schizophrenia and Huntington's disease. It is also essential for developmental life cycles and pathogenesis of Toxoplasma gondii and Plasmodium falciparum, known to cause toxoplasmosis and malaria, respectively. This depicts the strong biological significance of S-Palmitoylation, thus, the timely and accurate identification of S-palmitoylation sites is crucial. Herein, we propose a predictor for S-Palmitoylation sites in proteins namely SPalmitoylC-PseAAC by integrating the Chou's Pseudo Amino Acid Composition (PseAAC) and relative/absolute position-based features. Self-consistency testing and 10-fold cross-validation are performed to evaluate the performance of SPalmitoylC-PseAAC, using accuracy metrics. For self-consistency testing, 99.79% Acc, 99.77% Sp, 99.80% Sn and 1.00 MCC was observed, whereas, for 10-fold cross validation 97.22% Acc, 98.85% Sp, 95.80% Sn and 0.94 MCC was observed. Thus the proposed predictor can help in predicting the palmitoylation sites in an efficient and accurate way. The SPalmitoylC-PseAAC is available at (biopred.org/palm).


Assuntos
Proteínas de Membrana/metabolismo , Modelos Biológicos , Aciltransferases/química , Aciltransferases/metabolismo , Aminoácidos/química , Aminoácidos/metabolismo , Sequência de Bases , Bases de Dados de Proteínas , Humanos , Proteínas de Membrana/química , Ácido Palmítico/química , Ácido Palmítico/metabolismo , Processamento de Proteína Pós-Traducional
11.
J Theor Biol ; 473: 1-8, 2019 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-31005614

RESUMO

Antioxidant proteins are considered crucial in the areas of research on life sciences and pharmacology. They prevent damage to cells and DNA which are caused by free radicals. The role of antioxidants in the ageing process makes them more significant in their accurate identification. Disease preventions through antioxidant protein have also been the area of study in recent past. The existing process to identify and test every single antioxidant protein in order to obtain its properties is inefficient and expensive. Due to this nature, many pharmaceutical agents have reflected antioxidant proteins as attractive targets. Approaches based on computational methodologies have appeared to be as a highly desirable resource in the annotation and determination process of antioxidant proteins. In this study, we have developed a method that is built on computation intelligence and statistical moments based features for prediction. Our proposed system has achieved better accuracy than state-of-art systems in the prediction of antioxidant proteins from non-antioxidant proteins using 10-fold-cross-validation tests. These outcomes suggest that the use of statistical moments with a multilayer neural network could bear more effective and efficient results.


Assuntos
Aminoácidos/metabolismo , Antioxidantes/metabolismo , Proteínas/metabolismo , Estatística como Assunto , Algoritmos , Bases de Dados de Proteínas , Internet , Redes Neurais de Computação , Reprodutibilidade dos Testes
12.
J Theor Biol ; 468: 1-11, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30768975

RESUMO

The protein prenylation (or S-prenylation) is one of the most essential modifications, required for the association of membrane of a plethora of signalling proteins with the key biological process such as protein trafficking, cell growth, proliferation and differentiation. Due to the ubiquitous nature of S-prenylation and its role in cellular functions, any defect in the biosynthesis or regulation of the isoprenoid leads to the occurrence of a variety of diseases including neurodegenerative disorders, metabolic issues, cardiovascular diseases and one of the most fatal diseases, cancer. This depicts the strong biological significance of S-prenylation, thus, the timely and accurate identification of S-prenylation sites is crucial and may provide with possible ways to understand the mechanism of this modification in proteins. To avoid laborious, resource demanding and expensive experimental techniques of identifying S-prenylation sites, here, we propose a novel predictor namely SPrenylC-PseAAC by integrating the Chou's Pseudo Amino Acid Composition (PseAAC) and relative/absolute position-based features. A 2-tier classification was performed i.e., at first level, identification of prenylation and non-prenylation sites is performed, while at the second level, identification of S-farnesylation and S-geranylgeranylation sites is performed. Using jackknife, perdition model validation gave 95.31% accuracy for tier-1 classification and 91.42% for tier 2 classification, while for 10-fold cross-validation, it gave 93.68% accuracy for tier-1 classification and 89.70% for tier 2 classification. Thus the proposed predictor can help in predicting the Prenylation sites in an efficient and accurate way. The SPrenylC-PseAAC is available at (biopred.org/prenyl).


Assuntos
Algoritmos , Aminoácidos/química , Modelos Moleculares , Prenilação de Proteína , Sequência de Aminoácidos , Internet , Redes Neurais de Computação , Fosfatos de Poli-Isoprenil/química , Curva ROC , Reprodutibilidade dos Testes , Sesquiterpenos/química , Interface Usuário-Computador
13.
J Theor Biol ; 463: 47-55, 2019 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-30550863

RESUMO

The structure of protein gains additional stability against various detrimental effects by the presence of disulfide bonds. The formation of correct disulfide bonds between cysteine residues ensures proper in vivo and in vitro folding of the protein. Many cysteine residues can be present in the polypeptide chain of a protein, however, not all cysteine residues are involved in the formation of a disulfide bond, and therefore, accurate prediction of these bonds is crucial for identifying biophysical characteristics of a protein. In the present study, a novel method is proposed for the prediction of intramolecular disulfide bonds accurately using statistical moments and PseAAC. The pSSbond-PseAAC uses PseAAC along with position and composition relative features to calculate statistical moments. Statistical moments are important as they are very sensitive regarding the position of data sequences and for prediction of intramolecular disulfide bonds, moments are combined together to train neural networks. The overall accuracy of the pSSbond-PseAAC is 98.97% to sensitivity value 98.92%, specificity 98.99% and 0.98 MCC; and it outperforms various previously reported studies.


Assuntos
Cisteína/metabolismo , Dissulfetos/química , Proteínas/química , Biologia Computacional/métodos , Redes Neurais de Computação , Aprendizado de Máquina Supervisionado
14.
Curr Genomics ; 20(4): 306-320, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-32030089

RESUMO

BACKGROUND: The amino acid residues, in protein, undergo post-translation modification (PTM) during protein synthesis, a process of chemical and physical change in an amino acid that in turn alters behavioral properties of proteins. Tyrosine sulfation is a ubiquitous posttranslational modification which is known to be associated with regulation of various biological functions and pathological pro-cesses. Thus its identification is necessary to understand its mechanism. Experimental determination through site-directed mutagenesis and high throughput mass spectrometry is a costly and time taking process, thus, the reliable computational model is required for identification of sulfotyrosine sites. METHODOLOGY: In this paper, we present a computational model for the prediction of the sulfotyrosine sites named iSulfoTyr-PseAAC in which feature vectors are constructed using statistical moments of protein amino acid sequences and various position/composition relative features. These features are in-corporated into PseAAC. The model is validated by jackknife, cross-validation, self-consistency and in-dependent testing. RESULTS: Accuracy determined through validation was 93.93% for jackknife test, 95.16% for cross-validation, 94.3% for self-consistency and 94.3% for independent testing. CONCLUSION: The proposed model has better performance as compared to the existing predictors, how-ever, the accuracy can be improved further, in future, due to increasing number of sulfotyrosine sites in proteins.

15.
Curr Genomics ; 20(4): 275-292, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-32030087

RESUMO

BACKGROUND: Methylation is one of the most important post-translational modifications in the human body which usually arises on lysine among the most intensely modified residues. It performs a dynamic role in numerous biological procedures, such as regulation of gene expression, regulation of protein function and RNA processing. Therefore, to identify lysine methylation sites is an important challenge as some experimental procedures are time-consuming. OBJECTIVE: Herein, we propose a computational predictor named iMethylK_pseAAC to identify lysine methylation sites. METHODS: Firstly, we constructed feature vectors based on PseAAC using position and composition rel-ative features and statistical moments. A neural network is trained based on the extracted features. The performance of the proposed method is then validated using cross-validation and jackknife testing. RESULTS: The objective evaluation of the predictor showed accuracy of 96.7% for self-consistency, 91.61% for 10-fold cross-validation and 93.42% for jackknife testing. CONCLUSION: It is concluded that iMethylK_pseAAC outperforms the counterparts to identify lysine methylation sites such as iMethyl_pseACC, BPB_pPMS and PMeS.

16.
Anal Biochem ; 550: 109-116, 2018 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-29704476

RESUMO

Among all the post-translational modifications (PTMs) of proteins, Phosphorylation is known to be the most important and highly occurring PTM in eukaryotes and prokaryotes. It has an important regulatory mechanism which is required in most of the pathological and physiological processes including neural activity and cell signalling transduction. The process of threonine phosphorylation modifies the threonine by the addition of a phosphoryl group to the polar side chain, and generates phosphothreonine sites. The investigation and prediction of phosphorylation sites is important and various methods have been developed based on high throughput mass-spectrometry but such experimentations are time consuming and laborious therefore, an efficient and accurate novel method is proposed in this study for the prediction of phosphothreonine sites. The proposed method uses context-based data to calculate statistical moments. Position relative statistical moments are combined together to train neural networks. Using 10-fold cross validation, 94.97% accurate result has been obtained whereas for Jackknife testing, 96% accurate results have been obtained. The overall accuracy of the system is 94.4% to sensitivity value 94% and specificity 94.6%. These results suggest that the proposed method may play an essential role to the other existing methods for phosphothreonine sites prediction.


Assuntos
Fosfoproteínas , Fosfotreonina/química , Análise de Sequência de Proteína/métodos , Software , Fosfoproteínas/química , Fosfoproteínas/genética , Fosforilação
17.
Mol Biol Rep ; 45(6): 2295-2306, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30238411

RESUMO

For many biological functions membrane proteins (MPs) are considered crucial. Due to this nature of MPs, many pharmaceutical agents have reflected them as attractive targets. It bears indispensable importance that MPs are predicted with accurate measures using effective and efficient computational models (CMs). Annotation of MPs using in vitro analytical techniques is time-consuming and expensive; and in some cases, it can prove to be intractable. Due to this scenario, automated prediction and annotation of MPs through CM based techniques have appeared to be useful. Based on the use of computational intelligence and statistical moments based feature set, an MP prediction framework is proposed. Furthermore, the previously used dataset has been enhanced by incorporating new MPs from the latest release of UniProtKB. Rigorous experimentation proves that the use of statistical moments with a multilayer neural network, trained using back-propagation based prediction techniques allows more thorough results.


Assuntos
Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Análise de Sequência de Proteína/métodos , Algoritmos , Aminoácidos , Animais , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Humanos , Proteínas de Membrana/fisiologia
18.
Mol Biol Rep ; 45(6): 2501-2509, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30311130

RESUMO

Protein phosphorylation is one of the most fundamental types of post-translational modifications and it plays a vital role in various cellular processes of eukaryotes. Among three types of phosphorylation i.e. serine, threonine and tyrosine phosphorylation, tyrosine phosphorylation is one of the most frequent and it is important for mediation of signal transduction in eukaryotic cells. Site-directed mutagenesis and mass spectrometry help in the experimental determination of cellular signalling networks, however, these techniques are costly, time taking and labour associated. Thus, efficient and accurate prediction of these sites through computational approaches can be beneficial to reduce cost and time. Here, we present a more accurate and efficient sequence-based computational method for prediction of phosphotyrosine (PhosY) sites by incorporation of statistical moments into PseAAC. The study is carried out based on Chou's 5-step rule, and various position-composition relative features are used to train a neural network for the prediction purpose. Validation of results through Jackknife testing is performed to validate the results of the proposed prediction method. Overall accuracy validated through Jackknife testing was calculated 93.9%. These results suggest that the proposed prediction model can play a fundamental role in the prediction of PhosY sites in an accurate and efficient way.


Assuntos
Biologia Computacional/métodos , Previsões/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Aminoácidos , Biometria , Bases de Dados de Proteínas , Fosforilação/genética , Fosfotirosina/genética , Fosfotirosina/metabolismo , Processamento de Proteína Pós-Traducional
19.
J Membr Biol ; 250(1): 55-76, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-27866233

RESUMO

Membrane proteins are vital mediating molecules responsible for the interaction of a cell with its surroundings. These proteins are involved in different functionalities such as ferrying of molecules and nutrients across membrane, recognizing foreign bodies, receiving outside signals and translating them into the cell. Membrane proteins play significant role in drug interaction as nearly 50% of the drug targets are membrane proteins. Due to the momentous role of membrane protein in cell activity, computational models able to predict membrane protein with accurate measures bears indispensable importance. The conventional experimental methods used for annotating membrane proteins are time-consuming and costly and in some cases impossible. Computationally intelligent techniques have emerged to be as a useful resource in the automation of prediction and hence the annotation process. In this study, various techniques have been reviewed that are based on different computational intelligence models used for prediction process. These techniques were formulated by different researchers and were further evaluated to provide a comparative analysis. Analysis shows that the usage of support vector machine-based prediction techniques bears more assiduous results.


Assuntos
Aminoácidos/química , Biologia Computacional/métodos , Proteínas de Membrana/química , Proteínas de Membrana/classificação , Algoritmos , Interações Hidrofóbicas e Hidrofílicas , Redes Neurais de Computação , Matrizes de Pontuação de Posição Específica , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Máquina de Vetores de Suporte
20.
ScientificWorldJournal ; 2014: 723595, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24977221

RESUMO

This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%.


Assuntos
Algoritmos , Inteligência Artificial , Biometria/métodos , Interpretação de Imagem Assistida por Computador/métodos , Iris/anatomia & histologia , Reconhecimento Automatizado de Padrão/métodos , Humanos , Aumento da Imagem/métodos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa