Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Database (Oxford) ; 20202020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32147717

RESUMO

Liver cancer is the fourth major lethal malignancy worldwide. To understand the development and progression of liver cancer, biomedical research generated a tremendous amount of transcriptomics and disease-specific biomarker data. However, dispersed information poses pragmatic hurdles to delineate the significant markers for the disease. Hence, a dedicated resource for liver cancer is required that integrates scattered multiple formatted datasets and information regarding disease-specific biomarkers. Liver Cancer Expression Resource (CancerLivER) is a database that maintains gene expression datasets of liver cancer along with the putative biomarkers defined for the same in the literature. It manages 115 datasets that include gene-expression profiles of 9611 samples. Each of incorporated datasets was manually curated to remove any artefact; subsequently, a standard and uniform pipeline according to the specific technique is employed for their processing. Additionally, it contains comprehensive information on 594 liver cancer biomarkers which include mainly 315 gene biomarkers or signatures and 178 protein- and 46 miRNA-based biomarkers. To explore the full potential of data on liver cancer, a web-based interactive platform was developed to perform search, browsing and analyses. Analysis tools were also integrated to explore and visualize the expression patterns of desired genes among different types of samples based on individual gene, GO ontology and pathways. Furthermore, a dataset matrix download facility was provided to facilitate the users for their extensive analysis to elucidate more robust disease-specific signatures. Eventually, CancerLivER is a comprehensive resource which is highly useful for the scientific community working in the field of liver cancer.Availability: CancerLivER can be accessed on the web at https://webs.iiitd.edu.in/raghava/cancerliver.


Assuntos
Biomarcadores Tumorais/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Neoplasias Hepáticas/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Ontologia Genética , Humanos , Internet
2.
Chem Cent J ; 7(1): 49, 2013 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-23497593

RESUMO

BACKGROUND: Mycobacterium tuberculosis (M.tb) is the causative agent of tuberculosis, killing ~1.7 million people annually. The remarkable capacity of this pathogen to escape the host immune system for decades and then to cause active tuberculosis disease, makes M.tb a successful pathogen. Currently available anti-mycobacterial therapy has poor compliance due to requirement of prolonged treatment resulting in accelerated emergence of drug resistant strains. Hence, there is an urgent need to identify new chemical entities with novel mechanism of action and potent activity against the drug resistant strains. RESULTS: This study describes novel computational models developed for predicting inhibitors against both replicative and non-replicative phase of drug-tolerant M.tb under carbon starvation stage. These models were trained on highly diverse dataset of 2135 compounds using four classes of binary fingerprint namely PubChem, MACCS, EState, SubStructure. We achieved the best performance Matthews correlation coefficient (MCC) of 0.45 using the model based on MACCS fingerprints for replicative phase inhibitor dataset. In case of non-replicative phase, Hybrid model based on PubChem, MACCS, EState, SubStructure fingerprints performed better with maximum MCC value of 0.28. In this study, we have shown that molecular weight, polar surface area and rotatable bond count of inhibitors (replicating and non-replicating phase) are significantly different from non-inhibitors. The fragment analysis suggests that substructures like hetero_N_nonbasic, heterocyclic, carboxylic_ester, and hetero_N_basic_no_H are predominant in replicating phase inhibitors while hetero_O, ketone, secondary_mixed_amine are preferred in the non-replicative phase inhibitors. It was observed that nitro, alkyne, and enamine are important for the molecules inhibiting bacilli residing in both the phases. In this study, we introduced a new algorithm based on Matthews correlation coefficient called MCCA for feature selection and found that this algorithm is better or comparable to frequency based approach. CONCLUSION: In this study, we have developed computational models to predict phase specific inhibitors against drug resistant strains of M.tb grown under carbon starvation. Based on simple molecular properties, we have derived some rules, which would be useful in robust identification of tuberculosis inhibitors. Based on these observations, we have developed a webserver for predicting inhibitors against drug tolerant M.tb H37Rv available at http://crdd.osdd.net/oscadd/mdri/.

3.
BMC Res Notes ; 4: 237, 2011 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-21774797

RESUMO

BACKGROUND: Predicting the function of a protein is one of the major challenges in the post-genomic era where a large number of protein sequences of unknown function are accumulating rapidly. Lectins are the proteins that specifically recognize and bind to carbohydrate moieties present on either proteins or lipids. Cancerlectins are those lectins that play various important roles in tumor cell differentiation and metastasis. Although the two types of proteins are linked, still there is no computational method available that can distinguish cancerlectins from the large pool of non-cancerlectins. Hence, it is imperative to develop a method that can distinguish between cancer and non-cancerlectins. RESULTS: All the models developed in this study are based on a non-redundant dataset containing 178 cancerlectins and 226 non-cancerlectins in which no two sequences have more than 50% sequence similarity. We have applied the similarity search based technique, i.e. BLAST, and achieved a maximum accuracy of 43.25%. The amino acids compositional analysis have shown that certain residues (e.g. Leucine, Proline) were preferred in cancerlectins whereas some other (e.g. Asparatic acid, Asparagine) were preferred in non-cancerlectins. It has been found that the PROSITE domain "Crystalline beta gamma" was abundant in cancerlectins whereas domains like "SUEL-type lectin domain" were found mainly in non-cancerlectins. An SVM-based model has been developed to differentiate between the cancer and non-cancerlectins which achieved a maximum Matthew's correlation coefficient (MCC) value of 0.32 with an accuracy of 64.84%, using amino acid compositions. We have developed a model based on dipeptide compositions which achieved an MCC value of 0.30 with an accuracy of 64.84%. Thereafter, we have developed models based on split compositions (2 and 4 parts) and achieved an MCC value of 0.31, 0.32 with accuracies of 65.10% and 66.09%, respectively. An SVM model based on Position Specific Scoring Matrix (PSSM), generated by PSI-BLAST, was developed and achieved an MCC value of 0.36 with an accuracy of 68.34%. Finally, we have integrated the PROSITE domain information with PSSM and developed an SVM model that has achieved an MCC value of 0.38 with 69.09% accuracy. CONCLUSION: BLAST has been found inefficient to distinguish between cancer and non-cancerlectins. We analyzed the protein sequences of cancer and non-cancerlectins and identified interesting patterns. We have been able to identify PROSITE domains that are preferred in cancer and non-cancerlectins and thus provided interesting insights into the two types of proteins. The method developed in this study will be useful for researchers studying cancerlectins, lectins and cancer biology. The web-server based on the above study, is available at http://www.imtech.res.in/raghava/cancer_pred/

4.
Immunome Res ; 6: 6, 2010 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-20961417

RESUMO

BACKGROUND: One of the major challenges in the field of vaccine design is to predict conformational B-cell epitopes in an antigen. In the past, several methods have been developed for predicting conformational B-cell epitopes in an antigen from its tertiary structure. This is the first attempt in this area to predict conformational B-cell epitope in an antigen from its amino acid sequence. RESULTS: All Support vector machine (SVM) models were trained and tested on 187 non-redundant protein chains consisting of 2261 antibody interacting residues of B-cell epitopes. Models have been developed using binary profile of pattern (BPP) and physiochemical profile of patterns (PPP) and achieved a maximum MCC of 0.22 and 0.17 respectively. In this study, for the first time SVM model has been developed using composition profile of patterns (CPP) and achieved a maximum MCC of 0.73 with accuracy 86.59%. We compare our CPP based model with existing structure based methods and observed that our sequence based model is as good as structure based methods. CONCLUSION: This study demonstrates that prediction of conformational B-cell epitope in an antigen is possible from is primary sequence. This study will be very useful in predicting conformational B-cell epitopes in antigens whose tertiary structures are not available. A web server CBTOPE has been developed for predicting B-cell epitope http://www.imtech.res.in/raghava/cbtope/.

5.
BMC Pharmacol ; 10: 8, 2010 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-20637097

RESUMO

BACKGROUND: Different isoforms of Cytochrome P450 (CYP) metabolized different types of substrates (or drugs molecule) and make them soluble during biotransformation. Therefore, fate of any drug molecule depends on how they are treated or metabolized by CYP isoform. There is a need to develop models for predicting substrate specificity of major isoforms of P450, in order to understand whether a given drug will be metabolized or not. This paper describes an in-silico method for predicting the metabolizing capability of major isoforms (e.g. CYP 3A4, 2D6, 1A2, 2C9 and 2C19). RESULTS: All models were trained and tested on 226 approved drug molecules. Firstly, 2392 molecular descriptors for each drug molecule were calculated using various softwares. Secondly, best 41 descriptors were selected using general and genetic algorithm. Thirdly, Support Vector Machine (SVM) based QSAR models were developed using 41 best descriptors and achieved an average accuracy of 86.02%, evaluated using fivefold cross-validation. We have also evaluated the performance of our model on an independent dataset of 146 drug molecules and achieved average accuracy 70.55%. In addition, SVM based models were developed using 26 Chemistry Development Kit (CDK) molecular descriptors and achieved an average accuracy of 86.60%. CONCLUSIONS: This study demonstrates that SVM based QSAR model can predict substrate specificity of major CYP isoforms with high accuracy. These models can be used to predict isoform responsible for metabolizing a drug molecule. Thus these models can used to understand whether a molecule will be metabolized or not. This is possible to develop highly accurate models for predicting substrate specificity of major isoforms using CDK descriptors. A web server MetaPred has been developed for predicting metabolizing isoform of a drug molecule http://crdd.osdd.net/raghava/metapred/.


Assuntos
Biologia Computacional/métodos , Sistema Enzimático do Citocromo P-450/metabolismo , Sistemas Inteligentes , Desintoxicação Metabólica Fase I , Preparações Farmacêuticas/metabolismo , Algoritmos , Inteligência Artificial , Bases de Dados Factuais , Internet , Isoenzimas/metabolismo , Modelos Biológicos , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes , Software , Especificidade por Substrato
6.
BMC Bioinformatics ; 9: 201, 2008 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-18416838

RESUMO

BACKGROUND: Malaria parasite secretes various proteins in infected RBC for its growth and survival. Thus identification of these secretory proteins is important for developing vaccine/drug against malaria. The existing motif-based methods have got limited success due to lack of universal motif in all secretory proteins of malaria parasite. RESULTS: In this study a systematic attempt has been made to develop a general method for predicting secretory proteins of malaria parasite. All models were trained and tested on a non-redundant dataset of 252 secretory and 252 non-secretory proteins. We developed SVM models and achieved maximum MCC 0.72 with 85.65% accuracy and MCC 0.74 with 86.45% accuracy using amino acid and dipeptide composition respectively. SVM models were developed using split-amino acid and split-dipeptide composition and achieved maximum MCC 0.74 with 86.40% accuracy and MCC 0.77 with accuracy 88.22% respectively. In this study, for the first time PSSM profiles obtained from PSI-BLAST, have been used for predicting secretory proteins. We achieved maximum MCC 0.86 with 92.66% accuracy using PSSM based SVM model. All models developed in this study were evaluated using 5-fold cross-validation technique. CONCLUSION: This study demonstrates that secretory proteins have different residue composition than non-secretory proteins. Thus, it is possible to predict secretory proteins from its residue composition-using machine learning technique. The multiple sequence alignment provides more information than sequence itself. Thus performance of method based on PSSM profile is more accurate than method based on sequence composition. A web server PSEApred has been developed for predicting secretory proteins of malaria parasites,the URL can be found in the Availability and requirements section.


Assuntos
Inteligência Artificial , Proteínas Sanguíneas/química , Proteínas Sanguíneas/metabolismo , Eritrócitos/parasitologia , Perfilação da Expressão Gênica/métodos , Malária Falciparum/metabolismo , Plasmodium falciparum/metabolismo , Proteoma/metabolismo , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Animais , Eritrócitos/metabolismo , Humanos , Dados de Sequência Molecular
7.
BMC Bioinformatics ; 8: 337, 2007 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-17854501

RESUMO

BACKGROUND: In past number of methods have been developed for predicting subcellular location of eukaryotic, prokaryotic (Gram-negative and Gram-positive bacteria) and human proteins but no method has been developed for mycobacterial proteins which may represent repertoire of potent immunogens of this dreaded pathogen. In this study, attempt has been made to develop method for predicting subcellular location of mycobacterial proteins. RESULTS: The models were trained and tested on 852 mycobacterial proteins and evaluated using five-fold cross-validation technique. First SVM (Support Vector Machine) model was developed using amino acid composition and overall accuracy of 82.51% was achieved with average accuracy (mean of class-wise accuracy) of 68.47%. In order to utilize evolutionary information, a SVM model was developed using PSSM (Position-Specific Scoring Matrix) profiles obtained from PSI-BLAST (Position-Specific Iterated BLAST) and overall accuracy achieved was of 86.62% with average accuracy of 73.71%. In addition, HMM (Hidden Markov Model), MEME/MAST (Multiple Em for Motif Elicitation/Motif Alignment and Search Tool) and hybrid model that combined two or more models were also developed. We achieved maximum overall accuracy of 86.8% with average accuracy of 89.00% using combination of PSSM based SVM model and MEME/MAST. Performance of our method was compared with that of the existing methods developed for predicting subcellular locations of Gram-positive bacterial proteins. CONCLUSION: A highly accurate method has been developed for predicting subcellular location of mycobacterial proteins. This method also predicts very important class of proteins that is membrane-attached proteins. This method will be useful in annotating newly sequenced or hypothetical mycobacterial proteins. Based on above study, a freely accessible web server TBpred http://www.imtech.res.in/raghava/tbpred/ has been developed.


Assuntos
Inteligência Artificial , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Mycobacterium tuberculosis/metabolismo , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de Proteína/métodos , Frações Subcelulares/metabolismo , Motivos de Aminoácidos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Bases de Dados de Proteínas , Dados de Sequência Molecular , Mycobacterium tuberculosis/genética , Alinhamento de Sequência/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...