Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
BMC Bioinformatics ; 9 Suppl 1: S12, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18315843

RESUMO

BACKGROUND: Protein domains present some of the most useful information that can be used to understand protein structure and functions. Recent research on protein domain boundary prediction has been mainly based on widely known machine learning techniques, such as Artificial Neural Networks and Support Vector Machines. In this study, we propose a new machine learning model (IGRN) that can achieve accurate and reliable classification, with significantly reduced computations. The IGRN was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. RESULTS: The proposed model achieved average prediction accuracy of 67% on the Benchmark_2 dataset for domain boundary identification in multi-domains proteins and showed superior predictive performance and generalisation ability among the most widely used neural network models. With the CASP7 benchmark dataset, it also demonstrated comparable performance to existing domain boundary predictors such as DOMpro, DomPred, DomSSEA, DomCut and DomainDiscovery with 70.10% prediction accuracy. CONCLUSION: The performance of proposed model has been compared favourably to the performance of other existing machine learning based methods as well as widely known domain boundary predictors on two benchmark datasets and excels in the identification of domain boundaries in terms of model bias, generalisation and computational requirements.


Assuntos
Algoritmos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Dados de Sequência Molecular , Estrutura Terciária de Proteína , Análise de Regressão
2.
BMC Bioinformatics ; 7 Suppl 5: S6, 2006 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-17254311

RESUMO

BACKGROUND: Knowledge of protein domain boundaries is critical for the characterisation and understanding of protein function. The ability to identify domains without the knowledge of the structure--by using sequence information only--is an essential step in many types of protein analyses. In this present study, we demonstrate that the performance of DomainDiscovery is improved significantly by including the inter-domain linker index value for domain identification from sequence-based information. Improved DomainDiscovery uses a Support Vector Machine (SVM) approach and a unique training dataset built on the principle of consensus among experts in defining domains in protein structure. The SVM was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. RESULTS: Improved DomainDiscovery is compared with other methods by benchmarking against a structurally non-redundant dataset and also CASP5 targets. Improved DomainDiscovery achieves 70% accuracy for domain boundary identification in multi-domains proteins. CONCLUSION: Improved DomainDiscovery compares favourably to the performance of other methods and excels in the identification of domain boundaries for multi-domain proteins as a result of introducing support vector machine with benchmark_2 dataset.


Assuntos
Estrutura Terciária de Proteína , Análise de Sequência de Proteína/métodos , Software , Homologia Estrutural de Proteína , Motivos de Aminoácidos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dados de Sequência Molecular , Alinhamento de Sequência
3.
Int J Bioinform Res Appl ; 6(5): 508-21, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-21224207

RESUMO

Dengue virus, a member of the flavivirus family, is a mosquito-borne viral pathogen for which any specific treatment or control of infection by vaccination is yet to be conclusive. The envelope glycoprotein, E, mediates viral entry by membrane fusion. Elucidation of post-translational modification sites in E protein followed by sequence alignment produced stretches of residues which are conserved in most of the members of flaviviruses. Presence of protein kinase A (PKA) and protein kinase G (PKG) phosphorylation sites predicts that E protein may activate PKA and PKG through phosphorylation which is responsible for inhibition of platelet activation, and thereby causing thrombocytopenia. Here, we attempt to decipher the novel role of Dengue virus E protein in pathogenesis.


Assuntos
Vírus da Dengue/patogenicidade , Processamento de Proteína Pós-Traducional , Proteínas do Envelope Viral/química , Proteínas do Envelope Viral/metabolismo , Sequência de Bases , Biologia Computacional/métodos , Vírus da Dengue/genética , Vírus da Dengue/metabolismo , Dados de Sequência Molecular , Alinhamento de Sequência , Proteínas do Envelope Viral/genética , Replicação Viral
4.
Int J Bioinform Res Appl ; 5(1): 20-37, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19136362

RESUMO

Protein phosphorylation plays a fundamental role in most of the cellular regulatory pathways. Experimental detection of protein phosphorylation sites is labour intensive and often limited by the availability and optimisation of enzymatic reactions. The in silico prediction of phosphorylation sites using protein's primary sequences may provide guidelines for further experimental consideration and interpretation of phosphoproteomic data. An array of such tools exists over the internet and provides the prediction for protein kinase families. We developed an independent dataset to compare the performances of these methods to provide scientists with a better understanding of which method to use for their research.


Assuntos
Biologia Computacional/métodos , Fosfoproteínas/química , Algoritmos , Domínio Catalítico , Fosfoproteínas/metabolismo , Fosforilação , Proteínas Quinases/metabolismo , Proteoma/metabolismo
5.
IEEE Trans Nanobioscience ; 7(3): 200-5, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18779100

RESUMO

Wetlaufer introduced the classification of domains into continuous and discontinuous. Continuous domains form from a single-chain segment and discontinuous domains are composed of two or more chain segments. Richardson identified approximately 100 domains in her review. Her assignment was based on the concepts that the domain would be independently stable and/or could undergo rigid-body-like movements with respect to the entire protein. There are now several instances where structurally similar domains occur in different proteins in the absence of noticeable sequence similarity. Possibly, the most notable of such domains is the trios-phosphate isomerase (TIM) barrel. With the increase in the number of known sequences, computer algorithms are required to identify the discontinuous domain of an unknown protein chain in order to determine its structure and function. We have developed a novel algorithm for discontinuous-domain boundary prediction based on a machine learning algorithm and interresidue contact interactions values. We have used 415 proteins, including 100 discontinuous-domain chains for training. There is no method available that is designed solely on a sequence based for the prediction of discontinuous domain. DomainDiscovery performed significantly well compared to the structure-based methods like structural classification of proteins (SCOP), class, architecture, topology and homologous superfamily (CATH), and DOMain MAKer (DOMAK).


Assuntos
Algoritmos , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Proteínas/ultraestrutura , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Armazenamento e Recuperação da Informação/métodos , Dados de Sequência Molecular , Estrutura Terciária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA