Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Protein Pept Lett ; 19(1): 70-8, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21919857

RESUMO

Phosphorylation is one of the most important post-translational modifications, and the identification of protein phosphorylation sites is particularly important for studying disease diagnosis. However, experimental detection of phosphorylation sites is labor intensive. It would be beneficial if computational methods are available to provide an extra reference for the phosphorylation sites. Here we developed a novel sequence-based method for serine, threonine, and tyrosine phosphorylation site prediction. Nearest Neighbor algorithm was employed as the prediction engine. The peptides around the phosphorylation sites with a fixed length of thirteen amino acid residues were extracted via a sliding window along the protein chains concerned. Each of such peptides was coded into a vector with 6,072 features, derived from Amino Acid Index (AAIndex) database, for the classification/detection. Incremental Feature Selection, a feature selection algorithm based on the Maximum Relevancy Minimum Redundancy (mRMR) method was used to select a compact feature set for a further improvement of the classification performance. Three predictors were established for identifying the three types of phosphorylation sites, achieving the overall accuracies of 66.64%, 66.11%% and 66.69%, respectively. These rates were obtained by rigorous jackknife cross-validation tests.


Assuntos
Peptídeos/química , Fosfoproteínas/química , Análise de Sequência de Proteína/métodos , Máquina de Vetores de Suporte , Sítios de Ligação , Biologia Computacional , Mineração de Dados , Bases de Dados de Proteínas , Peptídeos/metabolismo , Fosfoproteínas/metabolismo , Fosforilação , Valor Preditivo dos Testes , Processamento de Proteína Pós-Traducional , Serina/metabolismo , Treonina/metabolismo , Tirosina/metabolismo
2.
Biochimie ; 93(3): 489-96, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21075167

RESUMO

Palmitoylation is a universal and important lipid modification, involving a series of basic cellular processes, such as membrane trafficking, protein stability and protein aggregation. With the avalanche of new protein sequences generated in the post genomic era, it is highly desirable to develop computational methods for rapidly and effectively identifying the potential palmitoylation sites of uncharacterized proteins so as to timely provide useful information for revealing the mechanism of protein palmitoylation. By using the Incremental Feature Selection approach based on amino acid factors, conservation, disorder feature, and specific features of palmitoylation site, a new predictor named IFS-Palm was developed in this regard. The overall success rate thus achieved by jackknife test on a newly constructed benchmark dataset was 90.65%. It was shown via an in-depth analysis that palmitoylation was intimately correlated with the feature of the upstream residue directly adjacent to cysteine site as well as the conservation of amino acid cysteine. Meanwhile, the protein disorder region might also play an import role in the post-translational modification. These findings may provide useful insights for revealing the mechanisms of palmitoylation.


Assuntos
Biologia Computacional/métodos , Lipoilação , Proteínas/química , Proteínas/metabolismo , Algoritmos , Sequência de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Reprodutibilidade dos Testes , Saccharomycetales/metabolismo
3.
Protein Pept Lett ; 17(7): 899-908, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20394581

RESUMO

The transcription factor (TF) is a protein that binds DNA at specific site to help regulate the transcription from DNA to RNA. The mechanism of transcriptional regulatory can be much better understood if the category of transcription factors is known. We introduce a system which can automatically categorize transcription factors using their primary structures. A feature analysis strategy called "mRMR" (Minimum Redundancy, Maximum Relevance) is used to analyze the contribution of the TF properties towards the TF classification. mRMR is coupled with forward feature selection to choose an optimized feature subset for the classification. TF properties are composed of the amino acid composition and the physiochemical characters of the proteins. These properties will generate over a hundred features/parameters. We put all the features/parameters into a classifier, called NNA (nearest neighbor algorithm), for the classification. The classification accuracy is 93.81%, evaluated by a Jackknife test. Feature analysis using mRMR algorithm shows that secondary structure, amino acid composition and hydrophobicity are the most relevant features for classification. A free online classifier is available at http://app3.biosino.org/132dvc/tf/.


Assuntos
Algoritmos , Sequência de Aminoácidos , Reconhecimento Automatizado de Padrão/métodos , Fatores de Transcrição , Aminoácidos/química , Cisteína/química , Interações Hidrofóbicas e Hidrofílicas , Dados de Sequência Molecular , Software , Fatores de Transcrição/química , Fatores de Transcrição/classificação , Triptofano/química
4.
PLoS One ; 5(12): e15917, 2010 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-21209839

RESUMO

BACKGROUND: Hydroxylation is an important post-translational modification and closely related to various diseases. Besides the biotechnology experiments, in silico prediction methods are alternative ways to identify the potential hydroxylation sites. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we developed a novel sequence-based method for identifying the two main types of hydroxylation sites--hydroxyproline and hydroxylysine. First, feature selection was made on three kinds of features consisting of amino acid indices (AAindex) which includes various physicochemical properties and biochemical properties of amino acids, Position-Specific Scoring Matrices (PSSM) which represent evolution information of amino acids and structural disorder of amino acids in the sliding window with length of 13 amino acids, then the prediction model were built using incremental feature selection method. As a result, the prediction accuracies are 76.0% and 82.1%, evaluated by jackknife cross-validation on the hydroxyproline dataset and hydroxylysine dataset, respectively. Feature analysis suggested that physicochemical properties and biochemical properties and evolution information of amino acids contribute much to the identification of the protein hydroxylation sites, while structural disorder had little relation to protein hydroxylation. It was also found that the amino acid adjacent to the hydroxylation site tends to exert more influence than other sites on hydroxylation determination. CONCLUSIONS/SIGNIFICANCE: These findings may provide useful insights for exploiting the mechanisms of hydroxylation.


Assuntos
Biologia Computacional/métodos , Hidroxilisina/química , Hidroxiprolina/química , Algoritmos , Aminoácidos/química , Sítios de Ligação , Bioquímica/métodos , Biologia Computacional/instrumentação , Hidroxilação , Hidroxilisina/metabolismo , Hidroxiprolina/metabolismo , Modelos Estatísticos , Modelos Teóricos , Peptídeos/química , Matrizes de Pontuação de Posição Específica , Conformação Proteica , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA