Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Int J Immunogenet ; 41(1): 74-80, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23800159

RESUMO

Granulocyte-macrophage colony-stimulating factor (GM-CSF) is a cytokine that is essential for growth and development of progenitors of granulocytes and monocytes/macrophages. In this study, we report molecular cloning, sequencing and characterization of GM-CSF from Indian water buffalo, Bubalus bubalis. In addition, we performed sequence and structural analysis for buffalo GM-CSF. Buffalo GM-CSF has been compared with 17 mammalian GM-CSFs using multiple sequence alignment and phylogenetic tree. Three-dimensional model for buffalo GM-CSF and human receptor complex was built using homology modelling to study cross-reactivity between two species. Detailed analysis was performed to study GM-CSF interface and various interactions at the interface.


Assuntos
Búfalos/genética , Fator Estimulador de Colônias de Granulócitos e Macrófagos/genética , Animais , Clonagem Molecular , Fator Estimulador de Colônias de Granulócitos e Macrófagos/química , Análise de Sequência de DNA
2.
J Theor Biol ; 317: 377-83, 2013 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-23123454

RESUMO

The extracellular matrix (ECM) is a major component of tissues of multicellular organisms. It consists of secreted macromolecules, mainly polysaccharides and glycoproteins. Malfunctions of ECM proteins lead to severe disorders such as marfan syndrome, osteogenesis imperfecta, numerous chondrodysplasias, and skin diseases. In this work, we report a random forest approach, EcmPred, for the prediction of ECM proteins from protein sequences. EcmPred was trained on a dataset containing 300 ECM and 300 non-ECM and tested on a dataset containing 145 ECM and 4187 non-ECM proteins. EcmPred achieved 83% accuracy on the training and 77% on the test dataset. EcmPred predicted 15 out of 20 experimentally verified ECM proteins. By scanning the entire human proteome, we predicted novel ECM proteins validated with gene ontology and InterPro. The dataset and standalone version of the EcmPred software is available at http://www.inb.uni-luebeck.de/tools-demos/Extracellular_matrix_proteins/EcmPred.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas da Matriz Extracelular/metabolismo , Inteligência Artificial , Bases de Dados de Proteínas , Humanos , Proteoma/metabolismo , Curva ROC
3.
BMC Bioinformatics ; 12: 345, 2011 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-21849049

RESUMO

BACKGROUND: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence. RESULTS: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated. CONCLUSION: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. The BLProt software is available at http://www.inb.uni-luebeck.de/tools-demos/bioluminescent%20protein/BLProt.


Assuntos
Proteínas Luminescentes/química , Software , Máquina de Vetores de Suporte , Animais , Humanos , Cadeias de Markov
4.
J Theor Biol ; 270(1): 56-62, 2011 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-21056045

RESUMO

Some creatures living in extremely low temperatures can produce some special materials called "antifreeze proteins" (AFPs), which can prevent the cell and body fluids from freezing. AFPs are present in vertebrates, invertebrates, plants, bacteria, fungi, etc. Although AFPs have a common function, they show a high degree of diversity in sequences and structures. Therefore, sequence similarity based search methods often fails to predict AFPs from sequence databases. In this work, we report a random forest approach "AFP-Pred" for the prediction of antifreeze proteins from protein sequence. AFP-Pred was trained on the dataset containing 300 AFPs and 300 non-AFPs and tested on the dataset containing 181 AFPs and 9193 non-AFPs. AFP-Pred achieved 81.33% accuracy from training and 83.38% from testing. The performance of AFP-Pred was compared with BLAST and HMM. High prediction accuracy and successful of prediction of hypothetical proteins suggests that AFP-Pred can be a useful approach to identify antifreeze proteins from sequence information, irrespective of their sequence similarity.


Assuntos
Algoritmos , Sequência de Aminoácidos/genética , Proteínas Anticongelantes/análise , Biologia Computacional/métodos , Proteínas/classificação , Aminoácidos/química , Proteínas Anticongelantes/genética , Inteligência Artificial , Fenômenos Químicos , Estrutura Secundária de Proteína/genética , Estrutura Terciária de Proteína/genética , Proteínas/genética , Curva ROC
5.
Biochem Biophys Res Commun ; 391(3): 1306-11, 2010 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-19995554

RESUMO

Eukaryotic protein secretion generally occurs via the classical secretory pathway that traverses the ER and Golgi apparatus. Secreted proteins usually contain a signal sequence with all the essential information required to target them for secretion. However, some proteins like fibroblast growth factors (FGF-1, FGF-2), interleukins (IL-1 alpha, IL-1 beta), galectins and thioredoxin are exported by an alternative pathway. This is known as leaderless or non-classical secretion and works without a signal sequence. Most computational methods for the identification of secretory proteins use the signal peptide as indicator and are therefore not able to identify substrates of non-classical secretion. In this work, we report a random forest method, SPRED, to identify secretory proteins from protein sequences irrespective of N-terminal signal peptides, thus allowing also correct classification of non-classical secretory proteins. Training was performed on a dataset containing 600 extracellular proteins and 600 cytoplasmic and/or nuclear proteins. The algorithm was tested on 180 extracellular proteins and 1380 cytoplasmic and/or nuclear proteins. We obtained 85.92% accuracy from training and 82.18% accuracy from testing. Since SPRED does not use N-terminal signals, it can detect non-classical secreted proteins by filtering those secreted proteins with an N-terminal signal by using SignalP. SPRED predicted 15 out of 19 experimentally verified non-classical secretory proteins. By scanning the entire human proteome we identified 566 protein sequences potentially undergoing non-classical secretion. The dataset and standalone version of the SPRED software is available at http://www.inb.uni-luebeck.de/tools-demos/spred/spred.


Assuntos
Inteligência Artificial , Genoma Humano , Proteínas/metabolismo , Proteoma , Análise de Sequência de Proteína/métodos , Animais , Humanos , Proteínas/química , Proteínas/genética
6.
Bioinformatics ; 25(2): 204-10, 2009 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-19038986

RESUMO

MOTIVATION: So far various bioinformatics and machine learning techniques applied for identification of sequence and functionally conserved residues in proteins. Although few computational methods are available for the prediction of structurally conserved residues from protein structure, almost all methods require homologous structural information and structure-based alignments, which still prove to be a bottleneck in protein structure comparison studies. In this work, we developed a neural network approach for identification of structurally important residues from a single protein structure without using homologous structural information and structural alignment. RESULTS: A neural network ensemble (NNE) method that utilizes negative correlation learning (NCL) approach was developed for identification of structurally conserved residues (SCRs) in proteins using features that represent amino acid conservation and composition, physico-chemical properties and structural properties. The NCL-NNE method was applied to 6042 SCRs that have been extracted from 496 protein domains. This method obtained high prediction sensitivity (92.8%) and quality (Matthew's correlation coefficient is 0.852) in identification of SCRs. Further benchmarking using 60 protein domains containing 1657 SCRs that were not part of the training and testing datasets shows that the NCL-NNE can correctly predict SCRs with approximately 90% sensitivity. These results suggest the usefulness of NCL-NNE for facilitating the identification of SCRs utilizing information derived from a single protein structure. Therefore, this method could be extremely effective in large-scale benchmarking studies where reliable structural homologs and alignments are limited.


Assuntos
Redes Neurais de Computação , Proteínas/química , Biologia Computacional/métodos , Sequência Conservada , Bases de Dados de Proteínas , Conformação Proteica , Estrutura Terciária de Proteína , Proteínas/genética
7.
Amino Acids ; 39(5): 1385-91, 2010 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-20411285

RESUMO

Real-world datasets commonly have issues with data imbalance. There are several approaches such as weighting, sub-sampling, and data modeling for handling these data. Learning in the presence of data imbalances presents a great challenge to machine learning. Techniques such as support-vector machines have excellent performance for balanced data, but may fail when applied to imbalanced datasets. In this paper, we propose a new undersampling technique for selecting instances from the majority class. The performance of this approach was evaluated in the context of several real biological imbalanced data. The ratios of negative to positive samples vary from ~9:1 to ~100:1. Useful classifiers have high sensitivity and specificity. Our results demonstrate that the proposed selection technique improves the sensitivity compared to weighted support-vector machine and available results in the literature for the same datasets.


Assuntos
Algoritmos , Aminoácidos/química , Domínio Catalítico , Físico-Química , Bases de Dados Factuais , Estrutura Molecular , Peso Molecular
8.
Amino Acids ; 39(3): 777-83, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20186553

RESUMO

Lipocalins are functionally diverse proteins that are composed of 120-180 amino acid residues. Members of this family have several important biological functions including ligand transport, cryptic coloration, sensory transduction, endonuclease activity, stress response activity in plants, odorant binding, prostaglandin biosynthesis, cellular homeostasis regulation, immunity, immunotherapy and so on. Identification of lipocalins from protein sequence is more challenging due to the poor sequence identity which often falls below the twilight zone. So far, no specific method has been reported to identify lipocalins from primary sequence. In this paper, we report a support vector machine (SVM) approach to predict lipocalins from protein sequence using sequence-derived properties. LipoPred was trained using a dataset consisting of 325 lipocalin proteins and 325 non-lipocalin proteins, and evaluated by an independent set of 140 lipocalin proteins and 21,447 non-lipocalin proteins. LipoPred achieved 88.61% accuracy with 89.26% sensitivity, 85.27% specificity and 0.74 Matthew's correlation coefficient (MCC). When applied on the test dataset, LipoPred achieved 84.25% accuracy with 88.57% sensitivity, 84.22% specificity and MCC of 0.16. LipoPred achieved better performance rate when compared with PSI-BLAST, HMM and SVM-Prot methods. Out of 218 lipocalins, LipoPred correctly predicted 194 proteins including 39 lipocalins that are non-homologous to any protein in the SWISSPROT database. This result shows that LipoPred is potentially useful for predicting the lipocalin proteins that have no sequence homologs in the sequence databases. Further, successful prediction of nine hypothetical lipocalin proteins and five new members of lipocalin family prove that LipoPred can be efficiently used to identify and annotate the new lipocalin proteins from sequence databases. The LipoPred software and dataset are available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/lipopred.htm.


Assuntos
Lipocalinas/química , Alinhamento de Sequência/métodos , Bases de Dados de Proteínas , Humanos , Estrutura Terciária de Proteína , Alinhamento de Sequência/instrumentação , Homologia de Sequência de Aminoácidos
9.
Nucleic Acids Res ; 36(Database issue): D218-21, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17933773

RESUMO

Structural motifs are important for the integrity of a protein fold and can be employed to design and rationalize protein engineering and folding experiments. Such conserved segments represent the conserved core of a family or superfamily and can be crucial for the recognition of potential new members in sequence and structure databases. We present a database, MegaMotifBase, that compiles a set of important structural segments or motifs for protein structures. Motifs are recognized on the basis of both sequence conservation and preservation of important structural features such as amino acid preference, solvent accessibility, secondary structural content, hydrogen-bonding pattern and residue packing. This database provides 3D orientation patterns of the identified motifs in terms of inter-motif distances and torsion angles. Important applications of structural motifs are also provided in several crucial areas such as similar sequence and structure search, multiple sequence alignment and homology modeling. MegaMotifBase can be a useful resource to gain knowledge about structure and functional relationship of proteins. The database can be accessed from the URL http://caps.ncbs.res.in/MegaMotifbase/index.html.


Assuntos
Motivos de Aminoácidos , Bases de Dados de Proteínas , Proteínas/classificação , Sequência de Aminoácidos , Sequência Conservada , Internet , Proteínas/química , Análise de Sequência de Proteína , Homologia Estrutural de Proteína
10.
Protein Pept Lett ; 27(3): 178-186, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31577193

RESUMO

BACKGROUND: N-Glycosylation is one of the most important post-translational mechanisms in eukaryotes. N-glycosylation predominantly occurs in N-X-[S/T] sequon where X is any amino acid other than proline. However, not all N-X-[S/T] sequons in proteins are glycosylated. Therefore, accurate prediction of N-glycosylation sites is essential to understand Nglycosylation mechanism. OBJECTIVE: In this article, our motivation is to develop a computational method to predict Nglycosylation sites in eukaryotic protein sequences. METHODS: In this article, we report a random forest method, Nglyc, to predict N-glycosylation site from protein sequence, using 315 sequence features. The method was trained using a dataset of 600 N-glycosylation sites and 600 non-glycosylation sites and tested on the dataset containing 295 Nglycosylation sites and 253 non-glycosylation sites. Nglyc prediction was compared with NetNGlyc, EnsembleGly and GPP methods. Further, the performance of Nglyc was evaluated using human and mouse N-glycosylation sites. RESULT: Nglyc method achieved an overall training accuracy of 0.8033 with all 315 features. Performance comparison with NetNGlyc, EnsembleGly and GPP methods shows that Nglyc performs better than the other methods with high sensitivity and specificity rate. CONCLUSION: Our method achieved an overall accuracy of 0.8248 with 0.8305 sensitivity and 0.8182 specificity. Comparison study shows that our method performs better than the other methods. Applicability and success of our method was further evaluated using human and mouse N-glycosylation sites. Nglyc method is freely available at https://github.com/bioinformaticsML/ Ngly.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Análise de Sequência de Proteína/métodos , Animais , Bases de Dados de Proteínas , Glicosilação , Humanos , Camundongos , Software
11.
Biochem Biophys Res Commun ; 384(2): 155-9, 2009 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-19394310

RESUMO

Identification of functionally important sites (FIS) in proteins is a critical problem and can have profound importance where protein structural information is limited. Machine learning techniques have been very useful in successful classification of many important biological problems. In this paper, we adopt the sparse kernel least squares classifiers (SKLSC) approach for classification and/or prediction of FIS using protein sequence derived features. The SKLSC algorithm was applied to 5435 FIS that have been extracted from 312 reliable alignments for a wide range of protein families. We obtained 68.28% sensitivity and 68.66% specificity for training dataset and 65.34% sensitivity and 66.88% specificity for testing dataset. Further, large scale benchmarking study using alignments of 101 protein families containing 1899 FIS showed that our method achieved an average approximately 70% sensitivity in predicting different types of FIS, such as active sites, metal, ligand or protein binding sites. Our findings also indicate that active sites and metal binding sites are comparably easier to predict compared to the ligand and protein binding sites. Despite moderate success, our results suggest the usefulness and potential of SKLSC approach in prediction of FIS using only protein sequence derived information.


Assuntos
Sítios de Ligação , Proteínas/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Domínio Catalítico , Análise dos Mínimos Quadrados , Proteínas/classificação
12.
J Biomol Struct Dyn ; 26(6): 679-86, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19385697

RESUMO

DNA-binding proteins (DNABPs) are important for various cellular processes, such as transcriptional regulation, recombination, replication, repair, and DNA modification. So far various bioinformatics and machine learning techniques have been applied for identification of DNA-binding proteins from protein structure. Only few methods are available for the identification of DNA binding proteins from protein sequence. In this work, we report a random forest method, DNA-Prot, to identify DNA binding proteins from protein sequence. Training was performed on the dataset containing 146 DNA-binding proteins and 250 non DNA-binding proteins. The algorithm was tested on the dataset containing 92 DNA-binding proteins and 100 non DNA-binding proteins. We obtained 80.31% accuracy from training and 84.37% accuracy from testing. Benchmarking analysis on the independent of 823 DNA-binding proteins and 823 non DNA-binding proteins shows that our approach can distinguish DNA-binding proteins from non DNA-binding proteins with more than 80% accuracy. We also compared our method with DNAbinder method on test dataset and two independent datasets. Comparable performance was observed from both methods on test dataset. In the benchmark dataset containing 823 DNA-binding proteins and 823 non DNA-binding proteins, we obtained significantly better performance from DNA-Prot with 81.83% accuracy whereas DNAbinder achieved only 61.42% accuracy using amino acid composition and 63.5% using PSSM profile. Similarly, DNA-Prot achieved better performance rate from the benchmark dataset containing 88 DNA-binding proteins and 233 non DNA-binding proteins. This result shows DNA-Prot can be efficiently used to identify DNA binding proteins from sequence information. The dataset and standalone version of DNA-Prot software can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/dnaprot.htm.


Assuntos
Algoritmos , Proteínas de Ligação a DNA/análise , Bases de Dados de Proteínas , Aminoácidos/metabolismo , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , Interações Hidrofóbicas e Hidrofílicas , Reprodutibilidade dos Testes
13.
Biochem Biophys Res Commun ; 367(3): 630-4, 2008 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-18206645

RESUMO

Identification of catalytic residues can provide valuable insights into protein function. With the increasing number of protein 3D structures having been solved by X-ray crystallography and NMR techniques, it is highly desirable to develop an efficient method to identify their catalytic sites. In this paper, we present an SVM method for the identification of catalytic residues using sequence and structural features. The algorithm was applied to the 2096 catalytic residues derived from Catalytic Site Atlas database. We obtained overall prediction accuracy of 88.6% from 10-fold cross validation and 95.76% from resubstitution test. Testing on the 254 catalytic residues shows our method can correctly predict all 254 residues. This result suggests the usefulness of our approach for facilitating the identification of catalytic residues from protein structures.


Assuntos
Algoritmos , Domínio Catalítico , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Proteínas/química , Valor Preditivo dos Testes , Conformação Proteica , Reprodutibilidade dos Testes
14.
Bioinformatics ; 23(5): 637-8, 2007 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-17237055

RESUMO

UNLABELLED: SMotif is a server that identifies important structural segments or motifs for a given protein structure(s) based on conservation of both sequential as well as important structural features such as solvent inaccessibility, secondary structural content, hydrogen bonding pattern and residue packing. This server also provides three-dimensional orientation patterns of the identified motifs in terms of inter-motif distances and torsion angles. These motifs may form the common core and therefore, can also be employed to design and rationalize protein engineering and folding experiments. AVAILABILITY: SMotif server is available via the URL http://caps.ncbs.res.in/SMotif/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Motivos de Aminoácidos , Software , Sequência Conservada , Bases de Dados de Proteínas , Ligação de Hidrogênio , Internet , Dobramento de Proteína , Estrutura Secundária de Proteína , Proteínas/química , Interface Usuário-Computador
15.
J Theor Biol ; 253(2): 375-80, 2008 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-18423492

RESUMO

Determination of protein structural class solely from sequence information is a challenging task. Several attempts to solve this problem using various methods can be found in literature. We present support vector machine (SVM) approach where probability-based decision is used along with class-wise optimized feature sets. This approach has two distinguishing characteristics from earlier attempts: (1) it uses class-wise optimized features and (2) decisions of different SVM classifiers are coupled with probability estimates to make the final prediction. The algorithm was tested on three datasets, containing 498 domains, 1092 domains and 5261 domains. Ten-fold external cross-validation was performed to assess the performance of the algorithm. Significantly high accuracy of 92.89% was obtained for the 498-dataset. We achieved 54.67% accuracy for the dataset with 1092 domains, which is better than the previously reported best accuracy of 53.8%. We obtained 59.43% prediction accuracy for the larger and less redundant 5261-dataset. We also investigated the advantage of using class-wise features over union of these features (conventional approach) in one-vs.-all SVM framework. Our results clearly show the advantage of using class-wise optimized features. Brief analysis of the selected class-wise features indicates their biological significance.


Assuntos
Conformação Proteica , Análise de Sequência de Proteína/métodos , Algoritmos , Animais , Biologia Computacional/métodos , Bases de Dados de Proteínas , Reconhecimento Automatizado de Padrão/métodos
16.
Nucleic Acids Res ; 34(Database issue): D285-6, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381866

RESUMO

Realization of conserved residues that represent a protein family is crucial for clearer understanding of biological function as well as for the better recognition of additional members in sequence databases. Functionally important residues are recognized well due to their high degree of conservation in closely related sequences and are annotated in functional motif databases. Structural motifs are central to the integrity of the fold and require careful analysis for their identification. We report the availability of a database of spatially interacting motifs in single protein structures as well as those among distantly related protein structures that belong to a superfamily. Spatial interactions amongst conserved motifs are automatically measured using sequence similarity scores and distance calculations. Interactions between pairs of conserved motifs are described in the form of pseudoenergies. iMOTdb database provides information for 854,488 motifs corresponding to 60,849 protein structural domains and 22,648 protein structural entries.


Assuntos
Motivos de Aminoácidos , Bases de Dados de Proteínas , Sequência Conservada , Internet , Análise de Sequência de Proteína , Interface Usuário-Computador
17.
BMC Bioinformatics ; 8: 351, 2007 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-17880712

RESUMO

BACKGROUND: Odorant binding proteins (OBPs) are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins. RESULTS: In this paper, we propose a new algorithm that uses Regularized Least Squares Classifier (RLSC) in conjunction with multiple physicochemical properties of amino acids to predict odorant-binding proteins. The algorithm was applied to the dataset derived from Pfam and GenDiS database and we obtained overall prediction accuracy of 97.7% (94.5% and 98.4% for positive and negative classes respectively). CONCLUSION: Our study suggests that RLSC is potentially useful for predicting the odorant binding proteins from sequence-derived properties irrespective of sequence similarity. Our method predicts 92.8% of 56 odorant binding proteins non-homologous to any protein in the swissprot database and 97.1% of the 414 independent dataset proteins, suggesting the usefulness of RLSC method for facilitating the prediction of odorant binding proteins from sequence information.


Assuntos
Algoritmos , Inteligência Artificial , Modelos Químicos , Reconhecimento Automatizado de Padrão/métodos , Mapeamento de Interação de Proteínas/métodos , Receptores Odorantes/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Dados de Sequência Molecular , Ligação Proteica
18.
Nucleic Acids Res ; 33(Web Server issue): W130-2, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15980441

RESUMO

DIAL is a web server for the automatic identification of structural domains given the 3D coordinates of a protein. Delineation of the structural domains and their exact boundaries are the starting points for the better realization of distantly related members of the domain families, for the rational design of the experiments and for clearer understanding of the biological function. The current server can examine crystallographic multiple chains and provide structural domain solutions that can also describe domain swapping events. The server can be accessed from http://www.ncbs.res.in/~faculty/mini/DIAL/home.html. The Supplementary data can be accessed from http://www.ncbs.res.in/~faculty/mini/DIAL/supplement.html.


Assuntos
Estrutura Terciária de Proteína , Software , Motivos de Aminoácidos , Aminoácidos/química , Cristalografia por Raios X , Internet , Modelos Moleculares , Estrutura Secundária de Proteína , Proteínas/química
19.
Nucleic Acids Res ; 33(Database issue): D252-5, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608190

RESUMO

Several proteins that have substantially diverged during evolution retain similar three-dimensional structures and biological function inspite of poor sequence identity. The database on Genomic Distribution of protein structural domain Superfamilies (GenDiS) provides record for the distribution of 4001 protein domains organized as 1194 structural superfamilies across 18,997 genomes at various levels of hierarchy in taxonomy. GenDiS database provides a survey of protein domains enlisted in sequence databases employing a 3-fold sequence search approach. Lineage-specific literature is obtained from the taxonomy database for individual protein members to provide a platform for performing genomic and phyletic studies across organisms. The database documents residual properties and provides alignments for the various superfamily members in genomes, offering insights into the rational design of experiments and for the better understanding of a superfamily. GenDiS database can be accessed at http://www.ncbs.res.in/~faculty/mini/gendis/home.html.


Assuntos
Bases de Dados de Proteínas , Genômica , Estrutura Terciária de Proteína , Proteínas/classificação , Proteínas/genética , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos , Software , Interface Usuário-Computador
20.
Nucleic Acids Res ; 33(Web Server issue): W274-6, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15980468

RESUMO

Establishment of similarities between proteins is very important for the study of the relationship between sequence, structure and function and for the analysis of evolutionary relationships. Motif-based search methods play a crucial role in establishing the connections between proteins that are particularly useful for distant relationships. This paper reports SCANMOT, a web-based server that searches for similarities between proteins by simultaneous matching of multiple motifs. SCANMOT searches for similar sequences in entire sequence databases using multiple conserved regions and utilizes inter-motif spacing as restraints. The SCANMOT server is available via http://www.ncbs.res.in/~faculty/mini/scanmot/scanmot.html.


Assuntos
Motivos de Aminoácidos , Análise de Sequência de Proteína/métodos , Software , Algoritmos , Sequência de Aminoácidos , Sequência Conservada , Bases de Dados de Proteínas , Evolução Molecular , Internet , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA