Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
BMC Bioinformatics ; 13: 286, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-23116496

RESUMO

BACKGROUND: Protein structures are comprised of modular elements known as domains. These units are used and re-used over and over in nature, and usually serve some particular function in the structure. Thus it is useful to be able to break up a protein of interest into its component domains, prior to similarity searching for example. Numerous computational methods exist for doing so, but most operate only on a single protein chain and many are limited to making a series of cuts to the sequence, while domains can and do span multiple chains. RESULTS: This study presents a novel clustering-based approach to domain identification, which works equally well on individual chains or entire complexes. The method is simple and fast, taking only a few milliseconds to run, and works by clustering either vectors representing secondary structure elements, or buried alpha-carbon positions, using average-linkage clustering. Each resulting cluster corresponds to a domain of the structure. The method is competitive with others, achieving 70% agreement with SCOP on a large non-redundant data set, and 80% on a set more heavily weighted in multi-domain proteins on which both SCOP and CATH agree. CONCLUSIONS: It is encouraging that a basic method such as this performs nearly as well or better than some far more complex approaches. This suggests that protein domains are indeed for the most part simply compact regions of structure with a higher density of buried contacts within themselves than between each other. By representing the structure as a set of points or vectors in space, it allows us to break free of any artificial limitations that other approaches may depend upon.


Assuntos
Algoritmos , Biologia Computacional/métodos , Estrutura Terciária de Proteína , Proteínas/química , Análise de Sequência de Proteína/métodos , Análise de Sequência de Proteína/estatística & dados numéricos , Análise por Conglomerados , Estrutura Secundária de Proteína
2.
J Chem Inf Model ; 50(8): 1466-75, 2010 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-20690656

RESUMO

A novel method for measuring protein pocket similarity was devised, using only the alpha carbon positions of the pocket residues. Pockets were compared pairwise using an exhaustive three-dimensional Calpha common subset search, grouping residues by physicochemical properties. At least five Calpha matches were required for each hit, and distances between corresponding points were fit to an Extreme Value Distribution resulting in a probabilistic score or likelihood for any given superposition. A set of 85 structures from 13 diverse protein families was clustered based on binding sites alone, using this score. It was also successfully used to cluster 25 kinases into a number of subfamilies. Using a test kinase query to retrieve other kinase pockets, it was found that a specificity of 99.2% and sensitivity of 97.5% could be achieved using an appropriate cutoff score. The search itself took from 2 to 10 min on a single 3.4 GHz CPU to search the entire Protein Data Bank (133 800 pockets), depending on the number of hits returned.


Assuntos
Carbono/química , Proteínas/química , Sítios de Ligação , Corismato Mutase/química , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Ligantes , Modelos Moleculares , Conformação Proteica , Proteínas Quinases/química , Tirosina-tRNA Ligase/química
4.
BMC Bioinformatics ; 7: 152, 2006 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-16545112

RESUMO

BACKGROUND: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID), a database of protein domain-small molecule interactions, was created using structural data from the Protein Data Bank (PDB). More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overpredict ion binding sites. DESCRIPTION: Using a set of co-crystallized protein-small molecule structures as a starting point, SMID interactions were generated by identifying protein domains that bind to small molecules, using NCBI's Reverse Position Specific BLAST (RPS-BLAST) algorithm. SMID records are available for viewing at http://smid.blueprint.org. The SMID-BLAST tool provides accurate transitive annotation of small-molecule binding sites for proteins not found in the PDB. Given a protein sequence, SMID-BLAST identifies domains using RPS-BLAST and then lists potential small molecule ligands based on SMID records, as well as their aligned binding sites. A heuristic ligand score is calculated based on E-value, ligand residue identity and domain entropy to assign a level of confidence to hits found. SMID-BLAST predictions were validated against a set of 793 experimental small molecule interactions from the PDB, of which 472 (60%) of predicted interactions identically matched the experimental small molecule and of these, 344 had greater than 80% of the binding site residues correctly identified. Further, we estimate that 45% of predictions which were not observed in the PDB validation set may be true positives. CONCLUSION: By focusing on protein domain-small molecule interactions, SMID is able to cluster similar interactions and detect subtle binding patterns that would not otherwise be obvious. Using SMID-BLAST, small molecule targets can be predicted for any protein sequence, with the only limitation being that the small molecule must exist in the PDB. Validation results and specific examples within illustrate that SMID-BLAST has a high degree of accuracy in terms of predicting both the small molecule ligand and binding site residue positions for a query protein.


Assuntos
Bases de Dados de Proteínas , Documentação/métodos , Armazenamento e Recuperação da Informação/métodos , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/classificação , Análise de Sequência de Proteína/métodos , Sítios de Ligação , Sistemas de Gerenciamento de Base de Dados , Ligantes , Ligação Proteica , Alinhamento de Sequência/métodos
5.
FEBS Lett ; 580(6): 1649-53, 2006 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-16494871

RESUMO

A complete set of 6300 small molecule ligands was extracted from the protein data bank, and deposited online in PubChem as data source 'SMID'. This set's major improvement over prior methods is the inclusion of cyclic polypeptides and branched polysaccharides, including an unambiguous nomenclature, in addition to normal monomeric ligands. Only the best available example of each ligand structure is retained, and an additional dataset is maintained containing co-ordinates for all examples of each structure. Attempts are made to correct ambiguous atomic elements and other common errors, and a perception algorithm was used to determine bond order and aromaticity when no other information was available.


Assuntos
Bases de Dados de Proteínas , Ligantes , Proteínas/química , Estrutura Molecular
6.
J Mol Biol ; 350(5): 1061-73, 2005 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-15978619

RESUMO

The identification and annotation of protein domains provides a critical step in the accurate determination of molecular function. Both computational and experimental methods of protein structure determination may be deterred by large multi-domain proteins or flexible linker regions. Knowledge of domains and their boundaries may reduce the experimental cost of protein structure determination by allowing researchers to work on a set of smaller and possibly more successful alternatives. Current domain prediction methods often rely on sequence similarity to conserved domains and as such are poorly suited to detect domain structure in poorly conserved or orphan proteins. We present here a simple computational method to identify protein domain linkers and their boundaries from sequence information alone. Our domain predictor, Armadillo (http://armadillo.blueprint.org), uses any amino acid index to convert a protein sequence to a smoothed numeric profile from which domains and domain boundaries may be predicted. We derived an amino acid index called the domain linker propensity index (DLI) from the amino acid composition of domain linkers using a non-redundant structure dataset. The index indicates that Pro and Gly show a propensity for linker residues while small hydrophobic residues do not. Armadillo predicts domain linker boundaries from Z-score distributions and obtains 35% sensitivity with DLI in a two-domain, single-linker dataset (within +/-20 residues from linker). The combination of DLI and an entropy-based amino acid index increases the overall Armadillo sensitivity to 56% for two domain proteins. Moreover, Armadillo achieves 37% sensitivity for multi-domain proteins, surpassing most other prediction methods. Armadillo provides a simple, but effective method by which prediction of domain boundaries can be obtained with reasonable sensitivity. Armadillo should prove to be a valuable tool for rapidly delineating protein domains in poorly conserved proteins or those with no sequence neighbors. As a first-line predictor, domain meta-predictors could yield improved results with Armadillo predictions.


Assuntos
Modelos Moleculares , Proteínas/química , Software , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Sequência Conservada , Estrutura Terciária de Proteína
7.
FEBS Lett ; 579(21): 4685-91, 2005 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-16098521

RESUMO

A novel chemical ontology based on chemical functional groups automatically, objectively assigned by a computer program, was developed to categorize small molecules. It has been applied to PubChem and the small molecule interaction database to demonstrate its utility as a basic pharmacophore search system. Molecules can be compared using a semantic similarity score based on functional group assignments rather than 3D shape, which succeeds in identifying small molecules known to bind a common binding site. This ontology will serve as a powerful tool for searching chemical databases and identifying key functional groups responsible for biological activities.


Assuntos
Substâncias Macromoleculares/química , Semântica , Software , Sítios de Ligação , Bases de Dados Factuais , Modelos Moleculares , Conformação Molecular
8.
J Am Soc Echocardiogr ; 26(2): 114-25, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23261149

RESUMO

Atherosclerosis of the proximal branches of the aortic arch has compelling clinical implications that warrant the application of direct noninvasive detection of the disease. The prevalence of aortic arch vessel disease in an aging and at-risk community and clinical population has been underreported and undertreated despite an associated increase of all-cause and cardiovascular mortality. Intrathoracic duplex imaging has been validated as an accurate noninvasive tool to detect, characterize, and follow native aortic arch vessel disease and its sequelae and correction. Such duplex techniques are easily integrated into routine echocardiography with focused training and minimal time investment in the examination. A paucity of available resources exists across disciplines regarding ultrasonographic investigation of these supra-aortic trunk vessels, including textbooks, journal articles, seminars, and manuals. This review has been compiled to familiarize physicians and sonographers with the relevant anatomy, pathophysiology, treatment, and diagnostic duplex surveillance of aortic arch vessel disease. Illustrative cases along with clinical rationale are discussed with the intent to facilitate the integration of arch vessel duplex imaging into the scope and practice of echocardiography.


Assuntos
Aorta Torácica/diagnóstico por imagem , Doenças da Aorta/diagnóstico por imagem , Ecocardiografia/métodos , Aumento da Imagem/métodos , Programas de Rastreamento/métodos , Ultrassonografia Doppler Dupla/métodos , Humanos
10.
Proteins ; 46(1): 8-23, 2002 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11746699

RESUMO

Protein structure prediction from sequence alone by "brute force" random methods is a computationally expensive problem. Estimates have suggested that it could take all the computers in the world longer than the age of the universe to compute the structure of a single 200-residue protein. Here we investigate the use of a faster version of our FOLDTRAJ probabilistic all-atom protein-structure-sampling algorithm. We have improved the method so that it is now over twenty times faster than originally reported, and capable of rapidly sampling conformational space without lattices. It uses geometrical constraints and a Leonard-Jones type potential for self-avoidance. We have also implemented a novel method to add secondary structure-prediction information to make protein-like amounts of secondary structure in sampled structures. In a set of 100,000 probabilistic conformers of 1VII, 1ENH, and 1PMC generated, the structures with smallest Calpha RMSD from native are 3.95, 5.12, and 5.95A, respectively. Expanding this test to a set of 17 distinct protein folds, we find that all-helical structures are "hit" by brute force more frequently than beta or mixed structures. For small helical proteins or very small non-helical ones, this approach should have a "hit" close enough to detect with a good scoring function in a pool of several million conformers. By fitting the distribution of RMSDs from the native state of each of the 17 sets of conformers to the extreme value distribution, we are able to estimate the size of conformational space for each. With a 0.5A RMSD cutoff, the number of conformers is roughly 2N where N is the number of residues in the protein. This is smaller than previous estimates, indicating an average of only two possible conformations per residue when sterics are accounted for. Our method reduces the effective number of conformations available at each residue by probabilistic bias, without requiring any particular discretization of residue conformational space, and is the fastest method of its kind. With computer speeds doubling every 18 months and parallel and distributed computing becoming more practical, the brute force approach to protein structure prediction may yet have some hope in the near future.


Assuntos
Algoritmos , Conformação Proteica , Animais , Simulação por Computador , Humanos , Cadeias de Markov , Modelos Moleculares , Modelos Estatísticos , Estrutura Secundária de Proteína
11.
Biochemistry ; 43(49): 15329-38, 2004 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-15581345

RESUMO

Intramolecular cross-linking of peptides by the light-sensitive compound diiodoacetamideazobenzene has been shown to permit reversible photocontrol of the helix-coil transition. Cross-linking between Cys residues spaced at i and i + 7 positions with the trans form of the linker was found to produce a decreased helix content compared to that of the non-cross-linked peptide. Photoisomerization to the cis form of the linker led to substantially higher helix content than in the non-cross-linked peptide. Detailed conformational analysis of the system leads to the conclusion that photocontrol of helix content does not involve specific interactions between the linker and the peptide. Instead, the change in peptide helix content caused by photoisomerization can be predicted by comparing the length ranges of the cis and trans forms of the linker with the expected distance distribution of the Cys attachment points in the intrinsic conformational ensemble of the peptide. The analysis presented here should help to guide the use of these and related linkers for the conformational control of a variety of peptide and protein systems.


Assuntos
Luz , Peptídeos/metabolismo , Sequência de Aminoácidos , Compostos Azo/química , Compostos Azo/metabolismo , Dicroísmo Circular , Reagentes de Ligações Cruzadas/metabolismo , Isomerismo , Dados de Sequência Molecular , Ressonância Magnética Nuclear Biomolecular , Peptídeos/síntese química , Peptídeos/efeitos da radiação , Fotoquímica , Conformação Proteica/efeitos da radiação , Estrutura Secundária de Proteína/efeitos da radiação , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA