Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Mol Biol ; 304(4): 599-619, 2000 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-11099383

RESUMO

A method is described in which proteins that match PROSITE patterns are filtered by the root-mean-square deviation of the local 3D structures of the probe and target over the pattern components. This was found to increase the discrimination between true and false members of the protein family but was dependent on how unique the structural features in the pattern were compared to equivalent fragments extracted from the structure databank (for example; if the pattern fell in an alpha-helix, then discrimination was poor.) We then generalised the sequence patterns (by widening the range of amino acid residues allowed at each position) and monitored how well the structural information helped retain specificity. While the discrimination of the pure sequence pattern had generally disappeared at information content values less than ten bits, the discrimination of the combined sequence structure probe remained high at this point before following a similar decay. The displacement between these curves indicates that the structural component is, on average, equivalent to about ten bits. The sequence patterns were also filtered using the structure comparison program SAP, giving a global, rather than local "view" of the proteins. This allowed the information content of the sequence patterns to become even less specific but raised problems of whether some proteins encountered with the same fold but no PROSITE pattern should constitute family members.


Assuntos
Bases de Dados como Assunto , Reconhecimento Automatizado de Padrão , Proteínas/química , Alinhamento de Sequência , Motivos de Aminoácidos , Sequência de Aminoácidos , Grupo dos Citocromos c/química , Endopeptidases/química , Fator de Crescimento Epidérmico/química , Kringles , Dados de Sequência Molecular , Sensibilidade e Especificidade , Homologia de Sequência de Aminoácidos , Software
2.
Comput Chem ; 24(1): 3-12, 2000 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-10642876

RESUMO

A method of multiple sequence alignment is described based on the double dynamic programming (DDP) algorithm previously used for treating structural constraints encountered in structure comparison and threading. Following these applications, the inconsistencies that emerge when trying to combine pair-wise alignments into a multiple alignment are reconciled by summing all the, possibly inconsistent, paths (low-level alignments) into a matrix which is then used to provide a final (high-level) alignment. This process is applied to all sequence pairs and the pair-wise results combined in a simple multiple sequence alignment program. From this alignment, further constraints are selected to bias the low-level alignments in the DDP algorithm and the process iterated. The results, however, showed that this overall iteration was not needed and one-pass gave results at least as good as the 'standard' progressive method of multiple sequence alignment. Further applications of the method are discussed.


Assuntos
Alinhamento de Sequência , Software , Algoritmos , Flavodoxina/química , Globinas/química , Proteínas/química
3.
J Comput Biol ; 7(5): 685-716, 2000.
Artigo em Inglês | MEDLINE | ID: mdl-11153094

RESUMO

This article investigates aspects of pairwise and multiple structure comparison, and the problem of automatically discover common patterns in a set of structures. Descriptions and representation of structures and patterns are described, as well as scoring and algorithms for comparison and discovery. A framework and nomenclature is developed for classifying different methods, and many of these are reviewed and placed into this framework.


Assuntos
Algoritmos , Proteínas/química , Análise por Conglomerados , Bases de Dados Factuais , Estrutura Molecular , Alinhamento de Sequência/métodos , Alinhamento de Sequência/estatística & dados numéricos
4.
Proteins ; 34(2): 206-19, 1999 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-10022356

RESUMO

We present a language for describing structural patterns of residues in protein structures and a method for the discovery of such patterns that recur in a set of protein structures. The patterns impose restrictions on the spatial position of each residue, their order along the amino acid chain, and which amino acids are allowed in each position. Unlike other methods for comparing sets of protein structures, our method is not based on the use of pairwise structure comparisons which is often time consuming and can produce inconsistent results. Instead, the method simultaneously takes into account information from all structures in the search for conserved structure patterns which are potential structure motifs. The method is based on describing the spatial neighborhoods of each residue in each structure as a string and applying a sequence pattern discovery method to find patterns common to subsets of these strings. Finally it is checked whether the similarities between the neighborhood strings correspond to spatially similar substructures. We apply the method to analyze sets of very disparate proteins from the four different protein families: serine proteases, cuprodoxins, cysteine proteinases, and ferredoxins. The motifs found by the method correspond well to the site and motif information given in the annotation of these proteins in PDB, Swiss-Prot, and PROSITE. Furthermore, the motifs are confirmed by using the motif data to constrain the structural alignment of the proteins obtained with the program SAP. This gave the best superposition/alignment of the proteins given the motif assignment.


Assuntos
Algoritmos , Modelos Moleculares , Conformação Proteica , Azurina/química , Cisteína Endopeptidases/química , Bases de Dados Factuais , Células Eucarióticas/química , Ferredoxinas/química , Plastocianina/química , Células Procarióticas/química , Serina Endopeptidases/química
5.
J Comput Biol ; 5(2): 279-305, 1998.
Artigo em Inglês | MEDLINE | ID: mdl-9672833

RESUMO

This paper surveys approaches to the discovery of patterns in biosequences and places these approaches within a formal framework that systematises the types of patterns and the discovery algorithms. Patterns with expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering the patterns that are the most frequently used in molecular bioinformatics. A formulation is given of the problem of the automatic discovery of such patterns from a set of sequences, and an analysis is presented of the ways in which an assessment can be made of the significance of the discovered patterns. It is shown that the problem is related to problems studied in the field of machine learning. The major part of this paper comprises a review of a number of existing methods developed to solve the problem and how these relate to each other, focusing on the algorithms underlying the approaches. A comparison is given of the algorithms, and examples are given of patterns that have been discovered using the different methods.


Assuntos
Algoritmos , Bases de Dados Factuais , Modelos Teóricos , Proteínas , Alinhamento de Sequência/métodos , Composição de Bases , Computação Matemática , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA