Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Nucleic Acids Res ; 32(Database issue): D351-3, 2004 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-14681432

RESUMO

All the protein sequences from plants (including Arabidopsis thaliana) available from SwissProt/TrEMBL have been the subject of an all-by-all systematic comparison and grouped into clusters of related proteins. Within each cluster, the sequences have been submitted to pyramidal classification; in the case where two or several subfamilies have been grouped together, the pyramidal tree helps in finding which sequences make the links between subfamilies. In addition, the 'domains' that are common to two or more sequences within a cluster were determined and displayed à la ProDom. The resulting graphical representations proved to be quite efficient in pinpointing those protein sequences suffering from a probable error in the annotation of their genes. The clusters can be searched through various criteria and their pyramidal classifications and their domain representations can be displayed by querying http://genoplante-info. infobiogen.fr/phytoprot. The user can also launch a BLAST search of a query sequence against all the clusters.


Assuntos
Biologia Computacional , Bases de Dados de Proteínas , Proteínas de Plantas/classificação , Proteoma , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/classificação , Análise por Conglomerados , Internet , Proteínas de Plantas/química , Estrutura Terciária de Proteína , Proteoma/química , Proteoma/classificação , Proteômica
2.
J Mol Biol ; 216(2): 411-24, 1990 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-2254937

RESUMO

The crystal structure of the tryptic fragment of the methionyl-tRNA synthetase from Escherichia coli, complexed with ATP, has been refined to a crystallographic R-factor of 0.220, at 2.5 A resolution (for 4433 protein atoms). In the last stages of the refinement, the simulated annealing refinement method was fully applied, contributing to a drastic improvement of the model and the identification of the missing atoms. In the final model, the root-mean-square deviation from ideality for bond distances is 0.021 A and for angle distances is 0.054 A. The position of the zinc ion has been confirmed and is located near the active site. The tryptic fragment is composed of two globular domains. The first domain, from the N terminus to Thr360, contains a nucleotide-binding fold into which two long polypeptides of 101 and 70 residues are inserted. The nucleotide-binding fold is strengthened by the presence of the zinc ion in the vicinity of the active site. The second domain, up to Pro526, is mainly alpha-helical. The C-terminal polypeptide, Phe527 to Lys551, folds back towards the first domain, making a link between the two domains. The heptapeptide 528-534 partly shapes a deep cavity that plunges into the central core of the nucleotide-binding fold, where the ATP molecule is located. The adenine ring, deeply buried in the bottom of the cleft, is blocked between the first helix HA, and the strands A and D of the beta-sheet and makes no polar interaction with the enzyme. The 2' and 3' hydroxyl groups of the ribose, whose conformation is C2' endo, interact with the main-chain carbonyl oxygen atoms of Ile231 and Glu241, respectively. The side-chain nitrogen atom of Lys142 is at hydrogen-bonding distance from the ring oxygen O-4' of the ribose. One of the alpha-phosphate oxygen atoms and one of the gamma-phosphate oxygen atoms interact with the imidazole ring of His21, which is well conserved in many of the known synthetases; this indicates a possible crucial role for this residue in binding ATP. The beta-phosphate group is linked to the main-chain carbonyl oxygen atom of Tyr15 through an intermediate water molecule. The gamma-phosphate group interacts with the carbonyl oxygen atom and the side-chain of Asn17.(ABSTRACT TRUNCATED AT 400 WORDS)


Assuntos
Trifosfato de Adenosina/metabolismo , Escherichia coli/enzimologia , Metionina tRNA Ligase/metabolismo , Simulação por Computador , Metionina tRNA Ligase/química , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Termodinâmica , Difração de Raios X , Zinco/metabolismo
3.
J Mol Biol ; 250(2): 123-7, 1995 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-7608964

RESUMO

The availability of specialized sequence databanks for Escherichia coli, Saccharomyces cerevisiae and Bacillus subtilis made it possible to build a set of 105 protein-coding genes that are homologous in these three species. An analysis of the triplets at both the nucleotide and amino acid level revealed that the codon bias of some amino acids are significantly higher at conserved rather than at non-conserved positions. Comparisons of homologous genes in E. coli and Salmonella typhimurium, and in S. cerevisiae and Drosophila melanogaster, led to the same conclusion. A special case was made for serine in E. coli, whose major codon is AGC for non-conserved and TCC for conserved residues. We interpret this observation as evidence that the primordial codons for serine were TCN, while codons AGY appeared later. This conclusion is substantiated by an analysis of the codon usage of catalytic serine residues in ancient, ubiquitous and essential proteins (ATP synthases and topoisomerases). It is shown that in these proteins the proportion of the catalytic serine residues coded by TCN is significantly higher than the one expected from the overall codon usage of serine residues.


Assuntos
Evolução Biológica , Códon/genética , Sequência Conservada/genética , Código Genético/genética , Serina/genética , Sequência de Aminoácidos , Bacillus subtilis/genética , Sequência de Bases , Escherichia coli/genética , Saccharomyces cerevisiae/genética
4.
J Mol Biol ; 204(4): 1019-29, 1988 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-3221397

RESUMO

Amino acid substitutions in evolutionarily related proteins have been studied from a structural point of view. We consider here that an amino acid al in a protein p1 has been replaced by the amino acid a2 in the structurally similar protein p2 if, after superposition of the p1 and p2 structures, the a1 and a2 C alpha atoms are no more than 1.2 A apart. Thirty-two proteins, grouped in 11 classes, have been analysed by this method. This produced 2860 amino acid pairs (substitutions), which were analysed by multi-dimensional statistical methods. The main results are as follows: (1) according to the observed exchangeability of amino acid side-chains, only four groups (strong clusters) could be delineated; (i) Ile and Val, (ii) Leu and Met, (iii) Lys, Arg and Gln, and (iv) Tyr and Phe. The other residues could not be classified. (2) The matrix of distances between amino acids, or scoring matrix, determined from this study, is different from any other published matrix. (3) Except for the distance matrices based on the chemical properties of amino acid side-chains, which can be grouped together, all other published matrices are different from one another. (4) The distance matrix determined in this study seems to be very efficient for aligning distantly related protein sequences.


Assuntos
Aminoácidos/metabolismo , Proteínas/metabolismo , Sequência de Aminoácidos , Aminoácidos/classificação , Animais , Proteínas de Bactérias/metabolismo , Evolução Biológica , Humanos , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão , Estatística como Assunto
5.
J Mol Biol ; 171(4): 571-6, 1983 Dec 25.
Artigo em Inglês | MEDLINE | ID: mdl-6363712

RESUMO

The three-dimensional structures of two animoacyl-tRNA synthetases, the methionyl-tRNA synthetase from Escherichia coli (MetRS) and the tyrosyl-tRNA synthetase from Bacillus stearothermophilus (TyrRS), show a remarkable similarity over a span of about 140 amino acids. The region of homologous folding corresponds to a five-stranded parallel beta-sheet, including a mononucleotide-binding fold. One cysteine and two histidine residues that were found to be invariant in the amino acid sequences occupy similar places in the nucleotide-binding fold. In TyrRS, these residues are close to the adenylate binding site, and in MetRS to the Mg2+-ATP binding site.


Assuntos
Aminoacil-tRNA Sintetases , Metionina tRNA Ligase , Tirosina-tRNA Ligase , Sequência de Aminoácidos , Escherichia coli/enzimologia , Geobacillus stearothermophilus/enzimologia , Modelos Moleculares , Conformação Proteica
6.
J Mol Biol ; 306(4): 863-76, 2001 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-11243794

RESUMO

Amino acid selection by aminoacyl-tRNA synthetases requires efficient mechanisms to avoid incorrect charging of the cognate tRNAs. A proofreading mechanism prevents Escherichia coli methionyl-tRNA synthetase (EcMet-RS) from activating in vivo L-homocysteine, a natural competitor of L-methionine recognised by the enzyme. The crystal structure of the complex between EcMet-RS and L-methionine solved at 1.8 A resolution exhibits some conspicuous differences with the recently published free enzyme structure. Thus, the methionine delta-sulphur atom replaces a water molecule H-bonded to Leu13N and Tyr260O(eta) in the free enzyme. Rearrangements of aromatic residues enable the protein to form a hydrophobic pocket around the ligand side-chain. The subsequent formation of an extended water molecule network contributes to relative displacements, up to 3 A, of several domains of the protein. The structure of this complex supports a plausible mechanism for the selection of L-methionine versus L-homocysteine and suggests the possibility of information transfer between the different functional domains of the enzyme.


Assuntos
Escherichia coli/enzimologia , Metionina tRNA Ligase/química , Metionina tRNA Ligase/metabolismo , Metionina/metabolismo , Regulação Alostérica , Sítio Alostérico , Sequência de Aminoácidos , Ligação Competitiva , Cristalização , Cristalografia por Raios X , Homocisteína/metabolismo , Ligação de Hidrogênio , Metionina/química , Modelos Moleculares , Dados de Sequência Molecular , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Alinhamento de Sequência , Especificidade por Substrato , Água/química , Água/metabolismo
7.
DNA Res ; 4(4): 257-65, 1997 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-9405933

RESUMO

Analysis of the codon usage of genes coding for the structural components of the outer membrane in Escherichia coli, is consistent with the requirement for high expression of these genes. Because porins (which constitute the major protein component of the outer membrane), and LPS (which constitute the major outermost constituent of the outer membrane), are synthesized from genes displaying widely different codon usage, it is possible to investigate the origin of the outer membrane. The analysis predicts that the outer membrane might originate from a genome other than the genome coding for the major part of the cell. Such a special origin would explain in structural terms, the likely lethality of porins if they were inadvertently inserted within the inner membrane, giving rise to the Gram-negative bacterial type, having an envelope comprising two membranes, instead of a single cytoplasmic membrane and a murein sacculus.


Assuntos
Proteínas da Membrana Bacteriana Externa/genética , Códon , Escherichia coli/genética , Genoma Bacteriano , RNA de Transferência/genética
8.
FEBS Lett ; 179(1): 133-7, 1985 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-3965297

RESUMO

The crystal structure of a small calcium-binding protein, the parvalbumin IIIf from Opsanus tau in which Tb was substituted for Ca, has been analysed by multiwavelength anomalous diffraction. Data at a resolution of 2.3 A were collected at three wavelengths near the L3 absorption edge of Tb (1.645-1.650 A), using the synchrotron radiation emitted by a storage ring and a multiwire proportional counter. The phases of the reflections were determined from this single derivative, without native data. Prior to any refinement, the resulting electron density map shows a good agreement with the model of the homologous carp parvalbumin in regions of identical amino-acid sequence.


Assuntos
Proteínas Musculares , Parvalbuminas , Animais , Carpas , Peixes , Modelos Moleculares , Conformação Proteica , Especificidade da Espécie , Difração de Raios X
9.
FEBS Lett ; 446(1): 6-8, 1999 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-10100603

RESUMO

Poly(ADP-ribose)polymerase is a nuclear NAD-dependent enzyme and an essential nick sensor involved in cellular processes where nicking and rejoining of DNA strands are required. The inter-alpha-inhibitor family is comprized of several plasma proteins that all harbor one or more so-called heavy chains designated H1-H4. The latter originate from precursor polypeptides H1P-H4P whose upper two thirds are highly homologous. We now describe a novel protein that includes (i) a so-called BRCT domain found in many proteins involved in DNA repair, (ii) an area that is homologous to the NAD-dependent catalytic domain of poly(ADP-ribose)polymerase, (iii) an area that is homologous to the upper two thirds of precursor polypeptides H1P-H4P and (iv) a proline-rich region with a potential nuclear localization signal. This protein now designated PH5P points to as yet unsuspected links between poly(ADP-ribose)polymerase and the inter-alpha-inhibitor family and is likely to be involved in DNA repair.


Assuntos
alfa-Globulinas/metabolismo , Reparo do DNA , Proteínas Nucleares/metabolismo , Poli(ADP-Ribose) Polimerases/metabolismo , alfa-Globulinas/genética , Animais , Humanos
10.
Biochimie ; 78(5): 311-4, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-8905149

RESUMO

A significant proportion of coding sequences or open reading frames discovered in the course of sequencing projects do not show any similarity with other sequences deposited with the protein databanks. In such cases the search for similarities must be performed with as many comparison algorithms as possible, so as to increase the chance of finding weak relationships. A specialised parallel hardware (SAMBA) implementing the Smith & Waterman algorithm has been developed at the 'Institut de Recherche en Informatique et Systèmes Aléatoìres' (IRISA). It makes it possible to scan protein databanks at a speed comparable with that of BLAST or FASTA. We report here a study performed with SAMBA on 814 orphan sequences from S cerevisiae and compare the results with those from BLAST and FASTA.


Assuntos
DNA Fúngico/genética , Genes Fúngicos , Fases de Leitura Aberta , Homologia de Sequência de Aminoácidos , Algoritmos , Sequência de Aminoácidos , Dados de Sequência Molecular , Família Multigênica
11.
Biochimie ; 74(6): 571-80, 1992 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-1520737

RESUMO

A simple and efficient method is described for analyzing quantitatively multiple protein sequence alignments and finding the most conserved blocks as well as the maxima of divergence within the set of aligned sequences. It consists of calculating the mean distance and the root-mean-square distance in each column of the multiple alignment, averaging the values in a window of defined length and plotting the results as a function of the position of the window. Due attention is paid to the presence of gaps in the columns. Several examples are provided, using the sequences of several cytochromes c, serine proteases, lysozymes and globins. Two distance matrices are compared, namely the matrix derived by Gribskov and Burgess from the Dayhoff matrix, and the Risler Structural Superposition Matrix. In each case, the divergence plots effectively point to the specific residues which are known to be essential for the catalytic activity of the proteins. In addition, the regions of maximum divergence are clearly delineated. Interestingly, they are generally observed in positions immediately flanking the most conserved blocks. The method should therefore be useful for delineating the peptide segments which will be good candidates for site-directed mutagenesis and for visualizing the evolutionary constraints along homologous polypeptide chains.


Assuntos
Evolução Biológica , Proteínas/química , Proteínas/classificação , Alinhamento de Sequência , Sequência de Aminoácidos , Animais , Dados de Sequência Molecular , Proteínas/genética , Alinhamento de Sequência/métodos , Homologia de Sequência do Ácido Nucleico , Software
12.
Biochimie ; 77(3): 194-203, 1995.
Artigo em Inglês | MEDLINE | ID: mdl-7647112

RESUMO

The superimposable dinucleotide fold domains of MetRS, GlnRS and TyrRS define structurally equivalent amino acids which have been used to constrain the sequence alignments of the 10 class I aminoacyl-tRNA synthetases (aaRS). The conservation of those residues which have been shown to be critical in some aaRS enables to predict their location and function in the other synthetases, particularly: i) a conserved negatively-charged residue which binds the alpha-amino group of the amino acid substrate; ii) conserved residues within the inserted domain bridging the two halves of the dinucleotide-binding fold; and iii) conserved residues in the second half of the fold which bind the amino acid and ATP substrate. The alignments also indicate that the class I synthetases may be partitioned into two subgroups: a) MetRS, IleRS, LeuRS, ValRS, CysRS and ArgRS; b) GlnRS, GluRS, TyrRS and TrpRS.


Assuntos
Aminoacil-tRNA Sintetases/química , Alinhamento de Sequência/classificação , Sequência de Aminoácidos , Aminoacil-tRNA Sintetases/classificação , Escherichia coli/química , Escherichia coli/enzimologia , Metionina tRNA Ligase/química , Modelos Químicos , Dados de Sequência Molecular , Conformação Proteica , Homologia de Sequência de Aminoácidos
13.
J Comput Biol ; 8(4): 381-99, 2001.
Artigo em Inglês | MEDLINE | ID: mdl-11571074

RESUMO

We propose and study a new approach for the analysis of families of protein sequences. This method is related to the LogDet distances used in phylogenetic reconstructions; it can be viewed as an attempt to embed these distances into a multidimensional framework. The proposed method starts by associating a Markov matrix to each pairwise alignment deduced from a given multiple alignment. The central objects under consideration here are matrix-valued logarithms L of these Markov matrices, which exist under conditions that are compatible with fairly large divergence between the sequences. These logarithms allow us to compare data from a family of aligned proteins with simple models (in particular, continuous reversible Markov models) and to test the adequacy of such models. If one neglects fluctuations arising from the finite length of sequences, any continuous reversible Markov model with a single rate matrix Q over an arbitrary tree predicts that all the observed matrices L are multiples of Q. Our method exploits this fact, without relying on any tree estimation. We test this prediction on a family of proteins encoded by the mitochondrial genome of 26 multicellular animals, which include vertebrates, arthropods, echinoderms, molluscs, and nematodes. A principal component analysis of the observed matrices L shows that a single rate model can be used as a rough approximation to the data, but that systematic deviations from any such model are unmistakable and related to the evolutionary history of the species under consideration.


Assuntos
Biologia Computacional , Proteínas/genética , Alinhamento de Sequência/estatística & dados numéricos , Simulação por Computador , DNA Mitocondrial/genética , Evolução Molecular , Cadeias de Markov , Filogenia , Análise de Sequência de Proteína/estatística & dados numéricos , Processos Estocásticos
14.
Comput Biol Chem ; 28(3): 211-8, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15261151

RESUMO

Although the characterization of proteins cannot solely rely upon sequence similarity, it has been widely proved that all-vs-all massive sequence comparisons may be an effective approach and a good basis for the prediction of biochemical functions or for the delineation of common shared properties. The program Cluster-C presented here enables a stand-alone and efficient construction of protein families within whole proteomes. The algorithm, which is based on the detection of cliques, ensures a high level of connectivity within the clusters. As opposed to the single transitive linkage method, Cluster-C allows a large number of sequences to be classified in such a way that the multidomain proteins do not produce a chain-grouping effect resulting in meaningless clusters. Moreover, some proteins can be present in several different but relevant clusters, which is of help in the determination of their functional domains. In the present analysis we used the Z-value, an evaluation of the significance of the similarity score, as the criterion for connecting sequences (the user can freely define the threshold of the similarity criterion). The clusters built with a rather low threshold (Z= 14) include more than 97% of the sequences and are consistent with known protein families and PROSITE patterns.


Assuntos
Algoritmos , Alinhamento de Sequência/métodos , Sequência de Aminoácidos/genética , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Análise por Conglomerados , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteoma/química , Proteoma/genética
19.
Comput Appl Biosci ; 10(4): 453-4, 1994 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-7804879

RESUMO

Fast sequence databanks search algorithms generally make use of hash tables and look for exactly matching words. An increased sensitivity--at the expense of a decreased selectivity--can be attained in the case of proteins by using a reduced amino acid alphabet. We propose here an alphabet reduced to 10 symbols, that we used in modified versions of the FASTP and SCAN programs. An application to the aminoacyl-tRNA synthetases shows that this technique may be useful in detecting distant relationships between proteins.


Assuntos
Bases de Dados Factuais , Proteínas/genética , Software , Algoritmos , Sequência de Aminoácidos , Aminoacil-tRNA Sintetases/genética , Escherichia coli/enzimologia , Escherichia coli/genética , Dados de Sequência Molecular , Oligopeptídeos/genética , Alinhamento de Sequência/métodos , Terminologia como Assunto
20.
Nucleic Acids Res ; 20(14): 3631-7, 1992 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-1641329

RESUMO

The present work describes an attempt to identify reliable criteria which could be used as distance indices between protein sequences. Seven different criteria have been tested: i and ii) the scores of the alignments as given by the BESTFIT and the FASTA programs; iii) the ratio parameter, i.e. the BESTFIT score divided by the length of the aligned peptides; iv and v) the statistical significance (Z-scores) of the scores calculated by BESTFIT and FASTA, as obtained by comparison with shuffled sequences; vi) the Z-scores provided by the program RELATE which performs a segment-by-segment comparison of 2 sequences, and vii) an original distance index calculated by the program DOCMA from all the pairwise dotplots between the sequences. These 7 criteria have been tested against the aminoacid sequences of 39 globins and those of the 20 aminoacyl-tRNA synthetases from E. coli. The distances between the sequences were analyzed by the multivariate analysis techniques. The results show that the distances calculated from the scores of the pairwise alignments are not adequately sensitive. The Z-score from RELATE is not selective enough and too demanding in computer time. Three criteria gave a classification consistent with the known similarities between the sequences in the sets, namely the Z-scores from BESTFIT and FASTA and the multiple dotplot comparison distance index from DOCMA.


Assuntos
Aminoacil-tRNA Sintetases/classificação , Globinas/classificação , Alinhamento de Sequência/classificação , Algoritmos , Aminoacil-tRNA Sintetases/química , Globinas/química , Análise Multivariada , Alinhamento de Sequência/estatística & dados numéricos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA