Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Nucleic Acids Res ; 32(Database issue): D351-3, 2004 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-14681432

RESUMEN

All the protein sequences from plants (including Arabidopsis thaliana) available from SwissProt/TrEMBL have been the subject of an all-by-all systematic comparison and grouped into clusters of related proteins. Within each cluster, the sequences have been submitted to pyramidal classification; in the case where two or several subfamilies have been grouped together, the pyramidal tree helps in finding which sequences make the links between subfamilies. In addition, the 'domains' that are common to two or more sequences within a cluster were determined and displayed à la ProDom. The resulting graphical representations proved to be quite efficient in pinpointing those protein sequences suffering from a probable error in the annotation of their genes. The clusters can be searched through various criteria and their pyramidal classifications and their domain representations can be displayed by querying http://genoplante-info. infobiogen.fr/phytoprot. The user can also launch a BLAST search of a query sequence against all the clusters.


Asunto(s)
Biología Computacional , Bases de Datos de Proteínas , Proteínas de Plantas/clasificación , Proteoma , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/clasificación , Análisis por Conglomerados , Internet , Proteínas de Plantas/química , Estructura Terciaria de Proteína , Proteoma/química , Proteoma/clasificación , Proteómica
2.
J Mol Biol ; 216(2): 411-24, 1990 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-2254937

RESUMEN

The crystal structure of the tryptic fragment of the methionyl-tRNA synthetase from Escherichia coli, complexed with ATP, has been refined to a crystallographic R-factor of 0.220, at 2.5 A resolution (for 4433 protein atoms). In the last stages of the refinement, the simulated annealing refinement method was fully applied, contributing to a drastic improvement of the model and the identification of the missing atoms. In the final model, the root-mean-square deviation from ideality for bond distances is 0.021 A and for angle distances is 0.054 A. The position of the zinc ion has been confirmed and is located near the active site. The tryptic fragment is composed of two globular domains. The first domain, from the N terminus to Thr360, contains a nucleotide-binding fold into which two long polypeptides of 101 and 70 residues are inserted. The nucleotide-binding fold is strengthened by the presence of the zinc ion in the vicinity of the active site. The second domain, up to Pro526, is mainly alpha-helical. The C-terminal polypeptide, Phe527 to Lys551, folds back towards the first domain, making a link between the two domains. The heptapeptide 528-534 partly shapes a deep cavity that plunges into the central core of the nucleotide-binding fold, where the ATP molecule is located. The adenine ring, deeply buried in the bottom of the cleft, is blocked between the first helix HA, and the strands A and D of the beta-sheet and makes no polar interaction with the enzyme. The 2' and 3' hydroxyl groups of the ribose, whose conformation is C2' endo, interact with the main-chain carbonyl oxygen atoms of Ile231 and Glu241, respectively. The side-chain nitrogen atom of Lys142 is at hydrogen-bonding distance from the ring oxygen O-4' of the ribose. One of the alpha-phosphate oxygen atoms and one of the gamma-phosphate oxygen atoms interact with the imidazole ring of His21, which is well conserved in many of the known synthetases; this indicates a possible crucial role for this residue in binding ATP. The beta-phosphate group is linked to the main-chain carbonyl oxygen atom of Tyr15 through an intermediate water molecule. The gamma-phosphate group interacts with the carbonyl oxygen atom and the side-chain of Asn17.(ABSTRACT TRUNCATED AT 400 WORDS)


Asunto(s)
Adenosina Trifosfato/metabolismo , Escherichia coli/enzimología , Metionina-ARNt Ligasa/metabolismo , Simulación por Computador , Metionina-ARNt Ligasa/química , Modelos Moleculares , Unión Proteica , Conformación Proteica , Termodinámica , Difracción de Rayos X , Zinc/metabolismo
3.
J Mol Biol ; 250(2): 123-7, 1995 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-7608964

RESUMEN

The availability of specialized sequence databanks for Escherichia coli, Saccharomyces cerevisiae and Bacillus subtilis made it possible to build a set of 105 protein-coding genes that are homologous in these three species. An analysis of the triplets at both the nucleotide and amino acid level revealed that the codon bias of some amino acids are significantly higher at conserved rather than at non-conserved positions. Comparisons of homologous genes in E. coli and Salmonella typhimurium, and in S. cerevisiae and Drosophila melanogaster, led to the same conclusion. A special case was made for serine in E. coli, whose major codon is AGC for non-conserved and TCC for conserved residues. We interpret this observation as evidence that the primordial codons for serine were TCN, while codons AGY appeared later. This conclusion is substantiated by an analysis of the codon usage of catalytic serine residues in ancient, ubiquitous and essential proteins (ATP synthases and topoisomerases). It is shown that in these proteins the proportion of the catalytic serine residues coded by TCN is significantly higher than the one expected from the overall codon usage of serine residues.


Asunto(s)
Evolución Biológica , Codón/genética , Secuencia Conservada/genética , Código Genético/genética , Serina/genética , Secuencia de Aminoácidos , Bacillus subtilis/genética , Secuencia de Bases , Escherichia coli/genética , Saccharomyces cerevisiae/genética
4.
J Mol Biol ; 204(4): 1019-29, 1988 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-3221397

RESUMEN

Amino acid substitutions in evolutionarily related proteins have been studied from a structural point of view. We consider here that an amino acid al in a protein p1 has been replaced by the amino acid a2 in the structurally similar protein p2 if, after superposition of the p1 and p2 structures, the a1 and a2 C alpha atoms are no more than 1.2 A apart. Thirty-two proteins, grouped in 11 classes, have been analysed by this method. This produced 2860 amino acid pairs (substitutions), which were analysed by multi-dimensional statistical methods. The main results are as follows: (1) according to the observed exchangeability of amino acid side-chains, only four groups (strong clusters) could be delineated; (i) Ile and Val, (ii) Leu and Met, (iii) Lys, Arg and Gln, and (iv) Tyr and Phe. The other residues could not be classified. (2) The matrix of distances between amino acids, or scoring matrix, determined from this study, is different from any other published matrix. (3) Except for the distance matrices based on the chemical properties of amino acid side-chains, which can be grouped together, all other published matrices are different from one another. (4) The distance matrix determined in this study seems to be very efficient for aligning distantly related protein sequences.


Asunto(s)
Aminoácidos/metabolismo , Proteínas/metabolismo , Secuencia de Aminoácidos , Aminoácidos/clasificación , Animales , Proteínas Bacterianas/metabolismo , Evolución Biológica , Humanos , Datos de Secuencia Molecular , Reconocimiento de Normas Patrones Automatizadas , Estadística como Asunto
5.
J Mol Biol ; 306(4): 863-76, 2001 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-11243794

RESUMEN

Amino acid selection by aminoacyl-tRNA synthetases requires efficient mechanisms to avoid incorrect charging of the cognate tRNAs. A proofreading mechanism prevents Escherichia coli methionyl-tRNA synthetase (EcMet-RS) from activating in vivo L-homocysteine, a natural competitor of L-methionine recognised by the enzyme. The crystal structure of the complex between EcMet-RS and L-methionine solved at 1.8 A resolution exhibits some conspicuous differences with the recently published free enzyme structure. Thus, the methionine delta-sulphur atom replaces a water molecule H-bonded to Leu13N and Tyr260O(eta) in the free enzyme. Rearrangements of aromatic residues enable the protein to form a hydrophobic pocket around the ligand side-chain. The subsequent formation of an extended water molecule network contributes to relative displacements, up to 3 A, of several domains of the protein. The structure of this complex supports a plausible mechanism for the selection of L-methionine versus L-homocysteine and suggests the possibility of information transfer between the different functional domains of the enzyme.


Asunto(s)
Escherichia coli/enzimología , Metionina-ARNt Ligasa/química , Metionina-ARNt Ligasa/metabolismo , Metionina/metabolismo , Regulación Alostérica , Sitio Alostérico , Secuencia de Aminoácidos , Unión Competitiva , Cristalización , Cristalografía por Rayos X , Homocisteína/metabolismo , Enlace de Hidrógeno , Metionina/química , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Alineación de Secuencia , Especificidad por Sustrato , Agua/química , Agua/metabolismo
6.
J Mol Biol ; 171(4): 571-6, 1983 Dec 25.
Artículo en Inglés | MEDLINE | ID: mdl-6363712

RESUMEN

The three-dimensional structures of two animoacyl-tRNA synthetases, the methionyl-tRNA synthetase from Escherichia coli (MetRS) and the tyrosyl-tRNA synthetase from Bacillus stearothermophilus (TyrRS), show a remarkable similarity over a span of about 140 amino acids. The region of homologous folding corresponds to a five-stranded parallel beta-sheet, including a mononucleotide-binding fold. One cysteine and two histidine residues that were found to be invariant in the amino acid sequences occupy similar places in the nucleotide-binding fold. In TyrRS, these residues are close to the adenylate binding site, and in MetRS to the Mg2+-ATP binding site.


Asunto(s)
Aminoacil-ARNt Sintetasas , Metionina-ARNt Ligasa , Tirosina-ARNt Ligasa , Secuencia de Aminoácidos , Escherichia coli/enzimología , Geobacillus stearothermophilus/enzimología , Modelos Moleculares , Conformación Proteica
7.
DNA Res ; 4(4): 257-65, 1997 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-9405933

RESUMEN

Analysis of the codon usage of genes coding for the structural components of the outer membrane in Escherichia coli, is consistent with the requirement for high expression of these genes. Because porins (which constitute the major protein component of the outer membrane), and LPS (which constitute the major outermost constituent of the outer membrane), are synthesized from genes displaying widely different codon usage, it is possible to investigate the origin of the outer membrane. The analysis predicts that the outer membrane might originate from a genome other than the genome coding for the major part of the cell. Such a special origin would explain in structural terms, the likely lethality of porins if they were inadvertently inserted within the inner membrane, giving rise to the Gram-negative bacterial type, having an envelope comprising two membranes, instead of a single cytoplasmic membrane and a murein sacculus.


Asunto(s)
Proteínas de la Membrana Bacteriana Externa/genética , Codón , Escherichia coli/genética , Genoma Bacteriano , ARN de Transferencia/genética
8.
FEBS Lett ; 179(1): 133-7, 1985 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-3965297

RESUMEN

The crystal structure of a small calcium-binding protein, the parvalbumin IIIf from Opsanus tau in which Tb was substituted for Ca, has been analysed by multiwavelength anomalous diffraction. Data at a resolution of 2.3 A were collected at three wavelengths near the L3 absorption edge of Tb (1.645-1.650 A), using the synchrotron radiation emitted by a storage ring and a multiwire proportional counter. The phases of the reflections were determined from this single derivative, without native data. Prior to any refinement, the resulting electron density map shows a good agreement with the model of the homologous carp parvalbumin in regions of identical amino-acid sequence.


Asunto(s)
Proteínas Musculares , Parvalbúminas , Animales , Carpas , Peces , Modelos Moleculares , Conformación Proteica , Especificidad de la Especie , Difracción de Rayos X
9.
FEBS Lett ; 446(1): 6-8, 1999 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-10100603

RESUMEN

Poly(ADP-ribose)polymerase is a nuclear NAD-dependent enzyme and an essential nick sensor involved in cellular processes where nicking and rejoining of DNA strands are required. The inter-alpha-inhibitor family is comprized of several plasma proteins that all harbor one or more so-called heavy chains designated H1-H4. The latter originate from precursor polypeptides H1P-H4P whose upper two thirds are highly homologous. We now describe a novel protein that includes (i) a so-called BRCT domain found in many proteins involved in DNA repair, (ii) an area that is homologous to the NAD-dependent catalytic domain of poly(ADP-ribose)polymerase, (iii) an area that is homologous to the upper two thirds of precursor polypeptides H1P-H4P and (iv) a proline-rich region with a potential nuclear localization signal. This protein now designated PH5P points to as yet unsuspected links between poly(ADP-ribose)polymerase and the inter-alpha-inhibitor family and is likely to be involved in DNA repair.


Asunto(s)
alfa-Globulinas/metabolismo , Reparación del ADN , Proteínas Nucleares/metabolismo , Poli(ADP-Ribosa) Polimerasas/metabolismo , alfa-Globulinas/genética , Animales , Humanos
10.
Biochimie ; 78(5): 311-4, 1996.
Artículo en Inglés | MEDLINE | ID: mdl-8905149

RESUMEN

A significant proportion of coding sequences or open reading frames discovered in the course of sequencing projects do not show any similarity with other sequences deposited with the protein databanks. In such cases the search for similarities must be performed with as many comparison algorithms as possible, so as to increase the chance of finding weak relationships. A specialised parallel hardware (SAMBA) implementing the Smith & Waterman algorithm has been developed at the 'Institut de Recherche en Informatique et Systèmes Aléatoìres' (IRISA). It makes it possible to scan protein databanks at a speed comparable with that of BLAST or FASTA. We report here a study performed with SAMBA on 814 orphan sequences from S cerevisiae and compare the results with those from BLAST and FASTA.


Asunto(s)
ADN de Hongos/genética , Genes Fúngicos , Sistemas de Lectura Abierta , Homología de Secuencia de Aminoácido , Algoritmos , Secuencia de Aminoácidos , Datos de Secuencia Molecular , Familia de Multigenes
11.
Biochimie ; 74(6): 571-80, 1992 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-1520737

RESUMEN

A simple and efficient method is described for analyzing quantitatively multiple protein sequence alignments and finding the most conserved blocks as well as the maxima of divergence within the set of aligned sequences. It consists of calculating the mean distance and the root-mean-square distance in each column of the multiple alignment, averaging the values in a window of defined length and plotting the results as a function of the position of the window. Due attention is paid to the presence of gaps in the columns. Several examples are provided, using the sequences of several cytochromes c, serine proteases, lysozymes and globins. Two distance matrices are compared, namely the matrix derived by Gribskov and Burgess from the Dayhoff matrix, and the Risler Structural Superposition Matrix. In each case, the divergence plots effectively point to the specific residues which are known to be essential for the catalytic activity of the proteins. In addition, the regions of maximum divergence are clearly delineated. Interestingly, they are generally observed in positions immediately flanking the most conserved blocks. The method should therefore be useful for delineating the peptide segments which will be good candidates for site-directed mutagenesis and for visualizing the evolutionary constraints along homologous polypeptide chains.


Asunto(s)
Evolución Biológica , Proteínas/química , Proteínas/clasificación , Alineación de Secuencia , Secuencia de Aminoácidos , Animales , Datos de Secuencia Molecular , Proteínas/genética , Alineación de Secuencia/métodos , Homología de Secuencia de Ácido Nucleico , Programas Informáticos
12.
Biochimie ; 77(3): 194-203, 1995.
Artículo en Inglés | MEDLINE | ID: mdl-7647112

RESUMEN

The superimposable dinucleotide fold domains of MetRS, GlnRS and TyrRS define structurally equivalent amino acids which have been used to constrain the sequence alignments of the 10 class I aminoacyl-tRNA synthetases (aaRS). The conservation of those residues which have been shown to be critical in some aaRS enables to predict their location and function in the other synthetases, particularly: i) a conserved negatively-charged residue which binds the alpha-amino group of the amino acid substrate; ii) conserved residues within the inserted domain bridging the two halves of the dinucleotide-binding fold; and iii) conserved residues in the second half of the fold which bind the amino acid and ATP substrate. The alignments also indicate that the class I synthetases may be partitioned into two subgroups: a) MetRS, IleRS, LeuRS, ValRS, CysRS and ArgRS; b) GlnRS, GluRS, TyrRS and TrpRS.


Asunto(s)
Aminoacil-ARNt Sintetasas/química , Alineación de Secuencia/clasificación , Secuencia de Aminoácidos , Aminoacil-ARNt Sintetasas/clasificación , Escherichia coli/química , Escherichia coli/enzimología , Metionina-ARNt Ligasa/química , Modelos Químicos , Datos de Secuencia Molecular , Conformación Proteica , Homología de Secuencia de Aminoácido
13.
J Comput Biol ; 8(4): 381-99, 2001.
Artículo en Inglés | MEDLINE | ID: mdl-11571074

RESUMEN

We propose and study a new approach for the analysis of families of protein sequences. This method is related to the LogDet distances used in phylogenetic reconstructions; it can be viewed as an attempt to embed these distances into a multidimensional framework. The proposed method starts by associating a Markov matrix to each pairwise alignment deduced from a given multiple alignment. The central objects under consideration here are matrix-valued logarithms L of these Markov matrices, which exist under conditions that are compatible with fairly large divergence between the sequences. These logarithms allow us to compare data from a family of aligned proteins with simple models (in particular, continuous reversible Markov models) and to test the adequacy of such models. If one neglects fluctuations arising from the finite length of sequences, any continuous reversible Markov model with a single rate matrix Q over an arbitrary tree predicts that all the observed matrices L are multiples of Q. Our method exploits this fact, without relying on any tree estimation. We test this prediction on a family of proteins encoded by the mitochondrial genome of 26 multicellular animals, which include vertebrates, arthropods, echinoderms, molluscs, and nematodes. A principal component analysis of the observed matrices L shows that a single rate model can be used as a rough approximation to the data, but that systematic deviations from any such model are unmistakable and related to the evolutionary history of the species under consideration.


Asunto(s)
Biología Computacional , Proteínas/genética , Alineación de Secuencia/estadística & datos numéricos , Simulación por Computador , ADN Mitocondrial/genética , Evolución Molecular , Cadenas de Markov , Filogenia , Análisis de Secuencia de Proteína/estadística & datos numéricos , Procesos Estocásticos
14.
Comput Biol Chem ; 28(3): 211-8, 2004 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-15261151

RESUMEN

Although the characterization of proteins cannot solely rely upon sequence similarity, it has been widely proved that all-vs-all massive sequence comparisons may be an effective approach and a good basis for the prediction of biochemical functions or for the delineation of common shared properties. The program Cluster-C presented here enables a stand-alone and efficient construction of protein families within whole proteomes. The algorithm, which is based on the detection of cliques, ensures a high level of connectivity within the clusters. As opposed to the single transitive linkage method, Cluster-C allows a large number of sequences to be classified in such a way that the multidomain proteins do not produce a chain-grouping effect resulting in meaningless clusters. Moreover, some proteins can be present in several different but relevant clusters, which is of help in the determination of their functional domains. In the present analysis we used the Z-value, an evaluation of the significance of the similarity score, as the criterion for connecting sequences (the user can freely define the threshold of the similarity criterion). The clusters built with a rather low threshold (Z= 14) include more than 97% of the sequences and are consistent with known protein families and PROSITE patterns.


Asunto(s)
Algoritmos , Alineación de Secuencia/métodos , Secuencia de Aminoácidos/genética , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Análisis por Conglomerados , Biología Computacional/métodos , Bases de Datos de Proteínas , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteoma/química , Proteoma/genética
19.
Comput Appl Biosci ; 10(4): 453-4, 1994 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-7804879

RESUMEN

Fast sequence databanks search algorithms generally make use of hash tables and look for exactly matching words. An increased sensitivity--at the expense of a decreased selectivity--can be attained in the case of proteins by using a reduced amino acid alphabet. We propose here an alphabet reduced to 10 symbols, that we used in modified versions of the FASTP and SCAN programs. An application to the aminoacyl-tRNA synthetases shows that this technique may be useful in detecting distant relationships between proteins.


Asunto(s)
Bases de Datos Factuales , Proteínas/genética , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Aminoacil-ARNt Sintetasas/genética , Escherichia coli/enzimología , Escherichia coli/genética , Datos de Secuencia Molecular , Oligopéptidos/genética , Alineación de Secuencia/métodos , Terminología como Asunto
20.
Comput Appl Biosci ; 9(2): 191-6, 1993 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-8481822

RESUMEN

A method aimed at classifying protein sequences without resorting to pairwise alignment is presented. Called DOCMA (DOt-plot Comparisons by Multivariate Analysis), it is based on a multivariate analysis of the pairwise dot-plots between all the sequences in the set. The dot-plots are first simplified by considering only the projections of the 'diagonal' segments of similarity onto the axes. From these projections a data matrix is built, in which each column is representative of the comparisons of one given sequence with all the other ones. This data matrix is then transformed into a distance matrix by a chi-squared analysis, from which the coordinates of the sequences in an orthonormal Euclidean space are obtained. The sequences are finally classified by a dynamic clustering procedure followed by a search for strong clusters. Application of this method to protein families such as the globins, the cytochromes c and the aminoacyl-tRNA synthetases shows that it is quite effective in delineating subgroups that contain even distantly related sequences.


Asunto(s)
Secuencia de Aminoácidos , Análisis Multivariante , Programas Informáticos , Algoritmos , Análisis por Conglomerados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA