Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Mol Evol ; 91(6): 854-864, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-38060007

RESUMO

Folds are the architecture and topology of a protein domain. Categories of folds are very few compared to the astronomical number of sequences. Eukaryotes have more protein folds than Archaea and Bacteria. These folds are of two types: shared with Archaea and/or Bacteria on one hand and specific to eukaryotic clades on the other hand. The first kind of folds is inherited from the first endosymbiosis and confirms the mixed origin of eukaryotes. In a dataset of 1073 folds whose presence or absence has been evidenced among 210 species equally distributed in the three super-kingdoms, we have identified 28 eukaryotic folds unambiguously inherited from Bacteria and 40 eukaryotic folds unambiguously inherited from Archaea. Compared to previous studies, the repartition of informational function is higher than expected for folds originated from Bacteria and as high as expected for folds inherited from Archaea. The second type of folds is specifically eukaryotic and associated with an increase of new folds within eukaryotes distributed in particular clades. Reconstructed ancestral states coupled with dating of each node on the tree of life provided fold appearance rates. The rate is on average twice higher within Eukaryota than within Bacteria or Archaea. The highest rates are found in the origins of eukaryotes, holozoans, metazoans, metazoans stricto sensu, and vertebrates: the roots of these clades correspond to bursts of fold evolution. We could correlate the functions of some of the fold synapomorphies within eukaryotes with significant evolutionary events. Among them, we find evidence for the rise of multicellularity, adaptive immune system, or virus folds which could be linked to an ecological shift made by tetrapods.


Assuntos
Archaea , Bactérias , Animais , Filogenia , Bactérias/genética , Archaea/genética , Proteínas , Eucariotos/genética , Evolução Biológica
2.
J Struct Biol ; 211(2): 107543, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32522553

RESUMO

The effects of a single residue substitution on the protein backbone are frequently quite small and there are many other potential sources of structural variation for protein. We present here a methodology considering different sources of distortions in order to isolate the very effect of the mutation. To validate our methodology, we consider a well-studied family with many single mutants: the human lysozyme. Most of the perturbations are expected to be at the very localisation of the mutation, but in many cases the effects are propagated at long range. We show that the distances between the mutated residue and the 5% most disturbed residues exponentially decreases. One third of the affected residues are in direct contact with the mutated position; the remaining two thirds are potential allosteric effects. We confirm the reliability of the residues identified as significantly perturbed by comparing our results to experimental studies. We confirm with the present method all the previously identified perturbations. This study shows that mutations have long-range impact on protein backbone that can be detected, although the displacement of the affected atoms is small.


Assuntos
Muramidase/ultraestrutura , Proteínas Mutantes/ultraestrutura , Conformação Proteica , Proteínas/ultraestrutura , Sequência de Aminoácidos/genética , Humanos , Muramidase/química , Muramidase/genética , Proteínas Mutantes/química , Proteínas Mutantes/genética , Mutação/genética , Mutação Puntual/genética , Proteínas/química , Proteínas/genética
3.
Bioinformatics ; 35(20): 3970-3980, 2019 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-30942864

RESUMO

MOTIVATION: Multiple sequence alignment programs have proved to be very useful and have already been evaluated in the literature yet not alignment programs based on structure or both sequence and structure. In the present article we wish to evaluate the added value provided through considering structures. RESULTS: We compared the multiple alignments resulting from 25 programs either based on sequence, structure or both, to reference alignments deposited in five databases (BALIBASE 2 and 3, HOMSTRAD, OXBENCH and SISYPHUS). On the whole, the structure-based methods compute more reliable alignments than the sequence-based ones, and even than the sequence+structure-based programs whatever the databases. Two programs lead, MAMMOTH and MATRAS, nevertheless the performances of MUSTANG, MATT, 3DCOMB, TCOFFEE+TM_ALIGN and TCOFFEE+SAP are better for some alignments. The advantage of structure-based methods increases at low levels of sequence identity, or for residues in regular secondary structures or buried ones. Concerning gap management, sequence-based programs set less gaps than structure-based programs. Concerning the databases, the alignments of the manually built databases are more challenging for the programs. AVAILABILITY AND IMPLEMENTATION: All data and results presented in this study are available at: http://wwwabi.snv.jussieu.fr/people/mathilde/download/AliMulComp/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Alinhamento de Sequência , Algoritmos , Bases de Dados de Proteínas , Estrutura Secundária de Proteína , Proteínas , Software
4.
Proteins ; 86(8): 853-867, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-29569365

RESUMO

A structural database of 11 families of chains differing by a single amino acid substitution has been built. Another structural dataset of 5 families with identical sequences has been used for comparison. The RMSD computed after a global superimposition of the mutated protein on each native one is smaller than the RMSD calculated among proteins of identical sequences. The effect of the perturbation is very local, and not necessarily the highest at the position of the mutation. A RMSD between mutated and native proteins is computed over a 3-residue or a 7-residue window at each position. To separate the effects of structural fluctuations due to point mutations from other sources, pair RMSD have been translated into P values which themselves are included in a score called P-RANK. This score allows highlighting small backbone distortions by comparing these RMSD between mutated and native positions to the RMSD at the same positions in the absence of a mutation. It results from the P-RANK that 38% of all mutations produce a significant effect on the displacement. When compared with a random distribution of RMSD at un-mutated positions, we show that, even if the RMSD is greater when the mutation is in loops than in regular secondary structure, the relative effect is more important for regular secondary structures and for buried positions. We confirm the absence of correlation between RMSD and the predicted variation of free energy of folding but we found a small correlation between high RMSD and the error in the prediction of ΔΔG.


Assuntos
Mutação Puntual , Proteínas/química , Bases de Dados de Proteínas , Modelos Moleculares , Estrutura Secundária de Proteína , Termodinâmica
5.
Methods Mol Biol ; 2627: 195-210, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36959449

RESUMO

Evaluation of the structural perturbations introduced by a single amino acid mutation is the main issue for protein structural biology. We propose here to present some recent advances in methods, allowing the splitting of distortion between the actual substitution effect and the contribution of the local flexibility of the position where the mutation occurs. Its main drawback is the need of many structures with a single mutation in each of them. To bypass this difficulty, we propose to use molecular modeling tools, with several software enabling us to build a model from a template, given the sequence. As a proof of concept, we rely on a gold standard, the human lysozyme. Both wild-type and three mutant structures are available in the PDB. Two of these mutations result in amyloid fibril formation, and the last one is neutral. As a conclusion, irrespective of the algorithm used for modeling, side chain conformations at the site of mutation are reliable, although long-range effects are out of reach of these tools.


Assuntos
Proteínas , Software , Humanos , Mutação , Proteínas/química , Modelos Moleculares , Algoritmos , Conformação Proteica
6.
Evolution ; 76(8): 1706-1719, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35765784

RESUMO

Several studies showed that folds (topology of protein secondary structures) distribution in proteomes may be a global proxy to build phylogeny. Then, some folds should be synapomorphies (derived characters exclusively shared among taxa). However, previous studies used methods that did not allow synapomorphy identification, which requires congruence analysis of folds as individual characters. Here, we map SCOP folds onto a sample of 210 species across the tree of life (TOL). Congruence is assessed using retention index of each fold for the TOL, and principal component analysis for deeper branches. Using a bicluster mapping approach, we define synapomorphic blocks of folds (SBF) sharing similar presence/absence patterns. Among the 1232 folds, 20% are universally present in our TOL, whereas 54% are reliable synapomorphies. These results are similar with CATH and ECOD databases. Eukaryotes are characterized by a large number of them, and several SBFs clearly support nested eukaryotic clades (divergence times from 1100 to 380 mya). Although clearly separated, the three superkingdoms reveal a strong mosaic pattern. This pattern is consistent with the dual origin of eukaryotes and witness secondary endosymbiosis in their phothosynthetic clades. Our study unveils direct analysis of folds synapomorphies as key characters to unravel evolutionary history of species.


Assuntos
Evolução Biológica , Eucariotos , Filogenia , Simbiose
7.
J Bacteriol ; 193(8): 2076-7, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21217001

RESUMO

Dickeya dadantii is a plant-pathogenic enterobacterium responsible for the soft rot disease of many plants of economic importance. We present here the sequence of strain 3937, a strain widely used as a model system for research on the molecular biology and pathogenicity of this group of bacteria.


Assuntos
DNA Bacteriano/química , DNA Bacteriano/genética , Enterobacteriaceae/genética , Genoma Bacteriano , Enterobacteriaceae/isolamento & purificação , Dados de Sequência Molecular , Doenças das Plantas/microbiologia , Plantas/microbiologia , Análise de Sequência de DNA
8.
Proteins ; 61(1): 137-51, 2005 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-16049912

RESUMO

YAKUSA is a program designed for rapid scanning of a structural database with a query protein structure. It searches for the longest common substructures called SHSPs (structural high-scoring pairs) existing between a query structure and every structure in the structural database. It makes use of protein backbone internal coordinates (alpha angles) in order to describe protein structures as sequences of symbols. The structural similarities are established in 5 steps, the first 3 being analogous to those used in BLAST: (1) building up a deterministic finite automaton describing all patterns identical or similar to those in the query structure; (2) searching for all these patterns in every structure in the database; (3) extending the patterns to longer matching substructures (i.e., SHSPs); (4) selecting compatible SHSPs for each query-database structure pair; and (5) ranking the query-database structure pairs using 3 scores based on SHSP similarity, on SHSP probabilities, and on spatial compatibility of SHSPs. Structural fragment probabilities are estimated according to a mixture transition distribution model, which is an approximation of a high-order Markov chain model. With regard to sensitivity and selectivity of the structural matches, YAKUSA compares well to the best related programs, although it is by far faster: A typical database scan takes about 40 s CPU time on a desktop personal computer. It has also been implemented on a Web server for real-time searches.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Software , Algoritmos , Internet , Cadeias de Markov , Modelos Moleculares , Dobramento de Proteína , Estrutura Quaternária de Proteína , Estrutura Terciária de Proteína , Proteínas/classificação , Proteínas/metabolismo , Sensibilidade e Especificidade , Homologia Estrutural de Proteína , Fatores de Tempo
9.
PLoS One ; 10(4): e0125098, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25915049

RESUMO

BACKGROUND: Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the "fuzzy oil drop" (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the "drop". If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model. RESULTS: We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.


Assuntos
Flavodoxina/química , Imunoglobulinas/química , Modelos Moleculares , Algoritmos , Sequência de Aminoácidos , Animais , Bactérias/química , Bactérias/metabolismo , Sítios de Ligação , Humanos , Interações Hidrofóbicas e Hidrofílicas , Método de Monte Carlo , Dobramento de Proteína , Estrutura Secundária de Proteína
10.
J Comput Biol ; 16(12): 1635-60, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20047489

RESUMO

The geometrical configurations of atoms in protein structures can be viewed as approximate relations among them. Then, finding similar common substructures within a set of protein structures belongs to a new class of problems that generalizes that of finding repeated motifs. The novelty lies in the addition of constraints on the motifs in terms of relations that must hold between pairs of positions of the motifs. We will hence denote them as relational motifs. For this class of problems, we present an algorithm that is a suitable extension of the KMR paradigm and, in particular, of the KMRC as it uses a degenerate alphabet. Our algorithm contains several improvements that become especially useful when-as it is required for relational motifs-the inference is made by partially overlapping shorter motifs, rather than concatenating them. The efficiency, correctness and completeness of the algorithm is ensured by several non-trivial properties that are proven in this paper. The algorithm has been applied in the important field of protein common 3D substructure searching. The methods implemented have been tested on several examples of protein families such as serine proteases, globins and cytochromes P450 additionally. The detected motifs have been compared to those found by multiple structural alignments methods.


Assuntos
Motivos de Aminoácidos , Biologia Computacional/métodos , Modelos Moleculares , Proteínas/química , Algoritmos , Bases de Dados de Proteínas , Globinas/química , Alinhamento de Sequência , Serina Proteases/química
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa