Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Proteins ; 89(12): 1618-1632, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34350630

RESUMO

An evolutionary-based definition and classification of target evaluation units (EUs) is presented for the 14th round of the critical assessment of structure prediction (CASP14). CASP14 targets included 84 experimental models submitted by various structural groups (designated T1024-T1101). Targets were split into EUs based on the domain organization of available templates and performance of server groups. Several targets required splitting (19 out of 25 multidomain targets) due in part to observed conformation changes. All in all, 96 CASP14 EUs were defined and assigned to tertiary structure assessment categories (Topology-based FM or High Accuracy-based TBM-easy and TBM-hard) considering their evolutionary relationship to existing ECOD fold space: 24 family level, 50 distant homologs (H-group), 12 analogs (X-group), and 10 new folds. Principal component analysis and heatmap visualization of sequence and structure similarity to known templates as well as performance of servers highlighted trends in CASP14 target difficulty. The assigned evolutionary levels (i.e., H-groups) and assessment classes (i.e., FM) displayed overlapping clusters of EUs. Many viral targets diverged considerably from their template homologs and thus were more difficult for prediction than other homology-related targets. On the other hand, some targets did not have sequence-identifiable templates, but were predicted better than expected due to relatively simple arrangements of secondary structural elements. An apparent improvement in overall server performance in CASP14 further complicated traditional classification, which ultimately assigned EUs into high-accuracy modeling (27 TBM-easy and 31 TBM-hard), topology (23 FM), or both (15 FM/TBM).

2.
Proteins ; 89(12): 1700-1710, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34455641

RESUMO

The high accuracy of some CASP14 models at the domain level prompted a more detailed evaluation of structure predictions on whole targets. For the first time in critical assessment of structure prediction (CASP), we evaluated accuracy of difficult domain assembly in models submitted for multidomain targets where the community predicted individual evaluation units (EUs) with greater accuracy than full-length targets. Ten proteins with domain interactions that did not show evidence of conformational change and were not involved in significant oligomeric contacts were chosen as targets for the domain interaction assessment. Groups were ranked using complementary interaction scores (F1, QS score, and Jaccard coefficient), and their predictions were evaluated for their ability to correctly model inter-domain interfaces and overall protein folds. Target performance was broadly grouped into two clusters. The first consisted primarily of targets containing two EUs wherein predictors more broadly predicted domain positioning and interfacial contacts correctly. The other consisted of complex two-EU and three-EU targets where few predictors performed well. The highest ranked predictor, AlphaFold2, produced high-accuracy models on eight out of 10 targets. Their interdomain scores on three of these targets were significantly higher than all other groups and were responsible for their overall outperformance in the category. We further highlight the performance of AlphaFold2 and the next best group, BAKER-experimental on several interesting targets.

3.
Proteins ; 89(12): 1673-1686, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34240477

RESUMO

This report describes the tertiary structure prediction assessment of difficult modeling targets in the 14th round of the Critical Assessment of Structure Prediction (CASP14). We implemented an official ranking scheme that used the same scores as the previous CASP topology-based assessment, but combined these scores with one that emphasized physically realistic models. The top performing AlphaFold2 group outperformed the rest of the prediction community on all but two of the difficult targets considered in this assessment. They provided high quality models for most of the targets (86% over GDT_TS 70), including larger targets above 150 residues, and they correctly predicted the topology of almost all the rest. AlphaFold2 performance was followed by two manual Baker methods, a Feig method that refined Zhang-server models, two notable automated Zhang server methods (QUARK and Zhang-server), and a Zhang manual group. Despite the remarkable progress in protein structure prediction of difficult targets, both the prediction community and AlphaFold2, to a lesser extent, faced challenges with flexible regions and obligate oligomeric assemblies. The official ranking of top-performing methods was supported by performance generated PCA and heatmap clusters that gave insight into target difficulties and the most successful state-of-the-art structure prediction methodologies.

4.
Science ; 373(6557): 871-876, 2021 08 20.
Artigo em Inglês | MEDLINE | ID: mdl-34282049

RESUMO

DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research.


Assuntos
Aprendizado Profundo , Conformação Proteica , Dobramento de Proteína , Proteínas/química , Proteínas ADAM/química , Sequência de Aminoácidos , Simulação por Computador , Microscopia Crioeletrônica , Cristalografia por Raios X , Bases de Dados de Proteínas , Proteínas de Membrana/química , Modelos Moleculares , Complexos Multiproteicos/química , Redes Neurais de Computação , Subunidades Proteicas/química , Proteínas/fisiologia , Receptores Acoplados a Proteínas G/química , Esfingosina N-Aciltransferase/química
5.
ACS Omega ; 6(24): 15698-15707, 2021 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-34179613

RESUMO

Domain classifications are a useful resource for computational analysis of the protein structure, but elements of their composition are often opaque to potential users. We perform a comparative analysis of our classification ECOD against the SCOPe, SCOP2, and CATH domain classifications with respect to their constituent domain boundaries and hierarchal organization. The coverage of these domain classifications with respect to ECOD and to the PDB was assessed by structure and by sequence. We also conducted domain pair analysis to determine broad differences in hierarchy between domains shared by ECOD and other classifications. Finally, we present domains from the major facilitator superfamily (MFS) of transporter proteins and provide evidence that supports their split into domains and for multiple conformations within these families. We find that the ECOD and CATH provide the most extensive structural coverage of the PDB. ECOD and SCOPe have the most consistent domain boundary conditions, whereas CATH and SCOP2 both differ significantly.

6.
PLoS Comput Biol ; 15(12): e1007569, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31869345

RESUMO

Rossmann folds are ancient, frequently diverged domains found in many biological reaction pathways where they have adapted for different functions. Consequently, discernment and classification of their homologous relations and function can be complicated. We define a minimal Rossmann-like structure motif (RLM) that corresponds for the common core of known Rossmann domains and use this motif to identify all RLM domains in the Protein Data Bank (PDB), thus finding they constitute about 20% of all known 3D structures. The Evolutionary Classification of protein structure Domains (ECOD) classifies RLM domains in a number of groups that lack evidence for homology (X-groups), which suggests that they could have evolved independently multiple times. Closely related, homologous RLM enzyme families can diverge to bind different ligands using similar binding sites and to catalyze different reactions. Conversely, non-homologous RLM domains can converge to catalyze the same reactions or to bind the same ligand with alternate binding modes. We discuss a special case of such convergent evolution that is relevant to the polypharmacology paradigm, wherein the same drug (methotrexate) binds to multiple non-homologous RLM drug targets with different topologies. Finally, assigning proteins with RLM domain to the Enzyme Commission classification suggest that RLM enzymes function mainly in metabolism (and comprise 38% of reference metabolic pathways) and are overrepresented in extant pathways that represent ancient biosynthetic routes such as nucleotide metabolism, energy metabolism, and metabolism of amino acids. In fact, RLM enzymes take part in five out of eight enzymatic reactions of the Wood-Ljungdahl metabolic pathway thought to be used by the last universal common ancestor (LUCA). The prevalence of RLM domains in this ancient metabolism might explain their wide distribution among enzymes.


Assuntos
Evolução Molecular , Domínios Proteicos/genética , Sítios de Ligação/genética , Domínio Catalítico/genética , Biologia Computacional , Bases de Dados de Proteínas , Enzimas/química , Enzimas/genética , Enzimas/metabolismo , Humanos , Ligantes , Redes e Vias Metabólicas/genética , Modelos Moleculares , Ligação Proteica/genética , Software , Homologia Estrutural de Proteína
7.
BMC Mol Cell Biol ; 20(1): 18, 2019 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-31226926

RESUMO

The manual classification of protein domains is approaching its 20th anniversary. ECOD is our mixed manual-automatic domain classification. Over time, the types of proteins which require manual curation has changed. Depositions with complex multidomain and multichain arrangements are commonplace. Transmembrane domains are regularly classified. Repeatedly, domains which are initially believed to be novel are found to have homologous links to existing classified domains. Here we present a brief summary of recent manual curation efforts in ECOD generally combined with specific case studies of transmembrane and multidomain proteins wherein manual curation was useful for discovering new homologous relationships. We present a new taxonomy for the classification of ABC transporter transmembrane domains. We examine alternate topologies of the leucine-specific (LS) domain of Leucine tRNA-synthetase. Finally, we elaborate on a distant homologous links between two helical dimerization domains.


Assuntos
Transportadores de Cassetes de Ligação de ATP/química , Transportadores de Cassetes de Ligação de ATP/classificação , Domínios Proteicos , Homologia Estrutural de Proteína , Proteínas de Transporte/química , Proteínas de Ciclo Celular/química , Cristalografia por Raios X , Bases de Dados de Proteínas , Endopeptidases/química , Escherichia coli/química , Humanos , Leucina-tRNA Ligase/química , Proteínas de Membrana/química , Proteínas de Transporte de Cátions Orgânicos/química , Multimerização Proteica , Estrutura Secundária de Proteína , Proteínas ras/química
8.
Curr Protoc Bioinformatics ; 61(1): e45, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-30040199

RESUMO

ECOD is a database of evolutionary domains from structures deposited in the PDB. Domains in ECOD are classified by a mixed manual/automatic method wherein the bulk of newly deposited structures are classified automatically by protein-protein BLAST. Those structures that cannot be classified automatically are referred to manual curators who use a combination of alignment results, functional analysis, and close reading of the literature to generate novel assignments. ECOD differs from other structural domain resources in that it is continually updated, classifying thousands of proteins per week. ECOD recognizes homology as its key organizing concept, rather than structural or sequence similarity alone. Such a classification scheme provides functional information about proteins of interest by placing them in the correct evolutionary context among all proteins of known structure. This unit demonstrates how to access ECOD via the Web and how to search the database by sequence or structure. It also details the distributable data files available for large-scale bioinformatics users. © 2018 by John Wiley & Sons, Inc.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Domínios Proteicos , Proteínas/química , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Alinhamento de Sequência
9.
Bioinformatics ; 34(17): 2997-3003, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29659718

RESUMO

Motivation: The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings. Results: We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively. Availability and implementation: The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod). Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Domínios Proteicos , Alinhamento de Sequência , Software
10.
Nucleic Acids Res ; 45(D1): D296-D302, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899594

RESUMO

Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ∼3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Modelos Moleculares , Domínios Proteicos , Proteínas , Biologia Computacional/métodos , Conformação Proteica , Proteínas/química , Proteínas/classificação , Proteínas/genética
11.
PLoS One ; 11(5): e0154786, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27149620

RESUMO

The Critical Assessment of techniques for protein Structure Prediction (or CASP) is a community-wide blind test experiment to reveal the best accomplishments of structure modeling. Assessors have been using the Global Distance Test (GDT_TS) measure to quantify prediction performance since CASP3 in 1998. However, identifying significant score differences between close models is difficult because of the lack of uncertainty estimations for this measure. Here, we utilized the atomic fluctuations caused by structure flexibility to estimate the uncertainty of GDT_TS scores. Structures determined by nuclear magnetic resonance are deposited as ensembles of alternative conformers that reflect the structural flexibility, whereas standard X-ray refinement produces the static structure averaged over time and space for the dynamic ensembles. To recapitulate the structural heterogeneous ensemble in the crystal lattice, we performed time-averaged refinement for X-ray datasets to generate structural ensembles for our GDT_TS uncertainty analysis. Using those generated ensembles, our study demonstrates that the time-averaged refinements produced structure ensembles with better agreement with the experimental datasets than the averaged X-ray structures with B-factors. The uncertainty of the GDT_TS scores, quantified by their standard deviations (SDs), increases for scores lower than 50 and 70, with maximum SDs of 0.3 and 1.23 for X-ray and NMR structures, respectively. We also applied our procedure to the high accuracy version of GDT-based score and produced similar results with slightly higher SDs. To facilitate score comparisons by the community, we developed a user-friendly web server that produces structure ensembles for NMR and X-ray structures and is accessible at http://prodata.swmed.edu/SEnCS. Our work helps to identify the significance of GDT_TS score differences, as well as to provide structure ensembles for estimating SDs of any scores.


Assuntos
Modelos Teóricos , Incerteza , Raios X
12.
Protein Sci ; 25(7): 1188-203, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-26833690

RESUMO

Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain-like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade.


Assuntos
Motivos de Aminoácidos , Proteínas/química , Proteômica/métodos , Bases de Dados de Proteínas , Evolução Molecular , Modelos Moleculares , Domínios Proteicos , Proteínas/genética
13.
Proteins ; 84 Suppl 1: 20-33, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26756794

RESUMO

Protein target structures for the Critical Assessment of Structure Prediction round 11 (CASP11) and CASP ROLL were split into domains and classified into categories suitable for assessment of template-based modeling (TBM) and free modeling (FM) based on their evolutionary relatedness to existing structures classified by the Evolutionary Classification of Protein Domains (ECOD) database. First, target structures were divided into domain-based evaluation units. Target splits were based on the domain organization of available templates as well as the performance of servers on whole targets compared to split target domains. Second, evaluation units were classified into TBM and FM categories using a combination of measures that evaluate prediction quality and template detectability. Generally, target domains with sequence-related templates and good server prediction performance were classified as TBM, whereas targets without sequence-identifiable templates and low server performance were classified as FM. As in previous CASP experiments, the boundaries for classification were blurred due to the presence of significant insertions and deteriorations in the targets with respect to homologous templates, as well as the presence of templates with partial coverage of new folds. The FM category included 45 target domains, which represents an unprecedented number of difficult CASP targets provided for modeling. Proteins 2016; 84(Suppl 1):20-33. © 2016 Wiley Periodicals, Inc.


Assuntos
Biologia Computacional/estatística & dados numéricos , Modelos Moleculares , Modelos Estatísticos , Proteínas/química , Software , Animais , Bacteriófagos/química , Biologia Computacional/métodos , Gráficos por Computador , Bases de Dados de Proteínas , Humanos , Cooperação Internacional , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas , Multimerização Proteica , Estrutura Secundária de Proteína , Proteínas/classificação , Homologia de Sequência de Aminoácidos
14.
Proteins ; 83(7): 1238-51, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25917548

RESUMO

ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis.


Assuntos
Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Terminologia como Assunto , Sequência de Aminoácidos , Animais , Dimetilaliltranstransferase/química , Dimetilaliltranstransferase/classificação , Evolução Molecular , Humanos , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Dados de Sequência Molecular , Neuropeptídeos/química , Neuropeptídeos/classificação , Neurotoxinas/química , Neurotoxinas/classificação , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Software , Venenos de Aranha/química , Venenos de Aranha/classificação , Eletricidade Estática
15.
PLoS Comput Biol ; 10(12): e1003926, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25474468

RESUMO

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/classificação , Evolução Molecular , Modelos Moleculares
16.
Bioinformatics ; 27(1): 46-54, 2011 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-21068000

RESUMO

MOTIVATION: The discovery of new protein folds is a relatively rare occurrence even as the rate of protein structure determination increases. This rarity reinforces the concept of folds as reusable units of structure and function shared by diverse proteins. If the folding mechanism of proteins is largely determined by their topology, then the folding pathways of members of existing folds could encompass the full set used by globular protein domains. RESULTS: We have used recent versions of three common protein domain dictionaries (SCOP, CATH and Dali) to generate a consensus domain dictionary (CDD). Surprisingly, 40% of the metafolds in the CDD are not composed of autonomous structural domains, i.e. they are not plausible independent folding units. This finding has serious ramifications for bioinformatics studies mining these domain dictionaries for globular protein properties. However, our main purpose in deriving this CDD was to generate an updated CDD to choose targets for MD simulation as part of our dynameomics effort, which aims to simulate the native and unfolding pathways of representatives of all globular protein consensus folds (metafolds). Consequently, we also compiled a list of representative protein targets of each metafold in the CDD. AVAILABILITY AND IMPLEMENTATION: This domain dictionary is available at www.dynameomics.org.


Assuntos
Dicionários como Assunto , Estrutura Terciária de Proteína , Biologia Computacional , Modelos Moleculares , Anotação de Sequência Molecular , Dobramento de Proteína
17.
Protein Eng Des Sel ; 24(1-2): 11-9, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21051320

RESUMO

The classification of protein folds is necessarily based on the structural elements that distinguish domains. Classification of protein domains consists of two problems: the partition of structures into domains and the classification of domains into sets of similar structures (or folds). Although similar topologies may arise by convergent evolution, the similarity of their respective folding pathways is unknown. The discovery and the characterization of the majority of protein folds will be followed by a similar enumeration of available protein folding pathways. Consequently, understanding the intricacies of structural domains is necessary to understanding their collective folding pathways. We review the current state of the art in the field of protein domain classification and discuss methods for the systematic and comprehensive study of protein folding across protein fold space via atomistic molecular dynamics simulation. Finally, we discuss our large-scale Dynameomics project, which includes simulations of representatives of all autonomous protein folds.


Assuntos
Proteínas/química , Proteínas/classificação , Animais , Humanos , Simulação de Dinâmica Molecular , Conformação Proteica , Dobramento de Proteína
18.
Biochemistry ; 50(6): 1029-41, 2011 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-21190388

RESUMO

To provide insight into the role of local sequence in the nonrandom coil behavior of the denatured state, we have extended our measurements of histidine-heme loop formation equilibria for cytochrome c' to 6 M guanidine hydrochloride. We observe that there is some reduction in the scatter about the best fit line of loop stability versus loop size data in 6 M versus 3 M guanidine hydrochloride, but the scatter is not eliminated. The scaling exponent, ν(3), of 2.5 ± 0.2 is also similar to that found previously in 3 M guanidine hydrochloride (2.6 ± 0.3). Rates of histidine-heme loop breakage in the denatured state of cytochrome c' show that some histidine-heme loops are significantly more persistent than others at both 3 and 6 M guanidine hydrochloride. Rates of histidine-heme loop formation more closely approximate random coil behavior. This observation indicates that heterogeneity in the denatured state ensemble results mainly from contact persistence. When mapped onto the structure of cytochrome c', the histidine-heme loops with slow breakage rates coincide with chain reversals between helices 1 and 2 and between helices 2 and 3. Molecular dynamics simulations of the unfolding of cytochrome c' at 498 K show that these reverse turns persist in the unfolded state. Thus, these portions of the primary structure of cytochrome c' set up the topology of cytochrome c' in the denatured state, predisposing the protein to fold efficiently to its native structure.


Assuntos
Proteínas de Bactérias/química , Citocromos c'/química , Rodopseudomonas/metabolismo , Guanidina/metabolismo , Concentração de Íons de Hidrogênio , Cinética , Modelos Moleculares , Conformação Proteica , Desnaturação Proteica , Dobramento de Proteína
19.
Structure ; 18(4): 423-35, 2010 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-20399180

RESUMO

The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 2000 protein/peptide systems (approximately 11,000 independent simulations) representing the majority of folds in globular proteins. These data are stored and organized using an innovative database approach, which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins. Here we describe the project in general terms and the type of information contained in the database. Then we provide examples of mining the database for information relevant to protein folding, structure building, the effect of single-nucleotide polymorphisms, and drug design. The native state simulation data and corresponding analyses for the 100 most populated metafolds, together with related resources, are publicly accessible through http://www.dynameomics.org.


Assuntos
Proteínas/química , Algoritmos , Animais , Biologia Computacional/métodos , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Conformação Molecular , Polimorfismo de Nucleotídeo Único , Desnaturação Proteica , Dobramento de Proteína , Proteômica/métodos
20.
Biomol Concepts ; 1(5-6): 335-44, 2010 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-25962007

RESUMO

All currently known structures of proteins together define 'protein fold space'. To increase the general understanding of protein dynamics and protein folding, we selected a set of 807 proteins and protein domains that represent 95% of the currently known autonomous folded domains present in globular proteins. Native state and unfolding simulations of these representatives are now complete and accessible via a novel database containing over 11 000 simulations. Because protein folding is a microscopically reversible process, these simulations effectively sample protein folding across all of protein fold space. Here, we give an overview of how the representative proteins were selected and how the simulations were performed and validated. We then provide examples of different types of analyses that can be performed across our large set of simulations, made possible by the database approach. We further show how the unfolding simulations can be used to compare unfolding of structural elements in isolation and in different structural contexts, using as an example a short, triple stranded ß-sheet that forms the WW domain and is present in several larger unrelated proteins.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...