Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Biophys J ; 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38894539

RESUMEN

Aquaporins (AQPs) are recognized as transmembrane water channels that facilitate selective water permeation through their monomeric pores. Among the AQP family, AQP6 has an intriguing characteristic as an anion channel, which is allosterically controlled by pH conditions and is eliminated by a single amino acid mutation. However, the molecular mechanism of anion permeation through AQP6 remains unclear. Using molecular dynamics simulations in the presence of a transmembrane voltage utilizing an ion concentration gradient, we show that chloride ions permeate through the pore corresponding to the central axis of the AQP6 homotetramer. Under low pH conditions, a subtle opening of the hydrophobic selectivity filter (SF), located near the extracellular part of the central pore, becomes wetted and enables anion permeation. Our simulations also indicate that a single mutation (N63G) in human AQP6, located at the central pore, significantly reduces anion conduction, consistent with experimental data. Moreover, we demonstrate that the pH-sensing mechanism in which the protonation of H184 and H189 under low pH conditions allosterically triggers the gating of the SF region. These results suggest a unique pH-dependent allosteric anion permeation mechanism in AQP6 and could clarify the role of the central pore in some of the AQP tetramers.

2.
Bioinformatics ; 39(12)2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-37995286

RESUMEN

MOTIVATION: Predicting protein structures with high accuracy is a critical challenge for the broad community of life sciences and industry. Despite progress made by deep neural networks like AlphaFold2, there is a need for further improvements in the quality of detailed structures, such as side-chains, along with protein backbone structures. RESULTS: Building upon the successes of AlphaFold2, the modifications we made include changing the losses of side-chain torsion angles and frame aligned point error, adding loss functions for side chain confidence and secondary structure prediction, and replacing template feature generation with a new alignment method based on conditional random fields. We also performed re-optimization by conformational space annealing using a molecular mechanics energy function which integrates the potential energies obtained from distogram and side-chain prediction. In the CASP15 blind test for single protein and domain modeling (109 domains), DeepFold ranked fourth among 132 groups with improvements in the details of the structure in terms of backbone, side-chain, and Molprobity. In terms of protein backbone accuracy, DeepFold achieved a median GDT-TS score of 88.64 compared with 85.88 of AlphaFold2. For TBM-easy/hard targets, DeepFold ranked at the top based on Z-scores for GDT-TS. This shows its practical value to the structural biology community, which demands highly accurate structures. In addition, a thorough analysis of 55 domains from 39 targets with publicly available structures indicates that DeepFold shows superior side-chain accuracy and Molprobity scores among the top-performing groups. AVAILABILITY AND IMPLEMENTATION: DeepFold tools are open-source software available at https://github.com/newtonjoo/deepfold.


Asunto(s)
Proteínas , Programas Informáticos , Conformación Proteica , Proteínas/química , Estructura Secundaria de Proteína , Pliegue de Proteína
3.
Cell ; 136(1): 85-96, 2009 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-19135891

RESUMEN

Condensins are key mediators of chromosome condensation across organisms. Like other condensins, the bacterial MukBEF condensin complex consists of an SMC family protein dimer containing two ATPase head domains, MukB, and two interacting subunits, MukE and MukF. We report complete structural views of the intersubunit interactions of this condensin along with ensuing studies that reveal a role for the ATPase activity of MukB. MukE and MukF together form an elongated dimeric frame, and MukF's C-terminal winged-helix domains (C-WHDs) bind MukB heads to constitute closed ring-like structures. Surprisingly, one of the two bound C-WHDs is forced to detach upon ATP-mediated engagement of MukB heads. This detachment reaction depends on the linker segment preceding the C-WHD, and mutations on the linker restrict cell growth. Thus ATP-dependent transient disruption of the MukB-MukF interaction, which creates openings in condensin ring structures, is likely to be a critical feature of the functional mechanism of condensins.


Asunto(s)
Adenosina Trifosfatasas/química , Bacterias/química , Proteínas Bacterianas/química , Proteínas de Unión al ADN/química , Complejos Multiproteicos/química , Adenosina Trifosfatasas/metabolismo , Adenosina Trifosfato , Bacterias/metabolismo , Proteínas Bacterianas/metabolismo , Sitios de Unión , ADN/metabolismo , Proteínas de Unión al ADN/metabolismo , Modelos Moleculares , Complejos Multiproteicos/metabolismo , Estructura Terciaria de Proteína
4.
J Comput Chem ; 44(30): 2332-2346, 2023 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-37585026

RESUMEN

Conformational space annealing (CSA), a global optimization method, has been applied to various protein structure modeling tasks. In this paper, we applied CSA to the cryo-EM structure modeling task by combining the python subroutine of CSA (PyCSA) and the fast relax (FastRelax) protocol of PyRosetta. Refinement of initial structures generated from two methods, rigid fitting of predicted structures to the Cryo-EM map and de novo protein modeling by tracing the Cryo-EM map, was performed by CSA. In the refinement of the rigid-fitted structures, the final models showed that CSA can generate reliable atomic structures of proteins, even when large movements of protein domains were required. In the de novo modeling case, although the overall structural qualities of the final models were rather dependent on the initial models, the final models generated by CSA showed improved MolProbity scores and cross-correlation coefficients to the maps. These results suggest that CSA can accomplish flexible fitting and refinement together by sampling diverse conformations effectively and thus can be utilized for cryo-EM structure modeling.


Asunto(s)
Proteínas , Modelos Moleculares , Microscopía por Crioelectrón/métodos , Proteínas/química , Conformación Molecular , Dominios Proteicos , Conformación Proteica
5.
Molecules ; 27(12)2022 Jun 09.
Artículo en Inglés | MEDLINE | ID: mdl-35744836

RESUMEN

Sequence-structure alignment for protein sequences is an important task for the template-based modeling of 3D structures of proteins. Building a reliable sequence-structure alignment is a challenging problem, especially for remote homologue target proteins. We built a method of sequence-structure alignment called CRFalign, which improves upon a base alignment model based on HMM-HMM comparison by employing pairwise conditional random fields in combination with nonlinear scoring functions of structural and sequence features. Nonlinear scoring part is implemented by a set of gradient boosted regression trees. In addition to sequence profile features, various position-dependent structural features are employed including secondary structures and solvent accessibilities. Training is performed on reference alignments at superfamily levels or twilight zone chosen from the SABmark benchmark set. We found that CRFalign method produces relative improvement in terms of average alignment accuracies for validation sets of SABmark benchmark. We also tested CRFalign on 51 sequence-structure pairs involving 15 FM target domains of CASP14, where we could see that CRFalign leads to an improvement in average modeling accuracies in these hard targets (TM-CRFalign ≃42.94%) compared with that of HHalign (TM-HHalign ≃39.05%) and also that of MRFalign (TM-MRFalign ≃36.93%). CRFalign was incorporated to our template search framework called CRFpred and was tested for a random target set of 300 target proteins consisting of Easy, Medium and Hard sets which showed a reasonable template search performance.


Asunto(s)
Algoritmos , Proteínas , Secuencia de Aminoácidos , Estructura Secundaria de Proteína , Proteínas/química , Alineación de Secuencia , Solventes
6.
Bioinformatics ; 35(14): 2411-2417, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-30500873

RESUMEN

MOTIVATION: Domain boundary prediction is one of the most important problems in the study of protein structure and function. Many sequence-based domain boundary prediction methods are either template-based or machine learning (ML) based. ML-based methods often perform poorly due to their use of only local (i.e. short-range) features. These conventional features such as sequence profiles, secondary structures and solvent accessibilities are typically restricted to be within 20 residues of the domain boundary candidate. RESULTS: To address the performance of ML-based methods, we developed a new protein domain boundary prediction method (ConDo) that utilizes novel long-range features such as coevolutionary information in addition to the aforementioned local window features as inputs for ML. Toward this purpose, two types of coevolutionary information were extracted from multiple sequence alignment using direct coupling analysis: (i) partially aligned sequences, and (ii) correlated mutation information. Both the partially aligned sequence information and the modularity of residue-residue couplings possess long-range correlation information. AVAILABILITY AND IMPLEMENTATION: https://github.com/gicsaw/ConDo.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Automático , Proteínas/química , Dominios Proteicos , Estructura Secundaria de Proteína , Alineación de Secuencia
7.
J Chem Inf Model ; 60(3): 1844-1864, 2020 03 23.
Artículo en Inglés | MEDLINE | ID: mdl-31999919

RESUMEN

The method for protein-structure prediction, which combines the physics-based coarse-grained UNRES force field with knowledge-based modeling, has been developed further and tested in the 13th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13). The method implements restraints from the consensus fragments common to server models. In this work, the server models to derive fragments have been chosen on the basis of quality assessment; a fully automatic fragment-selection procedure has been introduced, and Dynamic Fragment Assembly pseudopotentials have been fully implemented. The Global Distance Test Score (GDT_TS), averaged over our "Model 1" predictions, increased by over 10 units with respect to CASP12 for the free-modeling category to reach 40.82. Our "Model 1" predictions ranked 20 and 14 for all and free-modeling targets, respectively (upper 20.2% and 14.3% of all models submitted to CASP13 in these categories, respectively), compared to 27 (upper 21.1%) and 24 (upper 18.9%) in CASP12, respectively. For oligomeric targets, the Interface Patch Similarity (IPS) and Interface Contact Similarity (ICS) averaged over our best oligomer models increased from 0.28 to 0.36 and from 12.4 to 17.8, respectively, from CASP12 to CASP13, and top-ranking models of 2 targets (H0968 and T0997o) were obtained (none in CASP12). The improvement of our method in CASP13 over CASP12 was ascribed to the combined effect of the overall enhancement of server-model quality, our success in selecting server models and fragments to derive restraints, and improvements of the restraint and potential-energy functions.


Asunto(s)
Algoritmos , Proteínas , Biología Computacional , Consenso , Modelos Moleculares , Conformación Proteica
8.
Int J Mol Sci ; 21(7)2020 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-32244797

RESUMEN

Human SNF5 and BAF155 constitute the core subunit of multi-protein SWI/SNF chromatin-remodeling complexes that are required for ATP-dependent nucleosome mobility and transcriptional control. Human SNF5 (hSNF5) utilizes its repeat 1 (RPT1) domain to associate with the SWIRM domain of BAF155. Here, we employed X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and various biophysical methods in order to investigate the detailed binding mechanism between hSNF5 and BAF155. Multi-angle light scattering data clearly indicate that hSNF5171-258 and BAF155SWIRM are both monomeric in solution and they form a heterodimer. NMR data and crystal structure of the hSNF5171-258/BAF155SWIRM complex further reveal a unique binding interface, which involves a coil-to-helix transition upon protein binding. The newly formed αN helix of hSNF5171-258 interacts with the ß2-α1 loop of hSNF5 via hydrogen bonds and it also displays a hydrophobic interaction with BAF155SWIRM. Therefore, the N-terminal region of hSNF5171-258 plays an important role in tumorigenesis and our data will provide a structural clue for the pathogenesis of Rhabdoid tumors and malignant melanomas that originate from mutations in the N-terminal loop region of hSNF5.


Asunto(s)
Ensamble y Desensamble de Cromatina/genética , Mutación , Nucleosomas/genética , Proteína SMARCB1/genética , Factores de Transcripción/genética , Sitios de Unión/genética , Dicroismo Circular , Cristalografía por Rayos X , Regulación de la Expresión Génica , Humanos , Espectroscopía de Resonancia Magnética , Melanoma/genética , Melanoma/metabolismo , Melanoma/patología , Nucleosomas/metabolismo , Unión Proteica , Tumor Rabdoide/genética , Tumor Rabdoide/metabolismo , Tumor Rabdoide/patología , Proteína SMARCB1/química , Proteína SMARCB1/metabolismo , Factores de Transcripción/química , Factores de Transcripción/metabolismo
9.
Proteins ; 86 Suppl 1: 240-246, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29341255

RESUMEN

In CASP12, 2 types of data-assisted protein structure modeling were experimented. Either SAXS experimental data or cross-linking experimental data was provided for a selected number of CASP12 targets that the CASP12 predictor could utilize for better protein structure modeling. We devised 2 separate energy terms for SAXS data and cross-linking data to drive the model structures into more native-like structures that satisfied the given experimental data as much as possible. In CASP11, we successfully performed protein structure modeling using simulated sparse and ambiguously assigned NOE data and/or correct residue-residue contact information, where the only energy term that folded the protein into its native structure was the term which was originated from the given experimental data. However, the 2 types of experimental data provided in CASP12 were far from being sufficient enough to fold the target protein into its native structure because SAXS data provides only the overall shape of the molecule and the cross-linking contact information provides only very low-resolution distance information. For this reason, we combined the SAXS or cross-linking energy term with our regular modeling energy function that includes both the template energy term and the de novo energy terms. By optimizing the newly formulated energy function, we obtained protein models that fit better with provided SAXS data than the X-ray structure of the target. However, the improvement of the model relative to the 1 modeled without the SAXS data, was not significant. Consistent structural improvement was achieved by incorporating cross-linking data into the protein structure modeling.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Proteínas/química , Dispersión del Ángulo Pequeño , Algoritmos , Humanos , Simulación de Dinámica Molecular , Difracción de Rayos X
10.
Proteins ; 86 Suppl 1: 361-373, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-28975666

RESUMEN

Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this article, the most successful groups in CASP12 describe their latest methods for estimates of model accuracy (EMA). We show that pure single model accuracy estimation methods have shown clear progress since CASP11; the 3 top methods (MESHI, ProQ3, SVMQA) all perform better than the top method of CASP11 (ProQ2). Although the pure single model accuracy estimation methods outperform quasi-single (ModFOLD6 variations) and consensus methods (Pcons, ModFOLDclust2, Pcomb-domain, and Wallner) in model selection, they are still not as good as those methods in absolute model quality estimation and predictions of local quality. Finally, we show that when using contact-based model quality measures (CAD, lDDT) the single model quality methods perform relatively better.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Bases de Datos de Proteínas , Humanos , Alineación de Secuencia , Análisis de Secuencia de Proteína
11.
Proteins ; 86 Suppl 1: 122-135, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29159837

RESUMEN

For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Modelos Moleculares , Simulación de Dinámica Molecular , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Algoritmos , Cristalografía por Rayos X , Humanos , Modelos Estadísticos , Dominios y Motivos de Interacción de Proteínas , Análisis de Secuencia de Proteína , Máquina de Vectores de Soporte
12.
Biochim Biophys Acta Proteins Proteom ; 1866(2): 205-213, 2018 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-29122686

RESUMEN

We have analyzed the crystal structure of the dimeric form of d-glycero-d-manno-heptose-1,7-bisphosphate phosphatase from Burkholderia thailandensis (BtGmhB), catalyzing the removal of the phosphate at the 7 position of d-glycero-d-manno-heptose-1,7-bisphosphate. The crystal structure of BtGmhB revealed a dimeric form caused by a disruption of a short zinc-binding loop. The dimeric BtGmhB structure was induced by triggering the loss of Zn2+via the protonation of cysteine residues at pH 4.8 of the crystallization condition. Similarly, the addition of EDTA also causes the dimerization of BtGmhB. It appears there are two dimeric forms in solution with and without the disulfide bridge mediated by Cys95. The disulfide-free dimer produced by the loss of Zn2+ in the short zinc-binding loop is further converted to a stable disulfide-bonded dimer in vitro. Though the two dimeric forms are reversible, both of them are inactive due to a deformation of the active site. Single and triple mutant experiments confirmed the presence of two dimeric forms in vitro. Phosphatase assay results showed that only a zinc-bound monomeric form contains catalytic activity in contrast to the inactive zinc-free dimeric forms. The monomer-to-dimer transition caused by the loss of Zn2+ observed in this study is an example of reversal phenomenon caused by artificial proteins containing protein engineered zinc-finger motifs where the monomer-to-dimer transitions occurred in the presence of Zn2+. Therefore, this unusual dimerization process may be applicable to designing proteins possessing a short zinc-binding loop with a novel regulatory role.


Asunto(s)
Proteínas Bacterianas/química , Burkholderia/enzimología , Monoéster Fosfórico Hidrolasas/química , Ingeniería de Proteínas , Multimerización de Proteína , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Burkholderia/genética , Monoéster Fosfórico Hidrolasas/genética , Monoéster Fosfórico Hidrolasas/metabolismo , Estructura Secundaria de Proteína
13.
J Comput Chem ; 38(31): 2730-2746, 2017 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-28940211

RESUMEN

Molecular simulations restrained to single or multiple templates are commonly used in protein-structure modeling. However, the restraints introduce additional barriers, thus impairing the ergodicity of simulations, which can affect the quality of the resulting models. In this work, the effect of restraint types and simulation schemes on ergodicity and model quality was investigated by performing template-restrained canonical molecular dynamics (MD), multiplexed replica-exchange molecular dynamics, and Hamiltonian replica exchange molecular dynamics (HREMD) simulations with the coarse-grained UNRES force field on nine selected proteins, with pseudo-harmonic log-Gaussian (unbounded) or Lorentzian (bounded) restraint functions. The best ergodicity was exhibited by HREMD. It has been found that non-ergodicity does not affect model quality if good templates are used to generate restraints. However, when poor-quality restraints not covering the entire protein are used, the improved ergodicity of HREMD can lead to significantly improved protein models. © 2017 Wiley Periodicals, Inc.


Asunto(s)
Proteínas/química , Algoritmos , Bases de Datos de Proteínas , Simulación de Dinámica Molecular , Conformación Proteica , Temperatura , Termodinámica
14.
J Chem Inf Model ; 57(5): 1068-1078, 2017 05 22.
Artículo en Inglés | MEDLINE | ID: mdl-28398048

RESUMEN

We have developed a protein loop structure prediction method by combining a new energy function, which we call EPLM (energy for protein loop modeling), with the conformational space annealing (CSA) global optimization algorithm. The energy function includes stereochemistry, dynamic fragment assembly, distance-scaled finite ideal gas reference (DFIRE), and generalized orientation- and distance-dependent terms. For the conformational search of loop structures, we used the CSA algorithm, which has been quite successful in dealing with various hard global optimization problems. We assessed the performance of EPLM with two widely used loop-decoy sets, Jacobson and RAPPER, and compared the results against the DFIRE potential. The accuracy of model selection from a pool of loop decoys as well as de novo loop modeling starting from randomly generated structures was examined separately. For the selection of a nativelike structure from a decoy set, EPLM was more accurate than DFIRE in the case of the Jacobson set and had similar accuracy in the case of the RAPPER set. In terms of sampling more nativelike loop structures, EPLM outperformed EDFIRE for both decoy sets. This new approach equipped with EPLM and CSA can serve as the state-of-the-art de novo loop modeling method.


Asunto(s)
Bioquímica/métodos , Modelos Químicos , Proteínas/química , Conformación Proteica , Pliegue de Proteína
15.
Proteins ; 84 Suppl 1: 189-99, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26677100

RESUMEN

We have applied the conformational space annealing method to the contact-assisted protein structure modeling in CASP11. For Tp targets, where predicted residue-residue contact information was provided, the contact energy term in the form of the Lorentzian function was implemented together with the physical energy terms used in our template-free modeling of proteins. Although we observed some structural improvement of Tp models over the models predicted without the Tp information, the improvement was not substantial on average. This is partly due to the inaccuracy of the provided contact information, where only about 18% of it was correct. For Ts targets, where the information of ambiguous NOE (Nuclear Overhauser Effect) restraints was provided, we formulated the modeling in terms of the two-tier optimization problem, which covers: (1) the assignment of NOE peaks and (2) the three-dimensional (3D) model generation based on the assigned NOEs. Although solving the problem in a direct manner appears to be intractable at first glance, we demonstrate through CASP11 that remarkably accurate protein 3D modeling is possible by brute force optimization of a relevant energy function. For 19 Ts targets of the average size of 224 residues, generated protein models were of about 3.6 Å Cα atom accuracy. Even greater structural improvement was observed when additional Tc contact information was provided. For 20 out of the total 24 Tc targets, we were able to generate protein structures which were better than the best model from the rest of the CASP11 groups in terms of GDT-TS. Proteins 2016; 84(Suppl 1):189-199. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Algoritmos , Secuencias de Aminoácidos , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Internet , Resonancia Magnética Nuclear Biomolecular , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Termodinámica
16.
Proteins ; 84 Suppl 1: 118-30, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26474186

RESUMEN

For the template-free modeling of human targets of CASP11, we utilized two of our modeling protocols, LEE and LEER. The LEE protocol took CASP11-released server models as the input and used some of them as templates for 3D (three-dimensional) modeling. The template selection procedure was based on the clustering of the server models aided by a community detection method of a server-model network. Restraining energy terms generated from the selected templates together with physical and statistical energy terms were used to build 3D models. Side-chains of the 3D models were rebuilt using target-specific consensus side-chain library along with the SCWRL4 rotamer library, which completed the LEE protocol. The first success factor of the LEE protocol was due to efficient server model screening. The average backbone accuracy of selected server models was similar to that of top 30% server models. The second factor was that a proper energy function along with our optimization method guided us, so that we successfully generated better quality models than the input template models. In 10 out of 24 cases, better backbone structures than the best of input template structures were generated. LEE models were further refined by performing restrained molecular dynamics simulations to generate LEER models. CASP11 results indicate that LEE models were better than the average template models in terms of both backbone structures and side-chain orientations. LEER models were of improved physical realism and stereo-chemistry compared to LEE models, and they were comparable to LEE models in the backbone accuracy. Proteins 2016; 84(Suppl 1):118-130. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Modelos Estadísticos , Simulación de Dinámica Molecular , Proteínas/química , Programas Informáticos , Algoritmos , Secuencias de Aminoácidos , Bacterias/química , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos , Internet , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estereoisomerismo , Termodinámica , Virus/química
17.
Proteins ; 84 Suppl 1: 221-32, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26329522

RESUMEN

For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Modelos Moleculares , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Humanos , Internet , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Estructura Secundaria de Proteína , Alineación de Secuencia , Homología Estructural de Proteína , Termodinámica
18.
J Chem Inf Model ; 56(11): 2263-2279, 2016 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-27749055

RESUMEN

Recently, we developed a new approach to protein-structure prediction, which combines template-based modeling with the physics-based coarse-grained UNited RESidue (UNRES) force field. In this approach, restrained multiplexed replica exchange molecular dynamics simulations with UNRES, with the Cα-distance and virtual-bond-dihedral-angle restraints derived from knowledge-based models are carried out. In this work, we report a test of this approach in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11), in which we used the template-based models from early-stage predictions by the LEE group CASP11 server (group 038, called "nns"), and further improvement of the method. The quality of the models obtained in CASP11 was better than that resulting from unrestrained UNRES simulations; however, the obtained models were generally worse than the final nns models. Calculations with the final nns models, performed after CASP11, resulted in substantial improvement, especially for multi-domain proteins. Based on these results, we modified the procedure by deriving restraints from models from multiple servers, in this study the four top-performing servers in CASP11 (nns, BAKER-ROSETTASERVER, Zhang-server, and QUARK), and implementing either all restraints or only the restraints on the fragments that appear similar in the majority of models (the consensus fragments), outlier models discarded. Tests with 29 CASP11 human-prediction targets with length less than 400 amino-acid residues demonstrated that the consensus-fragment approach gave better results, i.e., lower α-carbon root-mean-square deviation from the experimental structures, higher template modeling score, and global distance test total score values than the best of the parent server models. Apart from global improvement (repacking and improving the orientation of domains and other substructures), improvement was also reached for template-based modeling targets, indicating that the approach has refinement capacity. Therefore, the consensus-fragment analysis is able to remove lower-quality models and poor-quality parts of the models without knowing the experimental structure.


Asunto(s)
Caspasas/química , Secuencia de Consenso , Simulación de Dinámica Molecular , Fragmentos de Péptidos/química , Humanos , Conformación Proteica
19.
BMC Bioinformatics ; 16: 94, 2015 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-25886990

RESUMEN

BACKGROUND: In template-based modeling when using a single template, inter-atomic distances of an unknown protein structure are assumed to be distributed by Gaussian probability density functions, whose center peaks are located at the distances between corresponding atoms in the template structure. The width of the Gaussian distribution, the variability of a spatial restraint, is closely related to the reliability of the restraint information extracted from a template, and it should be accurately estimated for successful template-based protein structure modeling. RESULTS: To predict the variability of the spatial restraints in template-based modeling, we have devised a prediction model, Sigma-RF, by using the random forest (RF) algorithm. The benchmark results on 22 CASP9 targets show that the variability values from Sigma-RF are of higher correlations with the true distance deviation than those from Modeller. We assessed the effect of new sigma values by performing the single-domain homology modeling of 22 CASP9 targets and 24 CASP10 targets. For most of the targets tested, we could obtain more accurate 3D models from the identical alignments by using the Sigma-RF results than by using Modeller ones. CONCLUSIONS: We find that the average alignment quality of residues located between and at two aligned residues, quasi-local information, is the most contributing factor, by investigating the importance of input features used in the RF machine learning. This average alignment quality is shown to be more important than the previously identified quantity of a local information: the product of alignment qualities at two aligned residues.


Asunto(s)
Algoritmos , Modelos Estadísticos , Homología Estructural de Proteína , Inteligencia Artificial , Modelos Moleculares , Alineación de Secuencia
20.
Proteins ; 83(12): 2251-62, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26454251

RESUMEN

We have carried out numerical experiments to investigate the applicability of the global optimization method of conformational space annealing (CSA) to the enhanced NMR protein structure determination over existing PDB structures. The NMR protein structure determination is driven by the optimization of collective multiple restraints arising from experimental data and the basic stereochemical properties of a protein-like molecule. By rigorous and straightforward application of CSA to the identical NMR experimental data used to generate existing PDB structures, we redetermined 56 recent PDB protein structures starting from fully randomized structures. The quality of CSA-generated structures and existing PDB structures were assessed by multiobjective functions in terms of their consistencies with experimental data and the requirements of protein-like stereochemistry. In 54 out of 56 cases, CSA-generated structures were better than existing PDB structures in the Pareto-dominant manner, while in the remaining two cases, it was a tie with mixed results. As a whole, all structural features tested improved in a statistically meaningful manner. The most improved feature was the Ramachandran favored portion of backbone torsion angles with about 8.6% improvement from 88.9% to 97.5% (P-value <10(-17)). We show that by straightforward application of CSA to the efficient global optimization of an energy function, NMR structures will be of better quality than existing PDB structures.


Asunto(s)
Resonancia Magnética Nuclear Biomolecular/métodos , Conformación Proteica , Proteínas/química , Bases de Datos de Proteínas , Modelos Moleculares
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA