Search | VHL CLAP/WR-PAHO/WHO

1.

Ding, Wenze; Nakai, Kenta; Gong, Haipeng.

Brief Bioinform ; 23(3)2022 05 13.

Article in English | MEDLINE | ID: mdl-35348602

ABSTRACT

Proteins with desired functions and properties are important in fields like nanotechnology and biomedicine. De novo protein design enables the production of previously unseen proteins from the ground up and is believed as a key point for handling real social challenges. Recent introduction of deep learning into design methods exhibits a transformative influence and is expected to represent a promising and exciting future direction. In this review, we retrospect the major aspects of current advances in deep-learning-based design procedures and illustrate their novelty in comparison with conventional knowledge-based approaches through noticeable cases. We not only describe deep learning developments in structure-based protein design and direct sequence design, but also highlight recent applications of deep reinforcement learning in protein design. The future perspectives on design goals, challenges and opportunities are also comprehensively discussed.

Subject(s)

Deep Learning , Knowledge Bases , Proteins

2.

Study on the diagnostic value of MDCT extramural vascular invasion in preoperative N staging of gastric cancer patients.

Zhu, Zhengqi; Mao, Mimi; Song, Anyi; Gong, Haipeng; Gu, Jianan; Dai, Yongfeng; Feng, Feng.

BMC Med Imaging ; 24(1): 20, 2024 Jan 19.

Article in English | MEDLINE | ID: mdl-38243288

ABSTRACT

BACKGROUND: To explore the diagnostic value of multidetector computed tomography (MDCT) extramural vascular invasion (EMVI) in preoperative N Staging of gastric cancer patients. METHODS: According to the MR-defined EMVI scoring standard of rectal cancer, we developed a 5-point scale scoring system to evaluate the status of CT-detected extramural vascular invasion(ctEMVI), 0-2 points were ctEMVI-negative status, and 3-4 points were positive status for ctEMVI. Patients were divided into ctEMVI positive group and ctEMVI negative group. The correlation between ctEMVI and clinical features was analyzed. Receiver operating characteristic (ROC) curve was used to evaluate the diagnostic efficacy of ctEMVI for pathological metastatic lymph nodes and N staging, The sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) of pathological N staging using ctEMVI and short-axis diameter were generated and compared. RESULTS: The occurrence rate of lymphovascular invasion (LVI) and proportion of tumors with a greatest diameter > 6 cm in the ctEMVI positive group was higher than that in the ctEMVI negative group (P < 0.05). Spearman correlation analysis showed a positive correlation between ctEMVI and LVI, N stage, and tumor size (P < 0.05). For ctEMVI scores ≥ 3,The AUC of ctEMVI for diagnosing lymph node metastasis, N stage ≥ N2, and N3 stage were 0.857, 0.802, and 0.758, respectively. The sensitivity, NPV and accuracy of ctEMVI for diagnosing N stage ≥ N2 were superior to those of short-axis diameter (P < 0.05), while sensitivity, specificity, PPV, NPV, and accuracy of ctEMVI for diagnosing N3 stage were superior to those of short-axis diameter (P < 0.05). CONCLUSION: ctEMVI has important value in diagnosing metastatic lymph nodes and advanced N staging. As an important imaging marker, ctEMVI can be included in the preoperative imaging evaluation of patients, providing important assistance for clinical guidance and treatment.

Subject(s)

Multidetector Computed Tomography , Stomach Neoplasms , Humans , Stomach Neoplasms/diagnostic imaging , Stomach Neoplasms/surgery , Stomach Neoplasms/pathology , Neoplasm Invasiveness/diagnostic imaging , Neoplasm Invasiveness/pathology , Retrospective Studies , Lymph Nodes/pathology , Neoplasm Staging

3.

BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models.

Qiao, Yanhua; Zhu, Xiaolei; Gong, Haipeng.

Bioinformatics ; 38(3): 648-654, 2022 01 12.

Article in English | MEDLINE | ID: mdl-34643684

ABSTRACT

MOTIVATION: As one of the most important post-translational modifications (PTMs), protein lysine crotonylation (Kcr) has attracted wide attention, which involves in important physiological activities, such as cell differentiation and metabolism. However, experimental methods are expensive and time-consuming for Kcr identification. Instead, computational methods can predict Kcr sites in silico with high efficiency and low cost. RESULTS: In this study, we proposed a novel predictor, BERT-Kcr, for protein Kcr sites prediction, which was developed by using a transfer learning method with pre-trained bidirectional encoder representations from transformers (BERT) models. These models were originally used for natural language processing (NLP) tasks, such as sentence classification. Here, we transferred each amino acid into a word as the input information to the pre-trained BERT model. The features encoded by BERT were extracted and then fed to a BiLSTM network to build our final model. Compared with the models built by other machine learning and deep learning classifiers, BERT-Kcr achieved the best performance with AUROC of 0.983 for 10-fold cross validation. Further evaluation on the independent test set indicates that BERT-Kcr outperforms the state-of-the-art model Deep-Kcr with an improvement of about 5% for AUROC. The results of our experiment indicate that the direct use of sequence information and advanced pre-trained models of NLP could be an effective way for identifying PTM sites of proteins. AVAILABILITY AND IMPLEMENTATION: The BERT-Kcr model is publicly available on http://zhulab.org.cn/BERT-Kcr_models/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Lysine , Machine Learning , Lysine/metabolism , Language , Natural Language Processing , Protein Processing, Post-Translational

4.

SAMF: a self-adaptive protein modeling framework.

Ding, Wenze; Xu, Qijiang; Liu, Siyuan; Wang, Tong; Shao, Bin; Gong, Haipeng; Liu, Tie-Yan.

Bioinformatics ; 37(22): 4075-4082, 2021 11 18.

Article in English | MEDLINE | ID: mdl-34042965

ABSTRACT

MOTIVATION: Gradient descent-based protein modeling is a popular protein structure prediction approach that takes as input the predicted inter-residue distances and other necessary constraints and folds protein structures by minimizing protein-specific energy potentials. The constraints from multiple predicted protein properties provide redundant and sometime conflicting information that can trap the optimization process into local minima and impairs the modeling efficiency. RESULTS: To address these issues, we developed a self-adaptive protein modeling framework, SAMF. It eliminates redundancy of constraints and resolves conflicts, folds protein structures in an iterative way, and picks up the best structures by a deep quality analysis system. Without a large amount of complicated domain knowledge and numerous patches as barriers, SAMF achieves the state-of-the-art performance by exploiting the power of cutting-edge techniques of deep learning. SAMF has a modular design and can be easily customized and extended. As the quality of input constraints is ever growing, the superiority of SAMF will be amplified over time. AVAILABILITY AND IMPLEMENTATION: The source code and data for reproducing the results is available at https://msracb.blob.core.windows.net/pub/psp/SAMF.zip. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Proteins , Software , Proteins/metabolism

5.

RDb₂C2: an improved method to identify the residue-residue pairing in ß strands.

Shao, Di; Mao, Wenzhi; Xing, Yaoguang; Gong, Haipeng.

BMC Bioinformatics ; 21(1): 133, 2020 Apr 03.

Article in English | MEDLINE | ID: mdl-32245403

ABSTRACT

BACKGROUND: Despite the great advance of protein structure prediction, accurate prediction of the structures of mainly ß proteins is still highly challenging, but could be assisted by the knowledge of residue-residue pairing in ß strands. Previously, we proposed a ridge-detection-based algorithm RDb2C that adopted a multi-stage random forest framework to predict the ß-ß pairing given the amino acid sequence of a protein. RESULTS: In this work, we developed a second version of this algorithm, RDb2C2, by employing the residual neural network to further enhance the prediction accuracy. In the benchmark test, this new algorithm improves the F1-score by > 10 percentage points, reaching impressively high values of ~ 72% and ~ 73% in the BetaSheet916 and BetaSheet1452 sets, respectively. CONCLUSION: Our new method promotes the prediction accuracy of ß-ß pairing to a new level and the prediction results could better assist the structure modeling of mainly ß proteins. We prepared an online server of RDb2C2 at http://structpred.life.tsinghua.edu.cn/rdb2c2.html.

Subject(s)

Algorithms , Protein Conformation, beta-Strand , Sequence Analysis, Protein/methods , Neural Networks, Computer

6.

Identification of residue pairing in interacting ß-strands from a predicted residue contact map.

Mao, Wenzhi; Wang, Tong; Zhang, Wenxuan; Gong, Haipeng.

BMC Bioinformatics ; 19(1): 146, 2018 04 19.

Article in English | MEDLINE | ID: mdl-29673311

ABSTRACT

BACKGROUND: Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in ß strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in ß-ß interactions. This information may benefit the tertiary structure prediction of mainly ß proteins. In this work, we propose a novel ridge-detection-based ß-ß contact predictor to identify residue pairing in ß strands from any predicted residue contact map. RESULTS: Our algorithm RDb2C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb2C remarkably outperforms all state-of-the-art methods on two conventional test sets of ß proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb2C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly ß proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb2C. CONCLUSION: Our method can significantly improve the prediction of ß-ß contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly ß proteins. AVAILABILITY: All source data and codes are available at http://166.111.152.91/Downloads.html or the GitHub address of https://github.com/wzmao/RDb2C .

Subject(s)

Amino Acids/chemistry , Computational Biology/methods , Proteins/chemistry , Algorithms , Models, Molecular , Protein Conformation, beta-Strand , Protein Structure, Tertiary , Reproducibility of Results

7.

A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy.

Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng.

Bioinformatics ; 33(17): 2675-2683, 2017 Sep 01.

Article in English | MEDLINE | ID: mdl-28472263

ABSTRACT

MOTIVATION: Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. RESULTS: We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. AVAILABILITY AND IMPLEMENTATION: All source data and codes are available at http://166.111.152.91/Downloads.html . CONTACT: hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Computational Biology/methods , Machine Learning , Models, Molecular , Protein Conformation , Software , Databases, Protein

8.

LRFragLib: an effective algorithm to identify fragments for de novo protein structure prediction.

Wang, Tong; Yang, Yuedong; Zhou, Yaoqi; Gong, Haipeng.

Bioinformatics ; 33(5): 677-684, 2017 03 01.

Article in English | MEDLINE | ID: mdl-27797773

ABSTRACT

Motivation: The quality of fragment library determines the efficiency of fragment assembly, an approach that is widely used in most de novo protein-structure prediction algorithms. Conventional fragment libraries are constructed mainly based on the identities of amino acids, sometimes facilitated by predicted information including dihedral angles and secondary structures. However, it remains challenging to identify near-native fragment structures with low sequence homology. Results: We introduce a novel fragment-library-construction algorithm, LRFragLib, to improve the detection of near-native low-homology fragments of 7-10 residues, using a multi-stage, flexible selection protocol. Based on logistic regression scoring models, LRFragLib outperforms existing techniques by achieving a significantly higher precision and a comparable coverage on recent CASP protein sets in sampling near-native structures. The method also has a comparable computational efficiency to the fastest existing techniques with substantially reduced memory usage. Availability and Implementation: The source code is available for download at http://166.111.152.91/Downloads.html. Contact: hgong@tsinghua.edu.cn. Supplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Software , Algorithms , Caspases/chemistry , Protein Structure, Secondary

9.

Molecular determinants for the thermodynamic and functional divergence of uniporter GLUT1 and proton symporter XylE.

Ke, Meng; Yuan, Yafei; Jiang, Xin; Yan, Nieng; Gong, Haipeng.

PLoS Comput Biol ; 13(6): e1005603, 2017 Jun.

Article in English | MEDLINE | ID: mdl-28617850

ABSTRACT

GLUT1 facilitates the down-gradient translocation of D-glucose across cell membrane in mammals. XylE, an Escherichia coli homolog of GLUT1, utilizes proton gradient as an energy source to drive uphill D-xylose transport. Previous studies of XylE and GLUT1 suggest that the variation between an acidic residue (Asp27 in XylE) and a neutral one (Asn29 in GLUT1) is a key element for their mechanistic divergence. In this work, we combined computational and biochemical approaches to investigate the mechanism of proton coupling by XylE and the functional divergence between GLUT1 and XylE. Using molecular dynamics simulations, we evaluated the free energy profiles of the transition between inward- and outward-facing conformations for the apo proteins. Our results revealed the correlation between the protonation state and conformational preference in XylE, which is supported by the crystal structures. In addition, our simulations suggested a thermodynamic difference between XylE and GLUT1 that cannot be explained by the single residue variation at the protonation site. To understand the molecular basis, we applied Bayesian network models to analyze the alteration in the architecture of the hydrogen bond networks during conformational transition. The models and subsequent experimental validation suggest that multiple residue substitutions are required to produce the thermodynamic and functional distinction between XylE and GLUT1. Despite the lack of simulation studies with substrates, these computational and biochemical characterizations provide unprecedented insight into the mechanistic difference between proton symporters and uniporters.

Subject(s)

Escherichia coli Proteins/chemistry , Escherichia coli Proteins/ultrastructure , Glucose Transporter Type 1/chemistry , Glucose Transporter Type 1/ultrastructure , Models, Chemical , Molecular Dynamics Simulation , Symporters/chemistry , Symporters/ultrastructure , Energy Transfer , Humans , Protein Binding , Protein Conformation , Structure-Activity Relationship , Thermodynamics

10.

A deep learning framework for modeling structural features of RNA-binding protein targets.

Zhang, Sai; Zhou, Jingtian; Hu, Hailin; Gong, Haipeng; Chen, Ligong; Cheng, Chao; Zeng, Jianyang.

Nucleic Acids Res ; 44(4): e32, 2016 Feb 29.

Article in English | MEDLINE | ID: mdl-26467480

ABSTRACT

RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp.

Subject(s)

Polypyrimidine Tract-Binding Protein/chemistry , RNA, Messenger/chemistry , RNA-Binding Proteins/chemistry , Ribosomes/chemistry , Binding Sites , Computational Biology , Gene Expression Regulation , Nucleic Acid Conformation , Polypyrimidine Tract-Binding Protein/genetics , RNA Processing, Post-Transcriptional/genetics , RNA, Messenger/metabolism , RNA-Binding Proteins/genetics , Ribosomes/genetics

11.

Structural and Dynamic Insights into the Mechanism of Allosteric Signal Transmission in ERK2-Mediated MKP3 Activation.

Lu, Chang; Liu, Xin; Zhang, Chen-Song; Gong, Haipeng; Wu, Jia-Wei; Wang, Zhi-Xin.

Biochemistry ; 56(46): 6165-6175, 2017 11 21.

Article in English | MEDLINE | ID: mdl-29077400

ABSTRACT

The mitogen-activated protein kinases (MAPKs) are key components of cellular signal transduction pathways, which are down-regulated by the MAPK phosphatases (MKPs). Catalytic activity of the MKPs is controlled both by their ability to recognize selective MAPKs and by allosteric activation upon binding to MAPK substrates. Here, we use a combination of experimental and computational techniques to elucidate the molecular mechanism for the ERK2-induced MKP3 activation. Mutational and kinetic study shows that the 334FNFM337 motif in the MKP3 catalytic domain is essential for MKP3-mediated ERK2 inactivation and is responsible for ERK2-mediated MKP3 activation. The long-term molecular dynamics (MD) simulations further reveal a complete dynamic process in which the catalytic domain of MKP3 gradually changes to a conformation that resembles an active MKP catalytic domain over the time scale of the simulation, providing a direct time-dependent observation of allosteric signal transmission in ERK2-induced MKP3 activation.

Subject(s)

Dual Specificity Phosphatase 6/metabolism , Enzyme Activation , Mitogen-Activated Protein Kinase 1/metabolism , Signal Transduction , Allosteric Regulation , Animals , Catalytic Domain , Dual Specificity Phosphatase 6/chemistry , Humans , Mice , Mitogen-Activated Protein Kinase 1/chemistry , Molecular Dynamics Simulation , Protein Binding , Protein Conformation , Rats

12.

Predicting the helix-helix interactions from correlated residue mutations.

Xiong, Dapeng; Mao, Wenzhi; Gong, Haipeng.

Proteins ; 85(12): 2162-2169, 2017 Dec.

Article in English | MEDLINE | ID: mdl-28833538

ABSTRACT

Helix-helix interactions are crucial in the structure assembly, stability and function of helix-rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix-helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix-helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F-measure of â¼60% for predicting helix-helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred.

Subject(s)

Computational Biology/methods , Machine Learning , Membrane Proteins/chemistry , Amino Acid Sequence , Binding Sites , Protein Binding , Protein Conformation, alpha-Helical , Protein Interaction Domains and Motifs

13.

Coupling between ATP hydrolysis and protein conformational change in maltose transporter.

Lv, Xiaoying; Liu, Hao; Chen, Haifeng; Gong, Haipeng.

Proteins ; 85(2): 207-220, 2017 02.

Article in English | MEDLINE | ID: mdl-27616441

ABSTRACT

As the intracellular part of maltose transporter, MalK dimer utilizes the energy of ATP hydrolysis to drive protein conformational change, which then facilitates substrate transport. Free energy evaluation of the complete conformational change before and after ATP hydrolysis is helpful to elucidate the mechanism of chemical-to-mechanical energy conversion in MalK dimer, but is lacking in previous studies. In this work, we used molecular dynamics simulations to investigate the structural transition of MalK dimer among closed, semi-open and open states. We observed spontaneous structural transition from closed to open state in the ADP-bound system and partial closure of MalK dimer from the semi-open state in the ATP-bound system. Subsequently, we calculated the reaction pathways connecting the closed and open states for the ATP- and ADP-bound systems and evaluated the free energy profiles along the paths. Our results suggested that the closed state is stable in the presence of ATP but is markedly destabilized when ATP is hydrolyzed to ADP, which thus explains the coupling between ATP hydrolysis and protein conformational change of MalK dimer in thermodynamics. Proteins 2017; 85:207-220. © 2016 Wiley Periodicals, Inc.

Subject(s)

ATP-Binding Cassette Transporters/chemistry , Adenosine Triphosphate/chemistry , Escherichia coli Proteins/chemistry , Escherichia coli/genetics , ATP-Binding Cassette Transporters/genetics , ATP-Binding Cassette Transporters/metabolism , Adenosine Triphosphate/metabolism , Binding Sites , Cloning, Molecular , Escherichia coli/metabolism , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Gene Expression , Hydrolysis , Molecular Dynamics Simulation , Protein Binding , Protein Interaction Domains and Motifs , Protein Multimerization , Protein Structure, Secondary , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Thermodynamics

14.

Molecular dynamics study of ion transport through an open model of voltage-gated sodium channel.

Li, Yang; Sun, Ruining; Liu, Huihui; Gong, Haipeng.

Biochim Biophys Acta Biomembr ; 1859(5): 879-887, 2017 May.

Article in English | MEDLINE | ID: mdl-28188741

ABSTRACT

Voltage-gated sodium (NaV) channels are critical in the signal transduction of excitable cells. In this work, we modeled the open conformation for the pore domain of a prokaryotic NaV channel (NaVRh), and used molecular dynamics simulations to track the translocation of dozens of Na+ ions through the channel in the presence of a physiological transmembrane ion concentration gradient and a transmembrane electrical field that was closer to the physiological one than previous studies. Channel conductance was then estimated from simulations on the wide-type and DEKA mutant of NaVRh. Interestingly, the conductivity predicted from the DEKA mutant agrees well with experimental measurement on eukaryotic NaV1.4 channel. Moreover, the wide-type and DEKA mutant of NaVRh exhibited markedly distinct ion permeation patterns, which thus implies the mechanistic difference between prokaryotic and eukaryotic NaV channels.

Subject(s)

Ion Transport , Molecular Dynamics Simulation , Voltage-Gated Sodium Channels/physiology , Binding Sites , Membrane Potentials , Protein Conformation , Voltage-Gated Sodium Channels/chemistry

15.

Data construction for phosphorylation site prediction.

Gong, Haipeng; Liu, Xiaoqing; Wu, Jun; He, Zengyou.

Brief Bioinform ; 15(5): 839-55, 2014 Sep.

Article in English | MEDLINE | ID: mdl-23543354

ABSTRACT

Protein phosphorylation is one of the most pervasive post-translational modifications, regulating diverse cellular processes in various organisms. As mass spectrometry-based experimental approaches for identifying phosphorylation events are resource-intensive, many computational methods have been proposed, in which phosphorylation site prediction is formulated as a classification problem. They differ in several ways, and one crucial issue is the construction of training data and test data for unbiased performance evaluation. In this article, we categorize the existing data construction methods and try to answer three questions: (i) Is it equivalent to use different data construction methods in the assessment of phosphorylation site prediction algorithms? (ii) What kind of test data set is unbiased for assessing the prediction performance of a trained algorithm in different real world scenarios? (iii) Among the summarized training data construction methods, which one(s) has better generalization performance for most scenarios? To answer these questions, we conduct comprehensive experimental studies for both non-kinase-specific and kinase-specific prediction tasks. The experimental results show that: (i) different data construction methods can lead to significantly different prediction performance; (ii) there can be different test data construction methods that are unbiased with respect to different real world scenarios; and (iii) different data construction methods have different generalization performance in different real world scenarios. Therefore, when developing new algorithms in future research, people should concentrate on what kind of scenario their algorithm will work for, what the corresponding unbiased test data are and which training data construction method can generate best generalization performance.

Subject(s)

Proteins/metabolism , Algorithms , Phosphorylation

16.

Structure of a fucose transporter in an outward-open conformation.

Dang, Shangyu; Sun, Linfeng; Huang, Yongjian; Lu, Feiran; Liu, Yufeng; Gong, Haipeng; Wang, Jiawei; Yan, Nieng.

Nature ; 467(7316): 734-8, 2010 Oct 07.

Article in English | MEDLINE | ID: mdl-20877283

ABSTRACT

The major facilitator superfamily (MFS) transporters are an ancient and widespread family of secondary active transporters. In Escherichia coli, the uptake of l-fucose, a source of carbon for microorganisms, is mediated by an MFS proton symporter, FucP. Despite intensive study of the MFS transporters, atomic structure information is only available on three proteins and the outward-open conformation has yet to be captured. Here we report the crystal structure of FucP at 3.1 Å resolution, which shows that it contains an outward-open, amphipathic cavity. The similarly folded amino and carboxyl domains of FucP have contrasting surface features along the transport path, with negative electrostatic potential on the N domain and hydrophobic surface on the C domain. FucP only contains two acidic residues along the transport path, Asp 46 and Glu 135, which can undergo cycles of protonation and deprotonation. Their essential role in active transport is supported by both in vivo and in vitro experiments. Structure-based biochemical analyses provide insights into energy coupling, substrate recognition and the transport mechanism of FucP.

Subject(s)

Escherichia coli Proteins/chemistry , Escherichia coli/chemistry , Monosaccharide Transport Proteins/chemistry , Symporters/chemistry , Crystallography, X-Ray , Escherichia coli Proteins/metabolism , Fucose/metabolism , Hydrophobic and Hydrophilic Interactions , Models, Biological , Models, Molecular , Monosaccharide Transport Proteins/metabolism , Protein Conformation , Protons , Rotation , Static Electricity , Symporters/metabolism

17.

Protonation of Glu(135) Facilitates the Outward-to-Inward Structural Transition of Fucose Transporter.

Liu, Yufeng; Ke, Meng; Gong, Haipeng.

Biophys J ; 109(3): 542-51, 2015 Aug 04.

Article in English | MEDLINE | ID: mdl-26244736

ABSTRACT

Major facilitator superfamily (MFS) transporters typically need to alternatingly sample the outward-facing and inward-facing conformations, in order to transport the substrate across membrane. To understand the mechanism, in this work, we focused on one MFS member, the L-fucose/H(+) symporter (FucP), whose crystal structure exhibits an outward-open conformation. Previous experiments imply several residues critical to the substrate/proton binding and structural transition of FucP, among which Glu(135), located in the periplasm-accessible vestibule, is supposed as being involved in both proton translocation and conformational change of the protein. Here, the structural transition of FucP in presence of substrate was investigated using molecular-dynamics simulations. By combining the equilibrium and accelerated simulations as well as thermodynamic calculations, not only was the large-scale conformational change from the outward-facing to inward-facing state directly observed, but also the free energy change during the structural transition was calculated. The simulations confirm the critical role of Glu(135), whose protonation facilitates the outward-to-inward structural transition both by energetically favoring the inward-facing conformation in thermodynamics and by reducing the free energy barrier along the reaction pathway in kinetics. Our results may help the mechanistic studies of both FucP and other MFS transporters.

Subject(s)

Molecular Dynamics Simulation , Monosaccharide Transport Proteins/chemistry , Protons , Amino Acid Sequence , Glutamic Acid/chemistry , Molecular Sequence Data , Monosaccharide Transport Proteins/metabolism

18.

RBRIdent: An algorithm for improved identification of RNA-binding residues in proteins from primary sequences.

Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng.

Proteins ; 83(6): 1068-77, 2015 Jun.

Article in English | MEDLINE | ID: mdl-25846271

ABSTRACT

Rapid and correct identification of RNA-binding residues based on the protein primary sequences is of great importance. In most prevalent machine-learning-based identification methods; however, either some features are inefficiently represented, or the redundancy between features is not effectively removed. Both problems may weaken the performance of a classifier system and raise its computational complexity. Here, we addressed the above problems and developed a better classifier (RBRIdent) to identify the RNA-binding residues. In an independent benchmark test, RBRIdent achieved an accuracy of 76.79%, Matthews correlation coefficient of 0.3819 and F-measure of 75.58%, remarkably outperforming all prevalent methods. These results suggest the necessity of proper feature description and the essential role of feature selection in this project. All source data and codes are freely available at http://166.111.152.91/RBRIdent.

Subject(s)

Algorithms , Computational Biology/methods , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/metabolism , Sequence Analysis, Protein/methods , Software , Binding Sites , Databases, Protein , Machine Learning , Models, Molecular

19.

Increasing ßB1-crystallin sensitivity to proteolysis caused by the congenital cataract-microcornea syndrome mutation S129R.

Wang, Sha; Zhao, Wei-Jie; Liu, Huihui; Gong, Haipeng; Yan, Yong-Bin.

Biochim Biophys Acta ; 1832(2): 302-11, 2013 Feb.

Article in English | MEDLINE | ID: mdl-23159606

ABSTRACT

Congenital hereditary cataract, which is mainly caused by the deposition of crystallins in light-scattering particles, is one of the leading causes of newborn blindness in human beings. Recently, an autosomal dominant congenital cataract-microcornea syndrome in a Chinese family has been associated with the S129R mutation in ßB1-crystallin. To investigate the underlying molecular mechanism, we examined the effect of the mutation on ßB1-crystallin structure and thermal stability. Biophysical experiments indicated that the mutation impaired the oligomerization of ßB1-crystallin and shifted the dimer-monomer equilibrium to monomer. Molecular dynamic simulations revealed that the mutation altered the hydrogen-bonding network and hydrophobic interactions in the subunit interface of the dimeric protein, which resulted in the opening of the tightly associated interacting sites to allow the infiltration of the solvent molecules into the interface. Despite the disruption of ßB1-crystallin assembly, the thermal stability of ßB1-crystallin was increased by the mutation accompanied by the reduction of thermal aggregation at high temperatures. Further analysis indicated that the mutation significantly increased the sensitivity of ßB1-crystallin to trypsin hydrolysis. The digested fragments of the mutant were prone to aggregate and unable to protect ßA3-crystallin against aggregation. These results indicated that the thermal stability-beneficial mutation S129R in ßB1-crystallin provided an excellent model for discovering molecular mechanisms apart from solubility and stability. Our results also highlighted that the increased sensitivity of mutated crystallins towards proteases might play a crucial role in the pathogenesis of congenital hereditary cataract and associated syndrome.

Subject(s)

Cataract/metabolism , Corneal Diseases/metabolism , Mutation , beta-Crystallin B Chain/metabolism , Cataract/genetics , Chromatography, Gel , Corneal Diseases/genetics , Fluorescence Resonance Energy Transfer , Humans , Molecular Dynamics Simulation , Native Polyacrylamide Gel Electrophoresis , Proteolysis , Spectrometry, Fluorescence

20.

Improving the orientation-dependent statistical potential using a reference state.

Liu, Yufeng; Zeng, Jianyang; Gong, Haipeng.

Proteins ; 82(10): 2383-93, 2014 Oct.

Article in English | MEDLINE | ID: mdl-24810843

ABSTRACT

Statistical potentials are frequently engaged in the protein structural prediction and protein folding for conformational evaluation. Theoretically, to describe the many-body effect, pairwise interaction between two atom groups should be corrected by their relative geometric orientation. The potential functions developed by this means are called orientation-dependent statistical potentials and have exhibited substantially improved performance. However, none of the currently available orientation-dependent statistical potentials use any reference state, which has been proven to greatly enhance the power of distance-dependent statistical potentials in numerous previous studies. In this work, we designed a reasonable reference state for the orientation-dependent statistical potentials: using the average geometric relationship between atom pairs in known structures by neglecting their residue identities. The statistical potential developed using this reference state (called ORDER_AVE) prevails most available rival potentials in a series of tests on the decoy sets, although the information of side chain atoms (except the ß-carbon) is absent in its construction.

Subject(s)

Computational Biology/methods , Models, Statistical , Protein Folding , Proteins/chemistry , Humans , Models, Molecular , Protein Conformation , Thermodynamics

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL