RESUMEN
The BEN domain is a newly discovered type of DNA-binding domain that exists in a variety of species. There are nine BEN domain-containing proteins in humans, and most have been shown to have chromatin-related functions. NACC1 preferentially binds to CATG motif-containing sequences and functions primarily as a transcriptional coregulator. BANP and BEND3 preferentially bind DNA bearing unmethylated CpG motifs, and they function as CpG island-binding proteins. To date, the DNA recognition mechanism of quite a few of these proteins remains to be determined. In this study, we solved the crystal structures of the BEN domains of NACC1 and BANP in complex with their cognate DNA substrates. We revealed the details of DNA binding by these BEN domain proteins and unexpectedly revealed that oligomerization is required for BANP to select unmethylated CGCG motif-containing DNA substrates. Our study clarifies the controversies surrounding DNA recognition by BANP and demonstrates a new mechanism by which BANP selects unmethylated CpG motifs and functions as a CpG island-binding protein. This understanding will facilitate further exploration of the physiological functions of the BEN domain proteins in the future.
Asunto(s)
Islas de CpG , Proteínas de Unión al ADN , ADN , Modelos Moleculares , Unión Proteica , Dominios Proteicos , ADN/metabolismo , ADN/química , ADN/genética , Humanos , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Multimerización de Proteína , Cristalografía por Rayos X , Metilación de ADN , Proteínas Represoras/metabolismo , Proteínas Represoras/química , Proteínas Represoras/genética , Sitios de UniónRESUMEN
Proteins with desired functions and properties are important in fields like nanotechnology and biomedicine. De novo protein design enables the production of previously unseen proteins from the ground up and is believed as a key point for handling real social challenges. Recent introduction of deep learning into design methods exhibits a transformative influence and is expected to represent a promising and exciting future direction. In this review, we retrospect the major aspects of current advances in deep-learning-based design procedures and illustrate their novelty in comparison with conventional knowledge-based approaches through noticeable cases. We not only describe deep learning developments in structure-based protein design and direct sequence design, but also highlight recent applications of deep reinforcement learning in protein design. The future perspectives on design goals, challenges and opportunities are also comprehensively discussed.
Asunto(s)
Aprendizaje Profundo , Bases del Conocimiento , ProteínasRESUMEN
BACKGROUND: To explore the diagnostic value of multidetector computed tomography (MDCT) extramural vascular invasion (EMVI) in preoperative N Staging of gastric cancer patients. METHODS: According to the MR-defined EMVI scoring standard of rectal cancer, we developed a 5-point scale scoring system to evaluate the status of CT-detected extramural vascular invasion(ctEMVI), 0-2 points were ctEMVI-negative status, and 3-4 points were positive status for ctEMVI. Patients were divided into ctEMVI positive group and ctEMVI negative group. The correlation between ctEMVI and clinical features was analyzed. Receiver operating characteristic (ROC) curve was used to evaluate the diagnostic efficacy of ctEMVI for pathological metastatic lymph nodes and N staging, The sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV) of pathological N staging using ctEMVI and short-axis diameter were generated and compared. RESULTS: The occurrence rate of lymphovascular invasion (LVI) and proportion of tumors with a greatest diameter > 6 cm in the ctEMVI positive group was higher than that in the ctEMVI negative group (P < 0.05). Spearman correlation analysis showed a positive correlation between ctEMVI and LVI, N stage, and tumor size (P < 0.05). For ctEMVI scores ≥ 3,The AUC of ctEMVI for diagnosing lymph node metastasis, N stage ≥ N2, and N3 stage were 0.857, 0.802, and 0.758, respectively. The sensitivity, NPV and accuracy of ctEMVI for diagnosing N stage ≥ N2 were superior to those of short-axis diameter (P < 0.05), while sensitivity, specificity, PPV, NPV, and accuracy of ctEMVI for diagnosing N3 stage were superior to those of short-axis diameter (P < 0.05). CONCLUSION: ctEMVI has important value in diagnosing metastatic lymph nodes and advanced N staging. As an important imaging marker, ctEMVI can be included in the preoperative imaging evaluation of patients, providing important assistance for clinical guidance and treatment.
Asunto(s)
Tomografía Computarizada Multidetector , Neoplasias Gástricas , Humanos , Neoplasias Gástricas/diagnóstico por imagen , Neoplasias Gástricas/cirugía , Neoplasias Gástricas/patología , Invasividad Neoplásica/diagnóstico por imagen , Invasividad Neoplásica/patología , Estudios Retrospectivos , Ganglios Linfáticos/patología , Estadificación de NeoplasiasRESUMEN
Bolt loosening detection is crucial for ensuring the safe operation of equipment. This paper presents a vision-based real-time detection method that identifies bolt loosening by recognizing anti-loosening line markers at bolt connections. The method employs the YOLOv10-S deep learning model for high-precision, real-time bolt detection, followed by a two-step Fast-SCNN image segmentation technique. This approach effectively isolates the bolt and nut regions, enabling accurate extraction of the anti-loosening line markers. Key intersection points are calculated using ellipse and line fitting techniques, and the loosening angle is determined through spatial projection transformation. The experimental results demonstrate that, for high-resolution images of 2048 × 1024 pixels, the proposed method achieves an average angle detection error of 1.145° with a detection speed of 32 FPS. Compared to traditional methods and other vision-based approaches, this method offers non-contact measurement, real-time detection capabilities, reduced detection error, and general adaptability to various bolt types and configurations, indicating significant application potential.
RESUMEN
MOTIVATION: As one of the most important post-translational modifications (PTMs), protein lysine crotonylation (Kcr) has attracted wide attention, which involves in important physiological activities, such as cell differentiation and metabolism. However, experimental methods are expensive and time-consuming for Kcr identification. Instead, computational methods can predict Kcr sites in silico with high efficiency and low cost. RESULTS: In this study, we proposed a novel predictor, BERT-Kcr, for protein Kcr sites prediction, which was developed by using a transfer learning method with pre-trained bidirectional encoder representations from transformers (BERT) models. These models were originally used for natural language processing (NLP) tasks, such as sentence classification. Here, we transferred each amino acid into a word as the input information to the pre-trained BERT model. The features encoded by BERT were extracted and then fed to a BiLSTM network to build our final model. Compared with the models built by other machine learning and deep learning classifiers, BERT-Kcr achieved the best performance with AUROC of 0.983 for 10-fold cross validation. Further evaluation on the independent test set indicates that BERT-Kcr outperforms the state-of-the-art model Deep-Kcr with an improvement of about 5% for AUROC. The results of our experiment indicate that the direct use of sequence information and advanced pre-trained models of NLP could be an effective way for identifying PTM sites of proteins. AVAILABILITY AND IMPLEMENTATION: The BERT-Kcr model is publicly available on http://zhulab.org.cn/BERT-Kcr_models/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Lisina , Aprendizaje Automático , Lisina/metabolismo , Lenguaje , Procesamiento de Lenguaje Natural , Procesamiento Proteico-PostraduccionalRESUMEN
MOTIVATION: Gradient descent-based protein modeling is a popular protein structure prediction approach that takes as input the predicted inter-residue distances and other necessary constraints and folds protein structures by minimizing protein-specific energy potentials. The constraints from multiple predicted protein properties provide redundant and sometime conflicting information that can trap the optimization process into local minima and impairs the modeling efficiency. RESULTS: To address these issues, we developed a self-adaptive protein modeling framework, SAMF. It eliminates redundancy of constraints and resolves conflicts, folds protein structures in an iterative way, and picks up the best structures by a deep quality analysis system. Without a large amount of complicated domain knowledge and numerous patches as barriers, SAMF achieves the state-of-the-art performance by exploiting the power of cutting-edge techniques of deep learning. SAMF has a modular design and can be easily customized and extended. As the quality of input constraints is ever growing, the superiority of SAMF will be amplified over time. AVAILABILITY AND IMPLEMENTATION: The source code and data for reproducing the results is available at https://msracb.blob.core.windows.net/pub/psp/SAMF.zip. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Proteínas , Programas Informáticos , Proteínas/metabolismoRESUMEN
BACKGROUND: Despite the great advance of protein structure prediction, accurate prediction of the structures of mainly ß proteins is still highly challenging, but could be assisted by the knowledge of residue-residue pairing in ß strands. Previously, we proposed a ridge-detection-based algorithm RDb2C that adopted a multi-stage random forest framework to predict the ß-ß pairing given the amino acid sequence of a protein. RESULTS: In this work, we developed a second version of this algorithm, RDb2C2, by employing the residual neural network to further enhance the prediction accuracy. In the benchmark test, this new algorithm improves the F1-score by > 10 percentage points, reaching impressively high values of ~ 72% and ~ 73% in the BetaSheet916 and BetaSheet1452 sets, respectively. CONCLUSION: Our new method promotes the prediction accuracy of ß-ß pairing to a new level and the prediction results could better assist the structure modeling of mainly ß proteins. We prepared an online server of RDb2C2 at http://structpred.life.tsinghua.edu.cn/rdb2c2.html.
Asunto(s)
Algoritmos , Conformación Proteica en Lámina beta , Análisis de Secuencia de Proteína/métodos , Redes Neurales de la ComputaciónRESUMEN
BACKGROUND: Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in ß strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in ß-ß interactions. This information may benefit the tertiary structure prediction of mainly ß proteins. In this work, we propose a novel ridge-detection-based ß-ß contact predictor to identify residue pairing in ß strands from any predicted residue contact map. RESULTS: Our algorithm RDb2C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb2C remarkably outperforms all state-of-the-art methods on two conventional test sets of ß proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb2C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly ß proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb2C. CONCLUSION: Our method can significantly improve the prediction of ß-ß contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly ß proteins. AVAILABILITY: All source data and codes are available at http://166.111.152.91/Downloads.html or the GitHub address of https://github.com/wzmao/RDb2C .
Asunto(s)
Aminoácidos/química , Biología Computacional/métodos , Proteínas/química , Algoritmos , Modelos Moleculares , Conformación Proteica en Lámina beta , Estructura Terciaria de Proteína , Reproducibilidad de los ResultadosRESUMEN
MOTIVATION: Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. RESULTS: We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. AVAILABILITY AND IMPLEMENTATION: All source data and codes are available at http://166.111.152.91/Downloads.html . CONTACT: hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Modelos Moleculares , Conformación Proteica , Programas Informáticos , Bases de Datos de ProteínasRESUMEN
Motivation: The quality of fragment library determines the efficiency of fragment assembly, an approach that is widely used in most de novo protein-structure prediction algorithms. Conventional fragment libraries are constructed mainly based on the identities of amino acids, sometimes facilitated by predicted information including dihedral angles and secondary structures. However, it remains challenging to identify near-native fragment structures with low sequence homology. Results: We introduce a novel fragment-library-construction algorithm, LRFragLib, to improve the detection of near-native low-homology fragments of 7-10 residues, using a multi-stage, flexible selection protocol. Based on logistic regression scoring models, LRFragLib outperforms existing techniques by achieving a significantly higher precision and a comparable coverage on recent CASP protein sets in sampling near-native structures. The method also has a comparable computational efficiency to the fastest existing techniques with substantially reduced memory usage. Availability and Implementation: The source code is available for download at http://166.111.152.91/Downloads.html. Contact: hgong@tsinghua.edu.cn. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Biología Computacional/métodos , Proteínas/química , Programas Informáticos , Algoritmos , Caspasas/química , Estructura Secundaria de ProteínaRESUMEN
GLUT1 facilitates the down-gradient translocation of D-glucose across cell membrane in mammals. XylE, an Escherichia coli homolog of GLUT1, utilizes proton gradient as an energy source to drive uphill D-xylose transport. Previous studies of XylE and GLUT1 suggest that the variation between an acidic residue (Asp27 in XylE) and a neutral one (Asn29 in GLUT1) is a key element for their mechanistic divergence. In this work, we combined computational and biochemical approaches to investigate the mechanism of proton coupling by XylE and the functional divergence between GLUT1 and XylE. Using molecular dynamics simulations, we evaluated the free energy profiles of the transition between inward- and outward-facing conformations for the apo proteins. Our results revealed the correlation between the protonation state and conformational preference in XylE, which is supported by the crystal structures. In addition, our simulations suggested a thermodynamic difference between XylE and GLUT1 that cannot be explained by the single residue variation at the protonation site. To understand the molecular basis, we applied Bayesian network models to analyze the alteration in the architecture of the hydrogen bond networks during conformational transition. The models and subsequent experimental validation suggest that multiple residue substitutions are required to produce the thermodynamic and functional distinction between XylE and GLUT1. Despite the lack of simulation studies with substrates, these computational and biochemical characterizations provide unprecedented insight into the mechanistic difference between proton symporters and uniporters.
Asunto(s)
Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/ultraestructura , Transportador de Glucosa de Tipo 1/química , Transportador de Glucosa de Tipo 1/ultraestructura , Modelos Químicos , Simulación de Dinámica Molecular , Simportadores/química , Simportadores/ultraestructura , Transferencia de Energía , Humanos , Unión Proteica , Conformación Proteica , Relación Estructura-Actividad , TermodinámicaRESUMEN
RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp.
Asunto(s)
Proteína de Unión al Tracto de Polipirimidina/química , ARN Mensajero/química , Proteínas de Unión al ARN/química , Ribosomas/química , Sitios de Unión , Biología Computacional , Regulación de la Expresión Génica , Conformación de Ácido Nucleico , Proteína de Unión al Tracto de Polipirimidina/genética , Procesamiento Postranscripcional del ARN/genética , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/genética , Ribosomas/genéticaRESUMEN
The mitogen-activated protein kinases (MAPKs) are key components of cellular signal transduction pathways, which are down-regulated by the MAPK phosphatases (MKPs). Catalytic activity of the MKPs is controlled both by their ability to recognize selective MAPKs and by allosteric activation upon binding to MAPK substrates. Here, we use a combination of experimental and computational techniques to elucidate the molecular mechanism for the ERK2-induced MKP3 activation. Mutational and kinetic study shows that the 334FNFM337 motif in the MKP3 catalytic domain is essential for MKP3-mediated ERK2 inactivation and is responsible for ERK2-mediated MKP3 activation. The long-term molecular dynamics (MD) simulations further reveal a complete dynamic process in which the catalytic domain of MKP3 gradually changes to a conformation that resembles an active MKP catalytic domain over the time scale of the simulation, providing a direct time-dependent observation of allosteric signal transmission in ERK2-induced MKP3 activation.
Asunto(s)
Fosfatasa 6 de Especificidad Dual/metabolismo , Activación Enzimática , Proteína Quinasa 1 Activada por Mitógenos/metabolismo , Transducción de Señal , Regulación Alostérica , Animales , Dominio Catalítico , Fosfatasa 6 de Especificidad Dual/química , Humanos , Ratones , Proteína Quinasa 1 Activada por Mitógenos/química , Simulación de Dinámica Molecular , Unión Proteica , Conformación Proteica , RatasRESUMEN
Helix-helix interactions are crucial in the structure assembly, stability and function of helix-rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix-helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix-helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F-measure of â¼60% for predicting helix-helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred.
Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Proteínas de la Membrana/química , Secuencia de Aminoácidos , Sitios de Unión , Unión Proteica , Conformación Proteica en Hélice alfa , Dominios y Motivos de Interacción de ProteínasRESUMEN
As the intracellular part of maltose transporter, MalK dimer utilizes the energy of ATP hydrolysis to drive protein conformational change, which then facilitates substrate transport. Free energy evaluation of the complete conformational change before and after ATP hydrolysis is helpful to elucidate the mechanism of chemical-to-mechanical energy conversion in MalK dimer, but is lacking in previous studies. In this work, we used molecular dynamics simulations to investigate the structural transition of MalK dimer among closed, semi-open and open states. We observed spontaneous structural transition from closed to open state in the ADP-bound system and partial closure of MalK dimer from the semi-open state in the ATP-bound system. Subsequently, we calculated the reaction pathways connecting the closed and open states for the ATP- and ADP-bound systems and evaluated the free energy profiles along the paths. Our results suggested that the closed state is stable in the presence of ATP but is markedly destabilized when ATP is hydrolyzed to ADP, which thus explains the coupling between ATP hydrolysis and protein conformational change of MalK dimer in thermodynamics. Proteins 2017; 85:207-220. © 2016 Wiley Periodicals, Inc.
Asunto(s)
Transportadoras de Casetes de Unión a ATP/química , Adenosina Trifosfato/química , Proteínas de Escherichia coli/química , Escherichia coli/genética , Transportadoras de Casetes de Unión a ATP/genética , Transportadoras de Casetes de Unión a ATP/metabolismo , Adenosina Trifosfato/metabolismo , Sitios de Unión , Clonación Molecular , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Expresión Génica , Hidrólisis , Simulación de Dinámica Molecular , Unión Proteica , Dominios y Motivos de Interacción de Proteínas , Multimerización de Proteína , Estructura Secundaria de Proteína , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , TermodinámicaRESUMEN
Voltage-gated sodium (NaV) channels are critical in the signal transduction of excitable cells. In this work, we modeled the open conformation for the pore domain of a prokaryotic NaV channel (NaVRh), and used molecular dynamics simulations to track the translocation of dozens of Na+ ions through the channel in the presence of a physiological transmembrane ion concentration gradient and a transmembrane electrical field that was closer to the physiological one than previous studies. Channel conductance was then estimated from simulations on the wide-type and DEKA mutant of NaVRh. Interestingly, the conductivity predicted from the DEKA mutant agrees well with experimental measurement on eukaryotic NaV1.4 channel. Moreover, the wide-type and DEKA mutant of NaVRh exhibited markedly distinct ion permeation patterns, which thus implies the mechanistic difference between prokaryotic and eukaryotic NaV channels.
Asunto(s)
Transporte Iónico , Simulación de Dinámica Molecular , Canales de Sodio Activados por Voltaje/fisiología , Sitios de Unión , Potenciales de la Membrana , Conformación Proteica , Canales de Sodio Activados por Voltaje/químicaRESUMEN
Protein phosphorylation is one of the most pervasive post-translational modifications, regulating diverse cellular processes in various organisms. As mass spectrometry-based experimental approaches for identifying phosphorylation events are resource-intensive, many computational methods have been proposed, in which phosphorylation site prediction is formulated as a classification problem. They differ in several ways, and one crucial issue is the construction of training data and test data for unbiased performance evaluation. In this article, we categorize the existing data construction methods and try to answer three questions: (i) Is it equivalent to use different data construction methods in the assessment of phosphorylation site prediction algorithms? (ii) What kind of test data set is unbiased for assessing the prediction performance of a trained algorithm in different real world scenarios? (iii) Among the summarized training data construction methods, which one(s) has better generalization performance for most scenarios? To answer these questions, we conduct comprehensive experimental studies for both non-kinase-specific and kinase-specific prediction tasks. The experimental results show that: (i) different data construction methods can lead to significantly different prediction performance; (ii) there can be different test data construction methods that are unbiased with respect to different real world scenarios; and (iii) different data construction methods have different generalization performance in different real world scenarios. Therefore, when developing new algorithms in future research, people should concentrate on what kind of scenario their algorithm will work for, what the corresponding unbiased test data are and which training data construction method can generate best generalization performance.
Asunto(s)
Proteínas/metabolismo , Algoritmos , FosforilaciónRESUMEN
The major facilitator superfamily (MFS) transporters are an ancient and widespread family of secondary active transporters. In Escherichia coli, the uptake of l-fucose, a source of carbon for microorganisms, is mediated by an MFS proton symporter, FucP. Despite intensive study of the MFS transporters, atomic structure information is only available on three proteins and the outward-open conformation has yet to be captured. Here we report the crystal structure of FucP at 3.1 Å resolution, which shows that it contains an outward-open, amphipathic cavity. The similarly folded amino and carboxyl domains of FucP have contrasting surface features along the transport path, with negative electrostatic potential on the N domain and hydrophobic surface on the C domain. FucP only contains two acidic residues along the transport path, Asp 46 and Glu 135, which can undergo cycles of protonation and deprotonation. Their essential role in active transport is supported by both in vivo and in vitro experiments. Structure-based biochemical analyses provide insights into energy coupling, substrate recognition and the transport mechanism of FucP.
Asunto(s)
Proteínas de Escherichia coli/química , Escherichia coli/química , Proteínas de Transporte de Monosacáridos/química , Simportadores/química , Cristalografía por Rayos X , Proteínas de Escherichia coli/metabolismo , Fucosa/metabolismo , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Biológicos , Modelos Moleculares , Proteínas de Transporte de Monosacáridos/metabolismo , Conformación Proteica , Protones , Rotación , Electricidad Estática , Simportadores/metabolismoRESUMEN
Major facilitator superfamily (MFS) transporters typically need to alternatingly sample the outward-facing and inward-facing conformations, in order to transport the substrate across membrane. To understand the mechanism, in this work, we focused on one MFS member, the L-fucose/H(+) symporter (FucP), whose crystal structure exhibits an outward-open conformation. Previous experiments imply several residues critical to the substrate/proton binding and structural transition of FucP, among which Glu(135), located in the periplasm-accessible vestibule, is supposed as being involved in both proton translocation and conformational change of the protein. Here, the structural transition of FucP in presence of substrate was investigated using molecular-dynamics simulations. By combining the equilibrium and accelerated simulations as well as thermodynamic calculations, not only was the large-scale conformational change from the outward-facing to inward-facing state directly observed, but also the free energy change during the structural transition was calculated. The simulations confirm the critical role of Glu(135), whose protonation facilitates the outward-to-inward structural transition both by energetically favoring the inward-facing conformation in thermodynamics and by reducing the free energy barrier along the reaction pathway in kinetics. Our results may help the mechanistic studies of both FucP and other MFS transporters.
Asunto(s)
Simulación de Dinámica Molecular , Proteínas de Transporte de Monosacáridos/química , Protones , Secuencia de Aminoácidos , Ácido Glutámico/química , Datos de Secuencia Molecular , Proteínas de Transporte de Monosacáridos/metabolismoRESUMEN
Rapid and correct identification of RNA-binding residues based on the protein primary sequences is of great importance. In most prevalent machine-learning-based identification methods; however, either some features are inefficiently represented, or the redundancy between features is not effectively removed. Both problems may weaken the performance of a classifier system and raise its computational complexity. Here, we addressed the above problems and developed a better classifier (RBRIdent) to identify the RNA-binding residues. In an independent benchmark test, RBRIdent achieved an accuracy of 76.79%, Matthews correlation coefficient of 0.3819 and F-measure of 75.58%, remarkably outperforming all prevalent methods. These results suggest the necessity of proper feature description and the essential role of feature selection in this project. All source data and codes are freely available at http://166.111.152.91/RBRIdent.