RESUMO
Antiviral peptides (AVPs) have shown potential in inhibiting viral attachment, preventing viral fusion with host cells and disrupting viral replication due to their unique action mechanisms. They have now become a broad-spectrum, promising antiviral therapy. However, identifying effective AVPs is traditionally slow and costly. This study proposed a new two-stage computational framework for AVP identification. The first stage identifies AVPs from a wide range of peptides, and the second stage recognizes AVPs targeting specific families or viruses. This method integrates contrastive learning and multi-feature fusion strategy, focusing on sequence information and peptide characteristics, significantly enhancing predictive ability and interpretability. The evaluation results of the model show excellent performance, with accuracy of 0.9240 and Matthews correlation coefficient (MCC) score of 0.8482 on the non-AVP independent dataset, and accuracy of 0.9934 and MCC score of 0.9869 on the non-AMP independent dataset. Furthermore, our model can predict antiviral activities of AVPs against six key viral families (Coronaviridae, Retroviridae, Herpesviridae, Paramyxoviridae, Orthomyxoviridae, Flaviviridae) and eight viruses (FIV, HCV, HIV, HPIV3, HSV1, INFVA, RSV, SARS-CoV). Finally, to facilitate user accessibility, we built a user-friendly web interface deployed at https://awi.cuhk.edu.cn/â¼dbAMP/AVP/.
Assuntos
Antivirais , Biologia Computacional , Peptídeos , Antivirais/farmacologia , Peptídeos/química , Biologia Computacional/métodos , Humanos , Vírus , Aprendizado de Máquina , AlgoritmosRESUMO
Cancer is a severe illness that significantly threatens human life and health. Anticancer peptides (ACPs) represent a promising therapeutic strategy for combating cancer. In silico methods enable rapid and accurate identification of ACPs without extensive human and material resources. This study proposes a two-stage computational framework called ACP-CapsPred, which can accurately identify ACPs and characterize their functional activities across different cancer types. ACP-CapsPred integrates a protein language model with evolutionary information and physicochemical properties of peptides, constructing a comprehensive profile of peptides. ACP-CapsPred employs a next-generation neural network, specifically capsule networks, to construct predictive models. Experimental results demonstrate that ACP-CapsPred exhibits satisfactory predictive capabilities in both stages, reaching state-of-the-art performance. In the first stage, ACP-CapsPred achieves accuracies of 80.25% and 95.71%, as well as F1-scores of 79.86% and 95.90%, on benchmark datasets Set 1 and Set 2, respectively. In the second stage, tasked with characterizing the functional activities of ACPs across five selected cancer types, ACP-CapsPred attains an average accuracy of 90.75% and an F1-score of 91.38%. Furthermore, ACP-CapsPred demonstrates excellent interpretability, revealing regions and residues associated with anticancer activity. Consequently, ACP-CapsPred presents a promising solution to expedite the development of ACPs and offers a novel perspective for other biological sequence analyses.
Assuntos
Antineoplásicos , Biologia Computacional , Redes Neurais de Computação , Peptídeos , Humanos , Antineoplásicos/química , Antineoplásicos/farmacologia , Peptídeos/química , Biologia Computacional/métodos , Neoplasias/tratamento farmacológico , Neoplasias/metabolismo , Bases de Dados de ProteínasRESUMO
The emergence of multidrug-resistant bacteria is a critical global crisis that poses a serious threat to public health, particularly with the rise of multidrug-resistant Staphylococcus aureus. Accurate assessment of drug resistance is essential for appropriate treatment and prevention of transmission of these deadly pathogens. Early detection of drug resistance in patients is critical for providing timely treatment and reducing the spread of multidrug-resistant bacteria. This study aims to develop a novel risk assessment framework for S. aureus that can accurately determine the resistance to multiple antibiotics. The comprehensive 7-year study involved Ë20 000 isolates with susceptibility testing profiles of six antibiotics. By incorporating mass spectrometry and machine learning, the study was able to predict the susceptibility to four different antibiotics with high accuracy. To validate the accuracy of our models, we externally tested on an independent cohort and achieved impressive results with an area under the receiver operating characteristic curve of 0. 94, 0.90, 0.86 and 0.91, and an area under the precision-recall curve of 0.93, 0.87, 0.87 and 0.81, respectively, for oxacillin, clindamycin, erythromycin and trimethoprim-sulfamethoxazole. In addition, the framework evaluated the level of multidrug resistance of the isolates by using the predicted drug resistance probabilities, interpreting them in the context of a multidrug resistance risk score and analyzing the performance contribution of different sample groups. The results of this study provide an efficient method for early antibiotic decision-making and a better understanding of the multidrug resistance risk of S. aureus.
Assuntos
Staphylococcus aureus Resistente à Meticilina , Infecções Estafilocócicas , Humanos , Staphylococcus aureus , Infecções Estafilocócicas/tratamento farmacológico , Infecções Estafilocócicas/microbiologia , Antibacterianos/farmacologia , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Aprendizado de Máquina , Medição de RiscoRESUMO
Tuberculosis (TB) is a severe disease caused by Mycobacterium tuberculosis that poses a significant threat to human health. The emergence of drug-resistant strains has made the global fight against TB even more challenging. Antituberculosis peptides (ATPs) have shown promising results as a potential treatment for TB. However, conventional wet lab-based approaches to ATP discovery are time-consuming and costly and often fail to discover peptides with desired properties. To address these challenges, we propose a novel machine learning-based framework called ATPfinder that can significantly accelerate the discovery of ATP. Our approach integrates various efficient peptide descriptors and utilizes the deep forest algorithm to construct the model. This neural network-like cascading structure can effectively process and mine features without complex hyperparameter tuning. Our experimental results show that ATPfinder outperforms existing ATP prediction tools, achieving state-of-the-art performance with an accuracy of 89.3% and an MCC of 0.70. Moreover, our framework exhibits better robustness than baseline algorithms commonly used for other sequence analysis tasks. Additionally, the excellent interpretability of our model can assist researchers in understanding the critical features of ATP. Finally, we developed a downloadable desktop application to simplify the use of our framework for researchers. Therefore, ATPfinder can facilitate the discovery of peptide drugs and provide potential solutions for TB treatment. Our framework is freely available at https://github.com/lantianyao/ATPfinder/ (data sets and code) and https://awi.cuhk.edu.cn/dbAMP/ATPfinder.html (software).
Assuntos
Mycobacterium tuberculosis , Tuberculose , Humanos , Peptídeos/farmacologia , Antituberculosos/farmacologia , Algoritmos , Tuberculose/tratamento farmacológico , Florestas , Trifosfato de AdenosinaRESUMO
The small molecule epiberberine (EPI) is a natural alkaloid with versatile bioactivities against several diseases including cancer and bacterial infection. EPI can induce the formation of a unique binding pocket at the 5' side of a human telomeric G-quadruplex (HTG) sequence with four telomeric repeats (Q4), resulting in a nanomolar binding affinity (KD approximately 26 nM) with significant fluorescence enhancement upon binding. It is important to understand (1) how EPI binding affects HTG structural stability and (2) how enhanced EPI binding may be achieved through the engineering of the DNA binding pocket. In this work, the EPI-binding-induced HTG structure stabilization effect was probed by a peptide nucleic acid (PNA) invasion assay in combination with a series of biophysical techniques. We show that the PNA invasion-based method may be useful for the characterization of compounds binding to DNA (and RNA) structures under physiological conditions without the need to vary the solution temperature or buffer components, which are typically needed for structural stability characterization. Importantly, the combination of theoretical modeling and experimental quantification allows us to successfully engineer Q4 derivative Q4-ds-A by a simple extension of a duplex structure to Q4 at the 5' end. Q4-ds-A is an excellent EPI binder with a KD of 8 nM, with the binding enhancement achieved through the preformation of a binding pocket and a reduced dissociation rate. The tight binding of Q4 and Q4-ds-A with EPI allows us to develop a novel magnetic bead-based affinity purification system to effectively extract EPI from Rhizoma coptidis (Huang Lian) extracts.
Assuntos
Berberina , Quadruplex G , Berberina/química , Berberina/análogos & derivados , Berberina/farmacologia , Humanos , DNA/química , Ácidos Nucleicos Peptídicos/químicaRESUMO
Enhancers are a class of noncoding DNA, serving as crucial regulatory elements in governing gene expression by binding to transcription factors. The identification of enhancers holds paramount importance in the field of biology. However, traditional experimental methods for enhancer identification demand substantial human and material resources. Consequently, there is a growing interest in employing computational methods for enhancer prediction. In this study, we propose a two-stage framework based on deep learning, termed CapsEnhancer, for the identification of enhancers and their strengths. CapsEnhancer utilizes chaos game representation to encode DNA sequences into unique images and employs a capsule network to extract local and global features from sequence "images". Experimental results demonstrate that CapsEnhancer achieves state-of-the-art performance in both stages. In the first and second stages, the accuracy surpasses the previous best methods by 8 and 3.5%, reaching accuracies of 94.5 and 95%, respectively. Notably, this study represents the pioneering application of computer vision methods to enhancer identification tasks. Our work not only contributes novel insights to enhancer identification but also provides a fresh perspective for other biological sequence analysis tasks.
Assuntos
Biologia Computacional , Elementos Facilitadores Genéticos , Biologia Computacional/métodos , Humanos , Dinâmica não Linear , Aprendizado ProfundoRESUMO
The last 18 months, or more, have seen a profound shift in our global experience, with many of us navigating a once-in-100-year pandemic. To date, COVID-19 remains a life-threatening pandemic with little to no targeted therapeutic recourse. The discovery of novel antiviral agents, such as vaccines and drugs, can provide therapeutic solutions to save human beings from severe infections; however, there is no specifically effective antiviral treatment confirmed for now. Thus, great attention has been paid to the use of natural or artificial antimicrobial peptides (AMPs) as these compounds are widely regarded as promising solutions for the treatment of harmful microorganisms. Given the biological significance of AMPs, it was obvious that there was a significant need for a single platform for identifying and engaging with AMP data. This led to the creation of the dbAMP platform that provides comprehensive information about AMPs and facilitates their investigation and analysis. To date, the dbAMP has accumulated 26 447 AMPs and 2262 antimicrobial proteins from 3044 organisms using both database integration and manual curation of >4579 articles. In addition, dbAMP facilitates the evaluation of AMP structures using I-TASSER for automated protein structure prediction and structure-based functional annotation, providing predictive structure information for clinical drug development. Next-generation sequencing (NGS) and third-generation sequencing have been applied to generate large-scale sequencing reads from various environments, enabling greatly improved analysis of genome structure. In this update, we launch an efficient online tool that can effectively identify AMPs from genome/metagenome and proteome data of all species in a short period. In conclusion, these improvements promote the dbAMP as one of the most abundant and comprehensively annotated resources for AMPs. The updated dbAMP is now freely accessible at http://awi.cuhk.edu.cn/dbAMP.
Assuntos
Peptídeos Antimicrobianos , Bases de Dados Factuais , Software , Peptídeos Antimicrobianos/química , Peptídeos Antimicrobianos/farmacologia , Genômica , Fases de Leitura Aberta , Conformação Proteica , ProteômicaRESUMO
MicroRNAs (miRNAs) are noncoding RNAs with 18-26 nucleotides; they pair with target mRNAs to regulate gene expression and produce significant changes in various physiological and pathological processes. In recent years, the interaction between miRNAs and their target genes has become one of the mainstream directions for drug development. As a large-scale biological database that mainly provides miRNA-target interactions (MTIs) verified by biological experiments, miRTarBase has undergone five revisions and enhancements. The database has accumulated >2 200 449 verified MTIs from 13 389 manually curated articles and CLIP-seq data. An optimized scoring system is adopted to enhance this update's critical recognition of MTI-related articles and corresponding disease information. In addition, single-nucleotide polymorphisms and disease-related variants related to the binding efficiency of miRNA and target were characterized in miRNAs and gene 3' untranslated regions. miRNA expression profiles across extracellular vesicles, blood and different tissues, including exosomal miRNAs and tissue-specific miRNAs, were integrated to explore miRNA functions and biomarkers. For the user interface, we have classified attributes, including RNA expression, specific interaction, protein expression and biological function, for various validation experiments related to the role of miRNA. We also used seed sequence information to evaluate the binding sites of miRNA. In summary, these enhancements render miRTarBase as one of the most research-amicable MTI databases that contain comprehensive and experimentally verified annotations. The newly updated version of miRTarBase is now available at https://miRTarBase.cuhk.edu.cn/.
Assuntos
Regiões 3' não Traduzidas , Bases de Dados de Ácidos Nucleicos , Redes Reguladoras de Genes , MicroRNAs/genética , Neoplasias/genética , RNA não Traduzido/genética , Animais , Sítios de Ligação , Biomarcadores/metabolismo , Mineração de Dados/estatística & dados numéricos , Exossomos/química , Exossomos/metabolismo , Regulação da Expressão Gênica , Humanos , Internet , Camundongos , MicroRNAs/classificação , MicroRNAs/metabolismo , Anotação de Sequência Molecular , Neoplasias/metabolismo , Neoplasias/patologia , Polimorfismo de Nucleotídeo Único , RNA não Traduzido/classificação , RNA não Traduzido/metabolismo , Células Tumorais Cultivadas , Interface Usuário-ComputadorRESUMO
Glycerol-3-phosphate acyltransferase (GPAT) catalyzes the first step in triacylglycerol synthesis. Understanding its substrate recognition mechanism may help to design drugs to regulate the production of glycerol lipids in cells. In this work, we investigate how the native substrate, glycerol-3-phosphate (G3P), and palmitoyl-coenzyme A (CoA) bind to the human GPAT isoform GPAT4 via molecular dynamics simulations (MD). As no experimentally resolved GPAT4 structure is available, the AlphaFold model is employed to construct the GPAT4-substrate complex model. Using another isoform, GPAT1, we demonstrate that once the ligand binding is properly addressed, the AlphaFold complex model can deliver similar results to the experimentally resolved structure in MD simulations. Following the validated protocol of complex construction, we perform MD simulations using the GPAT4-substrate complex. Our simulations reveal that R427 is an important residue in recognizing G3P via a stable salt bridge, but its motion can bring the ligand to different binding hotspots on GPAT4. Such high flexibility can be attributed to the flexible region that exists only on GPAT4 and not on GPAT1. Our study reveals the substrate recognition mechanism of GPAT4 and hence paves the way towards designing GPAT4 inhibitors.
Assuntos
Glicerol , Glicerofosfatos , Simulação de Dinâmica Molecular , Humanos , Ligantes , Glicerol-3-Fosfato O-Aciltransferase , Isoformas de Proteínas , FosfatosRESUMO
Inflammation is a biological response to harmful stimuli, aiding in the maintenance of tissue homeostasis. However, excessive or persistent inflammation can precipitate a myriad of pathological conditions. Although current treatments such as NSAIDs, corticosteroids, and immunosuppressants are effective, they can have side effects and resistance issues. In this backdrop, anti-inflammatory peptides (AIPs) have emerged as a promising therapeutic approach against inflammation. Leveraging machine learning methods, we have the opportunity to accelerate the discovery and investigation of these AIPs more effectively. In this study, we proposed an advanced framework by ensemble machine learning and deep learning for AIP prediction. Initially, we constructed three individual models with extremely randomized trees (ET), gated recurrent unit (GRU), and convolutional neural networks (CNNs) with attention mechanism and then used stacking architecture to build the final predictor. By utilizing various sequence encodings and combining the strengths of different algorithms, our predictor demonstrated exemplary performance. On our independent test set, our model achieved an accuracy, MCC, and F1-score of 0.757, 0.500, and 0.707, respectively, clearly outperforming other contemporary AIP prediction methods. Additionally, our model offers profound insights into the feature interpretation of AIPs, establishing a valuable knowledge foundation for the design and development of future anti-inflammatory strategies.
Assuntos
Aprendizado Profundo , Humanos , Anti-Inflamatórios/farmacologia , Anti-Inflamatórios/uso terapêutico , Peptídeos/farmacologia , Inflamação/tratamento farmacológico , Algoritmos , Aprendizado de MáquinaRESUMO
Radiation of the plant pyridoxal 5'-phosphate (PLP)-dependent aromatic l-amino acid decarboxylase (AAAD) family has yielded an array of paralogous enzymes exhibiting divergent substrate preferences and catalytic mechanisms. Plant AAADs catalyze either the decarboxylation or decarboxylation-dependent oxidative deamination of aromatic l-amino acids to produce aromatic monoamines or aromatic acetaldehydes, respectively. These compounds serve as key precursors for the biosynthesis of several important classes of plant natural products, including indole alkaloids, benzylisoquinoline alkaloids, hydroxycinnamic acid amides, phenylacetaldehyde-derived floral volatiles, and tyrosol derivatives. Here, we present the crystal structures of four functionally distinct plant AAAD paralogs. Through structural and functional analyses, we identify variable structural features of the substrate-binding pocket that underlie the divergent evolution of substrate selectivity toward indole, phenyl, or hydroxyphenyl amino acids in plant AAADs. Moreover, we describe two mechanistic classes of independently arising mutations in AAAD paralogs leading to the convergent evolution of the derived aldehyde synthase activity. Applying knowledge learned from this study, we successfully engineered a shortened benzylisoquinoline alkaloid pathway to produce (S)-norcoclaurine in yeast. This work highlights the pliability of the AAAD fold that allows change of substrate selectivity and access to alternative catalytic mechanisms with only a few mutations.
Assuntos
Descarboxilases de Aminoácido-L-Aromático/química , Domínio Catalítico , Evolução Molecular , Proteínas de Plantas/química , Aminoácidos Aromáticos/química , Aminoácidos Aromáticos/metabolismo , Descarboxilases de Aminoácido-L-Aromático/genética , Descarboxilases de Aminoácido-L-Aromático/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Especificidade por SubstratoRESUMO
One of the major challenges in cancer therapy lies in the limited targeting specificity exhibited by existing anti-cancer drugs. Tumor-homing peptides (THPs) have emerged as a promising solution to this issue, due to their capability to specifically bind to and accumulate in tumor tissues while minimally impacting healthy tissues. THPs are short oligopeptides that offer a superior biological safety profile, with minimal antigenicity, and faster incorporation rates into target cells/tissues. However, identifying THPs experimentally, using methods such as phage display or in vivo screening, is a complex, time-consuming task, hence the need for computational methods. In this study, we proposed StackTHPred, a novel machine learning-based framework that predicts THPs using optimal features and a stacking architecture. With an effective feature selection algorithm and three tree-based machine learning algorithms, StackTHPred has demonstrated advanced performance, surpassing existing THP prediction methods. It achieved an accuracy of 0.915 and a 0.831 Matthews Correlation Coefficient (MCC) score on the main dataset, and an accuracy of 0.883 and a 0.767 MCC score on the small dataset. StackTHPred also offers favorable interpretability, enabling researchers to better understand the intrinsic characteristics of THPs. Overall, StackTHPred is beneficial for both the exploration and identification of THPs and facilitates the development of innovative cancer therapies.
Assuntos
Neoplasias , Peptídeos , Humanos , Peptídeos/metabolismo , Oligopeptídeos , Algoritmos , Aprendizado de MáquinaRESUMO
Cancer is one of the leading diseases threatening human life and health worldwide. Peptide-based therapies have attracted much attention in recent years. Therefore, the precise prediction of anticancer peptides (ACPs) is crucial for discovering and designing novel cancer treatments. In this study, we proposed a novel machine learning framework (GRDF) that incorporates deep graphical representation and deep forest architecture for identifying ACPs. Specifically, GRDF extracts graphical features based on the physicochemical properties of peptides and integrates their evolutionary information along with binary profiles for constructing models. Moreover, we employ the deep forest algorithm, which adopts a layer-by-layer cascade architecture similar to deep neural networks, enabling excellent performance on small datasets but without complicated tuning of hyperparameters. The experiment shows GRDF exhibits state-of-the-art performance on two elaborate datasets (Set 1 and Set 2), achieving 77.12% accuracy and 77.54% F1-score on Set 1, as well as 94.10% accuracy and 94.15% F1-score on Set 2, exceeding existing ACP prediction methods. Our models exhibit greater robustness than the baseline algorithms commonly used for other sequence analysis tasks. In addition, GRDF is well-interpretable, enabling researchers to better understand the features of peptide sequences. The promising results demonstrate that GRDF is remarkably effective in identifying ACPs. Therefore, the framework presented in this study could assist researchers in facilitating the discovery of anticancer peptides and contribute to developing novel cancer treatments.
Assuntos
Neoplasias , Peptídeos , Humanos , Peptídeos/química , Algoritmos , Sequência de Aminoácidos , Redes Neurais de ComputaçãoRESUMO
Cytochrome P450 17A1 (CYP17A1) is one of the key enzymes in steroidogenesis that produces dehydroepiandrosterone (DHEA) from cholesterol. Abnormal DHEA production may lead to the progression of severe diseases, such as prostatic and breast cancers. Thus, CYP17A1 is a druggable target for anti-cancer molecule development. In this study, cheminformatic analyses and quantitative structure-activity relationship (QSAR) modeling were applied on a set of 962 CYP17A1 inhibitors (i.e., consisting of 279 steroidal and 683 nonsteroidal inhibitors) compiled from the ChEMBL database. For steroidal inhibitors, a QSAR classification model built using the PubChem fingerprint along with the extra trees algorithm achieved the best performance, reflected by the accuracy values of 0.933, 0.818, and 0.833 for the training, cross-validation, and test sets, respectively. For nonsteroidal inhibitors, a systematic cheminformatic analysis was applied for exploring the chemical space, Murcko scaffolds, and structure-activity relationships (SARs) for visualizing distributions, patterns, and representative scaffolds for drug discoveries. Furthermore, seven total QSAR classification models were established based on the nonsteroidal scaffolds, and two activity cliff (AC) generators were identified. The best performing model out of these seven was model VIII, which is built upon the PubChem fingerprint along with the random forest algorithm. It achieved a robust accuracy across the training set, the cross-validation set, and the test set, i.e., 0.96, 0.92, and 0.913, respectively. It is anticipated that the results presented herein would be instrumental for further CYP17A1 inhibitor drug discovery efforts.
Assuntos
Quimioinformática , Inibidores Enzimáticos , Esteroide 17-alfa-Hidroxilase , Desidroepiandrosterona , Inibidores Enzimáticos/farmacologia , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade , Esteroides/química , Esteroide 17-alfa-Hidroxilase/antagonistas & inibidoresRESUMO
When employing molecular dynamics (MD) simulations for computer-aided drug design, the quality of the used force fields is highly important. Here we present reparametrisations of the force fields for the core molecules from 9 different [Formula: see text]-lactam classes, for which we utilized the force field Toolkit and Gaussian calculations. We focus on the parametrisation of the dihedral angles, with the goal of reproducing the optimised quantum geometry in MD simulations. Parameters taken from CGenFF turn out to be a good initial guess for the multiplicity of each dihedral angle, but the key to a successful parametrisation is found to lie in the phase shifts. Based on the optimised quantum geometry, we come up with a strategy for predicting the phase shifts prior to the dihedral potential fitting. This allows us to successfully parameterise 8 out of the 11 molecules studied here, while the remaining 3 molecules can also be parameterised with small adjustments. Our work highlights the importance of predicting the dihedral phase shifts in the ligand parametrisation protocol, and provides a simple yet valuable strategy for improving the process of parameterising force fields of drug-like molecules.
Assuntos
Lactamas , Simulação de Dinâmica Molecular , Desenho de FármacosRESUMO
Flavonoids are important polyphenolic natural products, ubiquitous in land plants, that play diverse functions in plants' survival in their ecological niches, including UV protection, pigmentation for attracting pollinators, symbiotic nitrogen fixation, and defense against herbivores. Chalcone synthase (CHS) catalyzes the first committed step in plant flavonoid biosynthesis and is highly conserved in all land plants. In several previously reported crystal structures of CHSs from flowering plants, the catalytic cysteine is oxidized to sulfinic acid, indicating enhanced nucleophilicity in this residue associated with its increased susceptibility to oxidation. In this study, we report a set of new crystal structures of CHSs representing all five major lineages of land plants (bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms), spanning 500 million years of evolution. We reveal that the structures of CHS from a lycophyte and a moss species preserve the catalytic cysteine in a reduced state, in contrast to the cysteine sulfinic acid seen in all euphyllophyte CHS structures. In vivo complementation, in vitro biochemical and mutagenesis analyses, and molecular dynamics simulations identified a set of residues that differ between basal-plant and euphyllophyte CHSs and modulate catalytic cysteine reactivity. We propose that the CHS active-site environment has evolved in euphyllophytes to further enhance the nucleophilicity of the catalytic cysteine since the divergence of euphyllophytes from other vascular plant lineages 400 million years ago. These changes in CHS could have contributed to the diversification of flavonoid biosynthesis in euphyllophytes, which in turn contributed to their dominance in terrestrial ecosystems.
Assuntos
Aciltransferases/metabolismo , Evolução Biológica , Cisteína/metabolismo , Embriófitas/enzimologia , Aciltransferases/química , Sequência de Aminoácidos , Catálise , Domínio Catalítico , Cristalografia por Raios X , Embriófitas/classificação , Embriófitas/fisiologia , Simulação de Dinâmica Molecular , Filogenia , Conformação Proteica , Homologia de Sequência de AminoácidosRESUMO
Substrate permissiveness has long been regarded as the raw materials for the evolution of new enzymatic functions. In land plants, hydroxycinnamoyltransferase (HCT) is an essential enzyme of the phenylpropanoid metabolism. Although essential enzymes are normally associated with high substrate specificity, HCT can utilize a variety of non-native substrates. To examine the structural and dynamic basis of substrate permissiveness in this enzyme, we report the crystal structure of HCT from Selaginella moellendorffii and molecular dynamics (MD) simulations performed on five orthologous HCTs from several major lineages of land plants. Through altogether 17-µs MD simulations, we demonstrate the prevalent swing motion of an arginine handle on a submicrosecond timescale across all five HCTs, which plays a key role in native substrate recognition by these intrinsically promiscuous enzymes. Our simulations further reveal how a non-native substrate of HCT engages a binding site different from that of the native substrate and diffuses to reach the catalytic center and its co-substrate. By numerically solving the Smoluchowski equation, we show that the presence of such an alternative binding site, even when it is distant from the catalytic center, always increases the reaction rate of a given substrate. However, this increase is only significant for enzyme-substrate reactions heavily influenced by diffusion. In these cases, binding non-native substrates 'off-center' provides an effective rationale to develop substrate permissiveness while maintaining the native functions of promiscuous enzymes.
Assuntos
Acetofenonas/química , Acetofenonas/metabolismo , Aciltransferases/química , Aciltransferases/metabolismo , Especificidade por Substrato/fisiologia , Biologia Computacional , Cristalografia por Raios X , Proteínas de Plantas/química , Proteínas de Plantas/metabolismo , Selaginellaceae/enzimologiaRESUMO
Hydroxycinnamoyl-CoA:shikimate hydroxycinnamoyltransferase (HCT) is an essential acyltransferase that mediates flux through plant phenylpropanoid metabolism by catalyzing a reaction between p-coumaroyl-CoA and shikimate, yet it also exhibits broad substrate permissiveness in vitro. How do enzymes like HCT avoid functional derailment by cellular metabolites that qualify as non-native substrates? Here, we combine X-ray crystallography and molecular dynamics to reveal distinct dynamic modes of HCT under native and non-native catalysis. We find that essential electrostatic and hydrogen-bonding interactions between the ligand and active site residues, permitted by active site plasticity, are elicited more effectively by shikimate than by other non-native substrates. This work provides a structural basis for how dynamic conformational states of HCT favor native over non-native catalysis by reducing the number of futile encounters between the enzyme and shikimate.
Assuntos
Aciltransferases/química , Proteínas de Arabidopsis/química , Domínio Catalítico , Conformação Proteica , Aciltransferases/genética , Aciltransferases/metabolismo , Sequência de Aminoácidos , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Sítios de Ligação/genética , Biocatálise , Cristalografia por Raios X , Ligação de Hidrogênio , Cinética , Simulação de Dinâmica Molecular , Mutação , Homologia de Sequência de Aminoácidos , Ácido Chiquímico/química , Ácido Chiquímico/metabolismo , Eletricidade Estática , Especificidade por SubstratoRESUMO
With the rapid expansion of our computing power, molecular dynamics (MD) simulations ranging from hundreds of nanoseconds to microseconds or even milliseconds have become increasingly common. The majority of these long trajectories are obtained from plain (vanilla) MD simulations, where no enhanced sampling or free energy calculation method is employed. To promote the "recycling" of these trajectories, we developed the Virtual Substitution Scan (VSS) toolkit as a plugin of the open-source visualization and analysis software VMD. Based on the single-step free energy perturbation (sFEP) method, VSS enables the user to post-process a vanilla MD trajectory for a fast free energy scan of substituting aryl hydrogens by small functional groups. Dihedrals of the functional groups are sampled explicitly in VSS, which improves the performance of the calculation and is found particularly important for certain groups. As a proof-of-concept demonstration, we employ VSS to compute the solvation free energy change upon substituting the hydrogen of a benzene molecule by 12 small functional groups frequently considered in lead optimization. Additionally, VSS is used to compute the relative binding free energy of four selected ligands of the T4 lysozyme. Overall, the computational cost of VSS is only a fraction of the corresponding multi-step FEP (mFEP) calculation, while its results agree reasonably well with those of mFEP, indicating that VSS offers a promising tool for rapid free energy scan of small functional group substitutions. © 2016 Wiley Periodicals, Inc. Biopolymers 105: 324-336, 2016.
RESUMO
Single-step free energy perturbation (sFEP) has often been proposed as an efficient tool for a quick free energy scan due to its straightforward protocol and the ability to recycle an existing molecular dynamics trajectory for free energy calculations. Although sFEP is expected to fail when the sampling of a system is inefficient, it is often expected to hold for an alchemical transformation between ligands with a moderate difference in their sizes, e.g., transforming a benzene into an ethylbenzene. Yet, exceptions were observed in calculations for anisole and methylaniline, which have similar physical sizes as ethylbenzene. In this study, we show that such exceptions arise from the sampling inefficiency on an unexpected rigid degree of freedom, namely, the bond angle θ. The distributions of θ differ dramatically between two end states of a sFEP calculation, i.e., the conformation of the ligand changes significantly during the alchemical transformation process. Our investigation also reveals the interrelation between the ligand conformation and the intramolecular nonbonded interactions. This knowledge suggests a best combination of the ghost ligand potential and the dual topology setting, which improves the accuracy in a single reference sFEP calculation by bringing down its error from around 5kBT to kBT.