RESUMO
In understanding and redesigning the function of proteins in modern biochemistry, protein engineers are increasingly focusing on exploring regions in proteins called loops. Analyzing various characteristics of these regions helps the experts design the transfer of the desired function from one protein to another. This process is denoted as loop grafting. We designed a set of interactive visualizations that provide experts with visual support through all the loop grafting pipeline steps. The workflow is divided into several phases, reflecting the steps of the pipeline. Each phase is supported by a specific set of abstracted 2D visual representations of proteins and their loops that are interactively linked with the 3D View of proteins. By sequentially passing through the individual phases, the user shapes the list of loops that are potential candidates for loop grafting. Finally, the actual in-silico insertion of the loop candidates from one protein to the other is performed, and the results are visually presented to the user. In this way, the fully computational rational design of proteins and their loops results in newly designed protein structures that can be further assembled and tested through in-vitro experiments. We showcase the contribution of our visual support design on a real case scenario changing the enantiomer selectivity of the engineered enzyme. Moreover, we provide the readers with the experts' feedback.
RESUMO
SUMMARY: Protein design requires information about how mutations affect protein stability. Many web-based predictors are available for this purpose, yet comparing them or using them en masse is difficult. Here, we present BenchStab, a console tool/Python package for easy and quick execution of 19 predictors and result collection on a list of mutants. Moreover, the tool is easily extensible with additional predictors. We created an independent dataset derived from the FireProtDB and evaluated 24 different prediction methods. AVAILABILITY AND IMPLEMENTATION: BenchStab is an open-source Python package available at https://github.com/loschmidt/BenchStab with a detailed README and example usage at https://loschmidt.chemi.muni.cz/benchstab. The BenchStab dataset is available on Zenodo: https://zenodo.org/records/10637728.
Assuntos
Internet , Software , Estabilidade Proteica , Proteínas/química , Biologia Computacional/métodos , Bases de Dados de ProteínasRESUMO
The engineering of efficient enzymes for large-scale production of industrially relevant compounds is a challenging task. Utilizing rational protein design, which relies on a comprehensive understanding of mechanistic information, holds significant promise for achieving success in this endeavor. Pre-steady-state kinetic measurements, obtained either through fast-mixing techniques or photoswitchable substrates, provide crucial mechanistic insights. The latter approach not only furnishes mechanistic clarity but also affords real-time structural elucidation of reaction intermediates via time-resolved femtosecond crystallography. Unfortunately, only a limited number of such valuable mechanistic probes are available. To address this gap, we applied a multidisciplinary approach, including computational analysis, chemical synthesis, physicochemical property screening, and enzyme kinetics to identify promising candidates for photoswitchable probes. We demonstrate the approach by designing an azobenzene-based photoswitchable substrate tailored for haloalkane dehalogenases, a prototypic class of enzymes pivotal in developing computational tools for rational protein design. The probe was subjected to steady-state and pre-steady-state kinetic analysis, which revealed new insights about the catalytic behavior of the model biocatalysts. We employed laser-triggered Z-to-E azobenzene photoswitching to generate the productive isomer in situ, opening avenues for advanced mechanistic studies using time-resolved femtosecond crystallography. Our results not only pave the way for the mechanistic understanding of this model enzyme family, incorporating both kinetic and structural dimensions, but also propose a systematic approach to the rational design of photoswitchable enzymatic substrates.
RESUMO
Every year, more than 19 million cancer cases are diagnosed, and this number continues to increase annually. Since standard treatment options have varying success rates for different types of cancer, understanding the biology of an individual's tumour becomes crucial, especially for cases that are difficult to treat. Personalised high-throughput profiling, using next-generation sequencing, allows for a comprehensive examination of biopsy specimens. Furthermore, the widespread use of this technology has generated a wealth of information on cancer-specific gene alterations. However, there exists a significant gap between identified alterations and their proven impact on protein function. Here, we present a bioinformatics pipeline that enables fast analysis of a missense mutation's effect on stability and function in known oncogenic proteins. This pipeline is coupled with a predictor that summarises the outputs of different tools used throughout the pipeline, providing a single probability score, achieving a balanced accuracy above 86%. The pipeline incorporates a virtual screening method to suggest potential FDA/EMA-approved drugs to be considered for treatment. We showcase three case studies to demonstrate the timely utility of this pipeline. To facilitate access and analysis of cancer-related mutations, we have packaged the pipeline as a web server, which is freely available at https://loschmidt.chemi.muni.cz/predictonco/ .Scientific contributionThis work presents a novel bioinformatics pipeline that integrates multiple computational tools to predict the effects of missense mutations on proteins of oncological interest. The pipeline uniquely combines fast protein modelling, stability prediction, and evolutionary analysis with virtual drug screening, while offering actionable insights for precision oncology. This comprehensive approach surpasses existing tools by automating the interpretation of mutations and suggesting potential treatments, thereby striving to bridge the gap between sequencing data and clinical application.
RESUMO
Computational study of the effect of drug candidates on intrinsically disordered biomolecules is challenging due to their vast and complex conformational space. Here, we developed a comparative Markov state analysis (CoVAMPnet) framework to quantify changes in the conformational distribution and dynamics of a disordered biomolecule in the presence and absence of small organic drug candidate molecules. First, molecular dynamics trajectories are generated using enhanced sampling, in the presence and absence of small molecule drug candidates, and ensembles of soft Markov state models (MSMs) are learned for each system using unsupervised machine learning. Second, these ensembles of learned MSMs are aligned across different systems based on a solution to an optimal transport problem. Third, the directional importance of inter-residue distances for the assignment to different conformational states is assessed by a discriminative analysis of aggregated neural network gradients. This final step provides interpretability and biophysical context to the learned MSMs. We applied this novel computational framework to assess the effects of ongoing phase 3 therapeutics tramiprosate (TMP) and its metabolite 3-sulfopropanoic acid (SPA) on the disordered Aß42 peptide involved in Alzheimer's disease. Based on adaptive sampling molecular dynamics and CoVAMPnet analysis, we observed that both TMP and SPA preserved more structured conformations of Aß42 by interacting nonspecifically with charged residues. SPA impacted Aß42 more than TMP, protecting α-helices and suppressing the formation of aggregation-prone ß-strands. Experimental biophysical analyses showed only mild effects of TMP/SPA on Aß42 and activity enhancement by the endogenous metabolization of TMP into SPA. Our data suggest that TMP/SPA may also target biomolecules other than Aß peptides. The CoVAMPnet method is broadly applicable to study the effects of drug candidates on the conformational behavior of intrinsically disordered biomolecules.
RESUMO
Recombinant proteins play pivotal roles in numerous applications including industrial biocatalysts or therapeutics. Despite the recent progress in computational protein structure prediction, protein solubility and reduced aggregation propensity remain challenging attributes to design. Identification of aggregation-prone regions is essential for understanding misfolding diseases or designing efficient protein-based technologies, and as such has a great socio-economic impact. Here, we introduce AggreProt, a user-friendly webserver that automatically exploits an ensemble of deep neural networks to predict aggregation-prone regions (APRs) in protein sequences. Trained on experimentally evaluated hexapeptides, AggreProt compares to or outperforms state-of-the-art algorithms on two independent benchmark datasets. The server provides per-residue aggregation profiles along with information on solvent accessibility and transmembrane propensity within an intuitive interface with interactive sequence and structure viewers for comprehensive analysis. We demonstrate AggreProt efficacy in predicting differential aggregation behaviours in proteins on several use cases, which emphasize its potential for guiding protein engineering strategies towards decreased aggregation propensity and improved solubility. The webserver is freely available and accessible at https://loschmidt.chemi.muni.cz/aggreprot/.
Assuntos
Internet , Agregados Proteicos , Software , Engenharia de Proteínas/métodos , Algoritmos , Proteínas/química , Proteínas/genética , Redes Neurais de Computação , Dobramento de Proteína , Solubilidade , Conformação ProteicaRESUMO
FGF21 is an endocrine signaling protein belonging to the family of fibroblast growth factors (FGFs). It has emerged as a molecule of interest for treating various metabolic diseases due to its role in regulating glucogenesis and ketogenesis in the liver. However, FGF21 is prone to heat, proteolytic, and acid-mediated degradation, and its low molecular weight makes it susceptible to kidney clearance, significantly reducing its therapeutic potential. Protein engineering studies addressing these challenges have generally shown that increasing the thermostability of FGF21 led to improved pharmacokinetics. Here, we describe the computer-aided design and experimental characterization of FGF21 variants with enhanced melting temperature up to 15 °C, uncompromised efficacy at activation of MAPK/ERK signaling in Hep G2 cell culture, and ability to stimulate proliferation of Hep G2 and NIH 3T3 fibroblasts cells comparable with FGF21-WT. We propose that stabilizing the FGF21 molecule by rational design should be combined with other reported stabilization strategies to maximize the pharmaceutical potential of FGF21.
RESUMO
Molecular docking is a key technique in various fields like structural biology, medicinal chemistry, and biotechnology. It is widely used for virtual screening during drug discovery, computer-assisted drug design, and protein engineering. A general molecular docking process consists of the target and ligand selection, their preparation, and the docking process itself, followed by the evaluation of the results. However, the most commonly used docking software provides no or very basic evaluation possibilities. Scripting and external molecular viewers are often used, which are not designed for an efficient analysis of docking results. Therefore, we developed InVADo, a comprehensive interactive visual analysis tool for large docking data. It consists of multiple linked 2D and 3D views. It filters and spatially clusters the data, and enriches it with post-docking analysis results of protein-ligand interactions and functional groups, to enable well-founded decision-making. In an exemplary case study, domain experts confirmed that InVADo facilitates and accelerates the analysis workflow. They rated it as a convenient, comprehensive, and feature-rich tool, especially useful for virtual screening.
Assuntos
Gráficos por Computador , Software , Simulação de Acoplamento Molecular , Ligantes , Descoberta de Drogas/métodosRESUMO
ChannelsDB 2.0 is an updated database providing structural information about the position, geometry and physicochemical properties of protein channels-tunnels and pores-within deposited biomacromolecular structures from PDB and AlphaFoldDB databases. The newly deposited information originated from several sources. Firstly, we included data calculated using a popular CAVER tool to complement the data obtained using original MOLE tool for detection and analysis of protein tunnels and pores. Secondly, we added tunnels starting from cofactors within the AlphaFill database to enlarge the scope of the database to protein models based on Uniprot. This has enlarged available channel annotations â¼4.6 times as of 1 September 2023. The database stores information about geometrical features, e.g. length and radius, and physico-chemical properties based on channel-lining amino acids. The stored data are interlinked with the available UniProt mutation annotation data. ChannelsDB 2.0 provides an excellent resource for deep analysis of the role of biomacromolecular tunnels and pores. The database is available free of charge: https://channelsdb2.biodata.ceitec.cz.
Assuntos
Bases de Dados de Proteínas , Proteínas , Software , Aminoácidos , Proteínas/química , Conformação ProteicaRESUMO
PredictONCO 1.0 is a unique web server that analyzes effects of mutations on proteins frequently altered in various cancer types. The server can assess the impact of mutations on the protein sequential and structural properties and apply a virtual screening to identify potential inhibitors that could be used as a highly individualized therapeutic approach, possibly based on the drug repurposing. PredictONCO integrates predictive algorithms and state-of-the-art computational tools combined with information from established databases. The user interface was carefully designed for the target specialists in precision oncology, molecular pathology, clinical genetics and clinical sciences. The tool summarizes the effect of the mutation on protein stability and function and currently covers 44 common oncological targets. The binding affinities of Food and Drug Administration/ European Medicines Agency -approved drugs with the wild-type and mutant proteins are calculated to facilitate treatment decisions. The reliability of predictions was confirmed against 108 clinically validated mutations. The server provides a fast and compact output, ideal for the often time-sensitive decision-making process in oncology. Three use cases of missense mutations, (i) K22A in cyclin-dependent kinase 4 identified in melanoma, (ii) E1197K mutation in anaplastic lymphoma kinase 4 identified in lung carcinoma and (iii) V765A mutation in epidermal growth factor receptor in a patient with congenital mismatch repair deficiency highlight how the tool can increase levels of confidence regarding the pathogenicity of the variants and identify the most effective inhibitors. The server is available at https://loschmidt.chemi.muni.cz/predictonco.
Assuntos
Melanoma , Medicina de Precisão , Humanos , Reprodutibilidade dos Testes , Biologia Computacional , Mutação , Proteínas , Aprendizado de MáquinaRESUMO
The fibroblast growth factors (FGF) family holds significant potential for addressing chronic diseases. Specifically, recombinant FGF18 shows promise in treating osteoarthritis by stimulating cartilage formation. However, recent phase 2 clinical trial results of sprifermin (recombinant FGF18) indicate insufficient efficacy. Leveraging our expertise in rational protein engineering, we conducted a study to enhance the stability of FGF18. As a result, we obtained a stabilized variant called FGF18-E4, which exhibited improved stability with 16 °C higher melting temperature, resistance to trypsin and a 2.5-fold increase in production yields. Moreover, the FGF18-E4 maintained mitogenic activity after 1-week incubation at 37 °C and 1-day at 50 °C. Additionally, the inserted mutations did not affect its binding to the fibroblast growth factor receptors, making FGF18-E4 a promising candidate for advancing FGF-based osteoarthritis treatment.
RESUMO
NanoLuc, a superior ß-barrel fold luciferase, was engineered 10 years ago but the nature of its catalysis remains puzzling. Here experimental and computational techniques are combined, revealing that imidazopyrazinone luciferins bind to an intra-barrel catalytic site but also to an allosteric site shaped on the enzyme surface. Structurally, binding to the allosteric site prevents simultaneous binding to the catalytic site, and vice versa, through concerted conformational changes. We demonstrate that restructuration of the allosteric site can boost the luminescent reaction in the remote active site. Mechanistically, an intra-barrel arginine coordinates the imidazopyrazinone component of luciferin, which reacts with O2 via a radical charge-transfer mechanism, and then it also protonates the resulting excited amide product to form a light-emitting neutral species. Concomitantly, an aspartate, supported by two tyrosines, fine-tunes the blue color emitter to secure a high emission intensity. This information is critical to engineering the next-generation of ultrasensitive bioluminescent reporters.
Assuntos
Medições Luminescentes , Luciferases/metabolismo , Domínio CatalíticoRESUMO
Thermostable proteins find their use in numerous biomedical and biotechnological applications. However, the computational design of stable proteins often results in single-point mutations with a limited effect on protein stability. However, the construction of stable multiple-point mutants can prove difficult due to the possibility of antagonistic effects between individual mutations. FireProt protocol enables the automated computational design of highly stable multiple-point mutants. FireProt 2.0 builds on top of the previously published FireProt web, retaining the original functionality and expanding it with several new stabilization strategies. FireProt 2.0 integrates the AlphaFold database and the homology modeling for structure prediction, enabling calculations starting from a sequence. Multiple-point designs are constructed using the Bron-Kerbosch algorithm minimizing the antagonistic effect between the individual mutations. Users can newly limit the FireProt calculation to a set of user-defined mutations, run a saturation mutagenesis of the whole protein or select rigidifying mutations based on B-factors. Evolution-based back-to-consensus strategy is complemented by ancestral sequence reconstruction. FireProt 2.0 is significantly faster and a reworked graphical user interface broadens the tool's availability even to users with older hardware. FireProt 2.0 is freely available at http://loschmidt.chemi.muni.cz/fireprotweb.
Assuntos
Algoritmos , Proteínas , Proteínas/genética , Proteínas/química , Mutação , Estabilidade Proteica , InternetRESUMO
Haloalkane dehalogenases (HLDs) are a family of α/ß-hydrolase fold enzymes that employ SN2 nucleophilic substitution to cleave the carbon-halogen bond in diverse chemical structures, the biological role of which is still poorly understood. Atomic-level knowledge of both the inner organization and supramolecular complexation of HLDs is thus crucial to understand their catalytic and noncatalytic functions. Here, crystallographic structures of the (S)-enantioselective haloalkane dehalogenase DmmarA from the waterborne pathogenic microbe Mycobacterium marinum were determined at 1.6 and 1.85â Å resolution. The structures show a canonical αßα-sandwich HLD fold with several unusual structural features. Mechanistically, the atypical composition of the proton-relay catalytic triad (aspartate-histidine-aspartate) and uncommon active-site pocket reveal the molecular specificities of a catalytic apparatus that exhibits a rare (S)-enantiopreference. Additionally, the structures reveal a previously unobserved mode of symmetric homodimerization, which is predominantly mediated through unusual L5-to-L5 loop interactions. This homodimeric association in solution is confirmed experimentally by data obtained from small-angle X-ray scattering. Utilizing the newly determined structures of DmmarA, molecular modelling techniques were employed to elucidate the underlying mechanism behind its uncommon enantioselectivity. The (S)-preference can be attributed to the presence of a distinct binding pocket and variance in the activation barrier for nucleophilic substitution.
Assuntos
Mycobacterium marinum , Mycobacterium marinum/metabolismo , Ácido Aspártico , Estereoisomerismo , Hidrolases/química , Especificidade por SubstratoRESUMO
Thermostability is an essential requirement for the use of enzymes in the bioindustry. Here, we compare different protein stabilization strategies using a challenging target, a stable haloalkane dehalogenase DhaA115. We observe better performance of automated stabilization platforms FireProt and PROSS in designing multiple-point mutations over the introduction of disulfide bonds and strengthening the intra- and the inter-domain contacts by in silico saturation mutagenesis. We reveal that the performance of automated stabilization platforms was still compromised due to the introduction of some destabilizing mutations. Notably, we show that their prediction accuracy can be improved by applying manual curation or machine learning for the removal of potentially destabilizing mutations, yielding highly stable haloalkane dehalogenases with enhanced catalytic properties. A comparison of crystallographic structures revealed that current stabilization rounds were not accompanied by large backbone re-arrangements previously observed during the engineering stability of DhaA115. Stabilization was achieved by improving local contacts including protein-water interactions. Our study provides guidance for further improvement of automated structure-based computational tools for protein stabilization.
RESUMO
SUMMARY: Access pathways in enzymes are crucial for the passage of substrates and products of catalysed reactions. The process can be studied by computational means with variable degrees of precision. Our in-house approximative method CaverDock provides a fast and easy way to set up and run ligand binding and unbinding calculations through protein tunnels and channels. Here we introduce pyCaverDock, a Python3 API designed to improve user experience with the tool and further facilitate the ligand transport analyses. The API enables users to simplify the steps needed to use CaverDock, from automatizing setup processes to designing screening pipelines. AVAILABILITY AND IMPLEMENTATION: pyCaverDock API is implemented in Python 3 and is freely available with detailed documentation and practical examples at https://loschmidt.chemi.muni.cz/caverdock/.
Assuntos
Proteínas , Software , LigantesRESUMO
Catalase-peroxidases (KatGs) are unique bifunctional oxidoreductases that contain heme in their active centers allowing both the peroxidatic and catalatic reaction modes. These originally bacterial enzymes are broadly distributed among various fungi allowing them to cope with reactive oxygen species present in the environment or inside the cells. We used various biophysical, biochemical, and bioinformatics methods to investigate differences between catalase-peroxidases originating in thermophilic and mesophilic fungi from different habitats. Our results indicate that the architecture of the active center with a specific post-translational modification is highly similar in mesophilic and thermophilic KatG and also the peroxidatic acitivity with ABTS, guaiacol, and L-DOPA. However, only the thermophilic variant CthedisKatG reveals increased manganese peroxidase activity at elevated temperatures. The catalatic activity releasing molecular oxygen is comparable between CthedisKatG and mesophilic MagKatG1 over a broad temperature range. Two constructed point mutations in the active center were performed selectively blocking the formation of described post-translational modification in the active center. They exhibited a total loss of catalatic activity and changes in the peroxidatic activity. Our results indicate the capacity of bifunctional heme enzymes in the variable reactivity for potential biotech applications.
RESUMO
BACKGROUND: Apolipoprotein E (ApoE) ε4 genotype is the most prevalent risk factor for late-onset Alzheimer's Disease (AD). Although ApoE4 differs from its non-pathological ApoE3 isoform only by the C112R mutation, the molecular mechanism of its proteinopathy is unknown. METHODS: Here, we reveal the molecular mechanism of ApoE4 aggregation using a combination of experimental and computational techniques, including X-ray crystallography, site-directed mutagenesis, hydrogen-deuterium mass spectrometry (HDX-MS), static light scattering and molecular dynamics simulations. Treatment of ApoE ε3/ε3 and ε4/ε4 cerebral organoids with tramiprosate was used to compare the effect of tramiprosate on ApoE4 aggregation at the cellular level. RESULTS: We found that C112R substitution in ApoE4 induces long-distance (> 15 Å) conformational changes leading to the formation of a V-shaped dimeric unit that is geometrically different and more aggregation-prone than the ApoE3 structure. AD drug candidate tramiprosate and its metabolite 3-sulfopropanoic acid induce ApoE3-like conformational behavior in ApoE4 and reduce its aggregation propensity. Analysis of ApoE ε4/ε4 cerebral organoids treated with tramiprosate revealed its effect on cholesteryl esters, the storage products of excess cholesterol. CONCLUSIONS: Our results connect the ApoE4 structure with its aggregation propensity, providing a new druggable target for neurodegeneration and ageing.
Assuntos
Doença de Alzheimer , Apolipoproteína E4 , Humanos , Apolipoproteína E4/genética , Apolipoproteína E4/metabolismo , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Apolipoproteína E3/genética , Mutação/genética , Apolipoproteínas E/genéticaRESUMO
Cardiovascular diseases, such as myocardial infarction, ischemic stroke, and pulmonary embolism, are the most common causes of disability and death worldwide. Blood clot hydrolysis by thrombolytic enzymes and thrombectomy are key clinical interventions. The most widely used thrombolytic enzyme is alteplase, which has been used in clinical practice since 1986. Another clinically used thrombolytic protein is tenecteplase, which has modified epitopes and engineered glycosylation sites, suggesting that carbohydrate modification in thrombolytic enzymes is a viable strategy for their improvement. This comprehensive review summarizes current knowledge on computational and experimental identification of glycosylation sites and glycan identity, together with methods used for their reengineering. Practical examples from previous studies focus on modification of glycosylations in thrombolytics, e.g., alteplase, tenecteplase, reteplase, urokinase, saruplase, and desmoteplase. Collected clinical data on these glycoproteins demonstrate the great potential of this engineering strategy. Outstanding combinatorics originating from multiple glycosylation sites and the vast variety of covalently attached glycan species can be addressed by directed evolution or rational design. Directed evolution pipelines would benefit from more efficient cell-free expression and high-throughput screening assays, while rational design must employ structure prediction by machine learning and in silico characterization by supercomputing. Perspectives on challenges and opportunities for improvement of thrombolytic enzymes by engineering and evolution of protein glycosylation are provided.
Assuntos
Infarto do Miocárdio , Ativador de Plasminogênio Tecidual , Humanos , Tenecteplase , Glicosilação , Fibrinolíticos/uso terapêutico , Infarto do Miocárdio/tratamento farmacológicoRESUMO
We present sMolBoxes, a dataflow representation for the exploration and analysis of long molecular dynamics (MD) simulations. When MD simulations reach millions of snapshots, a frame-by-frame observation is not feasible anymore. Thus, biochemists rely to a large extent only on quantitative analysis of geometric and physico-chemical properties. However, the usage of abstract methods to study inherently spatial data hinders the exploration and poses a considerable workload. sMolBoxes link quantitative analysis of a user-defined set of properties with interactive 3D visualizations. They enable visual explanations of molecular behaviors, which lead to an efficient discovery of biochemically significant parts of the MD simulation. sMolBoxes follow a node-based model for flexible definition, combination, and immediate evaluation of properties to be investigated. Progressive analytics enable fluid switching between multiple properties, which facilitates hypothesis generation. Each sMolBox provides quick insight to an observed property or function, available in more detail in the bigBox View. The case studies illustrate that even with relatively few sMolBoxes, it is possible to express complex analytical tasks, and their use in exploratory analysis is perceived as more efficient than traditional scripting-based methods.