RESUMO
Cryptic sites can expand the space of druggable proteins, but the potential usefulness of such sites needs to be investigated before any major effort. Given that the binding pockets are not formed, the druggability of such sites is not well understood. The analysis of proteins and their ligands shows that cryptic sites that are formed primarily by the motion of side chains moving out of the pocket to enable ligand binding generally do not bind drug-sized molecules with sufficient potency. By contrast, sites that are formed by loop or hinge motion are potentially valuable drug targets. Arguments are provided to explain the underlying causes in terms of classical enzyme inhibition theory and the kinetics of side chain motion and ligand binding.
RESUMO
The precise prediction of major histocompatibility complex (MHC)-peptide complex structures is pivotal for understanding cellular immune responses and advancing vaccine design. In this study, we enhanced AlphaFold's capabilities by fine-tuning it with a specialized dataset consisting of exclusively high-resolution class I MHC-peptide crystal structures. This tailored approach aimed to address the generalist nature of AlphaFold's original training, which, while broad-ranging, lacked the granularity necessary for the high-precision demands of class I MHC-peptide interaction prediction. A comparative analysis was conducted against the homology-modeling-based method Pandora as well as the AlphaFold multimer model. Our results demonstrate that our fine-tuned model outperforms others in terms of root-mean-square deviation (median value for Cα atoms for peptides is 0.66 Å) and also provides enhanced predicted local distance difference test scores, offering a more reliable assessment of the predicted structures. These advances have substantial implications for computational immunology, potentially accelerating the development of novel therapeutics and vaccines by providing a more precise computational lens through which to view MHC-peptide interactions.
Assuntos
Modelos Moleculares , Peptídeos , Peptídeos/química , Peptídeos/metabolismo , Complexo Principal de Histocompatibilidade , Antígenos de Histocompatibilidade Classe I/química , Antígenos de Histocompatibilidade Classe I/metabolismo , Antígenos de Histocompatibilidade Classe I/imunologia , Ligação ProteicaRESUMO
The knowledge of ligand binding hot spots and of the important interactions within such hot spots is crucial for the design of lead compounds in the early stages of structure-based drug discovery. The computational solvent mapping server FTMap can reliably identify binding hot spots as consensus clusters, free energy minima that bind a variety of organic probe molecules. However, in its current implementation, FTMap provides limited information on regions within the hot spots that tend to interact with specific pharmacophoric features of potential ligands. E-FTMap is a new server that expands on the original FTMap protocol. E-FTMap uses 119 organic probes, rather than the 16 in the original FTMap, to exhaustively map binding sites, and identifies pharmacophore features as atomic consensus sites where similar chemical groups bind. We validate E-FTMap against a set of 109 experimentally derived structures of fragment-lead pairs, finding that highly ranked pharmacophore features overlap with the corresponding atoms in both fragments and lead compounds. Additionally, comparisons of mapping results to ensembles of bound ligands reveal that pharmacophores generated with E-FTMap tend to sample highly conserved protein-ligand interactions. E-FTMap is available as a web server at https://eftmap.bu.edu.
Assuntos
Descoberta de Drogas , Farmacóforo , Ligantes , Sítios de Ligação , Descoberta de Drogas/métodos , Ligação ProteicaRESUMO
The neural network-based program AlphaFold2 (AF2) provides high accuracy structure prediction for a large fraction of globular proteins. An important question is whether these models are accurate enough for reliably docking small ligands. Several recent papers and the results of CASP15 reveal that local conformational errors reduce the success rates of direct ligand docking. Here, we focus on the ability of the models to conserve the location of binding hot spots, regions on the protein surface that significantly contribute to the binding free energy of the protein-ligand interaction. Clusters of hot spots predict the location and even the druggability of binding sites, and hence are important for computational drug discovery. The hot spots are determined by protein mapping that is based on the distribution of small fragment-sized probes on the protein surface and is less sensitive to local conformation than docking. Mapping models taken from the AlphaFold Protein Structure Database show that identifying binding sites is more reliable than docking, but the success rates are still 5% to 10% lower than based on mapping X-ray structures. The drop in accuracy is particularly large for models of multidomain proteins. However, both the model binding sites and the mapping results can be substantially improved by generating AF2 models for the ligand binding domains of interest rather than the entire proteins and even more if using forced sampling with multiple initial seeds. The mapping of such models tends to reach the accuracy of results obtained by mapping the X-ray structures.
Assuntos
Furilfuramida , Proteínas de Membrana , Ligantes , Ligação Proteica , Conformação Proteica , Sítios de LigaçãoRESUMO
The precise prediction of Major Histocompatibility Complex (MHC)-peptide complex structures is pivotal for understanding cellular immune responses and advancing vaccine design. In this study, we enhanced AlphaFold's capabilities by fine-tuning it with a specialized dataset comprised by exclusively high-resolution MHC-peptide crystal structures. This tailored approach aimed to address the generalist nature of AlphaFold's original training, which, while broad-ranging, lacked the granularity necessary for the high-precision demands of MHC-peptide interaction prediction. A comparative analysis was conducted against the homology-modeling-based method Pandora [13], as well as the AlphaFold multimer model [8]. Our results demonstrate that our fine-tuned model outperforms both in terms of RMSD (median value is 0.65 Å) but also provides enhanced predicted lDDT scores, offering a more reliable assessment of the predicted structures. These advances have substantial implications for computational immunology, potentially accelerating the development of novel therapeutics and vaccines by providing a more precise computational lens through which to view MHC-peptide interactions.
RESUMO
Major histocompatibility complex Class I (MHC-I) molecules bind to peptides derived from intracellular antigens and present them on the surface of cells, allowing the immune system (T cells) to detect them. Elucidating the process of this presentation is essential for regulation and potential manipulation of the cellular immune system. Predicting whether a given peptide binds to an MHC molecule is an important step in the above process and has motivated the introduction of many computational approaches to address this problem. NetMHCPan, a pan-specific model for predicting binding of peptides to any MHC molecule, is one of the most widely used methods which focuses on solving this binary classification problem using shallow neural networks. The recent successful results of Deep Learning (DL) methods, especially Natural Language Processing (NLP-based) pretrained models in various applications, including protein structure determination, motivated us to explore their use in this problem. Specifically, we consider the application of deep learning models pretrained on large datasets of protein sequences to predict MHC Class I-peptide binding. Using the standard performance metrics in this area, and the same training and test sets, we show that our models outperform NetMHCpan4.1, currently considered as the-state-of-the-art.
RESUMO
In the ligand prediction category of CASP15, the challenge was to predict the positions and conformations of small molecules binding to proteins that were provided as amino acid sequences or as models generated by the AlphaFold2 program. For most targets, we used our template-based ligand docking program ClusPro ligTBM, also implemented as a public server available at https://ligtbm.cluspro.org/. Since many targets had multiple chains and a number of ligands, several templates, and some manual interventions were required. In a few cases, no templates were found, and we had to use direct docking using the Glide program. Nevertheless, ligTBM was shown to be a very useful tool, and by any ranking criteria, our group was ranked among the top five best-performing teams. In fact, all the best groups used template-based docking methods. Thus, it appears that the AlphaFold2-generated models, despite the high accuracy of the predicted backbone, have local differences from the x-ray structure that make the use of direct docking methods more challenging. The results of CASP15 confirm that this limitation can be frequently overcome by homology-based docking.
Assuntos
Proteínas , Software , Conformação Proteica , Simulação de Acoplamento Molecular , Ligantes , Proteínas/química , Ligação Proteica , Sítios de LigaçãoRESUMO
Antibodies play an important role in the immune system by binding to molecules called antigens at their respective epitopes. These interfaces or epitopes are structural entities determined by the interactions between an antibody and an antigen, making them ideal systems to analyze by using docking programs. Since the advent of high-throughput antibody sequencing, the ability to perform epitope mapping using only the sequence of the antibody has become a high priority. ClusPro, a leading protein-protein docking server, together with its template-based modeling version, ClusPro-TBM, have been re-purposed to map epitopes for specific antibody-antigen interactions by using the Antibody Epitope Mapping server (AbEMap). ClusPro-AbEMap offers three different modes for users depending on the information available on the antibody as follows: (i) X-ray structure, (ii) computational/predicted model of the structure or (iii) only the amino acid sequence. The AbEMap server presents a likelihood score for each antigen residue of being part of the epitope. We provide detailed information on the server's capabilities for the three options and discuss how to obtain the best results. In light of the recent introduction of AlphaFold2 (AF2), we also show how one of the modes allows users to use their AF2-generated antibody models as input. The protocol describes the relative advantages of the server compared to other epitope-mapping tools, its limitations and potential areas of improvement. The server may take 45-90 min depending on the size of the proteins.
Assuntos
Furilfuramida , Proteínas , Epitopos , Proteínas/química , Antígenos , Anticorpos , Mapeamento de EpitoposRESUMO
The design of PROteolysis-TArgeting Chimeras (PROTACs) requires bringing an E3 ligase into proximity with a target protein to modulate the concentration of the latter through its ubiquitination and degradation. Here, we present a method for generating high-accuracy structural models of E3 ligase-PROTAC-target protein ternary complexes. The method is dependent on two computational innovations: adding a "silent" convolution term to an efficient protein-protein docking program to eliminate protein poses that do not have acceptable linker conformations and clustering models of multiple PROTACs that use the same E3 ligase and target the same protein. Results show that the largest consensus clusters always have high predictive accuracy and that the ensemble of models can be used to predict the dissociation rate and cooperativity of the ternary complex that relate to the degrading activity of the PROTAC. The method is demonstrated by applications to known PROTAC structures and a blind test involving PROTACs against BRAF mutant V600E. The results confirm that PROTACs function by stabilizing a favorable interaction between the E3 ligase and the target protein but do not necessarily exploit the most energetically favorable geometry for interaction between the proteins.
Assuntos
Proteínas , Ubiquitina-Proteína Ligases , Proteólise , Ubiquitina-Proteína Ligases/metabolismo , Proteínas/metabolismo , UbiquitinaçãoRESUMO
Advances in a scientific discipline are often measured by small, incremental steps. In this review, we report on two intertwined disciplines in the protein structure prediction field, modeling of single chains and modeling of complexes, that have over decades emulated this pattern, as monitored by the community-wide blind prediction experiments CASP and CAPRI. However, over the past few years, dramatic advances were observed for the accurate prediction of single protein chains, driven by a surge of deep learning methodologies entering the prediction field. We review the mainscientific developments that enabled these recent breakthroughs and feature the important role of blind prediction experiments in building up and nurturing the structure prediction field. We discuss how the new wave of artificial intelligence-based methods is impacting the fields of computational and experimental structural biology and highlight areas in which deep learning methods are likely to lead to future developments, provided that major challenges are overcome.
Assuntos
Inteligência Artificial , Conformação ProteicaRESUMO
Antibodies are key proteins produced by the immune system to target pathogen proteins termed antigens via specific binding to surface regions called epitopes. Given an antigen and the sequence of an antibody the knowledge of the epitope is critical for the discovery and development of antibody based therapeutics. In this work, we present a computational protocol that uses template-based modeling and docking to predict epitope residues. This protocol is implemented in three major steps. First, a template-based modeling approach is used to build the antibody structures. We tested several options, including generation of models using AlphaFold2. Second, each antibody model is docked to the antigen using the fast Fourier transform (FFT) based docking program PIPER. Attention is given to optimally selecting the docking energy parameters depending on the input data. In particular, the van der Waals energy terms are reduced for modeled antibodies relative to x-ray structures. Finally, ranking of antigen surface residues is produced. The ranking relies on the docking results, that is, how often the residue appears in the docking poses' interface, and also on the energy favorability of the docking pose in question. The method, called PIPER-Map, has been tested on a widely used antibody-antigen docking benchmark. The results show that PIPER-Map improves upon the existing epitope prediction methods. An interesting observation is that epitope prediction accuracy starting from antibody sequence alone does not significantly differ from that of starting from unbound (i.e., separately crystallized) antibody structure.
Assuntos
Anticorpos , Antígenos , Epitopos/metabolismo , Anticorpos/química , Antígenos/química , Simulação de Dinâmica Molecular , Proteínas/química , Ligação ProteicaRESUMO
Within the last few decades, increases in computational resources have contributed enormously to the progress of science and engineering (S & E). To continue making rapid advancements, the S & E community must be able to access computing resources. One way to provide such resources is through High-Performance Computing (HPC) centers. Many academic research institutions offer their own HPC Centers but struggle to make the computing resources easily accessible and user-friendly. Here we present SHABU, a RESTful Web API framework that enables S & E communities to access resources from Boston University's Shared Computing Center (SCC). The SHABU requirements are derived from the use cases described in this work.
RESUMO
Inborn mutations in the digestive protease carboxypeptidase A1 (CPA1) gene may be associated with hereditary and idiopathic chronic pancreatitis (CP). Pathogenic mutations, such as p.N256K, cause intracellular retention and reduced secretion of CPA1, accompanied by endoplasmic reticulum (ER) stress, suggesting that mutation-induced misfolding underlies the phenotype. Here, we report the novel p.G250A CPA1 mutation found in a young patient with CP. Functional properties of the p.G250A mutation were identical to those of the p.N256K mutation, confirming its pathogenic nature. We noted that both mutations are in a catalytically important loop of CPA1 that is stabilized by the Cys248-Cys271 disulfide bond. Mutation of either or both Cys residues to Ala resulted in misfolding, as judged by the loss of CPA1 secretion and intracellular retention. We re-analyzed seven previously reported CPA1 mutations that affect this loop and found that all exhibited reduced secretion and caused ER stress of varying degrees. The magnitude of ER stress was proportional to the secretion defect. Replacing the naturally occurring mutations with Ala (e.g., p.V251A for p.V251M) restored secretion, with the notable exception of p.N256A. We conclude that the disulfide-stabilized loop of CPA1 is prone to mutation-induced misfolding, in most cases due to the disruptive nature of the newly introduced side chain. We propose that disease-causing CPA1 mutations exhibit abolished or markedly reduced secretion with pronounced ER stress, whereas CPA1 mutations with milder misfolding phenotypes may be associated with lower disease risk or may not be pathogenic at all.
Assuntos
Carboxipeptidases A , Predisposição Genética para Doença , Pancreatite Crônica , Humanos , Carboxipeptidases A/genética , Mutação , Pancreatite Crônica/genética , FenótipoRESUMO
Despite the growing number of G protein-coupled receptor (GPCR) structures, only 39 structures have been cocrystallized with allosteric inhibitors. These structures have been studied by protein mapping using the FTMap server, which determines the clustering of small organic probe molecules distributed on the protein surface. The method has found druggable sites overlapping with the cocrystallized allosteric ligands in 21 GPCR structures. Mapping of Alphafold2 generated models of these proteins confirms that the same sites can be identified without the presence of bound ligands. We then mapped the 394 GPCR X-ray structures available at the time of the analysis (September 2020). Results show that for each of the 21 structures with bound ligands there exist many other GPCRs that have a strong binding hot spot at the same location, suggesting potential allosteric sites in a large variety of GPCRs. These sites cluster at nine distinct locations, and each can be found in many different proteins. However, ligands binding at the same location generally show little or no similarity, and the amino acid residues interacting with these ligands also differ. Results confirm the possibility of specifically targeting these sites across GPCRs for allosteric modulation and help to identify the most likely binding sites among the limited number of potential locations. The FTMap server is available free of charge for academic and governmental use at https://ftmap.bu.edu/.
Assuntos
Aminoácidos , Receptores Acoplados a Proteínas G , Sítio Alostérico , Ligantes , Sítios de Ligação , Receptores Acoplados a Proteínas G/química , Regulação AlostéricaRESUMO
Pancreatitis, the inflammatory disorder of the pancreas, has no specific therapy. Genetic, biochemical, and animal model studies revealed that trypsin plays a central role in the onset and progression of pancreatitis. Here, we performed biochemical and preclinical mouse experiments to offer proof of concept that orally administered dabigatran etexilate can inhibit pancreatic trypsins and shows therapeutic efficacy in trypsin-dependent pancreatitis. We found that dabigatran competitively inhibited all human and mouse trypsin isoforms (Ki range 10-79 nM) and dabigatran plasma concentrations in mice given oral dabigatran etexilate well exceeded the Ki of trypsin inhibition. In the T7K24R trypsinogen mutant mouse model, a single oral gavage of dabigatran etexilate was effective against cerulein-induced progressive pancreatitis, with a high degree of histological normalization. In contrast, spontaneous pancreatitis in T7D23A mice, which carry a more aggressive trypsinogen mutation, was not ameliorated by dabigatran etexilate, given either as daily gavages or by mixing it with solid chow. Taken together, our observations showed that benzamidine derivatives such as dabigatran are potent trypsin inhibitors and show therapeutic activity against trypsin-dependent pancreatitis in T7K24R mice. Lack of efficacy in T7D23A mice is probably related to the more severe pathology and insufficient drug concentrations in the pancreas.
Assuntos
Dabigatrana , Pancreatite , Animais , Humanos , Camundongos , Modelos Animais de Doenças , Pâncreas , Pancreatite/induzido quimicamente , Pancreatite/tratamento farmacológico , Pancreatite/genética , Tripsina/genética , Tripsinogênio/genéticaRESUMO
Molecular dynamics was used to optimize the droperidol-hERG complex obtained from docking. To accommodate the inhibitor, residues T623, S624, V625, G648, Y652, and F656 did not move significantly during the simulation, while F627 moved significantly. Binding sites in cryo-EM structures and in structures obtained from molecular dynamics simulations were characterized using solvent mapping and Atlas ligands, which were negative images of the binding site, were generated. Atlas ligands were found to be useful for identifying human ether-á-go-go-related potassium channel (hERG) inhibitors by aligning compounds to them or by guiding the docking of compounds in the binding site. A molecular dynamics optimized structure of hERG led to improved predictions using either compound alignment to the Atlas ligand or docking. The structure was also found to be suitable to define a strategy for lowering inhibition based on the proposed binding mode of compounds in the channel.
Assuntos
Canais de Potássio Éter-A-Go-Go , Éter , Sítios de Ligação , Canal de Potássio ERG1/metabolismo , Canais de Potássio Éter-A-Go-Go/química , Canais de Potássio Éter-A-Go-Go/metabolismo , Humanos , Ligantes , SolventesRESUMO
Protein mapping distributes many copies of different molecular probes on the surface of a target protein in order to determine binding hot spots, regions that are highly preferable for ligand binding. While mapping of X-ray structures by the FTMap server is inherently static, this limitation can be overcome by the simultaneous analysis of multiple structures of the protein. FTMove is an automated web server that implements this approach. From the input of a target protein, by PDB code, the server identifies all structures of the protein available in the PDB, runs mapping on them, and combines the results to form binding hot spots and binding sites. The user may also upload their own protein structures, bypassing the PDB search for similar structures. Output of the server consists of the consensus binding sites and the individual mapping results for each structure - including the number of probes located in each binding site, for each structure. This level of detail allows the users to investigate how the strength of a binding site relates to the protein conformation, other binding sites, and the presence of ligands or mutations. In addition, the structures are clustered on the basis of their binding properties. The use of FTMove is demonstrated by application to 22 proteins with known allosteric binding sites; the orthosteric and allosteric binding sites were identified in all but one case, and the sites were typically ranked among the top five. The FTMove server is publicly available at https://ftmove.bu.edu.
Assuntos
Uso da Internet , Conformação Proteica , Proteínas , Software , Sítio Alostérico , Ligantes , Proteínas/químicaRESUMO
Starting with a crystal structure of a macromolecule, computational structural modeling can help to understand the associated biological processes, structure and function, as well as to reduce the number of further experiments required to characterize a given molecular entity. In the past decade, two classes of powerful automated tools for investigating the binding properties of proteins have been developed: the protein-protein docking program ClusPro and the FTMap and FTSite programs for protein hotspot identification. These methods have been widely used by the research community by means of publicly available online servers, and models built using these automated tools have been reported in a large number of publications. Importantly, additional experimental information can be leveraged to further improve the predictive power of these approaches. Here, an overview of the methods and their biological applications is provided together with a brief interpretation of the results.
Assuntos
Proteínas , Simulação por Computador , Simulação de Acoplamento Molecular , Conformação Proteica , Proteínas/químicaRESUMO
An increasing number of medically important proteins are challenging drug targets because their binding sites are too shallow or too polar, are cryptic and thus not detectable without a bound ligand or located in a protein-protein interface. While such proteins may not bind druglike small molecules with sufficiently high affinity, they are frequently druggable using novel therapeutic modalities. The need for such modalities can be determined by experimental or computational fragment based methods. Computational mapping by mixed solvent molecular dynamics simulations or the FTMap server can be used to determine binding hot spots. The strength and location of the hot spots provide very useful information for selecting potentially successful approaches to drug discovery.
Assuntos
Simulação de Dinâmica Molecular , Proteínas , Sítios de Ligação , Descoberta de Drogas , Ligantes , Ligação Proteica , Proteínas/químicaRESUMO
Predicting protein side-chains is important for both protein structure prediction and protein design. Modeling approaches to predict side-chains such as SCWRL4 have become one of the most widely used tools of its type due to fast and highly accurate predictions. Motivated by the recent success of AlphaFold2 in CASP14, our group adapted a 3D equivariant neural network architecture to predict protein side-chain conformations, specifically within a protein-protein interface, a problem that has not been fully addressed by AlphaFold2.