Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37418278

RESUMO

Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this structure is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of proteins is essential to understand their mechanism of action. Sets of carefully chosen conformations can summarize such complex landscapes and provide better insights into protein function than single conformations. We refer to these sets as representative conformational ensembles. Recent advances in computational methods have led to an increase in the number of available structural datasets spanning conformational landscapes. However, extracting representative conformational ensembles from such datasets is not an easy task and many methods have been developed to tackle it. Our new approach, EnGens (short for ensemble generation), collects these methods into a unified framework for generating and analyzing representative protein conformational ensembles. In this work, we: (1) provide an overview of existing methods and tools for representative protein structural ensemble generation and analysis; (2) unify existing approaches in an open-source Python package, and a portable Docker image, providing interactive visualizations within a Jupyter Notebook pipeline; (3) test our pipeline on a few canonical examples from the literature. Representative ensembles produced by EnGens can be used for many downstream tasks such as protein-ligand ensemble docking, Markov state modeling of protein dynamics and analysis of the effect of single-point mutations.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Conformação Proteica , Proteínas/química
2.
J Chem Inf Model ; 64(5): 1730-1750, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38415656

RESUMO

The recognition of peptides bound to class I major histocompatibility complex (MHC-I) receptors by T-cell receptors (TCRs) is a determinant of triggering the adaptive immune response. While the exact molecular features that drive the TCR recognition are still unknown, studies have suggested that the geometry of the joint peptide-MHC (pMHC) structure plays an important role. As such, there is a definite need for methods and tools that accurately predict the structure of the peptide bound to the MHC-I receptor. In the past few years, many pMHC structural modeling tools have emerged that provide high-quality modeled structures in the general case. However, there are numerous instances of non-canonical cases in the immunopeptidome that the majority of pMHC modeling tools do not attend to, most notably, peptides that exhibit non-standard amino acids and post-translational modifications (PTMs) or peptides that assume non-canonical geometries in the MHC binding cleft. Such chemical and structural properties have been shown to be present in neoantigens; therefore, accurate structural modeling of these instances can be vital for cancer immunotherapy. To this end, we have developed APE-Gen2.0, a tool that improves upon its predecessor and other pMHC modeling tools, both in terms of modeling accuracy and the available modeling range of non-canonical peptide cases. Some of the improvements include (i) the ability to model peptides that have different types of PTMs such as phosphorylation, nitration, and citrullination; (ii) a new and improved anchor identification routine in order to identify and model peptides that exhibit a non-canonical anchor conformation; and (iii) a web server that provides a platform for easy and accessible pMHC modeling. We further show that structures predicted by APE-Gen2.0 can be used to assess the effects that PTMs have in binding affinity in a more accurate manner than just using solely the sequence of the peptide. APE-Gen2.0 is freely available at https://apegen.kavrakilab.org.


Assuntos
Hominidae , Peptídeos , Animais , Peptídeos/química , Complexo Principal de Histocompatibilidade , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T/metabolismo , Processamento de Proteína Pós-Traducional , Hominidae/metabolismo , Ligação Proteica
3.
Proc Natl Acad Sci U S A ; 117(48): 30610-30618, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33184174

RESUMO

Peptide binding to major histocompatibility complexes (MHCs) is a central component of the immune system, and understanding the mechanism behind stable peptide-MHC binding will aid the development of immunotherapies. While MHC binding is mostly influenced by the identity of the so-called anchor positions of the peptide, secondary interactions from nonanchor positions are known to play a role in complex stability. However, current MHC-binding prediction methods lack an analysis of the major conformational states and might underestimate the impact of secondary interactions. In this work, we present an atomically detailed analysis of peptide-MHC binding that can reveal the contributions of any interaction toward stability. We propose a simulation framework that uses both umbrella sampling and adaptive sampling to generate a Markov state model (MSM) for a coronavirus-derived peptide (QFKDNVILL), bound to one of the most prevalent MHC receptors in humans (HLA-A24:02). While our model reaffirms the importance of the anchor positions of the peptide in establishing stable interactions, our model also reveals the underestimated importance of position 4 (p4), a nonanchor position. We confirmed our results by simulating the impact of specific peptide mutations and validated these predictions through competitive binding assays. By comparing the MSM of the wild-type system with those of the D4A and D4P mutations, our modeling reveals stark differences in unbinding pathways. The analysis presented here can be applied to any peptide-MHC complex of interest with a structural model as input, representing an important step toward comprehensive modeling of the MHC class I pathway.


Assuntos
Complexo Principal de Histocompatibilidade , Cadeias de Markov , Modelos Moleculares , Peptídeos/metabolismo , Alanina/genética , Ligação Competitiva , Simulação por Computador , Análise Mutacional de DNA , Mutação/genética , Prolina/metabolismo , Ligação Proteica
4.
BMC Bioinformatics ; 21(1): 13, 2020 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-31924164

RESUMO

BACKGROUND: The rapid growth of available knowledge on metabolic processes across thousands of species continues to expand the possibilities of producing chemicals by combining pathways found in different species. Several computational search algorithms have been developed for automating the identification of possible heterologous pathways; however, these searches may return thousands of pathway results. Although the large number of results are in part due to the large number of possible compounds and reactions, a subset of core reaction modules is repeatedly observed in pathway results across multiple searches, suggesting that some subpaths between common compounds were more consistently explored than others.To reduce the resources spent on searching the same metabolic space, a new meta-algorithm for metabolic pathfinding, Hub Pathway search with Atom Tracking (HPAT), was developed to take advantage of a precomputed network of subpath modules. To investigate the efficacy of this method, we created a table describing a network of common hub metabolites and how they are biochemically connected and only offloaded searches to and from this hub network onto an interactive webserver capable of visualizing the resulting pathways. RESULTS: A test set of nineteen known pathways taken from literature and metabolic databases were used to evaluate if HPAT was capable of identifying known pathways. HPAT found the exact pathway for eleven of the nineteen test cases using a diverse set of precomputed subpaths, whereas a comparable pathfinding search algorithm that does not use precomputed subpaths found only seven of the nineteen test cases. The capability of HPAT to find novel pathways was demonstrated by its ability to identify novel 3-hydroxypropanoate (3-HP) synthesis pathways. As for pathway visualization, the new interactive pathway filters enable a reduction of the number of displayed pathways from hundreds down to less than ten pathways in several test cases, illustrating their utility in reducing the amount of presented information while retaining pathways of interest. CONCLUSIONS: This work presents the first step in incorporating a precomputed subpath network into metabolic pathfinding and demonstrates how this leads to a concise, interactive visualization of pathway results. The modular nature of metabolic pathways is exploited to facilitate efficient discovery of alternate pathways.


Assuntos
Algoritmos , Redes e Vias Metabólicas , Ácido Láctico/análogos & derivados , Ácido Láctico/química , Ácido Láctico/metabolismo , Ácido Pirúvico/metabolismo
5.
J Chem Inf Model ; 60(3): 1302-1316, 2020 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-32130862

RESUMO

We define a molecular caging complex as a pair of molecules in which one molecule (the "host" or "cage") possesses a cavity that can encapsulate the other molecule (the "guest") and prevent it from escaping. Molecular caging complexes can be useful in applications such as molecular shape sorting, drug delivery, and molecular immobilization in materials science, to name just a few. However, the design and computational discovery of new caging complexes is a challenging task, as it is hard to predict whether one molecule can encapsulate another because their shapes can be quite complex. In this paper, we propose a computational screening method that predicts whether a given pair of molecules form a caging complex. Our method is based on a caging verification algorithm that was designed by our group for applications in robotic manipulation. We tested our algorithm on three pairs of molecules that were previously described in a pioneering work on molecular caging complexes and found that our results are fully consistent with the previously reported ones. Furthermore, we performed a screening experiment on a data set consisting of 46 hosts and four guests and used our algorithm to predict which pairs are likely to form caging complexes. Our method is computationally efficient and can be integrated into a screening pipeline to complement experimental techniques.


Assuntos
Algoritmos
6.
J Chem Inf Model ; 59(3): 1121-1135, 2019 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-30500191

RESUMO

Atom mapping of a chemical reaction is a mapping between the atoms in the reactant molecules and the atoms in the product molecules. It encodes the underlying reaction mechanism and, as such, constitutes essential information in computational studies in drug design. Various techniques have been investigated for the automatic computation of the atom mapping of a chemical reaction, approaching the problem as a graph matching problem. The graph abstraction of the chemical problem, though, eliminates crucial chemical information. There have been efforts for enhancing the graph representation by introducing the bond stabilities as edge weights, as they are estimated based on experimental evidence. Here, we present a fully automated optimization-based approach, named AMLGAM (Automated Machine Learning Guided Atom Mapping), that uses machine learning techniques for the estimation of the bond stabilities based on the chemical environment of each bond. The optimization method finds the reaction mechanism which favors the breakage/formation of the less stable bonds. We evaluated our method on a manually curated data set of 382 chemical reactions and ran our method on a much larger and diverse data set of 7400 chemical reactions. We show that the proposed method improves the accuracy over existing techniques based on results published by earlier studies on a common data set and is capable of handling unbalanced reactions.


Assuntos
Quimioinformática/métodos , Aprendizado de Máquina
7.
Molecules ; 24(5)2019 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-30832312

RESUMO

The Class I Major Histocompatibility Complex (MHC) is a central protein in immunology as it binds to intracellular peptides and displays them at the cell surface for recognition by T-cells. The structural analysis of bound peptide-MHC complexes (pMHCs) holds the promise of interpretable and general binding prediction (i.e., testing whether a given peptide binds to a given MHC). However, structural analysis is limited in part by the difficulty in modelling pMHCs given the size and flexibility of the peptides that can be presented by MHCs. This article describes APE-Gen (Anchored Peptide-MHC Ensemble Generator), a fast method for generating ensembles of bound pMHC conformations. APE-Gen generates an ensemble of bound conformations by iterated rounds of (i) anchoring the ends of a given peptide near known pockets in the binding site of the MHC, (ii) sampling peptide backbone conformations with loop modelling, and then (iii) performing energy minimization to fix steric clashes, accumulating conformations at each round. APE-Gen takes only minutes on a standard desktop to generate tens of bound conformations, and we show the ability of APE-Gen to sample conformations found in X-ray crystallography even when only sequence information is used as input. APE-Gen has the potential to be useful for its scalability (i.e., modelling thousands of pMHCs or even non-canonical longer peptides) and for its use as a flexible search tool. We demonstrate an example for studying cross-reactivity.


Assuntos
Antígenos de Histocompatibilidade Classe I/química , Complexos Multiproteicos/química , Peptídeos/química , Linfócitos T/química , Sítios de Ligação , Cristalografia por Raios X , Antígenos de Histocompatibilidade Classe I/imunologia , Modelos Moleculares , Complexos Multiproteicos/imunologia , Peptídeos/imunologia , Ligação Proteica , Conformação Proteica , Linfócitos T/imunologia
8.
J Chem Phys ; 149(24): 244119, 2018 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-30599712

RESUMO

Adaptive sampling methods, often used in combination with Markov state models, are becoming increasingly popular for speeding up rare events in simulation such as molecular dynamics (MD) without biasing the system dynamics. Several adaptive sampling strategies have been proposed, but it is not clear which methods perform better for different physical systems. In this work, we present a systematic evaluation of selected adaptive sampling strategies on a wide selection of fast folding proteins. The adaptive sampling strategies were emulated using models constructed on already existing MD trajectories. We provide theoretical limits for the sampling speed-up and compare the performance of different strategies with and without using some a priori knowledge of the system. The results show that for different goals, different adaptive sampling strategies are optimal. In order to sample slow dynamical processes such as protein folding without a priori knowledge of the system, a strategy based on the identification of a set of metastable regions is consistently the most efficient, while a strategy based on the identification of microstates performs better if the goal is to explore newer regions of the conformational space. Interestingly, the maximum speed-up achievable for the adaptive sampling of slow processes increases for proteins with longer folding times, encouraging the application of these methods for the characterization of slower processes, beyond the fast-folding proteins considered here.


Assuntos
Simulação de Dinâmica Molecular , Proteínas/química , Conformação Proteica , Dobramento de Proteína
9.
Int J Mol Sci ; 19(11)2018 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-30384411

RESUMO

Both experimental and computational methods are available to gather information about a protein's conformational space and interpret changes in protein structure. However, experimentally observing and computationally modeling large proteins remain critical challenges for structural biology. Our work aims at addressing these challenges by combining computational and experimental techniques relying on each other to overcome their respective limitations. Indeed, despite its advantages, an experimental technique such as hydrogen-exchange monitoring cannot produce structural models because of its low resolution. Additionally, the computational methods that can generate such models suffer from the curse of dimensionality when applied to large proteins. Adopting a common solution to this issue, we have recently proposed a framework in which our computational method for protein conformational sampling is biased by experimental hydrogen-exchange data. In this paper, we present our latest application of this computational framework: generating an atomic-resolution structural model for an unknown protein state. For that, starting from an available protein structure, we explore the conformational space of this protein, using hydrogen-exchange data on this unknown state as a guide. We have successfully used our computational framework to generate models for three proteins of increasing size, the biggest one undergoing large-scale conformational changes.


Assuntos
Complemento C3b/química , Medição da Troca de Deutério , Interleucina-8/química , Modelos Moleculares , Humanos , Conformação Proteica
10.
BMC Genomics ; 17 Suppl 4: 431, 2016 08 18.
Artigo em Inglês | MEDLINE | ID: mdl-27556159

RESUMO

BACKGROUND: The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. RESULTS: We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm's performance is demonstrated using an extensive dataset for the human kinome. CONCLUSION: We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important.


Assuntos
Biologia Computacional/métodos , Inibidores de Proteínas Quinases/química , Proteínas Quinases/genética , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Aminoácidos , Sítios de Ligação , Genoma Humano , Humanos , Ligação Proteica , Proteínas Quinases/química , Relação Estrutura-Atividade
11.
PLoS Comput Biol ; 9(6): e1003087, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23754939

RESUMO

The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.


Assuntos
Proteínas Quinases/química , Análise por Conglomerados , Humanos , Modelos Teóricos , Proteoma , Máquina de Vetores de Suporte
12.
iScience ; 27(1): 108613, 2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38188519

RESUMO

Peptide-HLA (pHLA) binding prediction is essential in screening peptide candidates for personalized peptide vaccines. Machine learning (ML) pHLA binding prediction tools are trained on vast amounts of data and are effective in screening peptide candidates. Most ML models report the ability to generalize to HLA alleles unseen during training ("pan-allele" models). However, the use of datasets with imbalanced allele content raises concerns about biased model performance. First, we examine the data bias of two ML-based pan-allele pHLA binding predictors. We find that the pHLA datasets overrepresent alleles from geographic populations of high-income countries. Second, we show that the identified data bias is perpetuated within ML models, leading to algorithmic bias and subpar performance for alleles expressed in low-income geographic populations. We draw attention to the potential therapeutic consequences of this bias, and we challenge the use of the term "pan-allele" to describe models trained with currently available public datasets.

13.
Artigo em Inglês | MEDLINE | ID: mdl-38577265

RESUMO

The cellular immune response comprises several processes, with the most notable ones being the binding of the peptide to the Major Histocompability Complex (MHC), the peptide-MHC (pMHC) presentation to the surface of the cell, and the recognition of the pMHC by the T-Cell Receptor. Identifying the most potent peptide targets for MHC binding, presentation and T-cell recognition is vital for developing peptide-based vaccines and T-cell-based immunotherapies. Data-driven tools that predict each of these steps have been developed, and the availability of mass spectrometry (MS) datasets has facilitated the development of accurate Machine Learning (ML) methods for class-I pMHC binding prediction. However, the accuracy of ML-based tools for pMHC kinetic stability prediction and peptide immunogenicity prediction is uncertain, as stability and immunogenicity datasets are not abundant. Here, we use transfer learning techniques to improve stability and immunogenicity predictions, by taking advantage of a large number of binding affinity and MS datasets. The resulting models, TLStab and TLImm, exhibit comparable or better performance than state-of-the-art approaches on different stability and immunogenicity test sets respectively. Our approach demonstrates the promise of learning from the task of peptide binding to improve predictions on downstream tasks. The source code of TLStab and TLImm is publicly available at https://github.com/KavrakiLab/TL-MHC.

14.
BMC Struct Biol ; 13 Suppl 1: S11, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564952

RESUMO

BACKGROUND: Using the popular program AutoDock, computer-aided docking of small ligands with 6 or fewer rotatable bonds, is reasonably fast and accurate. However, docking large ligands using AutoDock's recommended standard docking protocol is less accurate and computationally slow. RESULTS: In our earlier work, we presented a novel AutoDock-based incremental protocol (DINC) that addresses the limitations of AutoDock's standard protocol by enabling improved docking of large ligands. Instead of docking a large ligand to a target protein in one single step as done in the standard protocol, our protocol docks the large ligand in increments. In this paper, we present three detailed examples of docking using DINC and compare the docking results with those obtained using AutoDock's standard protocol. We summarize the docking results from an extended docking study that was done on 73 protein-ligand complexes comprised of large ligands. We demonstrate not only that DINC is up to 2 orders of magnitude faster than AutoDock's standard protocol, but that it also achieves the speed-up without sacrificing docking accuracy. We also show that positional restraints can be applied to the large ligand using DINC: this is useful when computing a docked conformation of the ligand. Finally, we introduce a webserver for docking large ligands using DINC. CONCLUSIONS: Docking large ligands using DINC is significantly faster than AutoDock's standard protocol without any loss of accuracy. Therefore, DINC could be used as an alternative protocol for docking large ligands. DINC has been implemented as a webserver and is available at http://dinc.kavrakilab.org. Applications such as therapeutic drug design, rational vaccine design, and others involving large ligands could benefit from DINC and its webserver implementation.


Assuntos
Ligantes , Proteínas/metabolismo , Algoritmos , Conformação Molecular , Simulação de Acoplamento Molecular , Proteínas/química , Software , Interface Usuário-Computador
15.
Commun Chem ; 6(1): 132, 2023 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-37353554

RESUMO

Elucidating the structure of a chemical compound is a fundamental task in chemistry with applications in multiple domains including drug discovery, precision medicine, and biomarker discovery. The common practice for elucidating the structure of a compound is to obtain a mass spectrum and subsequently retrieve its structure from spectral databases. However, these methods fail for novel molecules that are not present in the reference database. We propose Spec2Mol, a deep learning architecture for molecular structure recommendation given mass spectra alone. Spec2Mol is inspired by the Speech2Text deep learning architectures for translating audio signals into text. Our approach is based on an encoder-decoder architecture. The encoder learns the spectra embeddings, while the decoder, pre-trained on a massive dataset of chemical structures for translating between different molecular representations, reconstructs SMILES sequences of the recommended chemical structures. We have evaluated Spec2Mol by assessing the molecular similarity between the recommended structures and the original structure. Our analysis showed that Spec2Mol is able to identify the presence of key molecular substructures from its mass spectrum, and shows on par performance, when compared to existing fragmentation tree methods particularly when test structure information is not available during training or present in the reference database.

16.
Front Immunol ; 14: 1108303, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37187737

RESUMO

Introduction: Peptide-HLA class I (pHLA) complexes on the surface of tumor cells can be targeted by cytotoxic T-cells to eliminate tumors, and this is one of the bases for T-cell-based immunotherapies. However, there exist cases where therapeutic T-cells directed towards tumor pHLA complexes may also recognize pHLAs from healthy normal cells. The process where the same T-cell clone recognizes more than one pHLA is referred to as T-cell cross-reactivity and this process is driven mainly by features that make pHLAs similar to each other. T-cell cross-reactivity prediction is critical for designing T-cell-based cancer immunotherapies that are both effective and safe. Methods: Here we present PepSim, a novel score to predict T-cell cross-reactivity based on the structural and biochemical similarity of pHLAs. Results and discussion: We show our method can accurately separate cross-reactive from non-crossreactive pHLAs in a diverse set of datasets including cancer, viral, and self-peptides. PepSim can be generalized to work on any dataset of class I peptide-HLAs and is freely available as a web server at pepsim.kavrakilab.org.


Assuntos
Peptídeos , Linfócitos T Citotóxicos , Sequência de Aminoácidos , Células Clonais
17.
Bioinformatics ; 27(15): 2161-2, 2011 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-21659320

RESUMO

SUMMARY: The LabelHash server and tools are designed for large-scale substructure comparison. The main use is to predict the function of unknown proteins. Given a set of (putative) functional residues, LabelHash finds all occurrences of matching substructures in the entire Protein Data Bank, along with a statistical significance estimate and known functional annotations for each match. The results can be downloaded for further analysis in any molecular viewer. For Chimera, there is a plugin to facilitate this process. AVAILABILITY: The web site is free and open to all users with no login requirements at http://labelhash.kavrakilab.org


Assuntos
Bases de Dados de Proteínas , Internet , Anotação de Sequência Molecular/métodos , Proteínas/metabolismo , Algoritmos , Motivos de Aminoácidos , Biologia Computacional/métodos , Relação Estrutura-Atividade , Interface Usuário-Computador
18.
Sci Rep ; 12(1): 10749, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35750701

RESUMO

Binding of peptides to Human Leukocyte Antigen (HLA) receptors is a prerequisite for triggering immune response. Estimating peptide-HLA (pHLA) binding is crucial for peptide vaccine target identification and epitope discovery pipelines. Computational methods for binding affinity prediction can accelerate these pipelines. Currently, most of those computational methods rely exclusively on sequence-based data, which leads to inherent limitations. Recent studies have shown that structure-based data can address some of these limitations. In this work we propose a novel machine learning (ML) structure-based protocol to predict binding affinity of peptides to HLA receptors. For that, we engineer the input features for ML models by decoupling energy contributions at different residue positions in peptides, which leads to our novel per-peptide-position protocol. Using Rosetta's ref2015 scoring function as a baseline we use this protocol to develop 3pHLA-score. Our per-peptide-position protocol outperforms the standard training protocol and leads to an increase from 0.82 to 0.99 of the area under the precision-recall curve. 3pHLA-score outperforms widely used scoring functions (AutoDock4, Vina, Dope, Vinardo, FoldX, GradDock) in a structural virtual screening task. Overall, this work brings structure-based methods one step closer to epitope discovery pipelines and could help advance the development of cancer and viral vaccines.


Assuntos
Antígenos de Histocompatibilidade Classe II , Peptídeos , Epitopos/química , Antígenos HLA/metabolismo , Antígenos de Histocompatibilidade Classe I/metabolismo , Antígenos de Histocompatibilidade Classe II/metabolismo , Humanos , Peptídeos/química , Ligação Proteica
19.
Front Immunol ; 13: 931155, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35903104

RESUMO

The pandemic caused by the SARS-CoV-2 virus, the agent responsible for the COVID-19 disease, has affected millions of people worldwide. There is constant search for new therapies to either prevent or mitigate the disease. Fortunately, we have observed the successful development of multiple vaccines. Most of them are focused on one viral envelope protein, the spike protein. However, such focused approaches may contribute for the rise of new variants, fueled by the constant selection pressure on envelope proteins, and the widespread dispersion of coronaviruses in nature. Therefore, it is important to examine other proteins, preferentially those that are less susceptible to selection pressure, such as the nucleocapsid (N) protein. Even though the N protein is less accessible to humoral response, peptides from its conserved regions can be presented by class I Human Leukocyte Antigen (HLA) molecules, eliciting an immune response mediated by T-cells. Given the increased number of protein sequences deposited in biological databases daily and the N protein conservation among viral strains, computational methods can be leveraged to discover potential new targets for SARS-CoV-2 and SARS-CoV-related viruses. Here we developed SARS-Arena, a user-friendly computational pipeline that can be used by practitioners of different levels of expertise for novel vaccine development. SARS-Arena combines sequence-based methods and structure-based analyses to (i) perform multiple sequence alignment (MSA) of SARS-CoV-related N protein sequences, (ii) recover candidate peptides of different lengths from conserved protein regions, and (iii) model the 3D structure of the conserved peptides in the context of different HLAs. We present two main Jupyter Notebook workflows that can help in the identification of new T-cell targets against SARS-CoV viruses. In fact, in a cross-reactive case study, our workflows identified a conserved N protein peptide (SPRWYFYYL) recognized by CD8+ T-cells in the context of HLA-B7+. SARS-Arena is available at https://github.com/KavrakiLab/SARS-Arena.


Assuntos
COVID-19 , SARS-CoV-2 , Linfócitos T CD8-Positivos , COVID-19/prevenção & controle , Vacinas contra COVID-19 , Epitopos de Linfócito T , Humanos , Peptídeos , Desenvolvimento de Vacinas
20.
PNAS Nexus ; 1(3): pgac124, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36003074

RESUMO

Human leukocyte antigen class I (HLA-I) molecules bind and present peptides at the cell surface to facilitate the induction of appropriate CD8+ T cell-mediated immune responses to pathogen- and self-derived proteins. The HLA-I peptide-binding cleft contains dominant anchor sites in the B and F pockets that interact primarily with amino acids at peptide position 2 and the C-terminus, respectively. Nonpocket peptide-HLA interactions also contribute to peptide binding and stability, but these secondary interactions are thought to be unique to individual HLA allotypes or to specific peptide antigens. Here, we show that two positively charged residues located near the top of peptide-binding cleft facilitate interactions with negatively charged residues at position 4 of presented peptides, which occur at elevated frequencies across most HLA-I allotypes. Loss of these interactions was shown to impair HLA-I/peptide binding and complex stability, as demonstrated by both in vitro and in silico experiments. Furthermore, mutation of these Arginine-65 (R65) and/or Lysine-66 (K66) residues in HLA-A*02:01 and A*24:02 significantly reduced HLA-I cell surface expression while also reducing the diversity of the presented peptide repertoire by up to 5-fold. The impact of the R65 mutation demonstrates that nonpocket HLA-I/peptide interactions can constitute anchor motifs that exert an unexpectedly broad influence on HLA-I-mediated antigen presentation. These findings provide fundamental insights into peptide antigen binding that could broadly inform epitope discovery in the context of viral vaccine development and cancer immunotherapy.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA