Pesquisa | Biblioteca Virtual em Saúde

1.

HLA-Arena: A Customizable Environment for the Structural Modeling and Analysis of Peptide-HLA Complexes for Cancer Immunotherapy.

Antunes, Dinler A; Abella, Jayvee R; Hall-Swan, Sarah; Devaurs, Didier; Conev, Anja; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.

JCO Clin Cancer Inform ; 4: 623-636, 2020 07.

Artigo em Inglês | MEDLINE | ID: mdl-32667823

RESUMO

PURPOSE: HLA protein receptors play a key role in cellular immunity. They bind intracellular peptides and display them for recognition by T-cell lymphocytes. Because T-cell activation is partially driven by structural features of these peptide-HLA complexes, their structural modeling and analysis are becoming central components of cancer immunotherapy projects. Unfortunately, this kind of analysis is limited by the small number of experimentally determined structures of peptide-HLA complexes. Overcoming this limitation requires developing novel computational methods to model and analyze peptide-HLA structures. METHODS: Here we describe a new platform for the structural modeling and analysis of peptide-HLA complexes, called HLA-Arena, which we have implemented using Jupyter Notebook and Docker. It is a customizable environment that facilitates the use of computational tools, such as APE-Gen and DINC, which we have previously applied to peptide-HLA complexes. By integrating other commonly used tools, such as MODELLER and MHCflurry, this environment includes support for diverse tasks in structural modeling, analysis, and visualization. RESULTS: To illustrate the capabilities of HLA-Arena, we describe 3 example workflows applied to peptide-HLA complexes. Leveraging the strengths of our tools, DINC and APE-Gen, the first 2 workflows show how to perform geometry prediction for peptide-HLA complexes and structure-based binding prediction, respectively. The third workflow presents an example of large-scale virtual screening of peptides for multiple HLA alleles. CONCLUSION: These workflows illustrate the potential benefits of HLA-Arena for the structural modeling and analysis of peptide-HLA complexes. Because HLA-Arena can easily be integrated within larger computational pipelines, we expect its potential impact to vastly increase. For instance, it could be used to conduct structural analyses for personalized cancer immunotherapy, neoantigen discovery, or vaccine development.

Assuntos

Neoplasias , Peptídeos , Humanos , Imunoterapia , Neoplasias/terapia , Linfócitos T

2.

Improving the organization and interactivity of metabolic pathfinding with precomputed pathways.

Kim, Sarah M; Peña, Matthew I; Moll, Mark; Bennett, George N; Kavraki, Lydia E.

BMC Bioinformatics ; 21(1): 13, 2020 Jan 10.

Artigo em Inglês | MEDLINE | ID: mdl-31924164

RESUMO

BACKGROUND: The rapid growth of available knowledge on metabolic processes across thousands of species continues to expand the possibilities of producing chemicals by combining pathways found in different species. Several computational search algorithms have been developed for automating the identification of possible heterologous pathways; however, these searches may return thousands of pathway results. Although the large number of results are in part due to the large number of possible compounds and reactions, a subset of core reaction modules is repeatedly observed in pathway results across multiple searches, suggesting that some subpaths between common compounds were more consistently explored than others.To reduce the resources spent on searching the same metabolic space, a new meta-algorithm for metabolic pathfinding, Hub Pathway search with Atom Tracking (HPAT), was developed to take advantage of a precomputed network of subpath modules. To investigate the efficacy of this method, we created a table describing a network of common hub metabolites and how they are biochemically connected and only offloaded searches to and from this hub network onto an interactive webserver capable of visualizing the resulting pathways. RESULTS: A test set of nineteen known pathways taken from literature and metabolic databases were used to evaluate if HPAT was capable of identifying known pathways. HPAT found the exact pathway for eleven of the nineteen test cases using a diverse set of precomputed subpaths, whereas a comparable pathfinding search algorithm that does not use precomputed subpaths found only seven of the nineteen test cases. The capability of HPAT to find novel pathways was demonstrated by its ability to identify novel 3-hydroxypropanoate (3-HP) synthesis pathways. As for pathway visualization, the new interactive pathway filters enable a reduction of the number of displayed pathways from hundreds down to less than ten pathways in several test cases, illustrating their utility in reducing the amount of presented information while retaining pathways of interest. CONCLUSIONS: This work presents the first step in incorporating a precomputed subpath network into metabolic pathfinding and demonstrates how this leads to a concise, interactive visualization of pathway results. The modular nature of metabolic pathways is exploited to facilitate efficient discovery of alternate pathways.

Assuntos

Algoritmos , Redes e Vias Metabólicas , Ácido Láctico/análogos & derivados , Ácido Láctico/química , Ácido Láctico/metabolismo , Ácido Pirúvico/metabolismo

3.

Using parallelized incremental meta-docking can solve the conformational sampling issue when docking large ligands to proteins.

Devaurs, Didier; Antunes, Dinler A; Hall-Swan, Sarah; Mitchell, Nicole; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.

BMC Mol Cell Biol ; 20(1): 42, 2019 09 05.

Artigo em Inglês | MEDLINE | ID: mdl-31488048

RESUMO

BACKGROUND: Docking large ligands, and especially peptides, to protein receptors is still considered a challenge in computational structural biology. Besides the issue of accurately scoring the binding modes of a protein-ligand complex produced by a molecular docking tool, the conformational sampling of a large ligand is also often considered a challenge because of its underlying combinatorial complexity. In this study, we evaluate the impact of using parallelized and incremental paradigms on the accuracy and performance of conformational sampling when docking large ligands. We use five datasets of protein-ligand complexes involving ligands that could not be accurately docked by classical protein-ligand docking tools in previous similar studies. RESULTS: Our computational evaluation shows that simply increasing the amount of conformational sampling performed by a protein-ligand docking tool, such as Vina, by running it for longer is rarely beneficial. Instead, it is more efficient and advantageous to run several short instances of this docking tool in parallel and group their results together, in a straightforward parallelized docking protocol. Even greater accuracy and efficiency are achieved by our parallelized incremental meta-docking tool, DINC, showing the additional benefits of its incremental paradigm. Using DINC, we could accurately reproduce the vast majority of the protein-ligand complexes we considered. CONCLUSIONS: Our study suggests that, even when trying to dock large ligands to proteins, the conformational sampling of the ligand should no longer be considered an issue, as simple docking protocols using existing tools can solve it. Therefore, scoring should currently be regarded as the biggest unmet challenge in molecular docking.

Assuntos

Algoritmos , Simulação de Acoplamento Molecular , Proteínas/química , Bases de Dados de Proteínas , Ligantes , Peptídeos/química , Conformação Proteica

4.

Machine Learning Guided Atom Mapping of Metabolic Reactions.

Litsa, Eleni E; Peña, Matthew I; Moll, Mark; Giannakopoulos, George; Bennett, George N; Kavraki, Lydia E.

J Chem Inf Model ; 59(3): 1121-1135, 2019 03 25.

Artigo em Inglês | MEDLINE | ID: mdl-30500191

RESUMO

Atom mapping of a chemical reaction is a mapping between the atoms in the reactant molecules and the atoms in the product molecules. It encodes the underlying reaction mechanism and, as such, constitutes essential information in computational studies in drug design. Various techniques have been investigated for the automatic computation of the atom mapping of a chemical reaction, approaching the problem as a graph matching problem. The graph abstraction of the chemical problem, though, eliminates crucial chemical information. There have been efforts for enhancing the graph representation by introducing the bond stabilities as edge weights, as they are estimated based on experimental evidence. Here, we present a fully automated optimization-based approach, named AMLGAM (Automated Machine Learning Guided Atom Mapping), that uses machine learning techniques for the estimation of the bond stabilities based on the chemical environment of each bond. The optimization method finds the reaction mechanism which favors the breakage/formation of the less stable bonds. We evaluated our method on a manually curated data set of 382 chemical reactions and ran our method on a much larger and diverse data set of 7400 chemical reactions. We show that the proposed method improves the accuracy over existing techniques based on results published by earlier studies on a common data set and is capable of handling unbalanced reactions.

Assuntos

Quimioinformática/métodos , Aprendizado de Máquina

5.

General Prediction of Peptide-MHC Binding Modes Using Incremental Docking: A Proof of Concept.

Antunes, Dinler A; Devaurs, Didier; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.

Sci Rep ; 8(1): 4327, 2018 03 12.

Artigo em Inglês | MEDLINE | ID: mdl-29531253

RESUMO

The class I major histocompatibility complex (MHC) is capable of binding peptides derived from intracellular proteins and displaying them at the cell surface. The recognition of these peptide-MHC (pMHC) complexes by T-cells is the cornerstone of cellular immunity, enabling the elimination of infected or tumoral cells. T-cell-based immunotherapies against cancer, which leverage this mechanism, can greatly benefit from structural analyses of pMHC complexes. Several attempts have been made to use molecular docking for such analyses, but pMHC structure remains too challenging for even state-of-the-art docking tools. To overcome these limitations, we describe the use of an incremental meta-docking approach for structural prediction of pMHC complexes. Previous methods applied in this context used specific constraints to reduce the complexity of this prediction problem, at the expense of generality. Our strategy makes no assumption and can potentially be used to predict binding modes for any pMHC complex. Our method has been tested in a re-docking experiment, reproducing the binding modes of 25 pMHC complexes whose crystal structures are available. This study is a proof of concept that incremental docking strategies can lead to general geometry prediction of pMHC complexes, with potential applications for immunotherapy against cancer or infectious diseases.

Assuntos

Antígenos de Histocompatibilidade Classe I/metabolismo , Simulação de Acoplamento Molecular , Peptídeos/química , Peptídeos/farmacologia , Bases de Dados de Proteínas , Descoberta de Drogas , Antígenos de Histocompatibilidade Classe I/química , Humanos , Neoplasias/terapia , Estudo de Prova de Conceito , Ligação Proteica

6.

Maintaining and Enhancing Diversity of Sampled Protein Conformations in Robotics-Inspired Methods.

Abella, Jayvee R; Moll, Mark; Kavraki, Lydia E.

J Comput Biol ; 25(1): 3-20, 2018 01.

Artigo em Inglês | MEDLINE | ID: mdl-29035572

RESUMO

The ability to efficiently sample structurally diverse protein conformations allows one to gain a high-level view of a protein's energy landscape. Algorithms from robot motion planning have been used for conformational sampling, and several of these algorithms promote diversity by keeping track of "coverage" in conformational space based on the local sampling density. However, large proteins present special challenges. In particular, larger systems require running many concurrent instances of these algorithms, but these algorithms can quickly become memory intensive because they typically keep previously sampled conformations in memory to maintain coverage estimates. In addition, robotics-inspired algorithms depend on defining useful perturbation strategies for exploring the conformational space, which is a difficult task for large proteins because such systems are typically more constrained and exhibit complex motions. In this article, we introduce two methodologies for maintaining and enhancing diversity in robotics-inspired conformational sampling. The first method addresses algorithms based on coverage estimates and leverages the use of a low-dimensional projection to define a global coverage grid that maintains coverage across concurrent runs of sampling. The second method is an automatic definition of a perturbation strategy through readily available flexibility information derived from B-factors, secondary structure, and rigidity analysis. Our results show a significant increase in the diversity of the conformations sampled for proteins consisting of up to 500 residues when applied to a specific robotics-inspired algorithm for conformational sampling. The methodologies presented in this article may be vital components for the scalability of robotics-inspired approaches.

Assuntos

Biologia Computacional/métodos , Conformação Proteica , Análise de Sequência de Proteína/métodos , Software , Animais , Humanos , Robótica/métodos

7.

Native State of Complement Protein C3d Analysed via Hydrogen Exchange and Conformational Sampling.

Devaurs, Didier; Papanastasiou, Malvina; Antunes, Dinler A; Abella, Jayvee R; Moll, Mark; Ricklin, Daniel; Lambris, John D; Kavraki, Lydia E.

Int J Comput Biol Drug Des ; 11(1-2): 90-113, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30700993

RESUMO

Hydrogen/deuterium exchange detected by mass spectrometry (HDXMS) provides valuable information on protein structure and dynamics. Although HDX-MS data is often interpreted using crystal structures, it was suggested that conformational ensembles produced by molecular dynamics simulations yield more accurate interpretations. In this paper, we analyse the complement protein C3d by performing an HDX-MS experiment, and evaluate several interpretation methodologies using an existing prediction model to derive HDX-MS data from protein structure. To interpret and refine C3d's HDX-MS data, we look for a conformation (or conformational ensemble) of C3d that allows computationally replicating this data. We confirm that crystal structures are not a good choice and suggest that conformational ensembles produced by molecular dynamics simulations might not always be satisfactory either. Finally, we show that coarse-grained conformational sampling of C3d produces a conformation from which its HDX-MS data can be replicated and refined.

8.

DINC 2.0: A New Protein-Peptide Docking Webserver Using an Incremental Approach.

Antunes, Dinler A; Moll, Mark; Devaurs, Didier; Jackson, Kyle R; Lizée, Gregory; Kavraki, Lydia E.

Cancer Res ; 77(21): e55-e57, 2017 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-29092940

RESUMO

Molecular docking is a standard computational approach to predict binding modes of protein-ligand complexes by exploring alternative orientations and conformations of the ligand (i.e., by exploring ligand flexibility). Docking tools are largely used for virtual screening of small drug-like molecules, but their accuracy and efficiency greatly decays for ligands with more than 10 flexible bonds. This prevents a broader use of these tools to dock larger ligands, such as peptides, which are molecules of growing interest in cancer research. To overcome this limitation, our group has previously proposed a meta-docking strategy, called DINC, to predict binding modes of large ligands. By incrementally docking overlapping fragments of a ligand, DINC allowed predicting binding modes of peptide-based inhibitors of transcription factors involved in cancer. Here, we describe DINC 2.0, a revamped version of the DINC webserver with enhanced capabilities and a more user-friendly interface. DINC 2.0 allows docking ligands that were previously too challenging for DINC, such as peptides with more than 25 flexible bonds. The webserver is freely accessible at http://dinc.kavrakilab.org, together with additional documentation and video tutorials. Our team will provide continuous support for this tool and is working on extending its applicability to other challenging fields, such as personalized immunotherapy against cancer. Cancer Res; 77(21); e55-57. ©2017 AACR.

Assuntos

Simulação de Acoplamento Molecular/métodos , Neoplasias/genética , Peptídeos/química , Proteínas/química , Software , Algoritmos , Sítios de Ligação , Humanos , Ligantes , Modelos Moleculares , Neoplasias/química , Peptídeos/genética , Ligação Proteica , Conformação Proteica , Proteínas/genética

9.

A review of parameters and heuristics for guiding metabolic pathfinding.

Kim, Sarah M; Peña, Matthew I; Moll, Mark; Bennett, George N; Kavraki, Lydia E.

J Cheminform ; 9(1): 51, 2017 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-29086092

RESUMO

Recent developments in metabolic engineering have led to the successful biosynthesis of valuable products, such as the precursor of the antimalarial compound, artemisinin, and opioid precursor, thebaine. Synthesizing these traditionally plant-derived compounds in genetically modified yeast cells introduces the possibility of significantly reducing the total time and resources required for their production, and in turn, allows these valuable compounds to become cheaper and more readily available. Most biosynthesis pathways used in metabolic engineering applications have been discovered manually, requiring a tedious search of existing literature and metabolic databases. However, the recent rapid development of available metabolic information has enabled the development of automated approaches for identifying novel pathways. Computer-assisted pathfinding has the potential to save biochemists time in the initial discovery steps of metabolic engineering. In this paper, we review the parameters and heuristics used to guide the search in recent pathfinding algorithms. These parameters and heuristics capture information on the metabolic network structure, compound structures, reaction features, and organism-specificity of pathways. No one metabolic pathfinding algorithm or search parameter stands out as the best to use broadly for solving the pathfinding problem, as each method and parameter has its own strengths and shortcomings. As assisted pathfinding approaches continue to become more sophisticated, the development of better methods for visualizing pathway results and integrating these results into existing metabolic engineering practices is also important for encouraging wider use of these pathfinding methods.

10.

Coarse-Grained Conformational Sampling of Protein Structure Improves the Fit to Experimental Hydrogen-Exchange Data.

Devaurs, Didier; Antunes, Dinler A; Papanastasiou, Malvina; Moll, Mark; Ricklin, Daniel; Lambris, John D; Kavraki, Lydia E.

Front Mol Biosci ; 4: 13, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28344973

RESUMO

Monitoring hydrogen/deuterium exchange (HDX) undergone by a protein in solution produces experimental data that translates into valuable information about the protein's structure. Data produced by HDX experiments is often interpreted using a crystal structure of the protein, when available. However, it has been shown that the correspondence between experimental HDX data and crystal structures is often not satisfactory. This creates difficulties when trying to perform a structural analysis of the HDX data. In this paper, we evaluate several strategies to obtain a conformation providing a good fit to the experimental HDX data, which is a premise of an accurate structural analysis. We show that performing molecular dynamics simulations can be inadequate to obtain such conformations, and we propose a novel methodology involving a coarse-grained conformational sampling approach instead. By extensively exploring the intrinsic flexibility of a protein with this approach, we produce a conformational ensemble from which we extract a single conformation providing a good fit to the experimental HDX data. We successfully demonstrate the applicability of our method to four small and medium-sized proteins.

11.

Defining Low-Dimensional Projections to Guide Protein Conformational Sampling.

Novinskaya, Anastasia; Devaurs, Didier; Moll, Mark; Kavraki, Lydia E.

J Comput Biol ; 24(1): 79-89, 2017 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-27892695

RESUMO

Exploring the conformational space of proteins is critical to characterize their functions. Numerous methods have been proposed to sample a protein's conformational space, including techniques developed in the field of robotics and known as sampling-based motion-planning algorithms (or sampling-based planners). However, these algorithms suffer from the curse of dimensionality when applied to large proteins. Many sampling-based planners attempt to mitigate this issue by keeping track of sampling density to guide conformational sampling toward unexplored regions of the conformational space. This is often done using low-dimensional projections as an indirect way to reduce the dimensionality of the exploration problem. However, how to choose an appropriate projection and how much it influences the planner's performance are still poorly understood issues. In this article, we introduce two methodologies defining low-dimensional projections that can be used by sampling-based planners for protein conformational sampling. The first method leverages information about a protein's flexibility to construct projections that can efficiently guide conformational sampling, when expert knowledge is available. The second method builds similar projections automatically, without expert intervention. We evaluate the projections produced by both methodologies on two conformational search problems involving three middle-size proteins. Our experiments demonstrate that (i) defining projections based on expert knowledge can benefit conformational sampling and (ii) automatically constructing such projections is a reasonable alternative.

Assuntos

Algoritmos , Proteínas de Bactérias/química , Calmodulina/química , Proteínas de Transporte/química , Proteínas de Escherichia coli/química , Proteínas Periplásmicas de Ligação/química , Cianobactérias/química , Escherichia coli/química , Humanos , Modelos Moleculares , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta

12.

Structure-guided selection of specificity determining positions in the human Kinome.

Moll, Mark; Finn, Paul W; Kavraki, Lydia E.

BMC Genomics ; 17 Suppl 4: 431, 2016 08 18.

Artigo em Inglês | MEDLINE | ID: mdl-27556159

RESUMO

BACKGROUND: The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. RESULTS: We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm's performance is demonstrated using an extensive dataset for the human kinome. CONCLUSION: We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important.

Assuntos

Biologia Computacional/métodos , Inibidores de Proteínas Quinases/química , Proteínas Quinases/genética , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Aminoácidos , Sítios de Ligação , Genoma Humano , Humanos , Ligação Proteica , Proteínas Quinases/química , Relação Estrutura-Atividade

13.

SIMS: a hybrid method for rapid conformational analysis.

Gipson, Bryant; Moll, Mark; Kavraki, Lydia E.

PLoS One ; 8(7): e68826, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23935893

RESUMO

Proteins are at the root of many biological functions, often performing complex tasks as the result of large changes in their structure. Describing the exact details of these conformational changes, however, remains a central challenge for computational biology due the enormous computational requirements of the problem. This has engendered the development of a rich variety of useful methods designed to answer specific questions at different levels of spatial, temporal, and energetic resolution. These methods fall largely into two classes: physically accurate, but computationally demanding methods and fast, approximate methods. We introduce here a new hybrid modeling tool, the Structured Intuitive Move Selector (sims), designed to bridge the divide between these two classes, while allowing the benefits of both to be seamlessly integrated into a single framework. This is achieved by applying a modern motion planning algorithm, borrowed from the field of robotics, in tandem with a well-established protein modeling library. sims can combine precise energy calculations with approximate or specialized conformational sampling routines to produce rapid, yet accurate, analysis of the large-scale conformational variability of protein systems. Several key advancements are shown, including the abstract use of generically defined moves (conformational sampling methods) and an expansive probabilistic conformational exploration. We present three example problems that sims is applied to and demonstrate a rapid solution for each. These include the automatic determination of "active" residues for the hinge-based system Cyanovirin-N, exploring conformational changes involving long-range coordinated motion between non-sequential residues in Ribose-Binding Protein, and the rapid discovery of a transient conformational state of Maltose-Binding Protein, previously only determined by Molecular Dynamics. For all cases we provide energetic validations using well-established energy fields, demonstrating this framework as a fast and accurate tool for the analysis of a wide range of protein flexibility problems.

Assuntos

Algoritmos , Biologia Computacional/métodos , Proteínas/química , Aminoácidos/química , Proteínas de Bactérias/química , Proteínas de Transporte/química , Espectroscopia de Ressonância Magnética , Proteínas Ligantes de Maltose/química , Modelos Moleculares , Análise de Componente Principal , Conformação Proteica , Termodinâmica

14.

Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.

Bryant, Drew H; Moll, Mark; Finn, Paul W; Kavraki, Lydia E.

PLoS Comput Biol ; 9(6): e1003087, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23754939

RESUMO

The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Assuntos

Proteínas Quinases/química , Análise por Conglomerados , Humanos , Modelos Teóricos , Proteoma , Máquina de Vetores de Suporte

15.

The LabelHash server and tools for substructure-based functional annotation.

Moll, Mark; Bryant, Drew H; Kavraki, Lydia E.

Bioinformatics ; 27(15): 2161-2, 2011 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-21659320

RESUMO

SUMMARY: The LabelHash server and tools are designed for large-scale substructure comparison. The main use is to predict the function of unknown proteins. Given a set of (putative) functional residues, LabelHash finds all occurrences of matching substructures in the entire Protein Data Bank, along with a statistical significance estimate and known functional annotations for each match. The results can be downloaded for further analysis in any molecular viewer. For Chimera, there is a plugin to facilitate this process. AVAILABILITY: The web site is free and open to all users with no login requirements at http://labelhash.kavrakilab.org

Assuntos

Bases de Dados de Proteínas , Internet , Anotação de Sequência Molecular/métodos , Proteínas/metabolismo , Algoritmos , Motivos de Aminoácidos , Biologia Computacional/métodos , Relação Estrutura-Atividade , Interface Usuário-Computador

16.

The LabelHash algorithm for substructure matching.

Moll, Mark; Bryant, Drew H; Kavraki, Lydia E.

BMC Bioinformatics ; 11: 555, 2010 Nov 11.

Artigo em Inglês | MEDLINE | ID: mdl-21070651

RESUMO

BACKGROUND: There is an increasing number of proteins with known structure but unknown function. Determining their function would have a significant impact on understanding diseases and designing new therapeutics. However, experimental protein function determination is expensive and very time-consuming. Computational methods can facilitate function determination by identifying proteins that have high structural and chemical similarity. RESULTS: We present LabelHash, a novel algorithm for matching substructural motifs to large collections of protein structures. The algorithm consists of two phases. In the first phase the proteins are preprocessed in a fashion that allows for instant lookup of partial matches to any motif. In the second phase, partial matches for a given motif are expanded to complete matches. The general applicability of the algorithm is demonstrated with three different case studies. First, we show that we can accurately identify members of the enolase superfamily with a single motif. Next, we demonstrate how LabelHash can complement SOIPPA, an algorithm for motif identification and pairwise substructure alignment. Finally, a large collection of Catalytic Site Atlas motifs is used to benchmark the performance of the algorithm. LabelHash runs very efficiently in parallel; matching a motif against all proteins in the 95% sequence identity filtered non-redundant Protein Data Bank typically takes no more than a few minutes. The LabelHash algorithm is available through a web server and as a suite of standalone programs at http://labelhash.kavrakilab.org. The output of the LabelHash algorithm can be further analyzed with Chimera through a plugin that we developed for this purpose. CONCLUSIONS: LabelHash is an efficient, versatile algorithm for large-scale substructure matching. When LabelHash is running in parallel, motifs can typically be matched against the entire PDB on the order of minutes. The algorithm is able to identify functional homologs beyond the twilight zone of sequence identity and even beyond fold similarity. The three case studies presented in this paper illustrate the versatility of the algorithm.

Assuntos

Algoritmos , Proteínas/química , Biologia Computacional/métodos , Bases de Dados de Proteínas , Conformação Proteica , Proteínas/metabolismo , Relação Estrutura-Atividade

17.

Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction.

Bryant, Drew H; Moll, Mark; Chen, Brian Y; Fofanov, Viacheslav Y; Kavraki, Lydia E.

BMC Bioinformatics ; 11: 242, 2010 May 11.

Artigo em Inglês | MEDLINE | ID: mdl-20459833

RESUMO

BACKGROUND: Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. RESULTS: This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. CONCLUSIONS: FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.

Assuntos

Enzimas/química , Proteínas/química , Proteômica/métodos , Motivos de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Enzimas/metabolismo , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Proteínas/metabolismo , Análise de Sequência de Proteína

18.

Tracing conformational changes in proteins.

Haspel, Nurit; Moll, Mark; Baker, Matthew L; Chiu, Wah; Kavraki, Lydia E.

BMC Struct Biol ; 10 Suppl 1: S1, 2010 May 17.

Artigo em Inglês | MEDLINE | ID: mdl-20487508

RESUMO

BACKGROUND: Many proteins undergo extensive conformational changes as part of their functionality. Tracing these changes is important for understanding the way these proteins function. Traditional biophysics-based conformational search methods require a large number of calculations and are hard to apply to large-scale conformational motions. RESULTS: In this work we investigate the application of a robotics-inspired method, using backbone and limited side chain representation and a coarse grained energy function to trace large-scale conformational motions. We tested the algorithm on four well known medium to large proteins and we show that even with relatively little information we are able to trace low-energy conformational pathways efficiently. The conformational pathways produced by our methods can be further filtered and refined to produce more useful information on the way proteins function under physiological conditions. CONCLUSIONS: The proposed method effectively captures large-scale conformational changes and produces pathways that are consistent with experimental data and other computational studies. The method represents an important first step towards a larger scale modeling of more complex biological systems.

Assuntos

Proteínas/química , Adenilato Quinase/química , Algoritmos , Bactérias/química , Proteínas de Bactérias/química , Proteínas de Transporte/química , Chaperonina 60/química , Simulação por Computador , Escherichia coli/química , Proteínas de Escherichia coli/química , Modelos Moleculares , Proteínas Periplásmicas de Ligação/química , Conformação Proteica , Termodinâmica

19.

Roadmap methods for protein folding.

Moll, Mark; Schwarz, David; Kavraki, Lydia E.

Methods Mol Biol ; 413: 219-39, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18075168

RESUMO

Protein folding refers to the process whereby a protein assumes its intricate three-dimensional shape. This chapter reviews a class of methods for studying the folding process called roadmap methods. The goal of these methods is not to predict the folded structure of a protein, but rather to analyze the folding kinetics. It is assumed that the folded state is known. Roadmap methods maintain a graph representation of sampled conformations. By analyzing this graph one can predict structure formation order, the probability of folding, and get a coarse view of the energy landscape.

Assuntos

Conformação Proteica , Dobramento de Proteína , Animais , Biologia Computacional/métodos , Humanos , Cinética , Proteínas/química , Termodinâmica

20.

Matching of structural motifs using hashing on residue labels and geometric filtering for protein function prediction.

Moll, Mark; Kavraki, Lydia E.

Comput Syst Bioinformatics Conf ; 7: 157-68, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-19642277

RESUMO

There is an increasing number of proteins with known structure but unknown function. Determining their function would have a significant impact on understanding diseases and designing new therapeutics. However, experimental protein function determination is expensive and very time-consuming. Computational methods can facilitate function determination by identifying proteins that have high structural and chemical similarity. Our focus is on methods that determine binding site similarity. Although several such methods exist, it still remains a challenging problem to quickly find all functionally-related matches for structural motifs in large data sets with high specificity. In this context, a structural motif is a set of 3D points annotated with physicochemical information that characterize a molecular function. We propose a new method called LabelHash that creates hash tables of n-tuples of residues for a set of targets. Using these hash tables, we can quickly look up partial matches to a motif and expand those matches to complete matches. We show that by applying only very mild geometric constraints we can find statistically significant matches with extremely high specificity in very large data sets and for very general structural motifs. We demonstrate that our method requires a reasonable amount of storage when employing a simple geometric filter and further improves on the specificity of our previous work while maintaining very high sensitivity. Our algorithm is evaluated on 20 homolog classes and a non-redundant version of the Protein Data Bank as our background data set. We use cluster analysis to analyze why certain classes of homologs are more difficult to classify than others. The LabelHash algorithm is implemented on a web server at http://kavrakilab.org/labelhash/.

Assuntos

Algoritmos , Proteínas/química , Proteínas/metabolismo , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Dados de Sequência Molecular , Relação Estrutura-Atividade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA