RESUMO
In this study, we present the discovery and pharmacological characterization of a new series of 6-piperazinyl-7-azaindoles. These compounds demonstrate potent antagonism and selectivity against the 5-HT6 receptor. Our research primarily focuses on optimizing the lead structure and investigating the structure-activity relationship (SAR) of these compounds. Our main objective is to improve their activity and selectivity against off-target receptors. Overall, our findings contribute to the advancement of novel compounds targeting the 5-HT6 receptor. Compound 29 exhibits significant promise in terms of pharmacological, physicochemical, and ADME (Absorption, Distribution, Metabolism, and Excretion) properties. Consequently, it merits thorough exploration as a potential drug candidate due to its favorable activity profile and successful outcomes in a range of in vivo experiments.
Assuntos
Piridinas , Antagonistas da Serotonina , Piridinas/química , Antagonistas da Serotonina/química , Relação Estrutura-AtividadeRESUMO
The high similarity between certain sub-pockets of serine proteases may lead to low selectivity of protease inhibitors. Therefore the application of proteochemometrics (PCM), which quantifies the relationship between protein/ligand descriptors and affinity for multiple ligands and targets simultaneously, is useful to understand and improve the selectivity profiles of potential inhibitors. In this study, protein field-based PCM that uses knowledge-based and WaterMap derived fields to describe proteins in combination with 2D (RDKit and MOE fingerprints) and 3D (4 point pharmacophoric fingerprints and GRIND) ligand descriptors was used to model the bioactivities of 24 homologous serine proteases and 5863 inhibitors in an integrated fashion. Of the multiple field-based PCM models generated based on different ligand descriptors, RDKit fingerprints showed the best performance in terms of external prediction with Rtest2 of 0.72 and RMSEP of 0.81. Further, visual interpretation of the models highlights sub-pocket specific regions that influence affinity and selectivity of serine protease inhibitors.
RESUMO
A series of 1-Sulfonyl-6-Piperazinyl-7-Azaindoles, showing strong antagonistic activity to 5-HT6 receptor (5-HT6R) was synthesized and characterized. The series was optimized to reduce activity on D2 receptor. Based on the selectivity against this off-target and the analysis of the ADME-tox profile, compound 1c was selected for in vivo efficacy assessment, which demonstrated procognitive effects as shown in reversal of scopolamine induced amnesia in an elevated plus maze test in mice. Compound 3, the demethylated version of compound 1c, was profiled against a panel of 106 receptors, channels and transporters, indicating only D3 receptor as a major off-target. Compound 3 has been selected for this study over compound 1c because of the higher 5-HT6R/D2R binding ratio. These results have defined a new direction for the design of our pseudo-selective 5-HT6R antagonists.
Assuntos
Amnésia/tratamento farmacológico , Indóis/farmacologia , Piperazinas/farmacologia , Receptores de Serotonina/metabolismo , Antagonistas da Serotonina/farmacologia , Sulfonas/farmacologia , Amnésia/induzido quimicamente , Animais , Comportamento Animal/efeitos dos fármacos , Relação Dose-Resposta a Droga , Humanos , Indóis/síntese química , Indóis/química , Aprendizagem em Labirinto/efeitos dos fármacos , Camundongos , Modelos Moleculares , Estrutura Molecular , Piperazinas/síntese química , Piperazinas/química , Escopolamina , Antagonistas da Serotonina/síntese química , Antagonistas da Serotonina/química , Relação Estrutura-Atividade , Sulfonas/síntese química , Sulfonas/químicaRESUMO
Achieving selectivity for small organic molecules toward biological targets is a main focus of drug discovery but has been proven difficult, for example, for kinases because of the high similarity of their ATP binding pockets. To support the design of more selective inhibitors with fewer side effects or with altered target profiles for improved efficacy, we developed a method combining ligand- and receptor-based information. Conventional QSAR models enable one to study the interactions of multiple ligands toward a single protein target, but in order to understand the interactions between multiple ligands and multiple proteins, we have used proteochemometrics, a multivariate statistics method that aims to combine and correlate both ligand and protein descriptions with affinity to receptors. The superimposed binding sites of 50 unique kinases were described by molecular interaction fields derived from knowledge-based potentials and Schrödinger's WaterMap software. Eighty ligands were described by Mold(2), Open Babel, and Volsurf descriptors. Partial least-squares regression including cross-terms, which describe the selectivity, was used for model building. This combination of methods allows interpretation and easy visualization of the models within the context of ligand binding pockets, which can be translated readily into the design of novel inhibitors.
Assuntos
Trifosfato de Adenosina/química , Descoberta de Drogas , Simulação de Acoplamento Molecular , Inibidores de Proteínas Quinases/química , Proteínas Quinases/química , Sítios de Ligação , Humanos , Análise dos Mínimos Quadrados , Ligantes , Análise Multivariada , Ligação Proteica , Conformação Proteica , Relação Quantitativa Estrutura-AtividadeRESUMO
A series of 45 peptide inhibitors was designed, synthesized, and evaluated against the NS2B-NS3 proteases of the four subtypes of dengue virus, DEN-1-4. The design was based on proteochemometric models for Michaelis (Km) and cleavage rate constants (kcat) of protease substrates. This led first to octapeptides showing submicromolar or low micromolar inhibitory activities on the four proteases. Stepwise removal of cationic substrate non-prime side residues and variations in the prime side sequence resulted finally in an uncharged tetrapeptide, WYCW-NH2, with inhibitory Ki values of 4.2, 4.8, 24.4, and 11.2 µM for the DEN-1-4 proteases, respectively. Analysis of the inhibition data by proteochemometric modeling suggested the possibility for different binding poses of the shortened peptides compared to the octapeptides, which was supported by results of docking of WYCW-NH2 into the X-ray structure of DEN-3 protease.
Assuntos
Oligopeptídeos/farmacologia , Inibidores de Proteases/farmacologia , Serina Endopeptidases/metabolismo , Proteínas Virais/antagonistas & inibidores , Sequência de Aminoácidos , Cristalografia por Raios X , Desenho de Fármacos , Modelos Moleculares , Oligopeptídeos/química , Oligopeptídeos/metabolismo , Inibidores de Proteases/química , Inibidores de Proteases/metabolismo , Ligação Proteica , Conformação Proteica , Estrutura Terciária de Proteína , Serina Endopeptidases/química , Especificidade por Substrato , Proteínas Virais/química , Proteínas Virais/metabolismoRESUMO
The prime side specificity of dengue protease substrates was investigated by use of proteochemometrics, a technology for drug target interaction analysis. A set of 48 internally quenched peptides were designed using statistical molecular design (SMD) and assayed with proteases of four subtypes of dengue virus (DEN-1-4) for Michaelis (K(m)) and cleavage rate constants (k(cat)). The data were subjected to proteochemometrics modeling, concomitantly modeling all peptides on all the four dengue proteases, which yielded highly predictive models for both activities. Detailed analysis of the models then showed that considerably differing physico-chemical properties of amino acids contribute independently to the K(m) and k(cat) activities. For k(cat), only P1' and P2' prime side residues were important, while for K(m) all four prime side residues, P1'-P4', were important. The models could be used to identify amino acids for each P' substrate position that are favorable for, respectively, high substrate affinity and cleavage rate.
Assuntos
Serina Endopeptidases/química , Serina Endopeptidases/metabolismo , Técnicas de Química Combinatória , Vírus da Dengue/enzimologia , Cinética , Modelos Biológicos , Ligação Proteica , Proteômica , Serina Endopeptidases/genética , Especificidade por SubstratoRESUMO
BACKGROUND: A major obstacle in treatment of HIV is the ability of the virus to mutate rapidly into drug-resistant variants. A method for predicting the susceptibility of mutated HIV strains to antiviral agents would provide substantial clinical benefit as well as facilitate the development of new candidate drugs. Therefore, we used proteochemometrics to model the susceptibility of HIV to protease inhibitors in current use, utilizing descriptions of the physico-chemical properties of mutated HIV proteases and 3D structural property descriptions for the protease inhibitors. The descriptions were correlated to the susceptibility data of 828 unique HIV protease variants for seven protease inhibitors in current use; the data set comprised 4792 protease-inhibitor combinations. RESULTS: The model provided excellent predictability (R2 = 0.92, Q2 = 0.87) and identified general and specific features of drug resistance. The model's predictive ability was verified by external prediction in which the susceptibilities to each one of the seven inhibitors were omitted from the data set, one inhibitor at a time, and the data for the six remaining compounds were used to create new models. This analysis showed that the over all predictive ability for the omitted inhibitors was Q2 inhibitors = 0.72. CONCLUSION: Our results show that a proteochemometric approach can provide generalized susceptibility predictions for new inhibitors. Our proteochemometric model can directly analyze inhibitor-protease interactions and facilitate treatment selection based on viral genotype. The model is available for public use, and is located at HIV Drug Research Centre.
Assuntos
Técnicas de Química Combinatória/métodos , Sistemas de Liberação de Medicamentos/métodos , Farmacorresistência Viral , Inibidores da Protease de HIV/química , Protease de HIV/química , Modelos Químicos , Mapeamento de Interação de Proteínas/métodos , Sítios de Ligação , Simulação por Computador , Ligação ProteicaRESUMO
We demonstrate the use of statistical molecular design (SMD) in the selection of peptide libraries aimed to systematically investigate antigen-antibody binding spaces. Earlier, we derived two novel antibodies by mutating the complementarity-determining region of the anti-p24 (HIV-1) single chain Fv antibody, CB4-1 that had lost their affinity for a p24 epitope-homologous peptide by 8- and 60-fold. The present study was devoted to explore how peptide libraries can be designed under experimental design criteria for effective screening of peptide antigens. Several small peptide-antigen libraries were selected using SMD principles and their activities were evaluated by their binding to SPOT-synthesized peptide membranes and by fluorescence polarization (FP). The approach was able to reveal the most critical residues required for antigen binding, and finally to increase the binding activity by proper modifications of amino acids in the peptide antigen. A model of the active peptide binding pocket formed by the mutated scFv and the antigen was compatible with the information gained from the experimental data. Our results suggest that SMD approaches can be used to explore peptide antigen features essential for their interactions with antibodies.
Assuntos
Anticorpos/química , Anticorpos/imunologia , Antígenos/imunologia , Biblioteca de Peptídeos , Sequência de Aminoácidos , Substituição de Aminoácidos , Anticorpos/genética , Modelos Moleculares , Dados de Sequência Molecular , Mutação/genética , Ligação Proteica , Estrutura Terciária de ProteínaRESUMO
The melanocortin (MC) system confines unique G-protein coupled receptor pathways, which include the MC(1-5) receptors and their endogenous agonists and antagonists, the MCs and the agouti and agouti-related proteins. The MC4 receptor is an important target for development of drugs for treatment of obesity and cachexia. While natural MC peptides are selective for the MC1 receptor, some cyclic pentapeptides, such as the HS-129 peptide, show high selectivity for the MC4 receptor. Here we gained insight into the mechanisms for its recognition by MC receptors. To this end we correlated the interaction data of four HS peptide analogues with four wild-type and 14 multiple chimeric MC receptors to the binary and physicochemical descriptions of the studied entities by use of partial least squares regression, which resulted in highly valid proteochemometric models. Analysis of the models revealed that the recognition sites of the HS peptides are different from the earlier proteochemometrically mapped linear MSH peptides' recognitions sites, although they overlap partially. The analysis also revealed important amino acids that explain the selectivity of the HS-129 peptide for the MC4 receptor.
Assuntos
Peptídeos Cíclicos/química , Peptídeos Cíclicos/metabolismo , Receptores de Melanocortina/química , Receptores de Melanocortina/metabolismo , Proteína Agouti Sinalizadora , Sequência de Aminoácidos , Sítios de Ligação , Biologia Computacional/métodos , Peptídeos e Proteínas de Sinalização Intercelular/metabolismo , Melanocortinas , Dados de Sequência Molecular , Proteínas Recombinantes de Fusão/química , Reprodutibilidade dos TestesRESUMO
Proteochemometrics is a technology for the study of molecular recognition based on chemometric techniques. Here we applied it to analyse the amino acids and amino acid physico-chemical properties that are involved in antibodies' recognition of peptide antigens. To this end, we used a study system comprised by a diverse single chain antibody library derived from the murine mAb anti-p24 (HIV-1) antibody CB4-1, evaluated on peptide arrays manufactured by SPOT synthesis. The binding pattern obtained was correlated to physico-chemical descriptors (z-scales) of antibodies and peptides amino acids using partial least-squares projections to latent structures. Cross terms derived from antibody and antigen descriptors were included, which substantially improved the proteochemometric model. The final model was statistically highly satisfactory with a correlation coefficient R(2) = 0.73 and predictive ability Q(2) = 0.68. The physico-chemical properties of each interacting amino acid residue of both the peptides and the antibodies being essential for the antigen-antibody recognition could be retrieved from the model. The study shows for the first time the feasibility of using proteochemometrics to analyse the molecular recognition of antigens by antibodies.
Assuntos
Anticorpos Monoclonais/química , Reações Antígeno-Anticorpo , Proteína do Núcleo p24 do HIV/imunologia , Região Variável de Imunoglobulina/química , Peptídeos/química , Aminoácidos/química , Animais , Anticorpos Monoclonais/genética , Técnicas de Química Combinatória , Estudos de Viabilidade , Região Variável de Imunoglobulina/genética , Camundongos , Modelos Químicos , Mutação , Peptídeos/síntese química , Análise Serial de Proteínas/métodosRESUMO
The aim of this study was to develop predictive quantitative structure-activity relationship (QSAR) modeling for antibody-peptide interactions. A small single chain antibody library was designed and manufactured around the murine anti-p24 (HIV-1) monoclonal antibody CB4-1 by use of statistical molecular design (SMD) principles and site directed mutagenesis, and its affinity for a p24 derived antigen was determined by fluorescence polarization. A satisfactory QSAR model (Q(2) = 0.74, R(2) = 0.88) was derived by correlating the affinity data to physicochemical property scales of the amino acids varied in the library. The model explains most of the antibody-antigen interactions of the studied set, and provides insights into the molecular mechanism involved in antigen binding.
Assuntos
Anticorpos/análise , Anticorpos/genética , Proteínas Mutantes/análise , Relação Quantitativa Estrutura-Atividade , Motivos de Aminoácidos/imunologia , Animais , Anticorpos/metabolismo , Proteína do Núcleo p24 do HIV/química , Proteína do Núcleo p24 do HIV/imunologia , Região Variável de Imunoglobulina/química , Região Variável de Imunoglobulina/genética , Camundongos , Modelos Moleculares , Modelos Teóricos , Mutação , Biblioteca de Peptídeos , Anticorpos de Cadeia ÚnicaRESUMO
Retroviruses affect a large number of species, from fish and birds to mammals and humans, with global socioeconomic negative impacts. Here the authors report and experimentally validate a novel approach for the analysis of the molecular networks that are involved in the recognition of substrates by retroviral proteases. Using multivariate analysis of the sequence-based physiochemical descriptions of 61 retroviral proteases comprising wild-type proteases, natural mutants, and drug-resistant forms of proteases from nine different viral species in relation to their ability to cleave 299 substrates, the authors mapped the physicochemical properties and cross-dependencies of the amino acids of the proteases and their substrates, which revealed a complex molecular interaction network of substrate recognition and cleavage. The approach allowed a detailed analysis of the molecular-chemical mechanisms involved in substrate cleavage by retroviral proteases.
Assuntos
Farmacorresistência Viral/fisiologia , Modelos Biológicos , Peptídeo Hidrolases/química , Peptídeo Hidrolases/metabolismo , Proteínas dos Retroviridae/química , Retroviridae/enzimologia , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , HIV-1/enzimologia , Dados de Sequência Molecular , Ligação Proteica , Mapeamento de Interação de Proteínas/métodos , Transdução de Sinais/fisiologia , Relação Estrutura-AtividadeRESUMO
The interactions of alpha-MSH peptides with melanocortin receptors (MCRs) were located by proteochemometric modeling. Nine alpha-MSH peptide analogues were constructed by exchanging the Trp9 residue in the alpha-MSH core with the natural or artificial amino acids Arg, Asp, Cys, Gly, Leu, Nal, d-Nal, Pro, or d-Trp. The nine peptides created, and alpha-MSH itself, were evaluated for their interactions with the 4 wild-type MC(1,3-5)Rs and 15 multichimeric MCRs, each of the latter being constructed from three sequence segments, each taken from a different wild-type MC(1,3-5)R. The segments of the chimeric MCRs were selected according to the principles of statistical molecular design and were arranged so as to divide the receptors into five parts. By this approach, a set of 19 maximally diverse MC receptor proteins was obtained for which the interaction activity with the 10 peptides were measured by radioligand binding thus creating data for 190 ligand-protein pairs, which were subsequently analyzed by use of proteochemometric modeling. In proteochemometrics, the structural or physicochemical properties of both interaction partners, which represent the complementarity of the interacting entities, are used to create multivariate mathematical descriptions. (Here, physicochemical property descriptors of the receptors' and peptides' amino acids were used). A valid, highly predictive (Q2 = 0.74) and easily interpretable model was then obtained. The model was further validated by its ability to correctly predicting the affinity of alpha-MSH for new point and cassette-mutated MC4/MC1Rs, and it was then used to identify the receptor residues that are important for affording the high affinity and selectivity of alpha-MSH for the MC1R. It was revealed that these residues are located in several quite distant parts of the receptors' transmembrane cavity and must therefore cause their influence at various stages of the dynamic ligand-binding process, such as by affecting the conformation of the ligand at the vicinity of the receptor and taking part in the path of the ligand's entry into its binding pocket. Our study can be used as a template how to create high resolution proteochemometric models when there are a limited number of natural proteins and ligands available.
Assuntos
Peptídeos/química , Receptores de Melanocortina/química , Triptofano/química , alfa-MSH/química , Sítios de Ligação , Humanos , Modelos Moleculares , Peptídeos/metabolismo , Ligação Proteica , Receptores de Melanocortina/metabolismo , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo , Relação Estrutura-Atividade , alfa-MSH/metabolismoRESUMO
Modeling and understanding protein-ligand interactions is one of the most important goals in computational drug discovery. To this end, proteochemometrics uses structural and chemical descriptors from several proteins and several ligands to induce interaction-models. Here, we present a new and generalized approach in which proteins varying greatly in terms of sequence and structure are represented by a library of local substructures. Using linear regression and rule-based learning, we combine such local substructures with chemical descriptors from the ligands to model binding affinity for a training set of hydrolase and lyase enzymes. We evaluate the predictive performance of these models using cross validation and sets of unseen ligand with unknown three-dimensional structure. The models are shown to generalize by outperforming models using descriptors from only proteins or only ligands, or models using global structure similarities rather than local similarities. Thus, we demonstrate that this approach is capable of describing dependencies between local structural properties and ligands in otherwise dissimilar protein structures. These dependencies are often, but not always, associated with local substructures that are in contact with the ligands. Finally, we show that strongly bound enzyme-ligand complexes require the presence of particular local substructures, while weakly bound complexes may be described by the absence of certain properties. The results demonstrate that the alignment-independent approach using local substructures is capable of describing protein-ligand interaction for largely different proteins and hence opens up for proteochemometrics-analysis of the interaction-space of entire proteomes. Current approaches are limited to families of closely related proteins. families of closely related proteins.
Assuntos
Biologia Computacional/métodos , Desenho de Fármacos , Inibidores Enzimáticos/química , Enzimas/química , Modelos Moleculares , Proteômica , Algoritmos , Animais , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Ligantes , Ligação Proteica , Conformação Proteica , Proteínas/químicaRESUMO
BACKGROUND: Both direct and indirect interactions determine molecular recognition of ligands by proteins. Indirect interactions can be defined as effects on recognition controlled from distant sites in the proteins, e.g. by changes in protein conformation and mobility, whereas direct interactions occur in close proximity of the protein's amino acids and the ligand. Molecular recognition is traditionally studied using three-dimensional methods, but with such techniques it is difficult to predict the effects caused by mutational changes of amino acids located far away from the ligand-binding site. We recently developed an approach, proteochemometrics, to the study of molecular recognition that models the chemical effects involved in the recognition of ligands by proteins using statistical sampling and mathematical modelling. RESULTS: A proteochemometric model was built, based on a statistically designed protein library's (melanocortin receptors') interaction with three peptides and used to predict which amino acids and sequence fragments that are involved in direct and indirect ligand interactions. The model predictions were confirmed by directed mutagenesis. The predicted presumed direct interactions were in good agreement with previous three-dimensional studies of ligand recognition. However, in addition the model could also correctly predict the location of indirect effects on ligand recognition arising from distant sites in the receptors, something that three-dimensional modelling could not afford. CONCLUSION: We demonstrate experimentally that proteochemometric modelling can be used with high accuracy to predict the site of origin of direct and indirect effects on ligand recognitions by proteins.
Assuntos
Modelos Químicos , Peptídeos/química , Mapeamento de Interação de Proteínas/métodos , Receptores de Melanocortina/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Dados de Sequência Molecular , Ligação ProteicaRESUMO
The NS3 (dengue virus non-structural protein 3) serine protease of dengue virus is an essential component for virus maturation, thus representing an attractive target for the development of antiviral drugs directed at the inhibition of polyprotein processing. In the present study, we have investigated determinants of substrate specificity of the dengue virus NS3 protease by using internally quenched fluorogenic peptides containing Abz (o-aminobenzoic acid; synonymous to anthranilic acid) and 3-nitrotyrosine (nY) representing both native and chimaeric polyprotein cleavage site sequences. By using this combinatorial approach, we were able to describe the substrate preferences and determinants of specificity for the dengue virus NS2B(H)-NS3pro protease. Kinetic parameters (kcat/K(m)) for the hydrolysis of peptide substrates with systematic truncations at the prime and non-prime side revealed a length preference for peptides spanning the P4-P3' residues, and the peptide Abz-RRRRSAGnY-amide based on the dengue virus capsid protein processing site was discovered as a novel and efficient substrate of the NS3 protease (kcat/K(m)=11087 M(-1) x s(-1)). Thus, while having confirmed the exclusive preference of the NS3 protease for basic residues at the P1 and P2 positions, we have also shown that the presence of basic amino acids at the P3 and P4 positions is a major specificity-determining feature of the dengue virus NS3 protease. Investigation of the substrate peptide Abz-KKQRAGVLnY-amide based on the NS2B/NS3 polyprotein cleavage site demonstrated an unexpected high degree of cleavage efficiency. Chimaeric peptides with combinations of prime and non-prime sequences spanning the P4-P4' positions of all five native polyprotein cleavage sites revealed a preponderant effect of non-prime side residues on the K(m) values, whereas variations at the prime side sequences had higher impact on kcat.
Assuntos
Vírus da Dengue/enzimologia , Proteínas não Estruturais Virais/metabolismo , Escherichia coli , Fluorescência , Biblioteca de Peptídeos , RNA Helicases/metabolismo , Serina , Serina Endopeptidases/metabolismo , Especificidade por Substrato , Tirosina/análogos & derivados , Tirosina/química , ortoaminobenzoatos/químicaRESUMO
G-Protein-coupled receptors (GPCRs) are among the most important drug targets. Because of a shortage of 3D crystal structures, most of the drug design for GPCRs has been ligand-based. We propose a novel, rough set-based proteochemometric approach to the study of receptor and ligand recognition. The approach is validated on three datasets containing GPCRs. In proteochemometrics, properties of receptors and ligands are used in conjunction and modeled to predict binding affinity. The rough set (RS) rule-based models presented herein consist of minimal decision rules that associate properties of receptors and ligands with high or low binding affinity. The information provided by the rules is then used to develop a mechanistic interpretation of interactions between the ligands and receptors included in the datasets. The first two datasets contained descriptors of melanocortin receptors and peptide ligands. The third set contained descriptors of adrenergic receptors and ligands. All the rule models induced from these datasets have a high predictive quality. An example of a decision rule is "If R1_ligand(Ethyl) and TM helix 2 position 27(Methionine) then Binding(High)." The easily interpretable rule sets are able to identify determinative receptor and ligand parts. For instance, all three models suggest that transmembrane helix 2 is determinative for high and low binding affinity. RS models show that it is possible to use rule-based models to predict ligand-binding affinities. The models may be used to gain a deeper biological understanding of the combinatorial nature of receptor-ligand interactions.
Assuntos
Biologia Computacional/métodos , Proteômica/métodos , Receptores Acoplados a Proteínas G/química , Algoritmos , Animais , Área Sob a Curva , Bases de Dados de Proteínas , Humanos , Concentração de Íons de Hidrogênio , Ligantes , Modelos Biológicos , Modelos Químicos , Modelos Moleculares , Conformação Molecular , Peptídeos/química , Ligação Proteica , Conformação Proteica , Estrutura Terciária de Proteína , alfa-MSH/químicaRESUMO
MOTIVATION: Proteochemometrics is a novel technology for the analysis of interactions of series of proteins with series of ligands. We have here customized it for analysis of large datasets and evaluated it for the modeling of the interaction of psychoactive organic amines with all the five known families of amine G protein-coupled receptors (GPCRs). RESULTS: The model exploited data for the binding of 22 compounds to 31 amine GPCRs, correlating chemical descriptions and cross-descriptions of compounds and receptors to binding affinity using a novel strategy. A highly valid model (q2 = 0.76) was obtained which was further validated by external predictions using data for 10 other entirely independent compounds, yielding the high q2ext = 0.67. Interpretation of the model reveals molecular interactions that govern psychoactive organic amines overall affinity for amine GPCRs, as well as their selectivity for particular amine GPCRs. The new modeling procedure allows us to obtain fully interpretable proteochemometrics models using essentially unlimited number of ligand and protein descriptors.
Assuntos
Química Orgânica/métodos , Proteômica/métodos , Receptores Acoplados a Proteínas G/química , Aminas/química , Sítios de Ligação , Análise por Conglomerados , Bases de Dados Factuais , Interações Medicamentosas , Concentração de Íons de Hidrogênio , Análise dos Mínimos Quadrados , Ligantes , Modelos Biológicos , Modelos Químicos , Modelos Moleculares , Modelos Estatísticos , Modelos Teóricos , Mutagênese , Farmacologia/métodos , Ligação ProteicaRESUMO
BACKGROUND: Proteochemometrics is a new methodology that allows prediction of protein function directly from real interaction measurement data without the need of 3D structure information. Several reported proteochemometric models of ligand-receptor interactions have already yielded significant insights into various forms of bio-molecular interactions. The proteochemometric models are multivariate regression models that predict binding affinity for a particular combination of features of the ligand and protein. Although proteochemometric models have already offered interesting results in various studies, no detailed statistical evaluation of their average predictive power has been performed. In particular, variable subset selection performed to date has always relied on using all available examples, a situation also encountered in microarray gene expression data analysis. RESULTS: A methodology for an unbiased evaluation of the predictive power of proteochemometric models was implemented and results from applying it to two of the largest proteochemometric data sets yet reported are presented. A double cross-validation loop procedure is used to estimate the expected performance of a given design method. The unbiased performance estimates (P2) obtained for the data sets that we consider confirm that properly designed single proteochemometric models have useful predictive power, but that a standard design based on cross validation may yield models with quite limited performance. The results also show that different commercial software packages employed for the design of proteochemometric models may yield very different and therefore misleading performance estimates. In addition, the differences in the models obtained in the double CV loop indicate that detailed chemical interpretation of a single proteochemometric model is uncertain when data sets are small. CONCLUSION: The double CV loop employed offer unbiased performance estimates about a given proteochemometric modelling procedure, making it possible to identify cases where the proteochemometric design does not result in useful predictive models. Chemical interpretations of single proteochemometric models are uncertain and should instead be based on all the models selected in the double CV loop employed here.
Assuntos
Biologia Computacional/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Animais , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Ligantes , Modelos Biológicos , Modelos Químicos , Modelos Moleculares , Modelos Estatísticos , Modelos Teóricos , Valor Preditivo dos Testes , Linguagens de Programação , Ligação Proteica , Conformação Proteica , Ratos , Receptores Adrenérgicos alfa 1/química , Receptores Acoplados a Proteínas G/química , Análise de Regressão , Reprodutibilidade dos Testes , Seleção Genética , SoftwareRESUMO
Proteochemometrics was applied in the analysis of the binding of organic compounds to wild-type and chimeric melanocortin receptors. Thirteen chimeric melanocortin receptors were designed based on statistical molecular design; each chimera contained parts from three of the MC(1,3-5) receptors. The binding affinities of 18 compounds were determined for these chimeric melanocortin receptors and the four wild-type melanocortin receptors. The data for 14 of these compounds were correlated to the physicochemical and structural descriptors of compounds, binary descriptors of receptor sequences, and cross-terms derived from ligand and receptor descriptors to obtain a proteochemometric model (correlation was performed using partial least-squares projections to latent structures; PLS). A well fitted mathematical model (R(2) = 0.92) with high predictive ability (Q(2) = 0.79) was obtained. In a further validation of the model, the predictive ability for ligands (Q(2)lig = 0.68) and receptors (Q(2)rec = 0.76) was estimated. The model was moreover validated by external prediction by using the data for the four additional compounds that had not at all been included in the proteochemometric model; the analysis yielded a Q(2)ext = 0.73. An interpretation of the results using PLS coefficients revealed the influence of particular properties of organic compounds on their affinity to melanocortin receptors. Three-dimensional models of melanocortin receptors were also created, and physicochemical properties of the amino acids inside the receptors' transmembrane cavity were correlated to the PLS modeling results. The importance of particular amino acids for selective binding of organic compounds was estimated and used to outline the ligand recognition site in the melanocortin receptors.