RESUMO
In the realm of medicinal chemistry, the primary objective is to swiftly optimize a multitude of chemical properties of a set of compounds to yield a clinical candidate poised for clinical trials. In recent years, two computational techniques, machine learning (ML) and physics-based methods, have evolved substantially and are now frequently incorporated into the medicinal chemist's toolbox to enhance the efficiency of both hit optimization and candidate design. Both computational methods come with their own set of limitations, and they are often used independently of each other. ML's capability to screen extensive compound libraries expediently is tempered by its reliance on quality data, which can be scarce especially during early-stage optimization. Contrarily, physics-based approaches like free energy perturbation (FEP) are frequently constrained by low throughput and high cost by comparison; however, physics-based methods are capable of making highly accurate binding affinity predictions. In this study, we harnessed the strength of FEP to overcome data paucity in ML by generating virtual activity data sets which then inform the training of algorithms. Here, we show that ML algorithms trained with an FEP-augmented data set could achieve comparable predictive accuracy to data sets trained on experimental data from biological assays. Throughout the paper, we emphasize key mechanistic considerations that must be taken into account when aiming to augment data sets and lay the groundwork for successful implementation. Ultimately, the study advocates for the synergy of physics-based methods and ML to expedite the lead optimization process. We believe that the physics-based augmentation of ML will significantly benefit drug discovery, as these techniques continue to evolve.
Assuntos
Aprendizado de Máquina , Termodinâmica , Descoberta de Drogas/métodos , Algoritmos , HumanosRESUMO
High-resolution mass spectrometry (HRMS) enables rapid chemical annotation via accurate mass measurements and matching of experimentally derived spectra with reference spectra. Reference libraries are generated from chemical standards and are therefore limited in size relative to known chemical space. To address this limitation, in silico spectra (i.e., MS/MS or MS2 spectra), predicted via Competitive Fragmentation Modeling-ID (CFM-ID) algorithms, were generated for compounds within the U.S. Environmental Protection Agency's (EPA) Distributed Structure-Searchable Toxicity (DSSTox) database (totaling, at the time of analysis, ~ 765,000 substances). Experimental spectra from EPA's Non-Targeted Analysis Collaborative Trial (ENTACT) mixtures (n = 10) were then used to evaluate the performance of the in silico spectra. Overall, MS2 spectra were acquired for 377 unique compounds from the ENTACT mixtures. Approximately 53% of these compounds were correctly identified using a commercial reference library, whereas up to 50% were correctly identified as the top hit using the in silico library. Together, the reference and in silico libraries were able to correctly identify 73% of the 377 ENTACT substances. When using the in silico spectra for candidate filtering, an examination of binary classifiers showed a true positive rate (TPR) of 0.90 associated with false positive rates (FPRs) of 0.10 to 0.85, depending on the sample and method of candidate filtering. Taken together, these findings show the abilities of in silico spectra to correctly identify true positives in complex samples (at rates comparable to those observed with reference spectra), and efficiently filter large numbers of potential false positives from further consideration. Graphical abstract.
RESUMO
We describe the new Pathways plugin for the molecular visualization program visual molecular dynamics. The plugin identifies and visualizes tunneling pathways and pathway families in biomolecules, and calculates relative electronic couplings. The plugin includes unique features to estimate the importance of individual atoms for mediating the coupling, to analyze the coupling sensitivity to thermal motion, and to visualize pathway fluctuations. The Pathways plugin is open source software distributed under the terms of the GNU's Not Unix (GNU) public license.
Assuntos
Azurina/química , Proteínas de Bactérias/química , Pseudomonas aeruginosa/química , Software , Simulação por Computador , Transporte de Elétrons , Modelos Moleculares , Transdução de SinaisRESUMO
Allosteric regulation provides highly specific ligand recognition and signaling by transmembrane protein receptors. Unlike functions of protein molecular machines that rely on large-scale conformational transitions, signal transduction in receptors appears to be mediated by more subtle structural motions that are difficult to identify. We describe a theoretical model for allosteric regulation in receptors that addresses a fundamental riddle of signaling: What are the structural origins of the receptor agonism (specific signaling response to ligand binding)? The model suggests that different signaling pathways in bovine rhodopsin or human beta(2)-adrenergic receptor can be mediated by specific structural motions in the receptors. We discuss implications for understanding the receptor agonism, particularly the recently observed "biased agonism" (selected activation of specific signaling pathways), and for developing rational structure-based drug-design strategies.
Assuntos
Modelos Teóricos , Receptores Adrenérgicos beta 2/metabolismo , Rodopsina/metabolismo , Agonistas de Receptores Adrenérgicos beta 2 , Agonistas Adrenérgicos beta/química , Agonistas Adrenérgicos beta/metabolismo , Agonistas Adrenérgicos beta/farmacologia , Algoritmos , Regulação Alostérica , Animais , Sítios de Ligação , Bovinos , Humanos , Ligantes , Modelos Moleculares , Conformação Proteica , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Receptores Adrenérgicos beta 2/química , Rodopsina/agonistas , Rodopsina/química , Transdução de SinaisRESUMO
Electron transfer (ET) reactions provide a nexus among chemistry, biochemistry, and physics. These reactions underpin the "power plants" and "power grids" of bioenergetics, and they challenge us to understand how evolution manipulates structure to control ET kinetics. Ball-and-stick models for the machinery of electron transfer, however, fail to capture the rich electronic and nuclear dynamics of ET molecules: these static representations disguise, for example, the range of thermally accessible molecular conformations. The influence of structural fluctuations on electron-transfer kinetics is amplified by the exponential decay of electron tunneling probabilities with distance, as well as the delicate interference among coupling pathways. Fluctuations in the surrounding medium can also switch transport between coherent and incoherent ET mechanisms--and may gate ET so that its kinetics is limited by conformational interconversion times, rather than by the intrinsic ET time scale. Moreover, preparation of a charge-polarized donor state or of a donor state with linear or angular momentum can have profound dynamical and kinetic consequences. In this Account, we establish a vocabulary to describe how the conformational ensemble and the prepared donor state influence ET kinetics in macromolecules. This framework is helping to unravel the richness of functional biological ET pathways, which have evolved within fluctuating macromolecular structures. The conceptual framework for describing nonadiabatic ET seems disarmingly simple: compute the ensemble-averaged (mean-squared) donor-acceptor (DA) tunneling interaction,
Assuntos
Transporte de Elétrons , Cinética , Substâncias Macromoleculares/química , Substâncias Macromoleculares/metabolismo , Ácidos Nucleicos/química , Ácidos Nucleicos/metabolismo , Temperatura , Água/química , Água/metabolismoRESUMO
BACKGROUND: Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES: In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS: The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS: The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION: The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of â¼875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.
Assuntos
Simulação por Computador , Disruptores Endócrinos , Androgênios , Bases de Dados Factuais , Ensaios de Triagem em Larga Escala , Humanos , Receptores Androgênicos , Estados Unidos , United States Environmental Protection AgencyRESUMO
Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS2) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA's DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA's CompTox Chemicals Dashboard.
RESUMO
BACKGROUND: Quantitative structure-activity relationship (QSAR) models are important tools used in discovering new drug candidates and identifying potentially harmful environmental chemicals. These models often face two fundamental challenges: limited amount of available biological activity data and noise or uncertainty in the activity data themselves. To address these challenges, we introduce and explore a QSAR model based on custom distance metrics in the structure-activity space. METHODS: The model is built on top of the k-nearest neighbor model, incorporating non-linearity not only in the chemical structure space, but also in the biological activity space. The model is tuned and evaluated using activity data for human estrogen receptor from the US EPA ToxCast and Tox21 databases. RESULTS: The model closely trails the CERAPP consensus model (built on top of 48 individual human estrogen receptor activity models) in agonist activity predictions and consistently outperforms the CERAPP consensus model in antagonist activity predictions. DISCUSSION: We suggest that incorporating non-linear distance metrics may significantly improve QSAR model performance when the available biological activity data are limited.
Assuntos
Cobre/química , Citocromos/química , Heme/química , Nanotecnologia/métodos , Oxirredutases/química , Grupo dos Citocromos b , Transporte de Elétrons , Elétrons , Escherichia coli/metabolismo , Escherichia coli/ultraestrutura , Proteínas de Escherichia coli , Luz , Conformação Molecular , Consumo de Oxigênio , Fotoquímica/métodos , Fotólise , Espectrofotometria/métodosRESUMO
BACKGROUND: Humans are exposed to thousands of man-made chemicals in the environment. Some chemicals mimic natural endocrine hormones and, thus, have the potential to be endocrine disruptors. Most of these chemicals have never been tested for their ability to interact with the estrogen receptor (ER). Risk assessors need tools to prioritize chemicals for evaluation in costly in vivo tests, for instance, within the U.S. EPA Endocrine Disruptor Screening Program. OBJECTIVES: We describe a large-scale modeling project called CERAPP (Collaborative Estrogen Receptor Activity Prediction Project) and demonstrate the efficacy of using predictive computational models trained on high-throughput screening data to evaluate thousands of chemicals for ER-related activity and prioritize them for further testing. METHODS: CERAPP combined multiple models developed in collaboration with 17 groups in the United States and Europe to predict ER activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, mostly using a common training set of 1,677 chemical structures provided by the U.S. EPA, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity. All predictions were evaluated on a set of 7,522 chemicals curated from the literature. To overcome the limitations of single models, a consensus was built by weighting models on scores based on their evaluated accuracies. RESULTS: Individual model scores ranged from 0.69 to 0.85, showing high prediction reliabilities. Out of the 32,464 chemicals, the consensus model predicted 4,001 chemicals (12.3%) as high priority actives and 6,742 potential actives (20.8%) to be considered for further testing. CONCLUSION: This project demonstrated the possibility to screen large libraries of chemicals using a consensus of different in silico approaches. This concept will be applied in future projects related to other end points. CITATION: Mansouri K, Abdelaziz A, Rybacka A, Roncaglioni A, Tropsha A, Varnek A, Zakharov A, Worth A, Richard AM, Grulke CM, Trisciuzzi D, Fourches D, Horvath D, Benfenati E, Muratov E, Wedebye EB, Grisoni F, Mangiatordi GF, Incisivo GM, Hong H, Ng HW, Tetko IV, Balabin I, Kancherla J, Shen J, Burton J, Nicklaus M, Cassotti M, Nikolov NG, Nicolotti O, Andersson PL, Zang Q, Politi R, Beger RD, Todeschini R, Huang R, Farag S, Rosenberg SA, Slavov S, Hu X, Judson RS. 2016. CERAPP: Collaborative Estrogen Receptor Activity Prediction Project. Environ Health Perspect 124:1023-1033; http://dx.doi.org/10.1289/ehp.1510267.
Assuntos
Disruptores Endócrinos/toxicidade , Receptores de Estrogênio/metabolismo , Testes de Toxicidade , Simulação por Computador , Disruptores Endócrinos/classificação , Política Ambiental , Relação Quantitativa Estrutura-Atividade , Estados UnidosAssuntos
Aquaporinas/química , Proteínas/química , ATPases Translocadoras de Prótons/química , Água/química , Motivos de Aminoácidos , Catálise , Membrana Celular/metabolismo , Simulação por Computador , Fibronectinas/química , Ligação de Hidrogênio , Modelos Moleculares , Conformação Proteica , Desnaturação Proteica , Dobramento de Proteína , Estrutura Secundária de Proteína , Especificidade por Substrato , Fatores de TempoRESUMO
In the soft-wet environment of biomolecular electron transfer, it is possible that structural fluctuations could wash out medium-specific electronic effects on electron tunneling rates. We show that beyond a transition distance (2-3 A in water and 6-7 A in proteins), fluctuation contributions to the mean-squared donor-to-acceptor tunneling matrix element are likely to dominate over the average matrix element. Even though fluctuations dominate the tunneling mechanism at larger distances, we find that the protein fold is "remembered" by the electronic coupling, and structure remains a key determinant of electron transfer kinetics.
Assuntos
Modelos Biológicos , Modelos Químicos , Proteínas/química , Azurina/química , Azurina/metabolismo , Grupo dos Citocromos b/química , Grupo dos Citocromos b/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Modelos Moleculares , Mioglobina/química , Mioglobina/metabolismo , Estrutura Secundária de Proteína , Proteínas/metabolismo , TermodinâmicaRESUMO
Structured water molecules near redox cofactors were found recently to accelerate electron-transfer (ET) kinetics in several systems. Theoretical study of interprotein electron transfer across an aqueous interface reveals three distinctive electronic coupling mechanisms that we describe here: (i) a protein-mediated regime when the two proteins are in van der Waals contact; (ii) a structured water-mediated regime featuring anomalously weak distance decay at relatively close protein-protein contact distances; and (iii) a bulk water-mediated regime at large distances. Our analysis explains a range of otherwise puzzling biological ET kinetic data and provides a framework for including explicit water-mediated tunneling effects on ET kinetics.
Assuntos
Citocromos b5/metabolismo , Transporte de Elétrons , Água/química , Animais , Bovinos , Fenômenos Químicos , Físico-Química , Citocromos b5/química , Cinética , Modelos Químicos , Porfirinas/química , Conformação Proteica , TermodinâmicaRESUMO
We compute the autocorrelation function of the donor-acceptor tunneling matrix element
Assuntos
Azurina/química , Transporte de Elétrons , Modelos MolecularesRESUMO
Cytochrome c oxidase mediates the final step of electron transfer reactions in the respiratory chain, catalyzing the transfer between cytochrome c and the molecular oxygen and concomitantly pumping protons across the inner mitochondrial membrane. We investigate the electron transfer reactions in cytochrome c oxidase, particularly the control of the effective electronic coupling by the nuclear thermal motion. The effective coupling is calculated using the Green's function technique with an extended Huckel level electronic Hamiltonian, combined with all-atom molecular dynamics of the protein in a native (membrane and solvent) environment. The effective coupling between Cu(A) and heme a is found to be dominated by the pathway that starts from His(B204). The coupling between heme a and heme a(3) is dominated by a through-space jump between the two heme rings rather than by covalent pathways. In the both steps, the effective electronic coupling is robust to the thermal nuclear vibrations, thereby providing fast and efficient electron transfer.
Assuntos
Cobre/química , Complexo IV da Cadeia de Transporte de Elétrons/química , Transferência de Energia , Heme/química , Modelos Químicos , Modelos Moleculares , Simulação por Computador , Transporte de Elétrons , Ativação Enzimática , Cinética , OxirreduçãoRESUMO
F(1)F(o)-ATP synthase is a ubiquitous membrane protein complex that efficiently converts a cell's transmembrane proton gradient into chemical energy stored as ATP. The protein is made of two molecular motors, F(o) and F(1), which are coupled by a central stalk. The membrane unit, F(o), converts the transmembrane electrochemical potential into mechanical rotation of a rotor in F(o) and the physically connected central stalk. Based on available data of individual components, we have built an all-atom model of F(o) and investigated through molecular dynamics simulations and mathematical modeling the mechanism of torque generation in F(o). The mechanism that emerged generates the torque at the interface of the a- and c-subunits of F(o) through side groups aSer-206, aArg-210, and aAsn-214 of the a-subunit and side groups cAsp-61 of the c-subunits. The mechanism couples protonation/deprotonation of two cAsp-61 side groups, juxtaposed to the a-subunit at any moment in time, to rotations of individual c-subunit helices as well as rotation of the entire c-subunit. The aArg-210 side group orients the cAsp-61 side groups and, thereby, establishes proton transfer via aSer-206 and aAsn-214 to proton half-channels, while preventing direct proton transfer between the half-channels. A mathematical model proves the feasibility of torque generation by the stated mechanism against loads typical during ATP synthesis; the essential model characteristics, e.g., helix and subunit rotation and associated friction constants, have been tested and furnished by steered molecular dynamics simulations.