Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
Drug Metab Dispos ; 52(2): 69-79, 2024 Jan 09.
Article in English | MEDLINE | ID: mdl-37973374

ABSTRACT

Lung cancer is the leading cause of cancer deaths worldwide. We found that the cytochrome P450 isoform CYP4F11 is significantly overexpressed in patients with lung squamous cell carcinoma. CYP4F11 is a fatty acid ω-hydroxylase and catalyzes the production of the lipid mediator 20-hydroxyeicosatetraenoic acid (20-HETE) from arachidonic acid. 20-HETE promotes cell proliferation and migration in cancer. Inhibition of 20-HETE-generating cytochrome P450 enzymes has been implicated as novel cancer therapy for more than a decade. However, the exact role of CYP4F11 and its potential as drug target for lung cancer therapy has not been established yet. Thus, we performed a transient knockdown of CYP4F11 in the lung cancer cell line NCI-H460. Knockdown of CYP4F11 significantly inhibits lung cancer cell proliferation and migration while the 20-HETE production is significantly reduced. For biochemical characterization of CYP4F11-inhibitor interactions, we generated recombinant human CYP4F11. Spectroscopic ligand binding assays were conducted to evaluate CYP4F11 binding to the unselective CYP4A/F inhibitor HET0016. HET0016 shows high affinity to recombinant CYP4F11 and inhibits CYP4F11-mediated 20-HETE production in vitro with a nanomolar IC 50 Cross evaluation of HET0016 in NCI-H460 cells shows that lung cancer cell proliferation is significantly reduced together with 20-HETE production. However, HET0016 also displays antiproliferative effects that are not 20-HETE mediated. Future studies aim to establish the role of CYP4F11 in lung cancer and the underlying mechanism and investigate the potential of CYP4F11 as a therapeutic target for lung cancer. SIGNIFICANCE STATEMENT: Lung cancer is a deadly cancer with limited treatment options. Cytochrome P450 4F11 (CYP4F11) is significantly upregulated in lung squamous cell carcinoma. Knockdown of CYP4F11 in a lung cancer cell line significantly attenuates cell proliferation and migration with reduced production of the lipid mediator 20-hydroxyeicosatetraenoic acid (20-HETE). Studies with the unselective inhibitor HET0016 show a high inhibitory potency of CYP4F11-mediated 20-HETE production using recombinant enzyme. Overall, our studies demonstrate the potential of targeting CYP4F11 for new transformative lung cancer treatment.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Carcinoma, Squamous Cell , Lung Neoplasms , Humans , Lung Neoplasms/drug therapy , Fatty Acids , Cytochrome P-450 Enzyme System/metabolism , Cytochrome P-450 CYP4A , Eicosanoids , Hydroxyeicosatetraenoic Acids/metabolism , Cytochrome P450 Family 4/genetics
2.
J Chem Inf Model ; 63(23): 7401-7411, 2023 Dec 11.
Article in English | MEDLINE | ID: mdl-38000780

ABSTRACT

We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling. We find that differences in torsion preferences can be mostly attributed to a lack of available experimental crystallographic data with small deviations derived from gas-phase geometry differences. GFN2 demonstrates the ability to provide accurate and reliable torsional preferences that can provide a basis for new methods free from the limitations of experimental data collection. We provide Gaussian-based fits and sampling distributions suitable for torsion sampling and propose an alternative to the widely used "experimental torsion and knowledge distance geometry" (ETKDG) method using quantum torsion-derived distance geometry (QTDG) methods.

3.
J Chem Inf Model ; 61(6): 2530-2536, 2021 06 28.
Article in English | MEDLINE | ID: mdl-34038123

ABSTRACT

While accurate prediction of aqueous solubility remains a challenge in drug discovery, machine learning (ML) approaches have become increasingly popular for this task. For instance, in the Second Challenge to Predict Aqueous Solubility (SC2), all groups utilized machine learning methods in their submissions. We present SolTranNet, a molecule attention transformer to predict aqueous solubility from a molecule's SMILES representation. Atypically, we demonstrate that larger models perform worse at this task, with SolTranNet's final architecture having 3,393 parameters while outperforming linear ML approaches. SolTranNet has a 3-fold scaffold split cross-validation root-mean-square error (RMSE) of 1.459 on AqSolDB and an RMSE of 1.711 on a withheld test set. We also demonstrate that, when used as a classifier to filter out insoluble compounds, SolTranNet achieves a sensitivity of 94.8% on the SC2 data set and is competitive with the other methods submitted to the competition. SolTranNet is distributed via pip, and its source code is available at https://github.com/gnina/SolTranNet.


Subject(s)
Machine Learning , Water , Software , Solubility
4.
J Phys Chem A ; 125(9): 1987-1993, 2021 Mar 11.
Article in English | MEDLINE | ID: mdl-33630611

ABSTRACT

While many machine learning (ML) methods, particularly deep neural networks, have been trained for density functional and quantum chemical energies and properties, the vast majority of these methods focus on single-point energies. In principle, such ML methods, once trained, offer thermochemical accuracy on par with density functional and wave function methods but at speeds comparable to traditional force fields or approximate semiempirical methods. So far, most efforts have focused on optimized equilibrium single-point energies and properties. In this work, we evaluate the accuracy of several leading ML methods across a range of bond potential energy curves and torsional potentials. The methods were trained on the existing ANI-1 training set, calculated using the ωB97X/6-31G(d) single points at nonequilibrium geometries. We find that across a range of small molecules, several methods offer both qualitative accuracy (e.g., correct minima, both repulsive and attractive bond regions, anharmonic shape, and single minima) and quantitative accuracy in terms of the mean absolute percent error near the minima. At the moment, ANI-2x, FCHL, and a new libmolgrid-based convolutional neural net, the Colorful CNN, show good performance.

5.
J Chem Inf Model ; 60(3): 1079-1084, 2020 03 23.
Article in English | MEDLINE | ID: mdl-32049525

ABSTRACT

We describe libmolgrid, a general-purpose library for representing three-dimensional molecules using multidimensional arrays of voxelized molecular data. libmolgrid provides functionality for sampling batches of data suited to machine learning workflows, and it also supports temporal and spatial recurrences over that data to facilitate work with convolutional and recurrent neural networks. It was designed for seamless integration with popular deep learning frameworks and features optimized performance by leveraging graphics processing units (GPUs). libmolgrid is a free and open source project (GPLv2) that aims to democratize grid-based modeling in computational chemistry.


Subject(s)
Deep Learning , Machine Learning , Neural Networks, Computer
6.
J Chem Inf Model ; 60(9): 4200-4215, 2020 09 28.
Article in English | MEDLINE | ID: mdl-32865404

ABSTRACT

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard data set of sufficient size to compare performance between models. We present a new data set for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank, and perform a comprehensive evaluation of grid-based convolutional neural network (CNN) models on this data set. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind data set, how performance improves by adding more lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of five densely connected CNNs, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized data set for training machine learning models to recognize ligands in noncognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this data set for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.


Subject(s)
Drug Design , Neural Networks, Computer , Databases, Protein , Ligands , Protein Binding
7.
Proteins ; 85(10): 1944-1956, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28688107

ABSTRACT

NMR chemical shifts can be computed from molecular dynamics (MD) simulations using a template matching approach and a library of conformers containing chemical shifts generated from ab initio quantum calculations. This approach has potential utility for evaluating the force fields that underlie these simulations. Imperfections in force fields generate flawed atomic coordinates. Chemical shifts obtained from flawed coordinates have errors that can be traced back to these imperfections. We use this approach to evaluate a series of AMBER force fields that have been refined over the course of two decades (ff94, ff96, ff99SB, ff14SB, ff14ipq, and ff15ipq). For each force field a series of MD simulations are carried out for eight model proteins. The calculated chemical shifts for the 1 H, 15 N, and 13 Ca atoms are compared with experimental values. Initial evaluations are based on root mean squared (RMS) errors at the protein level. These results are further refined based on secondary structure and the types of atoms involved in nonbonded interactions. The best chemical shift for identifying force field differences is the shift associated with peptide protons. Examination of the model proteins on a residue by residue basis reveals that force field performance is highly dependent on residue position. Examination of the time course of nonbonded interactions at these sites provides explanations for chemical shift differences at the atomic coordinate level. Results show that the newer ff14ipq and ff15ipq force fields developed with the implicitly polarized charge method perform better than the older force fields.


Subject(s)
Peptides/chemistry , Protein Conformation , Proteins/chemistry , Molecular Dynamics Simulation , Nuclear Magnetic Resonance, Biomolecular , Protein Structure, Secondary , Quantum Theory , Static Electricity
9.
Res Sq ; 2024 Sep 24.
Article in English | MEDLINE | ID: mdl-39399689

ABSTRACT

Molecular interactions between proteins and their ligands are important for drug design. A pharmacophore consists of favorable molecular interactions in a protein binding site and can be utilized for virtual screening. Pharmacophores are easiest to identify from co-crystal structures of a bound protein-ligand complex. In this work, however, we develop a deep learning method that can identify pharmacophores in the absence of a ligand. Specifically, we train a CNN model to identify potential favorable interactions in the the binding site, and develop a deep geometric Q-learning algorithm that attempts to select an optimal subset of these interaction points to form a pharmacophore. With this algorithm, we show better prospective virtual screening performance, in terms of F1 scores, on the DUD-E dataset than random selection of ligand identified features from co-crystal structures. We also conduct experiments on the LIT-PCBA dataset and show that it provides efficient solutions for identifying active molecules. Finally, we test our method by screening the COVID moonshot dataset and show that it would be effective in identifying prospective lead molecules even in the absence of fragment screening experiments. Alongside, we provide a Google Colab notebook for ease of use of the developed method.

10.
bioRxiv ; 2024 May 04.
Article in English | MEDLINE | ID: mdl-38746274

ABSTRACT

The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding. Interestingly, no method utilizes a language representation of the interaction itself. We developed an interaction LM (iLM), which uses a novel language to represent interactions between protein/peptide sequences. Sliding Window Interaction Grammar (SWING) leverages differences in amino acid properties to generate an interaction vocabulary. This vocabulary is the input into a LM followed by a supervised prediction step where the LM's representations are used as features. SWING was first applied to predicting peptide:MHC (pMHC) interactions. SWING was not only successful at generating Class I and Class II models that have comparable prediction to state-of-the-art approaches, but the unique Mixed Class model was also successful at jointly predicting both classes. Further, the SWING model trained only on Class I alleles was predictive for Class II, a complex prediction task not attempted by any existing approach. For de novo data, using only Class I or Class II data, SWING also accurately predicted Class II pMHC interactions in murine models of SLE (MRL/lpr model) and T1D (NOD model), that were validated experimentally. To further evaluate SWING's generalizability, we tested its ability to predict the disruption of specific protein-protein interactions by missense mutations. Although modern methods like AlphaMissense and ESM1b can predict interfaces and variant effects/pathogenicity per mutation, they are unable to predict interaction-specific disruptions. SWING was successful at accurately predicting the impact of both Mendelian mutations and population variants on PPIs. This is the first generalizable approach that can accurately predict interaction-specific disruptions by missense mutations with only sequence information. Overall, SWING is a first-in-class generalizable zero-shot iLM that learns the language of PPIs.

11.
ACS Omega ; 8(44): 41680-41688, 2023 Nov 07.
Article in English | MEDLINE | ID: mdl-37970017

ABSTRACT

The success of machine learning is, in part, due to a large volume of data available to train models. However, the amount of training data for structure-based molecular property prediction remains limited. The previously described CrossDocked2020 data set expanded the available training data for binding pose classification in a molecular docking setting but did not address expanding the amount of receptor-ligand binding affinity data. We present experiments demonstrating that imputing binding affinity labels for complexes without experimentally determined binding affinities is a viable approach to expanding training data for structure-based models of receptor-ligand binding affinity. In particular, we demonstrate that utilizing imputed labels from a convolutional neural network trained only on the affinity data present in CrossDocked2020 results in a small improvement in the binding affinity regression performance, despite the additional sources of noise that such imputed labels add to the training data. The code, data splits, and imputation labels utilized in this paper are freely available at https://github.com/francoep/ImputationPaper.

12.
Chem Sci ; 12(23): 8036-8047, 2021 May 08.
Article in English | MEDLINE | ID: mdl-34194693

ABSTRACT

Machine learning has been increasingly applied to the field of computer-aided drug discovery in recent years, leading to notable advances in binding-affinity prediction, virtual screening, and QSAR. Surprisingly, it is less often applied to lead optimization, the process of identifying chemical fragments that might be added to a known ligand to improve its binding affinity. We here describe a deep convolutional neural network that predicts appropriate fragments given the structure of a receptor/ligand complex. In an independent benchmark of known ligands with missing (deleted) fragments, our DeepFrag model selected the known (correct) fragment from a set over 6500 about 58% of the time. Even when the known/correct fragment was not selected, the top fragment was often chemically similar and may well represent a valid substitution. We release our trained DeepFrag model and associated software under the terms of the Apache License, Version 2.0.

13.
PLoS One ; 14(8): e0220113, 2019.
Article in English | MEDLINE | ID: mdl-31430292

ABSTRACT

Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.


Subject(s)
Databases, Pharmaceutical , Deep Learning , Drug Evaluation, Preclinical/methods , Pharmaceutical Preparations/chemistry , Ligands , Pharmaceutical Preparations/metabolism , Proteins/metabolism , User-Computer Interface
14.
Protein Sci ; 27(1): 229-232, 2018 01.
Article in English | MEDLINE | ID: mdl-28921842

ABSTRACT

AnchorQuery (http://anchorquery.csb.pitt.edu) is a web application for rational structure-based design of protein-protein interaction (PPI) inhibitors. A specialized variant of pharmacophore search is used to rapidly screen libraries consisting of more than 31 million synthesizable compounds biased by design to preferentially target PPIs. Every library compound is accessible through one-step multi-component reaction (MCR) chemistry and contains an anchor motif that is bioisosteric to an amino acid residue. The inclusion of this anchor not only biases the compounds to interact with proteins, it also enables a rapid, sublinear time pharmacophore search algorithm. AnchorQuery provides all the tools necessary for users to perform online interactive virtual screens of millions of compounds, including pharmacophore elucidation and search, and enrichment analysis. Accessibility: AnchorQuery is freely accessible at http://anchorquery.csb.pitt.edu.


Subject(s)
Algorithms , Databases, Protein , Molecular Docking Simulation , Software , Amino Acid Motifs
15.
Protein Sci ; 27(1): 112-128, 2018 01.
Article in English | MEDLINE | ID: mdl-28836357

ABSTRACT

The Adaptive Poisson-Boltzmann Solver (APBS) software was developed to solve the equations of continuum electrostatics for large biomolecular assemblages that have provided impact in the study of a broad range of chemical, biological, and biomedical applications. APBS addresses the three key technology challenges for understanding solvation and electrostatics in biomedical applications: accurate and efficient models for biomolecular solvation and electrostatics, robust and scalable software for applying those theories to biomolecular systems, and mechanisms for sharing and analyzing biomolecular electrostatics data in the scientific community. To address new research applications and advancing computational capabilities, we have continually updated APBS and its suite of accompanying software since its release in 2001. In this article, we discuss the models and capabilities that have recently been implemented within the APBS software package including a Poisson-Boltzmann analytical and a semi-analytical solver, an optimized boundary element solver, a geometry-based geometric flow solvation model, a graph theory-based algorithm for determining pKa values, and an improved web-based visualization tool for viewing electrostatics.


Subject(s)
Models, Molecular , Software , Static Electricity
16.
Comput Theor Chem ; 1099: 152-166, 2017 Jan 01.
Article in English | MEDLINE | ID: mdl-29109930

ABSTRACT

Accurate chemical shifts for the atoms in molecular mechanics (MD) trajectories can be obtained from quantum mechanical (QM) calculations that depend solely on the coordinates of the atoms in the localized regions surrounding atoms of interest. If these coordinates are correct and the sample size is adequate, the ensemble average of these chemical shifts should be equal to the chemical shifts obtained from NMR spectroscopy. If this is not the case, the coordinates must be incorrect. We have utilized this fact to quantify the errors associated with the backbone atoms in MD simulations of proteins. A library of regional conformers containing 169,499 members was constructed from 6 model proteins. The chemical shifts associated with the backbone atoms in each of these conformers was obtained from QM calculations using density functional theory at the B3LYP level with a 6-311+G(2d,p) basis set. Chemical shifts were assigned to each backbone atom in each MD simulation frame using a template matching approach. The ensemble average of these chemical shifts was compared to chemical shifts from NMR spectroscopy. A large systematic error was identified that affected the 1H atoms of the peptide bonds involved in hydrogen bonding with water molecules or peptide backbone atoms. This error was highly sensitive to changes in electrostatic parameters. Smaller errors affecting the 13Ca and 15N atoms were also detected. We believe these errors could be useful as metrics for comparing the force-fields and parameter sets used in MD simulation because they are directly tied to errors in atomic coordinates.

17.
PLoS One ; 11(5): e0156313, 2016.
Article in English | MEDLINE | ID: mdl-27228149

ABSTRACT

OBJECTIVE: Dynamic regulation of actin cytoskeleton is at the heart of all actin-based cellular events. In this study, we sought to identify novel post-translational modifications of Profilin-1 (Pfn1), an important regulator of actin polymerization in cells. METHODOLOGY: We performed in vitro protein kinase assay followed by mass-spectrometry to identify Protein Kinase A (PKA) phosphorylation sites of Pfn1. By two-dimensional gel electrophoresis (2D-GE) analysis, we further examined the changes in the isoelectric profile of ectopically expressed Pfn1 in HEK-293 cells in response to forskolin (FSK), an activator of cAMP/PKA pathway. Finally, we combined molecular dynamics simulations (MDS), GST pull-down assay and F-actin analyses of mammalian cells expressing site-specific phosphomimetic variants of Pfn1 to predict the potential consequences of phosphorylation of Pfn1. RESULTS AND SIGNIFICANCE: We identified several PKA phosphorylation sites of Pfn1 including Threonine 89 (T89), a novel site. Consistent with PKA's ability to phosphorylate Pfn1 in vitro, FSK stimulation increased the pool of the most negatively charged form of Pfn1 in HEK-293 cells which can be attenuated by PKA inhibitor H89. MDS predicted that T89 phosphorylation destabilizes an intramolecular interaction of Pfn1, potentially increasing its affinity for actin. The T89D phosphomimetic mutation of Pfn1 elicits several changes that are hallmarks of proteins folded into alternative three-dimensional conformations including detergent insolubility, protein aggregation and accelerated proteolysis, suggesting that T89 is a structurally important residue of Pfn1. Expression of T89D-Pfn1 induces actin:T89D-Pfn1 co-clusters and dramatically reduces overall actin polymerization in cells, indicating an actin-sequestering action of T89D-Pfn1. Finally, rendering T89 non-phosphorylatable causes a positive charge shift in the isoelectric profile of Pfn1 in a 2D gel electrophoresis analysis of cell extracts, a finding that is consistent with phosphorylation of a certain pool of intracellular Pfn1 on the T89 residue. In summary, we propose that T89 phosphorylation could have major functional consequences on Pfn1. This study paves the way for further investigation of the potential role of Pfn1 phosphorylation in PKA-mediated regulation of actin-dependent biological processes.


Subject(s)
Cyclic AMP-Dependent Protein Kinases/metabolism , Profilins/metabolism , Threonine/metabolism , Actin Cytoskeleton/metabolism , Animals , Cattle , Electrophoresis, Gel, Two-Dimensional , HEK293 Cells , Humans , Mice , Models, Molecular , Molecular Dynamics Simulation , Phosphorylation , Profilins/chemistry , Protein Binding , Protein Conformation , Protein Processing, Post-Translational , Tandem Mass Spectrometry , Threonine/chemistry
18.
ChemMedChem ; 10(3): 490-7, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25677305

ABSTRACT

Metabolic reprogramming of tumor cells toward serine catabolism is now recognized as a hallmark of cancer. Serine hydroxymethyltransferase (SHMT), the enzyme providing one-carbon units by converting serine and tetrahydrofolate (H4 PteGlu) to glycine and 5,10-CH2 -H4 PteGlu, therefore represents a target of interest in developing new chemotherapeutic drugs. In this study, 13 folate analogues under clinical evaluation or in therapeutic use were in silico screened against SHMT, ultimately identifying four antifolate agents worthy of closer evaluation. The interaction mode of SHMT with these four antifolate drugs (lometrexol, nolatrexed, raltitrexed, and methotrexate) was assessed. The mechanism of SHMT inhibition by the selected antifolate agents was investigated in vitro using the human cytosolic isozyme. The results of this study showed that lometrexol competitively inhibits SHMT with inhibition constant (Ki ) values in the low micromolar. The binding mode of lometrexol to SHMT was further investigated by molecular docking. These results thus provide insights into the mechanism of action of antifolate drugs and constitute the basis for the rational design of novel and more potent inhibitors of SHMT.


Subject(s)
Folic Acid Antagonists/chemistry , Folic Acid Antagonists/pharmacology , Glycine Hydroxymethyltransferase/antagonists & inhibitors , Glycine Hydroxymethyltransferase/metabolism , Humans , Methotrexate/chemistry , Methotrexate/pharmacology , Molecular Docking Simulation , Quinazolines/chemistry , Quinazolines/pharmacology , Tetrahydrofolates/chemistry , Tetrahydrofolates/pharmacology , Thiophenes/chemistry , Thiophenes/pharmacology
SELECTION OF CITATIONS
SEARCH DETAIL