ABSTRACT
Drug discovery and development constitute a laborious and costly undertaking. The success of a drug hinges not only good efficacy but also acceptable absorption, distribution, metabolism, elimination, and toxicity (ADMET) properties. Overall, up to 50% of drug development failures have been contributed from undesirable ADMET profiles. As a multiple parameter objective, the optimization of the ADMET properties is extremely challenging owing to the vast chemical space and limited human expert knowledge. In this study, a freely available platform called Chemical Molecular Optimization, Representation and Translation (ChemMORT) is developed for the optimization of multiple ADMET endpoints without the loss of potency (https://cadd.nscc-tj.cn/deploy/chemmort/). ChemMORT contains three modules: Simplified Molecular Input Line Entry System (SMILES) Encoder, Descriptor Decoder and Molecular Optimizer. The SMILES Encoder can generate the molecular representation with a 512-dimensional vector, and the Descriptor Decoder is able to translate the above representation to the corresponding molecular structure with high accuracy. Based on reversible molecular representation and particle swarm optimization strategy, the Molecular Optimizer can be used to effectively optimize undesirable ADMET properties without the loss of bioactivity, which essentially accomplishes the design of inverse QSAR. The constrained multi-objective optimization of the poly (ADP-ribose) polymerase-1 inhibitor is provided as the case to explore the utility of ChemMORT.
Subject(s)
Deep Learning , Humans , Drug Development , Drug Discovery , Poly(ADP-ribose) Polymerase InhibitorsABSTRACT
Due to its promising capacity in improving drug efficacy, polypharmacology has emerged to be a new theme in the drug discovery of complex disease. In the process of novel multi-target drugs (MTDs) discovery, in silico strategies come to be quite essential for the advantage of high throughput and low cost. However, current researchers mostly aim at typical closely related target pairs. Because of the intricate pathogenesis networks of complex diseases, many distantly related targets are found to play crucial role in synergistic treatment. Therefore, an innovational method to develop drugs which could simultaneously target distantly related target pairs is of utmost importance. At the same time, reducing the false discovery rate in the design of MTDs remains to be the daunting technological difficulty. In this research, effective small molecule clustering in the positive dataset, together with a putative negative dataset generation strategy, was adopted in the process of model constructions. Through comprehensive assessment on 10 target pairs with hierarchical similarity-levels, the proposed strategy turned out to reduce the false discovery rate successfully. Constructed model types with much smaller numbers of inhibitor molecules gained considerable yields and showed better false-hit controllability than before. To further evaluate the generalization ability, an in-depth assessment of high-throughput virtual screening on ChEMBL database was conducted. As a result, this novel strategy could hierarchically improve the enrichment factors for each target pair (especially for those distantly related/unrelated target pairs), corresponding to target pair similarity-levels.
Subject(s)
Drug Discovery , Polypharmacology , Drug Discovery/methods , High-Throughput Screening AssaysABSTRACT
The rapid development of engineered nanomaterials (ENMs) causes humans to become increasingly exposed to them. Therefore, a better understanding of the health impact of ENMs is highly demanded. Considering the 3Rs (Replacement, Reduction, and Refinement) principle, in vitro and computational methods are excellent alternatives for testing on animals. Among computational methods, nano-quantitative structure-activity relationship (nano-QSAR), which links the physicochemical and structural properties of EMNs with biological activities, is one of the leading method. The nature of toxicological experiments has evolved over the last decades; currently, one experiment can provide thousands of measurements of the organism's functioning at the molecular level. At the same time, the capacity of the in vitro systems to mimic the human organism is also improving significantly. Hence, the authors would like to discuss whether the nano-QSAR approach follows modern toxicological studies and takes full advantage of the opportunities offered by modern toxicological platforms. Challenges and possibilities for improving data integration are underlined narratively, including the need for a consensus built between the in vitro and the QSAR domains.
Subject(s)
Nanostructures , Quantitative Structure-Activity Relationship , Humans , Animals , Nanostructures/toxicity , Nanostructures/chemistryABSTRACT
With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. Despite the interest of the community in developing new methods for learning molecular embeddings and their theoretical benefits, comparing molecular embeddings with each other and with traditional representations is not straightforward, which in turn hinders the process of choosing a suitable representation for Quantitative Structure-Activity Relationship (QSAR) modeling. A reason behind this issue is the difficulty of conducting a fair and thorough comparison of the different existing embedding approaches, which requires numerous experiments on various datasets and training scenarios. To close this gap, we reviewed the literature on methods for molecular embeddings and reproduced three unsupervised and two supervised molecular embedding techniques recently proposed in the literature. We compared these five methods concerning their performance in QSAR scenarios using different classification and regression datasets. We also compared these representations to traditional molecular representations, namely molecular descriptors and fingerprints. As opposed to the expected outcome, our experimental setup consisting of over $25 000$ trained models and statistical tests revealed that the predictive performance using molecular embeddings did not significantly surpass that of traditional representations. Although supervised embeddings yielded competitive results compared with those using traditional molecular representations, unsupervised embeddings tended to perform worse than traditional representations. Our results highlight the need for conducting a careful comparison and analysis of the different embedding techniques prior to using them in drug design tasks and motivate a discussion about the potential of molecular embeddings in computer-aided drug design.
Subject(s)
Algorithms , Quantitative Structure-Activity RelationshipABSTRACT
Artificial intelligence (AI)-based computational techniques allow rapid exploration of the chemical space. However, representation of the compounds into computational-compatible and detailed features is one of the crucial steps for quantitative structure-activity relationship (QSAR) analysis. Recently, graph-based methods are emerging as a powerful alternative to chemistry-restricted fingerprints or descriptors for modeling. Although graph-based modeling offers multiple advantages, its implementation demands in-depth domain knowledge and programming skills. Here we introduce deepGraphh, an end-to-end web service featuring a conglomerate of established graph-based methods for model generation for classification or regression tasks. The graphical user interface of deepGraphh supports highly configurable parameter support for model parameter tuning, model generation, cross-validation and testing of the user-supplied query molecules. deepGraphh supports four widely adopted methods for QSAR analysis, namely, graph convolution network, graph attention network, directed acyclic graph and Attentive FP. Comparative analysis revealed that deepGraphh supported methods are comparable to the descriptors-based machine learning techniques. Finally, we used deepGraphh models to predict the blood-brain barrier permeability of human and microbiome-generated metabolites. In summary, deepGraphh offers a one-stop web service for graph-based methods for chemoinformatics.
Subject(s)
Artificial Intelligence , Quantitative Structure-Activity Relationship , Humans , Machine LearningABSTRACT
The quinolizidine alkaloids matrine and its N-oxide oxymatrine occur in plants of the genus Sophora. Recently, matrine was sporadically detected in liquorice products. Morphological similarity of the liquorice plant Glycyrrhiza glabra with Sophora species and resulting confusion during harvesting may explain this contamination, but use of matrine as pesticide has also been reported. The detection of matrine in liquorice products raised concern as some studies suggested a genotoxic activity of matrine and oxymatrine. However, these studies are fraught with uncertainties, putting the reliability and robustness into question. Another issue was that Sophora root extracts were usually tested instead of pure matrine and oxymatrine. The aim of this work was therefore to determine whether matrine and oxymatrine have potential for causing gene mutations. In a first step and to support a weight-of-evidence analysis, in silico predictions were performed to improve the database using expert and statistical systems by VEGA, Leadscope (Instem®), and Nexus (Lhasa Limited). Unfortunately, the confidence levels of the predictions were insufficient to either identify or exclude a mutagenic potential. Thus, in order to obtain reliable results, the bacterial reverse mutation assay (Ames test) was carried out in accordance with OECD Test Guideline 471. The test set included the plate incorporation and the preincubation assay. It was performed with five different bacterial strains in the presence or absence of metabolic activation. Neither matrine nor oxymatrine induced a significant increase in the number of revertants under any of the selected experimental conditions. Overall, it can be concluded that matrine and oxymatrine are unlikely to have a gene mutation potential. Any positive findings with Sophora extracts in the Ames test may be related to other components. Notably, the results also indicated a need to extend the application domain of respective (Q)SAR tools to secondary plant metabolites.
Subject(s)
Alkaloids , Sophora , Matrines , Reproducibility of Results , Alkaloids/toxicity , Alkaloids/analysis , Quinolizines/toxicity , Quinolizines/analysis , MutationABSTRACT
This article aims to provide a comprehensive critical, yet readable, review of general interest to the chemistry community on molecular similarity as applied to chemical informatics and predictive modeling with a special focus on read-across (RA) and read-across structure-activity relationships (RASAR). Molecular similarity-based computational tools, such as quantitative structure-activity relationships (QSARs) and RA, are routinely used to fill the data gaps for a wide range of properties including toxicity endpoints for regulatory purposes. This review will explore the background of RA starting from how structural information has been used through to how other similarity contexts such as physicochemical, absorption, distribution, metabolism, and elimination (ADME) properties, and biological aspects are being characterized. More recent developments of RA's integration with QSAR have resulted in the emergence of novel models such as ToxRead, generalized read-across (GenRA), and quantitative RASAR (q-RASAR). Conventional QSAR techniques have been excluded from this review except where necessary for context.
Subject(s)
Machine Learning , Quantitative Structure-Activity Relationship , Humans , Cheminformatics/methods , Structure-Activity Relationship , AnimalsABSTRACT
Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.
Subject(s)
Artificial Intelligence , Drug Discovery , Drug Discovery/methods , Drug Design , Machine Learning , Quantitative Structure-Activity RelationshipABSTRACT
Given the aging populations in advanced countries globally, many pharmaceutical companies have focused on developing central nervous system (CNS) drugs. However, due to the blood-brain barrier, drugs do not easily reach the target area in the brain. Although conventional screening methods for drug discovery involve the measurement of (unbound fraction of drug) brain-to-plasma partition coefficients, it is difficult to consider nonequilibrium between plasma and brain compound concentration-time profiles. To truly understand the pharmacokinetics/pharmacodynamics of CNS drugs, compound concentration-time profiles in the brain are necessary; however, such analyses are costly and time-consuming and require a significant number of animals. Therefore, in this study, we attempted to develop an in silico prediction method that does not require a large amount of experimental data by combining modeling and simulation (M&S) with machine learning (ML). First, we constructed a hybrid model linking plasma concentration-time profile to the brain compartment that takes into account the transit time and brain distribution of each compound. Using mouse plasma and brain time experimental values for 103 compounds, we determined the brain kinetic parameters of the hybrid model for each compound; this case was defined as scenario I (a positive control experiment) and included the full brain concentration-time profile data. Next, we built an ML model using chemical structure descriptors as explanatory variables and rate parameters as the target variable, and we then input the predicted values from 5-fold cross-validation (CV) into the hybrid model; this case was defined as scenario II, in which no brain compound concentration-time profile data exist. Finally, for scenario III, assuming that the brain concentration is obtained at only one time point, we used the brain kinetic parameters from the result of the 5-fold CV in scenario II as the initial values for the hybrid model and performed parameter refitting against the observed brain concentration at that time point. As a result, the RMSE/R2-values of the brain compound concentration-time profiles over time were 0.445/0.517 in scenario II and 0.246/0.805 in scenario III, indicating the method provides high accuracy and suggesting that it is a practical method for predicting brain compound concentration-time profiles.
Subject(s)
Blood-Brain Barrier , Brain , Computer Simulation , Machine Learning , Animals , Brain/metabolism , Mice , Blood-Brain Barrier/metabolism , Models, Biological , Central Nervous System Agents/pharmacokinetics , Central Nervous System Agents/administration & dosage , Tissue Distribution , Drug Discovery/methodsABSTRACT
Bioactive peptides (BPs) are short amino acid sequences that that are known to exhibit physiological characteristics such as antioxidant, antimicrobial, antihypertensive and antidiabetic properties, suggesting that they could be exploited as functional foods in the nutraceutical industry. These BPs can be derived from a variety of food sources, including milk, meat, marine, and plant proteins. In the past decade, various methods including in silico, in vitro, and in vivo techniques have been explored to unravel underlying mechanisms of BPs. To forecast interactions between peptides and their targets, in silico methods such as BIOPEP, molecular docking and Quantitative Structure-Activity Relationship modeling have been employed. Additionally, in vitro research has examined how BPs affect enzyme activities, protein expressions, and cell cultures. In vivo studies on the contrary have appraised the impact of BPs on animal models and human subjects. Hence, in the light of recent literature, this review examines the multifaceted aspects of BPs production from milk, meat, marine, and plant proteins and their potential bioactivities. We envisage that the various concepts discussed will contribute to a better understanding of the food derived BP production, which could pave a way for their potential applications in the nutraceutical industry.
ABSTRACT
Neuropathic pain (NP) is characterized by hyperalgesia, allodynia, and spontaneous pain. Hyperpolarization-activated cyclic nucleotide-gated (HCN) channel involved in neuronal hyperexcitability, has emerged as an important target for the drug development of NP. HCN channels exist in four different isoforms, where HCN1 is majorly expressed in dorsal root ganglion having an imperative role in NP pathophysiology. A specific HCN1 channel inhibitor will hold the better potential to treat NP without disturbing the physiological roles of other HCN isoforms. The main objective is to identify and analyze the chemical properties of scaffolds with higher HCN1 channel specificity. The 3D-QSAR studies highlight the hydrophobic & hydrogen bond donor groups enhance specificity towards the HCN1 channel. Further, the molecular interaction of the scaffolds with the HCN1 pore was studied by generating an open-pore model of the HCN1 channel using homology modelling and then docking the molecules with it. In addition, the important residues involved in the interaction between HCN1 pore and scaffolds were also identified. Moreover, ADME predictions revealed that compounds had good oral bioavailability and solubility characteristics. Subsequently, molecular dynamics simulation studies revealed the better stability of the lead molecules A7 and A9 during interactions and ascertained them as potential drug candidates. Cumulative studies provided the important structural features for enhancing HCN1 channel-specific inhibition, paving the way to design and develop novel specific HCN1 channel inhibitors.
ABSTRACT
PURPOSE: In order to ensure that drug administration is safe during pregnancy, it is crucial to have the possibility to predict the placental permeability of drugs in humans. The experimental method which is most widely used for the said purpose is in vitro human placental perfusion, though the approach is highly expensive and time consuming. Quantitative structure-activity relationship (QSAR) modeling represents a powerful tool for the assessment of the drug placental transfer, and can be successfully employed to be an alternative in in vitro experiments. METHODS: The conformation-independent QSAR models covered in the present study were developed through the use of the SMILES notation descriptors and local molecular graph invariants. What is more, the Monte Carlo optimization method, was used in the test sets and the training sets as the model developer with three independent molecular splits. RESULTS: A range of different statistical parameters was used to validate the developed QSAR model, including the standard error of estimation, mean absolute error, root-mean-square error (RMSE), correlation coefficient, cross-validated correlation coefficient, Fisher ratio, MAE-based metrics and the correlation ideality index. Once the mentioned statistical methods were employed, an excellent predictive potential and robustness of the developed QSAR model was demonstrated. In addition, the molecular fragments, which are derived from the SMILES notation descriptors accounting for the decrease or increase in the investigated activity, were revealed. CONCLUSION: The presented QSAR modeling can be an invaluable tool for the high-throughput screening of the placental permeability of drugs.
Subject(s)
Placenta , Quantitative Structure-Activity Relationship , Female , Pregnancy , Humans , Models, Molecular , Monte Carlo Method , PermeabilityABSTRACT
Matrine and indole have antibacterial, anticancer, and other biological activities, in order to develop new antibiotics to solve the problem of multi-drug resistant bacteria. In this paper, we synthesized a series of 29 novel matrine derivatives as potential drug candidates by combining indole analogs and matrine. The antibacterial activity of these compounds was evaluated through minimum inhibitory concentration (MIC) assays against five bacterial strains (S. aureus, C. albicans, P. acnes, P. aeruginosa, and E. coli). The obtained results demonstrated promising antibacterial efficacy, particularly for compounds A20 and A18, which exhibited MICs.au values of 0.021 and 0.031 mg/ml, respectively, against S. aureus. Moreover, compounds A20 and A27 displayed remarkable MICc.al values of 2.806 and 4.519 mg/ml, respectively, against C. albicans, surpassing the performance of the clinical antibiotic penicillin G sodium (0.0368 mg/ml) and fluconazole (4.849 mg/ml). These findings underscore the significant bacteriostatic activity of the matrine derivatives. Furthermore, to gain a deeper understanding 3D-QSAR modeling was employed, revealing the critical influence of steric structure, charge distribution, hydrophobic interactions, and hydrogen bonding within the molecular structure on the bacteriostatic activity of the compounds. Additionally, molecular docking simulations shed light on the interaction between compound A20 and bacterial proteins, highlighting the involvement of hydrogen bonding, hydrophobic interactions, and π-π conjugation in the formation of stable complexes that inhibit the normal functioning of the proteins. This comprehensive analysis provided valuable insights into the antibacterial mechanism of the novel matrine derivatives, offering theoretical support for their potential application as antibiotics.
Subject(s)
Anti-Bacterial Agents , Matrines , Anti-Bacterial Agents/chemistry , Staphylococcus aureus , Escherichia coli , Molecular Docking Simulation , Microbial Sensitivity Tests , Indoles/pharmacologyABSTRACT
The global outbreak of the COVID-19 pandemic caused by the SARS-CoV-2 virus had led to profound respiratory health implications. This study focused on designing organoselenium-based inhibitors targeting the SARS-CoV-2 main protease (Mpro). The ligand-binding pathway sampling method based on parallel cascade selection molecular dynamics (LB-PaCS-MD) simulations was employed to elucidate plausible paths and conformations of ebselen, a synthetic organoselenium drug, within the Mpro catalytic site. Ebselen effectively engaged the active site, adopting proximity to H41 and interacting through the benzoisoselenazole ring in a π-π T-shaped arrangement, with an additional π-sulfur interaction with C145. In addition, the ligand-based drug design using the QSAR with GFA-MLR, RF, and ANN models were employed for biological activity prediction. The QSAR-ANN model showed robust statistical performance, with an r2training exceeding 0.98 and an RMSEtest of 0.21, indicating its suitability for predicting biological activities. Integration the ANN model with the LB-PaCS-MD insights enabled the rational design of novel compounds anchored in the ebselen core structure, identifying promising candidates with favorable predicted IC50 values. The designed compounds exhibited suitable drug-like characteristics and adopted an active conformation similar to ebselen, inhibiting Mpro function. These findings represent a synergistic approach merging ligand and structure-based drug design; with the potential to guide experimental synthesis and enzyme assay testing.
Subject(s)
Antiviral Agents , Coronavirus 3C Proteases , Drug Design , Isoindoles , Machine Learning , Molecular Dynamics Simulation , Organoselenium Compounds , Protease Inhibitors , Quantitative Structure-Activity Relationship , SARS-CoV-2 , SARS-CoV-2/drug effects , SARS-CoV-2/enzymology , Organoselenium Compounds/chemistry , Organoselenium Compounds/pharmacology , Organoselenium Compounds/chemical synthesis , Isoindoles/chemistry , Isoindoles/pharmacology , Isoindoles/chemical synthesis , Coronavirus 3C Proteases/antagonists & inhibitors , Coronavirus 3C Proteases/metabolism , Protease Inhibitors/chemistry , Protease Inhibitors/pharmacology , Protease Inhibitors/chemical synthesis , Antiviral Agents/pharmacology , Antiviral Agents/chemistry , Antiviral Agents/chemical synthesis , Humans , Azoles/chemistry , Azoles/pharmacology , Azoles/chemical synthesis , COVID-19/virology , Catalytic DomainABSTRACT
Notwithstanding the wide adoption of the OECD principles (or best practices) for QSAR modeling, disparities between in silico predictions and experimental results are frequent, suggesting that model predictions are often too optimistic. Of these OECD principles, the applicability domain (AD) estimation has been recognized in several reports in the literature to be one of the most challenging, implying that the actual reliability measures of model predictions are often unreliable. Applying tree-based error analysis workflows on 5 QSAR models reported in the literature and available in the QsarDB repository, i.e., androgen receptor bioactivity (agonists, antagonists, and binders, respectively) and membrane permeability (highest membrane permeability and the intrinsic permeability), we demonstrate that predictions erroneously tagged as reliable (AD prediction errors) overwhelmingly correspond to instances in subspaces (cohorts) with the highest prediction error rates, highlighting the inhomogeneity of the AD space. In this sense, we call for more stringent AD analysis guidelines which require the incorporation of model error analysis schemes, to provide critical insight on the reliability of underlying AD algorithms. Additionally, any selected AD method should be rigorously validated to demonstrate its suitability for the model space over which it is applied. These steps will ultimately contribute to more accurate estimations of the reliability of model predictions. Finally, error analysis may also be useful in "rational" model refinement in that data expansion efforts and model retraining are focused on cohorts with the highest error rates.
Subject(s)
Algorithms , Quantitative Structure-Activity Relationship , Reproducibility of ResultsABSTRACT
Antioxidants agents play an essential role in the food industry for improving the oxidative stability of food products. In the last years, the search for new natural antioxidants has increased due to the potential high toxicity of chemical additives. Therefore, the synthesis and evaluation of the antioxidant activity in peptides is a field of current research. In this study, we performed a Quantitative Structure Activity Relationship analysis (QSAR) of cysteine-containing 19 dipeptides and 19 tripeptides. The main objective is to bring information on the relationship between the structure of peptides and their antioxidant activity. For this purpose, 1D and 2D molecular descriptors were calculated using the PaDEL software, which provides information about the structure, shape, size, charge, polarity, solubility and other aspects of the compounds. Different QSAR model for di- and tripeptides were developed. The statistic parameters for di-peptides model (R2train = 0.947 and R2test = 0.804) and for tripeptide models (R2train = 0.923 and R2test = 0.847) indicate that the generated models have high predictive capacity. Then, the influence of the cysteine position was analyzed predicting the antioxidant activity for new di- and tripeptides, and comparing them with glutathione. In dipeptides, excepting SC, TC and VC, the activity increases when cysteine is at the N-terminal position. For tripeptides, we observed a notable increase in activity when cysteine is placed in the N-terminal position.
Subject(s)
Antioxidants , Cysteine , Dipeptides , Oligopeptides , Quantitative Structure-Activity Relationship , Cysteine/chemistry , Antioxidants/chemistry , Antioxidants/pharmacology , Dipeptides/chemistry , Dipeptides/pharmacology , Oligopeptides/chemistry , Oligopeptides/pharmacology , Models, Molecular , SoftwareABSTRACT
Computer-aided drug design has advanced rapidly in recent years, and multiple instances of in silico designed molecules advancing to the clinic have demonstrated the contribution of this field to medicine. Properly designed and implemented platforms can drastically reduce drug development timelines and costs. While such efforts were initially focused primarily on target affinity/activity, it is now appreciated that other parameters are equally important in the successful development of a drug and its progression to the clinic, including pharmacokinetic properties as well as absorption, distribution, metabolic, excretion and toxicological (ADMET) properties. In the last decade, several programs have been developed that incorporate these properties into the drug design and optimization process and to varying degrees, allowing for multi-parameter optimization. Here, we introduce the Artificial Intelligence-driven Drug Design (AIDD) platform, which automates the drug design process by integrating high-throughput physiologically-based pharmacokinetic simulations (powered by GastroPlus) and ADMET predictions (powered by ADMET Predictor) with an advanced evolutionary algorithm that is quite different than current generative models. AIDD uses these and other estimates in iteratively performing multi-objective optimizations to produce novel molecules that are active and lead-like. Here we describe the AIDD workflow and details of the methodologies involved therein. We use a dataset of triazolopyrimidine inhibitors of the dihydroorotate dehydrogenase from Plasmodium falciparum to illustrate how AIDD generates novel sets of molecules.
Subject(s)
Artificial Intelligence , Drug Design , Algorithms , Evolution, MolecularABSTRACT
Metal-free carbon material-mediated nonradical oxidation processes (C-NOPs) have emerged as a research hotspot due to their excellent performance in selectively eliminating organic pollutants in aqueous environments. However, the selective oxidation mechanisms of C-NOPs remain obscure due to the diversity of organic pollutants and nonradical active species. Herein, quantitative structure-activity relationship (QSAR) models were employed to unveil the origins of C-NOP selectivity toward organic pollutants in different oxidant systems. QSAR analysis based on adsorption and oxidation descriptors revealed that C-NOP selectivity depends on the oxidation potentials of organic pollutants rather than on adsorption interactions. However, the dominance of electronic effects in selective oxidation decreases with increasing structural complexity of organic pollutants. Moreover, the oxidation threshold solely depends on the inherent electronic nature of organic pollutants and not on the reactivity of nonradical active species. Notably, the accuracy of substituent descriptors (Hammett constants) and theoretical descriptors (e.g., highest occupied molecular orbital energy, ionization potential, and single-electron oxidation potential) is significantly influenced by the complexity and molecular state of organic pollutants. Overall, the study findings reveal the origins of organic pollutant-oriented selective oxidation and provide insight into the application of descriptors in QSAR analysis.
Subject(s)
Environmental Pollutants , Water Pollutants, Chemical , Carbon , Quantitative Structure-Activity Relationship , Oxidation-Reduction , Oxidants/chemistry , Water Pollutants, Chemical/chemistryABSTRACT
Per- and polyfluoroalkyl substances (PFAS) are widely employed anthropogenic fluorinated chemicals known to disrupt hepatic lipid metabolism by binding to human peroxisome proliferator-activated receptor alpha (PPARα). Therefore, screening for PFAS that bind to PPARα is of critical importance. Machine learning approaches are promising techniques for rapid screening of PFAS. However, traditional machine learning approaches lack interpretability, posing challenges in investigating the relationship between molecular descriptors and PPARα binding. In this study, we aimed to develop a novel, explainable machine learning approach to rapidly screen for PFAS that bind to PPARα. We calculated the PPARα-PFAS binding score and 206 molecular descriptors for PFAS. Through systematic and objective selection of important molecular descriptors, we developed a machine learning model with good predictive performance using only three descriptors. The molecular size (b_single) and electrostatic properties (BCUT_PEOE_3 and PEOE_VSA_PPOS) are important for PPARα-PFAS binding. Alternative PFAS are considered safer than their legacy predecessors. However, we found that alternative PFAS with many carbon atoms and ether groups exhibited a higher affinity for PPARα. Therefore, confirming the toxicity of these alternative PFAS compounds with such characteristics through biological experiments is important.
Subject(s)
Fluorocarbons , PPAR alpha , Humans , PPAR alpha/metabolism , Liver/metabolismABSTRACT
The heterogeneous photodegradation behavior of liquid crystal monomers (LCMs) in standard dust (standard reference material, SRM 2583) and environmental dust was investigated. The measured photodegradation ratios for 23 LCMs in SRM and environmental dust in 12 h were 11.1 ± 1.8 to 23.2 ± 1.1% and 8.7 ± 0.5 to 24.0 ± 2.8%, respectively. The degradation behavior of different LCM compounds varied depending on their structural properties. A quantitative structure-activity relationship model for predicting the degradation ratio of LCMs in SRM dust was established, which revealed that the molecular descriptors related to molecular polarizability, electronegativity, and molecular mass were closely associated with LCMs' photodegradation. The photodegradation products of the LCM compound 4'-propoxy-4-biphenylcarbonitrile (PBIPHCN) in dust, including â¢OH oxidation, C-O bond cleavage, and ring-opening products, were identified by nontarget analysis, and the corresponding degradation pathways were suggested. Some of the identified products, such as 4'-hydroxyethoxy-4-biphenylcarbonitrile, showed predicted toxicity (with an oral rat lethal dose of 50%) comparable to that of PBIPHCN. The half-lives of the studied LCMs in SRM dust were estimated at 32.2-82.5 h by fitting an exponential decay curve to the observed photodegradation data. The photodegradation mechanisms of LCMs in dust were revealed for the first time, enhancing the understanding of LCMs' environmental behavior and risks.