Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
Add more filters

Publication year range
1.
Chem Res Toxicol ; 37(4): 580-589, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38501392

ABSTRACT

The desirable pharmacological properties and a broad number of therapeutic activities have made peptides promising drugs over small organic molecules and antibody drugs. Nevertheless, toxic effects, such as hemolysis, have hampered the development of such promising drugs. Hence, a reliable computational tool to predict peptide hemolytic toxicity is enormously useful before synthesis and experimental evaluation. Currently, four web servers that predict hemolytic activity using machine learning (ML) algorithms are available; however, they exhibit some limitations, such as the need for a reliable negative set and limited application domain. Hence, we developed a robust model based on a novel theoretical approach that combines network science and a multiquery similarity searching (MQSS) method. A total of 1152 initial models were constructed from 144 scaffolds generated in a previous report. These were evaluated on external data sets, and the best models were fused and improved. Our best MQSS model I1 outperformed all state-of-the-art ML-based models and was used to characterize the prevalence of hemolytic toxicity on therapeutic peptides. Based on our model's estimation, the number of hemolytic peptides might be 3.9-fold higher than the reported.


Subject(s)
Hemolysis , Peptides , Humans , Amino Acid Sequence , Peptides/pharmacology , Peptides/chemistry , Algorithms , Machine Learning
2.
J Comput Aided Mol Des ; 38(1): 9, 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38351144

ABSTRACT

Notwithstanding the wide adoption of the OECD principles (or best practices) for QSAR modeling, disparities between in silico predictions and experimental results are frequent, suggesting that model predictions are often too optimistic. Of these OECD principles, the applicability domain (AD) estimation has been recognized in several reports in the literature to be one of the most challenging, implying that the actual reliability measures of model predictions are often unreliable. Applying tree-based error analysis workflows on 5 QSAR models reported in the literature and available in the QsarDB repository, i.e., androgen receptor bioactivity (agonists, antagonists, and binders, respectively) and membrane permeability (highest membrane permeability and the intrinsic permeability), we demonstrate that predictions erroneously tagged as reliable (AD prediction errors) overwhelmingly correspond to instances in subspaces (cohorts) with the highest prediction error rates, highlighting the inhomogeneity of the AD space. In this sense, we call for more stringent AD analysis guidelines which require the incorporation of model error analysis schemes, to provide critical insight on the reliability of underlying AD algorithms. Additionally, any selected AD method should be rigorously validated to demonstrate its suitability for the model space over which it is applied. These steps will ultimately contribute to more accurate estimations of the reliability of model predictions. Finally, error analysis may also be useful in "rational" model refinement in that data expansion efforts and model retraining are focused on cohorts with the highest error rates.


Subject(s)
Algorithms , Quantitative Structure-Activity Relationship , Reproducibility of Results
3.
Int J Mol Sci ; 25(11)2024 May 25.
Article in English | MEDLINE | ID: mdl-38891933

ABSTRACT

The role of the gut microbiota and its interplay with host metabolic health, particularly in the context of type 2 diabetes mellitus (T2DM) management, is garnering increasing attention. Dipeptidyl peptidase 4 (DPP4) inhibitors, commonly known as gliptins, constitute a class of drugs extensively used in T2DM treatment. However, their potential interactions with gut microbiota remain poorly understood. In this study, we employed computational methodologies to investigate the binding affinities of various gliptins to DPP4-like homologs produced by intestinal bacteria. The 3D structures of DPP4 homologs from gut microbiota species, including Segatella copri, Phocaeicola vulgatus, Bacteroides uniformis, Parabacteroides merdae, and Alistipes sp., were predicted using computational modeling techniques. Subsequently, molecular dynamics simulations were conducted for 200 ns to ensure the stability of the predicted structures. Stable structures were then utilized to predict the binding interactions with known gliptins through molecular docking algorithms. Our results revealed binding similarities of gliptins toward bacterial DPP4 homologs compared to human DPP4. Specifically, certain gliptins exhibited similar binding scores to bacterial DPP4 homologs as they did with human DPP4, suggesting a potential interaction of these drugs with gut microbiota. These findings could help in understanding the interplay between gliptins and gut microbiota DPP4 homologs, considering the intricate relationship between the host metabolism and microbial communities in the gut.


Subject(s)
Diabetes Mellitus, Type 2 , Dipeptidyl Peptidase 4 , Dipeptidyl-Peptidase IV Inhibitors , Gastrointestinal Microbiome , Humans , Bacteria/metabolism , Bacterial Proteins/metabolism , Bacterial Proteins/chemistry , Binding Sites , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 2/drug therapy , Dipeptidyl Peptidase 4/metabolism , Dipeptidyl Peptidase 4/chemistry , Dipeptidyl-Peptidase IV Inhibitors/pharmacology , Molecular Docking Simulation , Molecular Dynamics Simulation , Protein Binding
4.
Mol Divers ; 2023 Apr 05.
Article in English | MEDLINE | ID: mdl-37017875

ABSTRACT

Ubiquitin-proteasome system (UPS) is a highly regulated mechanism of intracellular protein degradation and turnover. The UPS is involved in different biological activities, such as the regulation of gene transcription and cell cycle. Several researchers have applied cheminformatics and artificial intelligence methods to study the inhibition of proteasomes, including the prediction of UPP inhibitors. Following this idea, we applied a new tool for obtaining molecular descriptors (MDs) for modeling proteasome Inhibition in terms of EC50 (µmol/L), in which a set of new MDs called atomic weighted vectors (AWV) and several prediction algorithms were used in cheminformatics studies. In the manuscript, a set of descriptors based on AWV are presented as datasets for training different machine learning techniques, such as linear regression, multiple linear regression (MLR), random forest (RF), K-nearest neighbors (IBK), multi-layer perceptron, best-first search, and genetic algorithm. The results suggest that these atomic descriptors allow adequate modeling of proteasome inhibitors despite artificial intelligence techniques, as a variant to build efficient models for the prediction of inhibitory activity.

5.
Int J Mol Sci ; 23(3)2022 Jan 31.
Article in English | MEDLINE | ID: mdl-35163573

ABSTRACT

Inflammasomes are multiprotein complexes that represent critical elements of the inflammatory response. The dysregulation of the best-characterized complex, the NLRP3 inflammasome, has been linked to the pathogenesis of diseases such as multiple sclerosis, type 2 diabetes mellitus, Alzheimer's disease, and cancer. While there exist molecular inhibitors specific for the various components of inflammasome complexes, no currently reported inhibitors specifically target NLRP3PYD homo-oligomerization. In the present study, we describe the identification of QM380 and QM381 as NLRP3PYD homo-oligomerization inhibitors after screening small molecules from the MyriaScreen library using a split-luciferase complementation assay. Our results demonstrate that these NLRP3PYD inhibitors interfere with ASC speck formation, inhibit pro-inflammatory cytokine IL1-ß release, and decrease pyroptotic cell death. We employed spectroscopic techniques and computational docking analyses with QM380 and QM381 and the PYD domain to confirm the experimental results and predict possible mechanisms underlying the inhibition of NLRP3PYD homo-interactions.


Subject(s)
Anti-Inflammatory Agents , NLR Family, Pyrin Domain-Containing 3 Protein , Protein Multimerization/drug effects , Pyroptosis/drug effects , Anti-Inflammatory Agents/chemistry , Anti-Inflammatory Agents/pharmacology , HEK293 Cells , Humans , NLR Family, Pyrin Domain-Containing 3 Protein/antagonists & inhibitors , NLR Family, Pyrin Domain-Containing 3 Protein/chemistry , NLR Family, Pyrin Domain-Containing 3 Protein/genetics , NLR Family, Pyrin Domain-Containing 3 Protein/metabolism
6.
Proteins ; 89(2): 174-184, 2021 02.
Article in English | MEDLINE | ID: mdl-32881068

ABSTRACT

We present a novel Java-based program denominated PeptiDesCalculator for computing peptide descriptors. These descriptors include: redefinitions of known protein parameters to suite the peptide domain, generalization schemes for the global descriptions of peptide characteristics, as well as empirical descriptors based on experimental evidence on peptide stability and interaction propensity. The PeptiDesCalculator software provides a user-friendly Graphical User Interface (GUI) and is parallelized to maximize the use of computational resources available in current work stations. The PeptiDesCalculator indices are employed in modeling 8 peptide bioactivity endpoints demonstrating satisfactory behavior. Moreover, we compare the performance of a support vector machine (SVM) classifier built using 15 PeptiDesCalculator indices with that of a recently reported deep neural network (DNN) antimicrobial activity classifier, demonstrating comparable test set performance notwithstanding the remarkably lower degree of freedom for the former. This software will facilitate the development of in silico models for the prediction of peptide properties.


Subject(s)
Peptides/chemistry , Peptides/pharmacology , Software , Support Vector Machine , Anti-Bacterial Agents/chemistry , Anti-Bacterial Agents/pharmacology , Antifungal Agents/chemistry , Antifungal Agents/pharmacology , Antineoplastic Agents/chemistry , Antineoplastic Agents/pharmacology , Antiviral Agents/chemistry , Antiviral Agents/pharmacology , Candida albicans/drug effects , HIV Infections/drug therapy , Hepatitis C/drug therapy , Humans , Listeria monocytogenes/drug effects , Neoplasms/drug therapy , Neural Networks, Computer , Peptide Mapping , Peptides/genetics , Peptides/metabolism , Protein Stability , Pseudomonas aeruginosa/drug effects
7.
Mol Divers ; 25(3): 1425-1438, 2021 Aug.
Article in English | MEDLINE | ID: mdl-34258685

ABSTRACT

Scientific and consumer interest in healthy foods (also known as functional foods), nutraceuticals and cosmeceuticals has increased in the recent years, leading to an increased presence of these products in the market. However, the regulations across different countries that define the type of claims that may be made, and the degree of evidence required to support these claims, are rather inconsistent. Moreover, there is also controversy on the effectiveness and biological mode of action of many of these products, which should undergo an exhaustive approval process to guarantee the consumer rights. Computational approaches constitute invaluable tools to facilitate the discovery of bioactive molecules and provide biological plausibility on the mode of action of these products. Indeed, methodologies like QSAR, docking or molecular dynamics have been used in drug discovery protocols for decades and can now aid in the discovery of bioactive food components. Thanks to these approaches, it is possible to search for new functions in food constituents, which may be part of our daily diet, and help to prevent disorders like diabetes, hypercholesterolemia or obesity. In the present manuscript, computational studies applied to this field are reviewed to illustrate the potential of these approaches to guide the first screening steps and the mechanistic studies of nutraceutical, cosmeceutical and functional foods.


Subject(s)
Cheminformatics/methods , Cosmeceuticals/chemistry , Dietary Supplements/analysis , Functional Food/analysis , Models, Molecular , Quantitative Structure-Activity Relationship , Algorithms , Cosmeceuticals/pharmacology , Databases, Chemical , Humans , Machine Learning , Molecular Docking Simulation , Molecular Dynamics Simulation
8.
J Chem Inf Model ; 60(7): 3534-3545, 2020 07 27.
Article in English | MEDLINE | ID: mdl-32589419

ABSTRACT

Over the past few decades, virtual high-throughput screening (vHTS) and molecular dynamics simulations have become effective and widely used tools in the initial stages of drug discovery efforts. These methods allow a great number of druglike molecules to be screened quickly and inexpensively. Unfortunately, however, the accuracies of both these methods rely on the quality of the underlying molecular mechanics force fields (FFs), which are often poor. This major weakness originates from the reliance of FFs on a finite list of specific parameters, called atom types, which have low transferability between molecules. In particular, the torsional energy barriers of druglike molecules are notoriously difficult to predict. Continuing our endeavor to understand factors affecting the torsional energy barriers of small molecules and quantify them, we showed that descriptors calculated using the extended-Hückel method could be used to rapidly assign accurate torsion parameters for conjugated molecules. This method, called H-TEQ 4.5, was developed using a set of 684 conjugated molecules. It was subsequently validated on a test set of 200 diverse molecules and produced an average root-mean-square error (rmse) of 1.01 kcal·mol-1, with respect to the reference quantum mechanic torsional profiles. For comparison, GAFF2, MMFF94, and MAB produced average rmse's of 3.49, 1.50, and 1.77 kcal·mol-1, respectively. H-TEQ 4.5 is also computationally inexpensive, running just under 0.25 ms for a biphenyl molecule on a home computer, allowing it to be used for vHTS of large libraries of compounds. Overall, H-TEQ 4.5 solved the problems associated with the transferability of torsion parameters for conjugated molecules. This method was incorporated into the Molecular Operating Environment and will be available for a wide variety of applications.


Subject(s)
Molecular Dynamics Simulation , Quantum Theory , Physical Phenomena , Static Electricity , Thermodynamics
9.
Mol Divers ; 24(4): 913-932, 2020 Nov.
Article in English | MEDLINE | ID: mdl-31659696

ABSTRACT

In this report, we introduce a set of aggregation operators (AOs) to calculate global and local (group and atom type) molecular descriptors (MDs) as a generalization of the classical approach of molecular encoding using the sum of the atomic (or fragment) contributions. These AOs are implemented in a new and free software denominated MD-LOVIs ( http://tomocomd.com/md-lovis ), which allows for the calculation of MDs from atomic weights vector and LOVIs (local vertex invariants). This software was developed in Java programming language and employed the Chemical Development Kit (CDK) library for handling chemical structures and the calculation of atomic weights. An analysis of the complexities of the algorithms presented herein demonstrates that these aspects were efficiently implemented. The calculation speed experiments show that the MD-LOVIs software has satisfactory behavior when compared to software such as Padel, CDKDescriptor, DRAGON and Bluecal software. Shannon's entropy (SE)-based variability studies demonstrate that MD-LOVIs yields indices with greater information content when compared to those of popular academic and commercial software. A principal component analysis reveals that our approach captures chemical information orthogonal to that codified by the DRAGON, Padel and Mold2 software, as a result of the several generalizations in MD-LOVIs not used in other programs. Lastly, three QSARs were built using multiple linear regression with genetic algorithms, and the statistical parameters of these models demonstrate that the MD-LOVIs indices obtained with AOs yield better performance than those obtained when the summation operator is used exclusively. Moreover, it is also revealed that the MD-LOVIs indices yield models with comparable to superior performance when compared to other QSAR methodologies reported in the literature, despite their simplicity. The studies performed herein collectively demonstrated that MD-LOVIs software generates indices as simple as possible, but not simpler and that use of AOs enhances the diversity of the chemical information codified, which consequently improves the performance of traditional MDs.


Subject(s)
Models, Chemical , Small Molecule Libraries/chemistry , Algorithms , Linear Models , Multivariate Analysis , Quantitative Structure-Activity Relationship , Software
10.
J Chem Inf Model ; 59(11): 4750-4763, 2019 11 25.
Article in English | MEDLINE | ID: mdl-31589815

ABSTRACT

Applications of computational methods to predict binding affinities for protein/drug complexes are routinely used in structure-based drug discovery. Applications of these methods often rely on empirical force fields (FFs) and their associated parameter sets and atom types. However, it is widely accepted that FFs cannot accurately cover the entire chemical space of drug-like molecules, due to the restrictive cost of parametrization and the poor transferability of existing parameters. To address these limitations, initiatives have been carried out to develop more transferable methods, in order to allow for more rigorous descriptions of any drug-like molecule. We have previously reported H-TEQ, a method which does not rely on atom types and incorporates well established chemical principles to assign parameters to organic molecules. The previous implementation of H-TEQ (a torsional barrier prediction method) only covered saturated and lone pair containing molecules; here, we report our efforts to incorporate conjugated systems into our model. The next step was the evaluation of the introduction of unsaturations. The developed model (H-TEQ3.0) has been validated on a wide variety of molecules containing heteroaromatic groups, alkyls, and fused ring systems. Our method performs on par with one of the most commonly used FFs (GAFF2), without relying on atom types or any prior parametrization.


Subject(s)
Allyl Compounds/chemistry , Benzene Derivatives/chemistry , Drug Discovery , Molecular Conformation , Molecular Dynamics Simulation , Pharmaceutical Preparations/chemistry , Quantum Theory , Thermodynamics
11.
J Chem Inf Model ; 59(11): 4764-4777, 2019 11 25.
Article in English | MEDLINE | ID: mdl-31430147

ABSTRACT

Biaryl molecules are ubiquitous pharmacophores found in natural products and pharmaceuticals. In spite of this, existing molecular mechanics force fields are unable to accurately reproduce their torsional energy profiles, except for a few well-parametrized cases. This effectively limits the ability of structure-based drug design methods to correctly identify hits involving biaryls with confidence (e.g., during virtual screening, employing docking and/or molecular dynamics simulations). Continuing in our endeavor to quantify organic chemistry principles, we showed that the torsional energy profile of biaryl compounds could be computed on-the-fly based on the electron richness/deficiency of the aromatic rings. This method, called H-TEQ 4.0, was developed using a set of 131 biaryls. It was subsequently validated on a separate set of 100 diverse biaryls, including multisubstituted, bicyclic and tricyclic druglike molecules, and produced an average root-mean-square error (RMSE) of 0.95 kcal·mol-1. For comparison, GAFF2 produced an RMSE of 3.88 kcal·mol-1, owing to problems associated with the transferability of torsion parameters. The success of H-TEQ 4.0 provided further evidence that force fields could transition to become atom-type independent, providing that the correct chemical principles are used. Overall, this method solved the problem of transferability of biaryl torsion parameters, while simultaneously improving the overall accuracy of the force field.


Subject(s)
Hydrocarbons, Aromatic/chemistry , Pharmaceutical Preparations/chemistry , Drug Design , Electrons , Models, Chemical , Quantum Theory , Static Electricity , Thermodynamics
12.
J Comput Aided Mol Des ; 33(11): 997-1008, 2019 11.
Article in English | MEDLINE | ID: mdl-31773464

ABSTRACT

Imbalanced datasets, comprising of more inactive compounds relative to the active ones, are a common challenge in ligand-based model building workflows for drug discovery. This is particularly true for neglected tropical diseases since efforts to identify therapeutics for these diseases are often limited. In this report, we analyze the performance of several undersampling strategies in modeling the Dengue Virus 2 (DENV2) inhibitory activity, as well as the anti-flaviviral activities for the West Nile (WNV) and Zika (ZIKV) viruses. To this end, we build datasets comprising of 1218 (159 actives and 1059 inactives), 1044 (132 actives and 912 inactives) and 302 (75 actives and 227 inactives) molecules with known DENV2, WNV and ZIKV inhibitory activity profiles, respectively. We develop ensemble classifiers for these endpoints and compare the performance of the different undersampling algorithms on external sets. It is observed that data pruning algorithms yield superior performance relative to data selection algorithms. The best overall performance is provided by the one-sided selection algorithm with test set balanced accuracy (BACC) values of 0.84, 0.74 and 0.77 for the DENV2, WNV and ZIKV inhibitory activities, respectively. For the model building, we use the recently proposed GT-STAF information indices, and compare the predictivity of 3 molecular fragmentation approaches: connected subgraphs, substructure and alogp atom types, which are observed to show comparable performance. On the other hand, a combination of indices based on these fragmentation strategies enhances the predictivity of the built ensembles. The built models could be useful for screening new molecules with possible DENV, WNV and ZIKV inhibitory activities. ADMET modelers are encouraged to adopt undersampling algorithms in their workflows when dealing with imbalanced datasets.


Subject(s)
Antiviral Agents/pharmacology , Drug Discovery/methods , Flaviviridae/drug effects , Support Vector Machine , Antiviral Agents/chemistry , Dengue Virus/drug effects , Flaviviridae Infections/drug therapy , Humans , West Nile virus/drug effects , Zika Virus/drug effects
13.
J Chem Inf Model ; 58(1): 194-205, 2018 01 22.
Article in English | MEDLINE | ID: mdl-29253333

ABSTRACT

We previously implemented a well-known qualitative chemical principle into an accurate quantitative model computing relative potential energies of conformers. According to this principle, hyperconjugation strength correlates with electronegativity of donors and acceptors. While this earlier version of our model applies to σ bonds, lone pairs, disregarded in this earlier version, also have a major impact on the conformational preferences of molecules. Among the well-established principles used by organic chemists to rationalize some organic chemical behaviors are the anomeric effect, the alpha effect, basicity, and nucleophilicity. These effects are directly related to the presence of lone pairs. We report herein our effort to incorporate lone pairs into our model to extend its applicability domain to any saturated small molecules. The developed model H-TEQ 2 has been validated on a wide variety of molecules from polyaromatic molecules to carbohydrates and molecules with high heteroatoms/carbon ratios. Interestingly, this method, in contrast to common force field-based methods, does not rely on atom types and is virtually applicable to any organic molecules.


Subject(s)
Molecular Dynamics Simulation , Small Molecule Libraries/chemistry , Hydrogen Bonding , Models, Chemical , Molecular Conformation , Quantum Theory , Thermodynamics
14.
Ecotoxicol Environ Saf ; 135: 130-136, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27723465

ABSTRACT

Biomagnification of organic pollutants in food webs has been usually associated to hydrophobicity and other molecular descriptors. However, direct information on atoms and substituent positions in a molecular scaffold that most affect this biological property is not straightforward using traditional QSPR techniques. This work reports the QSPR modeling of biomagnification factors (logBMF) of a series of aromatic organochlorine compounds using three MIA-QSPR (multivariate image analysis applied to QSPR) approaches. The MIA-QSPR model based on augmented molecular images (described with atoms represented as circles with sizes proportional to the respective van der Waals radii and having colors numerically proportional to the Pauling's electronegativity) encoded better the logBMF data. The average results for the main statistical parameters used to attest the model's predictability were r2=0.85, q2=0.72 and r2test=0.85. In addition, chemical insights on substituents and respective positions at the biphenyl rings A and B, and dibenzo-p-dioxin and dibenzofuran motifs are given to aid the design of more ecofriendly derivatives.


Subject(s)
Hydrocarbons, Chlorinated/chemistry , Hydrophobic and Hydrophilic Interactions , Multivariate Analysis , Quantitative Structure-Activity Relationship
15.
Bioinformatics ; 31(15): 2553-9, 2015 Aug 01.
Article in English | MEDLINE | ID: mdl-25819673

ABSTRACT

MOTIVATION: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. RESULTS: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are included in CAMP_Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs.


Subject(s)
Antimicrobial Cationic Peptides/chemistry , Databases, Nucleic Acid , Databases, Protein , Sequence Analysis, Protein/methods , Software , Algorithms , Humans
16.
Int J Mol Sci ; 17(6)2016 Jun 07.
Article in English | MEDLINE | ID: mdl-27338348

ABSTRACT

A quantitative structure-activity relationship (QSAR) study of the 2,2-diphenyl-l-picrylhydrazyl (DPPH•) radical scavenging ability of 1373 chemical compounds, using DRAGON molecular descriptors (MD) and the neural network technique, a technique based on the multilayer multilayer perceptron (MLP), was developed. The built model demonstrated a satisfactory performance for the training ( R 2 = 0.713 ) and test set ( Q ext 2 = 0.654 ) , respectively. To gain greater insight on the relevance of the MD contained in the MLP model, sensitivity and principal component analyses were performed. Moreover, structural and mechanistic interpretation was carried out to comprehend the relationship of the variables in the model with the modeled property. The constructed MLP model was employed to predict the radical scavenging ability for a group of coumarin-type compounds. Finally, in order to validate the model's predictions, an in vitro assay for one of the compounds (4-hydroxycoumarin) was performed, showing a satisfactory proximity between the experimental and predicted pIC50 values.


Subject(s)
Biphenyl Compounds/chemistry , Computer Simulation , Free Radical Scavengers/chemistry , Models, Theoretical , Picrates/chemistry , Antioxidants/chemistry , Antioxidants/pharmacology , Biphenyl Compounds/antagonists & inhibitors , Coumarins/chemistry , Coumarins/pharmacology , Free Radical Scavengers/pharmacology , Picrates/antagonists & inhibitors , Quantitative Structure-Activity Relationship
17.
Int J Mol Sci ; 17(6)2016 May 27.
Article in English | MEDLINE | ID: mdl-27240357

ABSTRACT

This report examines the interpretation of the Graph Derivative Indices (GDIs) from three different perspectives (i.e., in structural, steric and electronic terms). It is found that the individual vertex frequencies may be expressed in terms of the geometrical and electronic reactivity of the atoms and bonds, respectively. On the other hand, it is demonstrated that the GDIs are sensitive to progressive structural modifications in terms of: size, ramifications, electronic richness, conjugation effects and molecular symmetry. Moreover, it is observed that the GDIs quantify the interaction capacity among molecules and codify information on the activation entropy. A structure property relationship study reveals that there exists a direct correspondence between the individual frequencies of atoms and Hückel's Free Valence, as well as between the atomic GDIs and the chemical shift in NMR, which collectively validates the theory that these indices codify steric and electronic information of the atoms in a molecule. Taking in consideration the regularity and coherence found in experiments performed with the GDIs, it is possible to say that GDIs possess plausible interpretation in structural and physicochemical terms.


Subject(s)
Pharmaceutical Preparations/chemistry , Algorithms , Computer Graphics , Drug Design , Entropy
18.
J Comput Chem ; 36(23): 1748-55, 2015 Sep 05.
Article in English | MEDLINE | ID: mdl-26119527

ABSTRACT

For a decade, the multivariate image analysis applied to quantitative structure-activity relationship (MIA-QSAR) approach has been successfully used in the modeling of several chemical and biological properties of chemical compounds. However, the key pitfall of this method has been its exclusive applicability to congeneric datasets due to the prerequisite of aligning the chemical images with respect to the basic molecular scaffold. The present report aims to explore the use of the 2D-discrete Fourier transform (2D-DFT) as a means of opening way to the modeling, for the first time, of structurally diverse noncongruent chemical images. The usability of the 2D-DFT in QSAR modeling of noncongruent chemical compounds is assessed using a structurally diverse dataset of 100 compounds, with reported inhibitory activity against MCF-7 human breast cancer cell line. An analysis of the statistical parameters of the built regression models validates their robustness and high predictive power. Additionally, a comparison of the results obtained with the 2D-DFT MIA-QSAR approach with those of the DRAGON molecular descriptors is performed, revealing superior performance for the former. This result represents a milestone in the MIA-QSAR context, as it opens way for the possibility of screening for new molecular entities with the desired chemical or therapeutic utility.

19.
J Theor Biol ; 374: 125-37, 2015 Jun 07.
Article in English | MEDLINE | ID: mdl-25843214

ABSTRACT

In the present study, we introduce novel 3D protein descriptors based on the bilinear algebraic form in the ℝ(n) space on the coulombic matrix. For the calculation of these descriptors, macromolecular vectors belonging to ℝ(n) space, whose components represent certain amino acid side-chain properties, were used as weighting schemes. Generalization approaches for the calculation of inter-amino acidic residue spatial distances based on Minkowski metrics are proposed. The simple- and double-stochastic schemes were defined as approaches to normalize the coulombic matrix. The local-fragment indices for both amino acid-types and amino acid-groups are presented in order to permit characterizing fragments of interest in proteins. On the other hand, with the objective of taking into account specific interactions among amino acids in global or local indices, geometric and topological cut-offs are defined. To assess the utility of global and local indices a classification model for the prediction of the major four protein structural classes, was built with the Linear Discriminant Analysis (LDA) technique. The developed LDA-model correctly classifies the 92.6% and 92.7% of the proteins on the training and test sets, respectively. The obtained model showed high values of the generalized square correlation coefficient (GC(2)) on both the training and test series. The statistical parameters derived from the internal and external validation procedures demonstrate the robustness, stability and the high predictive power of the proposed model. The performance of the LDA-model demonstrates the capability of the proposed indices not only to codify relevant biochemical information related to the structural classes of proteins, but also to yield suitable interpretability. It is anticipated that the current method will benefit the prediction of other protein attributes or functions.


Subject(s)
Computational Biology/methods , Macromolecular Substances/chemistry , Protein Conformation , Proteins/chemistry , Algorithms , Amino Acids/chemistry , Computer Simulation , Linear Models , Models, Biological , Models, Molecular , Quantitative Structure-Activity Relationship , Reproducibility of Results , Stochastic Processes
20.
Mol Divers ; 19(2): 305-19, 2015 May.
Article in English | MEDLINE | ID: mdl-25620721

ABSTRACT

The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon's entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphic representation for Shannon's distribution of MD calculating software.


Subject(s)
Models, Theoretical , Software , Algorithms
SELECTION OF CITATIONS
SEARCH DETAIL