Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
1.
Mol Inform ; 43(7): e202400018, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38803302

RESUMEN

The growing interest in chemoinformatic model uncertainty calls for a summary of the most widely used regression techniques and how to estimate their reliability. Regression models learn a mapping from the space of explanatory variables to the space of continuous output values. Among other limitations, the predictive performance of the model is restricted by the training data used for model fitting. Identification of unusual objects by outlier detection methods can improve model performance. Additionally, proper model evaluation necessitates defining the limitations of the model, often called the applicability domain. Comparable to certain classifiers, some regression techniques come with built-in methods or augmentations to quantify their (un)certainty, while others rely on generic procedures. The theoretical background of their working principles and how to deduce specific and general definitions for their domain of applicability shall be explained.


Asunto(s)
Quimioinformática , Quimioinformática/métodos , Análisis de Regresión
2.
J Pharm Sci ; 112(9): 2404-2411, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37295605

RESUMEN

Understanding binding related changes in antibody conformations is important for epitope prediction and antibody refinement. The increase of available data in the PDB allowed a more detailed investigation of the conformational landscape for free and bound antibodies. A dataset containing a total of 835 unique PDB entries of antibodies that were crystallized in complex with their antigen and in a free state was constructed. It was examined for binding related conformation changes. We present further evidence supporting the theory of a pre-existing-equilibrium in experimental data. Multiple sequence alignments did not show binding induced tendencies in the solvent accessibility of residues in any specific position. Evaluating the changes in solvent accessibility per residue revealed a certain binding induced increase for several amino acids. Antibody-antigen interaction statistics were established and quantify a significant directional asymmetry between many interacting antibody and antigen residue pairs, especially a richness in tyrosine in the antibody epitope compared to its paratope. This asymmetry could potentially facilitate an increase in the success rate of computationally guided antibody refinement.


Asunto(s)
Anticuerpos , Antígenos , Epítopos , Sitios de Unión de Anticuerpos , Conformación Molecular , Conformación Proteica
3.
J Cheminform ; 15(1): 49, 2023 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-37118768

RESUMEN

It is insightful to report an estimator that describes how certain a model is in a prediction, additionally to the prediction alone. For regression tasks, most approaches implement a variation of the ensemble method, apart from few exceptions. Instead of a single estimator, a group of estimators yields several predictions for an input. The uncertainty can then be quantified by measuring the disagreement between the predictions, for example by the standard deviation. In theory, ensembles should not only provide uncertainties, they also boost the predictive performance by reducing errors arising from variance. Despite the development of novel methods, they are still considered the "golden-standard" to quantify the uncertainty of regression models. Subsampling-based methods to obtain ensembles can be applied to all models, regardless whether they are related to deep learning or traditional machine learning. However, little attention has been given to the question whether the ensemble method is applicable to virtually all scenarios occurring in the field of cheminformatics. In a widespread and diversified attempt, ensembles are evaluated for 32 datasets of different sizes and modeling difficulty, ranging from physicochemical properties to biological activities. For increasing ensemble sizes with up to 200 members, the predictive performance as well as the applicability as uncertainty estimator are shown for all combinations of five modeling techniques and four molecular featurizations. Useful recommendations were derived for practitioners regarding the success and minimum size of ensembles, depending on whether predictive performance or uncertainty quantification is of more importance for the task at hand.

4.
Molecules ; 26(21)2021 Oct 28.
Artículo en Inglés | MEDLINE | ID: mdl-34770921

RESUMEN

Uncertainty measures estimate the reliability of a predictive model. Especially in the field of molecular property prediction as part of drug design, model reliability is crucial. Besides other techniques, Random Forests have a long tradition in machine learning related to chemoinformatics and are widely used. Random Forests consist of an ensemble of individual regression models, namely, decision trees and, therefore, provide an uncertainty measure already by construction. Regarding the disagreement of single-model predictions, a narrower distribution of predictions is interpreted as a higher reliability. The standard deviation of the decision tree ensemble predictions is the default uncertainty measure for Random Forests. Due to the increasing application of machine learning in drug design, there is a constant search for novel uncertainty measures that, ideally, outperform classical uncertainty criteria. When analyzing Random Forests, it appears obvious to consider the variance of the dependent variables within each terminal decision tree leaf to obtain predictive uncertainties. Hereby, predictions that arise from more leaves of high variance are considered less reliable. Expectedly, the number of such high-variance leaves yields a reasonable uncertainty measure. Depending on the dataset, it can also outperform ensemble uncertainties. However, small-scale comparisons, i.e., considering only a few datasets, are insufficient, since they are more prone to chance correlations. Therefore, large-scale estimations are required to make general claims about the performance of uncertainty measures. On several chemoinformatic regression datasets, high-variance leaves are compared to the standard deviation of ensemble predictions. It turns out that high-variance leaf uncertainty is meaningful, not superior to the default ensemble standard deviation. A brief possible explanation is offered.

5.
Molecules ; 26(5)2021 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-33800445

RESUMEN

In this study, the general processability of cannabidiol (CBD) in colloidal lipid carriers was investigated. Due to its many pharmacological effects, the pharmaceutical use of this poorly water-soluble drug is currently under intensive research and colloidal lipid emulsions are a well-established formulation option for such lipophilic substances. To obtain a better understanding of the formulability of CBD in lipid emulsions, different aspects of CBD loading and its interaction with the emulsion droplets were investigated. Very high drug loads (>40% related to lipid content) could be achieved in emulsions of medium chain triglycerides, rapeseed oil, soybean oil and trimyristin. The maximum CBD load depended on the type of lipid matrix. CBD loading increased the particle size and the density of the lipid matrix. The loading capacity of a trimyristin emulsion for CBD was superior to that of a suspension of solid lipid nanoparticles based on trimyristin (69% vs. 30% related to the lipid matrix). In addition to its localization within the lipid core of the emulsion droplets, cannabidiol was associated with the droplet interface to a remarkable extent. According to a stress test, CBD destabilized the emulsions, with phospholipid-stabilized emulsions being more stable than poloxamer-stabilized ones. Furthermore, it was possible to produce emulsions with pure CBD as the dispersed phase, since CBD demonstrated such a pronounced supercooling tendency that it did not recrystallize, even if cooled to -60 °C.


Asunto(s)
Cannabidiol/química , Sistemas de Liberación de Medicamentos/métodos , Gotas Lipídicas/química , Cannabidiol/aislamiento & purificación , Portadores de Fármacos/química , Emulsionantes/química , Emulsiones/química , Nanopartículas/química , Tamaño de la Partícula , Fosfolípidos/química , Aceite de Brassica napus/química , Aceite de Soja/química , Agua
6.
Metallomics ; 11(3): 533-545, 2019 03 20.
Artículo en Inglés | MEDLINE | ID: mdl-30516775

RESUMEN

Gold complexes with N-heterocyclic carbene (NHC) ligands have been attracting major attention in medicinal inorganic chemistry based on their favorable antiproliferative effects and the structural versatility of the coordinated NHC ligands. Here we present a novel complex of the type (NHC)2Au+, which represents a substantially improved and selective TrxR inhibitor compared to close structural analogues. The complex is highly stable in various solutions over 96 hours, however, comparative cellular uptake studies indicate metabolic transformations inside cells over time. A portfolio of other gold complexes (e.g. Auranofin) has been used as references in key biological assays, showing that the novel (NHC)2Au+ complex exhibits substantially lower protein binding in combination with a strongly enhanced cytotoxic activity.


Asunto(s)
Antineoplásicos , Inhibidores Enzimáticos , Oro/química , Metano/análogos & derivados , Reductasa de Tiorredoxina-Disulfuro/antagonistas & inhibidores , Antineoplásicos/química , Antineoplásicos/farmacología , Apoptosis/efectos de los fármacos , Línea Celular Tumoral , Inhibidores Enzimáticos/química , Inhibidores Enzimáticos/farmacología , Compuestos Heterocíclicos/química , Humanos , Metano/química , Modelos Moleculares , Unión Proteica
7.
J Chem Inf Model ; 58(1): 165-181, 2018 01 22.
Artículo en Inglés | MEDLINE | ID: mdl-29172519

RESUMEN

A novel alignment-free molecular descriptor called xMaP (flexible MaP descriptor) is introduced. The descriptor is the advancement of the previously published translationally and rotationally invariant three-dimensional (3D) descriptor MaP (mapping property distributions onto the molecular surface) to the fourth dimension (4D). In addition to MaP, xMaP is independent of the chosen starting conformation of the encoded molecules and is therefore entirely alignment-free. This is achieved by using ensembles of conformers, which are generated by conformational searches. This step of the procedure is similar to Hopfinger's 4D quantitative structure-activity relationship (QSAR). A five-step procedure is used to compute the xMaP descriptor. First, a conformational search for each molecule is carried out. Next, for each of the conformers an approximation to the molecular surface with equally distributed surface points is computed. Third, molecular properties are projected onto this surface. Fourth, areas of identical properties are clustered to so-called patches. Fifth, the spatial distribution of the patches is converted into an alignment-free descriptor that is based on the entire conformer ensemble. The resulting descriptor can be interpreted by superimposing the most important descriptor variables and the molecules of the data set. The most important descriptor variables are identified with chemometric regression tools. The novel descriptor was applied to several benchmark data sets and was compared to other descriptors and QSAR techniques comprising a binary fingerprint, a topological pharmacophore descriptor (Cats2D), and the field-based 3D-QSAR technique GRID/PLS which is alignment-dependent. The use of conformer ensembles renders xMaP very robust. It turns out that xMaP performs very well on (almost) all data sets and that the statistical results are comparable to GRID/PLS. In addition to that, xMaP can also be used to efficiently visualize the derived quantitative structure-activity relationships.


Asunto(s)
Relación Estructura-Actividad Cuantitativa , Algoritmos , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Estructura Molecular , Reproducibilidad de los Resultados , Propiedades de Superficie
8.
Eur J Pharm Biopharm ; 126: 40-56, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-28532676

RESUMEN

Low aqueous solubility of active pharmaceutical ingredients presents a serious challenge in the development process of new drug products. This article provides an overview on some of the current approaches for the formulation of poorly water-soluble drugs with a special focus on strategies pursued at the Center of Pharmaceutical Engineering of the TU Braunschweig. These comprise formulation in lipid-based colloidal drug delivery systems and experimental as well as computational approaches towards the efficient identification of the most suitable carrier systems. For less lipophilic substances the preparation of drug nanoparticles by milling and precipitation is investigated for instance by means of microsystem-based manufacturing techniques and with special regard to the preparation of individualized dosage forms. Another option to overcome issues with poor drug solubility is the incorporation into nanospun fibers.


Asunto(s)
Química Farmacéutica/métodos , Composición de Medicamentos/métodos , Preparaciones Farmacéuticas/síntesis química , Agua/química , Química Farmacéutica/tendencias , Composición de Medicamentos/tendencias , Sistemas de Liberación de Medicamentos/métodos , Sistemas de Liberación de Medicamentos/tendencias , Solubilidad
9.
ACS Omega ; 3(5): 5704-5714, 2018 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-31458770

RESUMEN

The prediction of protein-ligand interactions and their corresponding binding free energy is a challenging task in structure-based drug design and related applications. Docking and scoring is broadly used to propose the binding mode and underlying interactions as well as to provide a measure for ligand affinity or differentiate between active and inactive ligands. Various studies have revealed that most docking software packages reliably predict the binding mode, although scoring remains a challenge. Here, a diverse benchmark data set of 99 matched molecular pairs (3D-MMPs) with experimentally determined X-ray structures and corresponding binding affinities is introduced. This data set was used to study the predictive power of 13 commonly used scoring functions to demonstrate the applicability of the 3D-MMP data set as a valuable tool for benchmarking scoring functions.

10.
J Cheminform ; 9(1): 44, 2017 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-29086213

RESUMEN

The goal of defining an applicability domain for a predictive classification model is to identify the region in chemical space where the model's predictions are reliable. The boundary of the applicability domain is defined with the help of a measure that shall reflect the reliability of an individual prediction. Here, the available measures are differentiated into those that flag unusual objects and which are independent of the original classifier and those that use information of the trained classifier. The former set of techniques is referred to as novelty detection while the latter is designated as confidence estimation. A review of the available confidence estimators shows that most of these measures estimate the probability of class membership of the predicted objects which is inversely related to the error probability. Thus, class probability estimates are natural candidates for defining the applicability domain but were not comprehensively included in previous benchmark studies. The focus of the present study is to find the best measure for defining the applicability domain for a given binary classification technique and to determine the performance of novelty detection versus confidence estimation. Six different binary classification techniques in combination with ten data sets were studied to benchmark the various measures. The area under the receiver operating characteristic curve (AUC ROC) was employed as main benchmark criterion. It is shown that class probability estimates constantly perform best to differentiate between reliable and unreliable predictions. Previously proposed alternatives to class probability estimates do not perform better than the latter and are inferior in most cases. Interestingly, the impact of defining an applicability domain depends on the observed area under the receiver operator characteristic curve. That means that it depends on the level of difficulty of the classification problem (expressed as AUC ROC) and will be largest for intermediately difficult problems (range AUC ROC 0.7-0.9). In the ranking of classifiers, classification random forests performed best on average. Hence, classification random forests in combination with the respective class probability estimate are a good starting point for predictive binary chemoinformatic classifiers with applicability domain. Graphical abstract .

12.
Mol Inform ; 35(5): 160-80, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27492083

RESUMEN

Classification rules are often used in chemoinformatics to predict categorical properties of drug candidates related to bioactivity from explanatory variables, which encode the respective molecular structures (i.e. molecular descriptors). To avoid predictions with an unduly large error probability, the domain the classifier is applied to should be restricted to the domain covered by the training set objects. This latter domain is commonly referred to as applicability domain in chemoinformatics. Conceptually, the applicability domain defines the region in space where the "normal" objects are located. Defining the border of the applicability domain may then be viewed as detecting anomalous or novel objects or as detecting outliers. Currently two different types of measures are in use. The first one defines the applicability domain solely in terms of the molecular descriptor space, which is referred to as novelty detection. The second type defines the applicability domain in terms of the expected reliability of the predictions which is referred to as confidence estimation. Both types are systematically differentiated here and the most popular measures are reviewed. It will be shown that all common chemoinformatic classifiers have built-in confidence scores. Since confidence estimation uses information of the class labels for computing the confidence scores, it is expected to be more efficient in reducing the error rate than novelty detection, which solely uses the information of the explanatory variables.


Asunto(s)
Bases de Datos Farmacéuticas , Algoritmos , Modelos Químicos , Estructura Molecular , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/clasificación , Relación Estructura-Actividad Cuantitativa
14.
J Comput Aided Mol Des ; 29(9): 847-65, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26070362

RESUMEN

Despite its importance and all the considerable efforts made, the progress in drug discovery is limited. One main reason for this is the partly questionable data quality. Models relating biological activity and structures and in silico predictions rely on precisely and accurately measured binding data. However, these data vary so strongly, such that only variations by orders of magnitude are considered as unreliable. This can certainly be improved considering the high analytical performance in pharmaceutical quality control. Thus the principles, properties and performances of biochemical and cell-based assays are revisited and evaluated. In the part of biochemical assays immunoassays, fluorescence assays, surface plasmon resonance, isothermal calorimetry, nuclear magnetic resonance and affinity capillary electrophoresis are discussed in details, in addition radiation-based ligand binding assays, mass spectrometry, atomic force microscopy and microscale thermophoresis are briefly evaluated. In addition, general sources of error, such as solvent, dilution, sample pretreatment and the quality of reagents and reference materials are discussed. Biochemical assays can be optimized to provide good accuracy and precision (e.g. percental relative standard deviation <10 %). Cell-based assays are often considered superior related to the biological significance, however, typically they cannot still be considered as really quantitative, in particular when results are compared over longer periods of time or between laboratories. A very careful choice of assays is therefore recommended. Strategies to further optimize assays are outlined, considering the evaluation and the decrease of the relevant error sources. Analytical performance and data quality are still advancing and will further advance the progress in drug development.


Asunto(s)
Bioensayo/normas , Exactitud de los Datos , Descubrimiento de Drogas , Calorimetría/normas , Bases de Datos Factuales , Electroforesis Capilar/normas , Fluorescencia , Inmunoensayo/normas , Ligandos , Espectroscopía de Resonancia Magnética/normas , Preparaciones Farmacéuticas/metabolismo , Sensibilidad y Especificidad , Resonancia por Plasmón de Superficie/normas
15.
J Med Chem ; 58(7): 3131-43, 2015 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-25730262

RESUMEN

The protein kinase DYRK1A has been suggested to act as one of the intracellular regulators contributing to neurological alterations found in individuals with Down syndrome. For an assessment of the role of DYRK1A, selective synthetic inhibitors are valuable pharmacological tools. However, the DYRK1A inhibitors described in the literature so far either are not sufficiently selective or have not been tested against closely related kinases from the DYRK and the CLK protein kinase families. The aim of this study was the identification of DYRK1A inhibitors exhibiting selectivity versus the structurally and functionally closely related DYRK and CLK isoforms. Structure modification of the screening hit 11H-indolo[3,2-c]quinoline-6-carboxylic acid revealed structure-activity relationships for kinase inhibition and enabled the design of 10-iodo-substituted derivatives as very potent DYRK1A inhibitors with considerable selectivity against CLKs. X-ray structure determination of three 11H-indolo[3,2-c]quinoline-6-carboxylic acids cocrystallized with DYRK1A confirmed the predicted binding mode within the ATP binding site.


Asunto(s)
Inhibidores de Proteínas Quinasas/química , Inhibidores de Proteínas Quinasas/farmacología , Proteínas Serina-Treonina Quinasas/antagonistas & inhibidores , Proteínas Serina-Treonina Quinasas/química , Proteínas Tirosina Quinasas/antagonistas & inhibidores , Proteínas Tirosina Quinasas/química , Adenosina Trifosfato/metabolismo , Sitios de Unión , Ácidos Carboxílicos/química , Técnicas de Química Sintética , Cristalografía por Rayos X , Relación Dosis-Respuesta a Droga , Células HEK293/efectos de los fármacos , Humanos , Indoles/química , Simulación del Acoplamiento Molecular , Conformación Proteica , Inhibidores de Proteínas Quinasas/metabolismo , Proteínas Serina-Treonina Quinasas/metabolismo , Proteínas Tirosina Quinasas/metabolismo , Quinolonas/química , Relación Estructura-Actividad , Quinasas DyrK
16.
Traffic ; 16(5): 493-509, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25615411

RESUMEN

The pre-exocytotic behavior of insulin granules was studied against the background of the entirety of submembrane granules in MIN6 cells, and the characteristics were compared with the macroscopic secretion pattern and the cytosolic Ca(2+) concentration of MIN6 pseudo-islets at 22°C, 32°C and 37°C. The mobility of granules labeled by insulin-EGFP and the fusion events were assessed by TIRF microscopy utilizing an observer-independent algorithm. In the z-dimension, 40 mm K(+) or 30 mm glucose increased the granule turnover. The effect of high K(+) was quickly reversible. The increase by glucose was more sustained and modified the efficacy of a subsequent K(+) stimulus. The effect size of glucose increased with physiological temperature whereas that of high K(+) did not. The mobility in the x/y-dimension and the fusion rates were little affected by the stimuli, in contrast to secretion. Fusion and secretion, however, had the same temperature dependence. Granules that appeared and fused within one image sequence had significantly larger caging diameters than pre-existent granules that underwent fusion. These in turn had a different mobility than residence-matched non-fusing granules. In conclusion, delivery to the membrane, tethering and fusion of granules are differently affected by insulinotropic stimuli. Fusion rates and secretion do not appear to be tightly coupled.


Asunto(s)
Membrana Celular/metabolismo , Exocitosis , Células Secretoras de Insulina/efectos de los fármacos , Insulina/metabolismo , Fusión de Membrana/efectos de los fármacos , Vesículas Secretoras/metabolismo , Animales , Calcio/metabolismo , Línea Celular , Membrana Celular/efectos de los fármacos , Citosol/metabolismo , Exocitosis/efectos de los fármacos , Glucosa/farmacología , Secreción de Insulina , Células Secretoras de Insulina/metabolismo , Ratones , Microscopía Fluorescente , Cloruro de Potasio/farmacología , Vesículas Secretoras/efectos de los fármacos
18.
J Cheminform ; 6(1): 47, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25506400

RESUMEN

BACKGROUND: Generally, QSAR modelling requires both model selection and validation since there is no a priori knowledge about the optimal QSAR model. Prediction errors (PE) are frequently used to select and to assess the models under study. Reliable estimation of prediction errors is challenging - especially under model uncertainty - and requires independent test objects. These test objects must not be involved in model building nor in model selection. Double cross-validation, sometimes also termed nested cross-validation, offers an attractive possibility to generate test data and to select QSAR models since it uses the data very efficiently. Nevertheless, there is a controversy in the literature with respect to the reliability of double cross-validation under model uncertainty. Moreover, systematic studies investigating the adequate parameterization of double cross-validation are still missing. Here, the cross-validation design in the inner loop and the influence of the test set size in the outer loop is systematically studied for regression models in combination with variable selection. METHODS: Simulated and real data are analysed with double cross-validation to identify important factors for the resulting model quality. For the simulated data, a bias-variance decomposition is provided. RESULTS: The prediction errors of QSAR/QSPR regression models in combination with variable selection depend to a large degree on the parameterization of double cross-validation. While the parameters for the inner loop of double cross-validation mainly influence bias and variance of the resulting models, the parameters for the outer loop mainly influence the variability of the resulting prediction error estimate. CONCLUSIONS: Double cross-validation reliably and unbiasedly estimates prediction errors under model uncertainty for regression models. As compared to a single test set, double cross-validation provided a more realistic picture of model quality and should be preferred over a single test set.

19.
J Chem Inf Model ; 54(6): 1578-95, 2014 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-24850242

RESUMEN

The analysis of Structure-Activity-Relationships (SAR) of small molecules is a fundamental task in drug discovery. Although a large number of methods are already published, there is still a strong need for novel intuitive approaches. The inSARa (intuitive networks for Structure-Activity Relationships analysis) method introduced herein takes advantage of the synergistic combination of reduced graphs (RG) and the intuitive maximum common substructure (MCS) concept. The main feature of the inSARa concept is a hierarchical network structure of clearly defined substructure relationships based on common pharmacophoric features. Thus, straightforward SAR interpretation is possible by interactive network navigation. When focusing on a set of active molecules at one single target, the resulting inSARa networks are shown to be valuable for various essential tasks in SAR analysis, such as the identification of activity cliffs or "activity switches", bioisosteric exchanges, common pharmacophoric features, or "SAR hotspots".


Asunto(s)
Gráficos por Computador , Descubrimiento de Drogas/métodos , Relación Estructura-Actividad , Bases de Datos Farmacéuticas , Preparaciones Farmacéuticas/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA