Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Altern Lab Anim ; 51(1): 39-54, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36572567

RESUMO

There is an ongoing aim to replace animal and in vitro laboratory models with in silico methods. Such replacement requires the successful validation and comparably good performance of the alternative methods. We have developed an in silico prediction system for human clinical pharmacokinetics, based on machine learning, conformal prediction and a new physiologically-based pharmacokinetic model, i.e. ANDROMEDA. The objectives of this study were: a) to evaluate how well ANDROMEDA predicts the human clinical pharmacokinetics of a previously proposed benchmarking data set comprising 24 physicochemically diverse drugs and 28 small drug molecules new to the market in 2021; b) to compare its predictive performance with that of laboratory methods; and c) to investigate and describe the pharmacokinetic characteristics of the modern drugs. Median and maximum prediction errors for the selected major parameters were ca 1.2 to 2.5-fold and 16-fold for both data sets, respectively. Prediction accuracy was on par with, or better than, the best laboratory-based prediction methods (superior performance for a vast majority of the comparisons), and the prediction range was considerably broader. The modern drugs have higher average molecular weight than those in the benchmarking set from 15 years earlier (ca 200 g/mol higher), and were predicted to (generally) have relatively complex pharmacokinetics, including permeability and dissolution limitations and significant renal, biliary and/or gut-wall elimination. In conclusion, the results were overall better than those obtained with laboratory methods, and thus serve to further validate the ANDROMEDA in silico system for the prediction of human clinical pharmacokinetics of modern and physicochemically diverse drugs.


Assuntos
Benchmarking , Modelos Biológicos , Animais , Humanos , Permeabilidade , Farmacocinética , Preparações Farmacêuticas , Simulação por Computador
2.
J Pharm Sci ; 111(9): 2614-2619, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35605685

RESUMO

The gastrointestinal uptake of macrocyclic compounds is not fully understood. Here we applied our previously validated integrated system based on machine learning and conformal prediction to predict the passive fraction absorbed (fa), maximum fraction dissolved (fdiss), substrate specificities for major efflux transporters and total fraction absorbed (fa,tot) for a selected set of designed macrocyclic compounds (n = 37; MW 407-889 g/mol) and macrocyclic drugs (n = 16; MW 734-1203 g/mole) in vivo in man. Major aims were to increase the understanding of oral absorption of macrocycles and further validate our methodology. We predicted designed macrocycles to have high fa and low to high fdiss and fa,tot, and average estimates were higher than for the larger macrocyclic drugs. With few exceptions, compounds were predicted to be effluxed and well absorbed. A 2-fold median prediction error for fa,tot was achieved for macrocycles (validation set). Advantages with our methodology include that it enables predictions for macrocycles with low permeability, Caco-2 recovery and solubility (BCS IV), and provides prediction intervals and guides optimization of absorption. The understanding of oral absorption of macrocycles was increased and the methodology was validated for prediction of the uptake of macrocycles in man.


Assuntos
Absorção Intestinal , Modelos Biológicos , Administração Oral , Células CACO-2 , Simulação por Computador , Humanos , Permeabilidade , Preparações Farmacêuticas , Solubilidade
3.
Xenobiotica ; 52(2): 113-118, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35238270

RESUMO

Pharmacokinetic/toxicokinetic (PK/TK) information for chemicals in humans is generally lacking. Here we applied machine learning, conformal prediction and a new physiologically-based PK/TK model for prediction of the human PK/TK of 65 chemicals from different classes, including carcinogens, food constituents and preservatives, vitamins, sweeteners, dyes and colours, pesticides, alternative medicines, flame retardants, psychoactive drugs, dioxins, poisons, UV-absorbents, surfactants, solvents and cosmetics.About 80% of the main human PK/TK (fraction absorbed, oral bioavailability, half-life, unbound fraction in plasma, clearance, volume of distribution, fraction excreted) for the selected chemicals was missing in the literature. This information was now added (from in silico predictions). Median and mean prediction errors for these parameters were 1.3- to 2.7-fold and 1.4- to 4.8-fold, respectively. In total, 59 and 86% of predictions had errors <2- and <5-fold, respectively. Predicted and observed PK/TK for the chemicals was generally within the range for pharmaceutical drugs.The results validated the new integrated system for prediction of the human PK/TK for different chemicals and added important missing information. No general difference in PK/TK-characteristics was found between the selected chemicals and pharmaceutical drugs.


Assuntos
Modelos Biológicos , Farmacocinética , Disponibilidade Biológica , Simulação por Computador , Humanos , Cinética , Preparações Farmacêuticas , Toxicocinética
4.
Xenobiotica ; 51(12): 1366-1371, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34845977

RESUMO

Volume of distribution at steady state (Vss) is an important pharmacokinetic endpoint. In this study we apply machine learning and conformal prediction for human Vss prediction, and make a head-to-head comparison with rat-to-man scaling, allometric scaling and the Rodgers-Lukova method on combined in silico and in vitro data, using a test set of 105 compounds with experimentally observed Vss.The mean prediction error and % with <2-fold prediction error for our method were 2.4-fold and 64%, respectively. 69% of test compounds had an observed Vss within the prediction interval at a 70% confidence level. In comparison, 2.2-, 2.9- and 3.1-fold mean errors and 69, 64 and 61% of predictions with <2-fold error was reached with rat-to-man and allometric scaling and Rodgers-Lukova method, respectively.We conclude that our method has theoretically proven validity that was empirically confirmed, and showing predictive accuracy on par with animal models and superior to an alternative widely used in silico-based method. The option for the user to select the level of confidence in predictions offers better guidance on how to optimise Vss in drug discovery applications.


Assuntos
Modelos Biológicos , Preparações Farmacêuticas , Animais , Descoberta de Drogas , Modelos Animais , Farmacocinética , Ratos
5.
J Pharm Sci ; 110(1): 42-49, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33075380

RESUMO

One of the challenges with predictive modeling is how to quantify the reliability of the models' predictions on new objects. In this work we give an introduction to conformal prediction, a framework that sits on top of traditional machine learning algorithms and which outputs valid confidence estimates to predictions from QSAR models in the form of prediction intervals that are specific to each predicted object. For regression, a prediction interval consists of an upper and a lower bound. For classification, a prediction interval is a set that contains none, one, or many of the potential classes. The size of the prediction interval is affected by a user-specified confidence/significance level, and by the nonconformity of the predicted object; i.e., the strangeness as defined by a nonconformity function. Conformal prediction provides a rigorous and mathematically proven framework for in silico modeling with guarantees on error rates as well as a consistent handling of the models' applicability domain intrinsically linked to the underlying machine learning model. Apart from introducing the concepts and types of conformal prediction, we also provide an example application for modeling ABC transporters using conformal prediction, as well as a discussion on general implications for drug discovery.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Algoritmos , Conformação Molecular , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes
6.
Gigascience ; 8(5)2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31029061

RESUMO

BACKGROUND: The complex nature of biological data has driven the development of specialized software tools. Scientific workflow management systems simplify the assembly of such tools into pipelines, assist with job automation, and aid reproducibility of analyses. Many contemporary workflow tools are specialized or not designed for highly complex workflows, such as with nested loops, dynamic scheduling, and parametrization, which is common in, e.g., machine learning. FINDINGS: SciPipe is a workflow programming library implemented in the programming language Go, for managing complex and dynamic pipelines in bioinformatics, cheminformatics, and other fields. SciPipe helps in particular with workflow constructs common in machine learning, such as extensive branching, parameter sweeps, and dynamic scheduling and parametrization of downstream tasks. SciPipe builds on flow-based programming principles to support agile development of workflows based on a library of self-contained, reusable components. It supports running subsets of workflows for improved iterative development and provides a data-centric audit logging feature that saves a full audit trace for every output file of a workflow, which can be converted to other formats such as HTML, TeX, and PDF on demand. The utility of SciPipe is demonstrated with a machine learning pipeline, a genomics, and a transcriptomics pipeline. CONCLUSIONS: SciPipe provides a solution for agile development of complex and dynamic pipelines, especially in machine learning, through a flexible application programming interface suitable for scientists used to programming or scripting.


Assuntos
Biologia Computacional , Genômica , Software , Biblioteca Gênica , Aprendizado de Máquina , Linguagens de Programação , Fluxo de Trabalho
7.
Front Pharmacol ; 9: 1256, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30459617

RESUMO

Ligand-based models can be used in drug discovery to obtain an early indication of potential off-target interactions that could be linked to adverse effects. Another application is to combine such models into a panel, allowing to compare and search for compounds with similar profiles. Most contemporary methods and implementations however lack valid measures of confidence in their predictions, and only provide point predictions. We here describe a methodology that uses Conformal Prediction for predicting off-target interactions, with models trained on data from 31 targets in the ExCAPE-DB dataset selected for their utility in broad early hazard assessment. Chemicals were represented by the signature molecular descriptor and support vector machines were used as the underlying machine learning method. By using conformal prediction, the results from predictions come in the form of confidence p-values for each class. The full pre-processing and model training process is openly available as scientific workflows on GitHub, rendering it fully reproducible. We illustrate the usefulness of the developed methodology on a set of compounds extracted from DrugBank. The resulting models are published online and are available via a graphical web interface and an OpenAPI interface for programmatic access.

8.
J Cheminform ; 10(1): 49, 2018 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-30306349

RESUMO

Ligand-based predictive modeling is widely used to generate predictive models aiding decision making in e.g. drug discovery projects. With growing data sets and requirements on low modeling time comes the necessity to analyze data sets efficiently to support rapid and robust modeling. In this study we analyzed four data sets and studied the efficiency of machine learning methods on sparse data structures, utilizing Morgan fingerprints of different radii and hash sizes, and compared with molecular signatures descriptor of different height. We specifically evaluated the effect these parameters had on modeling time, predictive performance, and memory requirements using two implementations of random forest; Scikit-learn as well as FEST. We also compared with a support vector machine implementation. Our results showed that unhashed fingerprints yield significantly better accuracy than hashed fingerprints ([Formula: see text]), with no pronounced deterioration in modeling time and memory usage. Furthermore, the fast execution and low memory usage of the FEST algorithm suggest that it is a good alternative for large, high dimensional sparse data. Both support vector machines and random forest performed equally well but results indicate that the support vector machine was better at using the extra information from larger values of the Morgan fingerprint's radius.

9.
J Cheminform ; 10(1): 17, 2018 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-29616425

RESUMO

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

10.
J Cheminform ; 9(1): 33, 2017 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-29086040

RESUMO

BACKGROUND: The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. RESULTS: We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. CONCLUSIONS: This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance.

12.
J Cheminform ; 8: 67, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27942268

RESUMO

Predictive modelling in drug discovery is challenging to automate as it often contains multiple analysis steps and might involve cross-validation and parameter tuning that create complex dependencies between tasks. With large-scale data or when using computationally demanding modelling methods, e-infrastructures such as high-performance or cloud computing are required, adding to the existing challenges of fault-tolerant automation. Workflow management systems can aid in many of these challenges, but the currently available systems are lacking in the functionality needed to enable agile and flexible predictive modelling. We here present an approach inspired by elements of the flow-based programming paradigm, implemented as an extension of the Luigi system which we name SciLuigi. We also discuss the experiences from using the approach when modelling a large set of biochemical interactions using a shared computer cluster.Graphical abstract.

13.
J Cheminform ; 8: 39, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27516811

RESUMO

The increasing size of datasets in drug discovery makes it challenging to build robust and accurate predictive models within a reasonable amount of time. In order to investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures. For modelling, two implementations of support vector machines (SVM) were used. Chemical structures were described by the signatures molecular descriptor. Results showed that for the larger datasets, the LIBLINEAR SVM implementation performed on par with the well-established libsvm with a radial basis function kernel, but with dramatically less time for model building even on modest computer resources. Using a non-linear kernel proved to be infeasible for large data sizes, even with substantial computational resources on a computer cluster. To deploy the resulting models, we extended the Bioclipse decision support framework to support models from LIBLINEAR and made our models of logD and solubility available from within Bioclipse.

14.
J Lab Autom ; 21(1): 178-87, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26246423

RESUMO

Although medical cancer treatment has improved during the past decades, it is difficult to choose between several first-line treatments supposed to be equally active in the diagnostic group. It is even more difficult to select a treatment after the standard protocols have failed. Any guidance for selection of the most effective treatment is valuable at these critical stages. We describe the principles and procedures for ex vivo assessment of drug activity in tumor cells from patients as a basis for tailored cancer treatment. Patient tumor cells are assayed for cytotoxicity with a panel of drugs. Acoustic drug dispensing provides great flexibility in the selection of drugs for testing; currently, up to 80 compounds and/or combinations thereof may be tested for each patient. Drug response predictions are obtained by classification using an empirical model based on historical responses for the diagnosis. The laboratory workflow is supported by an integrated system that enables rapid analysis and automatic generation of the clinical referral response.


Assuntos
Antineoplásicos/farmacologia , Técnicas Citológicas/métodos , Ensaios de Seleção de Medicamentos Antitumorais/métodos , Acústica , Sobrevivência Celular/efeitos dos fármacos , Células Cultivadas , Humanos , Neoplasias
15.
J Chem Inf Model ; 55(1): 19-25, 2015 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-25493610

RESUMO

Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.


Assuntos
Biologia Computacional/métodos , Metodologias Computacionais , Bases de Dados de Compostos Químicos , Descoberta de Drogas/métodos , Relação Quantitativa Estrutura-Atividade , Bases de Dados Factuais , Internet , Ligantes , Software
16.
J Chem Inf Model ; 54(11): 3211-7, 2014 Nov 24.
Artigo em Inglês | MEDLINE | ID: mdl-25318024

RESUMO

QSAR modeling using molecular signatures and support vector machines with a radial basis function is increasingly used for virtual screening in the drug discovery field. This method has three free parameters: C, γ, and signature height. C is a penalty parameter that limits overfitting, γ controls the width of the radial basis function kernel, and the signature height determines how much of the molecule is described by each atom signature. Determination of optimal values for these parameters is time-consuming. Good default values could therefore save considerable computational cost. The goal of this project was to investigate whether such default values could be found by using seven public QSAR data sets spanning a wide range of end points and using both a bit version and a count version of the molecular signatures. On the basis of the experiments performed, we recommend a parameter set of heights 0 to 2 for the count version of the signature fingerprints and heights 0 to 3 for the bit version. These are in combination with a support vector machine using C in the range of 1 to 100 and γ in the range of 0.001 to 0.1. When data sets are small or longer run times are not a problem, then there is reason to consider the addition of height 3 to the count fingerprint and a wider grid search. However, marked improvements should not be expected.


Assuntos
Avaliação Pré-Clínica de Medicamentos/métodos , Máquina de Vetores de Suporte , Benchmarking , Relação Quantitativa Estrutura-Atividade
17.
J Chem Inf Model ; 54(10): 2647-53, 2014 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-25230336

RESUMO

When evaluating a potential drug candidate it is desirable to predict target interactions in silico prior to synthesis in order to assess, e.g., secondary pharmacology. This can be done by looking at known target binding profiles of similar compounds using chemical similarity searching. The purpose of this study was to construct and evaluate the performance of chemical fingerprints based on the molecular signature descriptor for performing target binding predictions. For the comparison we used the area under the receiver operating characteristics curve (AUC) complemented with net reclassification improvement (NRI). We created two open source signature fingerprints, a bit and a count version, and evaluated their performance compared to a set of established fingerprints with regards to predictions of binding targets using Tanimoto-based similarity searching on publicly available data sets extracted from ChEMBL. The results showed that the count version of the signature fingerprint performed on par with well-established fingerprints such as ECFP. The count version outperformed the bit version slightly; however, the count version is more complex and takes more computing time and memory to run so its usage should probably be evaluated on a case-by-case basis. The NRI based tests complemented the AUC based ones and showed signs of higher power.


Assuntos
Desenho de Fármacos , Modelos Químicos , Impressão Molecular/métodos , Software , Área Sob a Curva , Simulação por Computador , Bases de Dados de Compostos Químicos , Ligantes , Estrutura Molecular , Curva ROC
18.
Bioinformatics ; 29(2): 286-9, 2013 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-23178637

RESUMO

SUMMARY: Bioclipse, a graphical workbench for the life sciences, provides functionality for managing and visualizing life science data. We introduce Bioclipse-R, which integrates Bioclipse and the statistical programming language R. The synergy between Bioclipse and R is demonstrated by the construction of a decision support system for anticancer drug screening and mutagenicity prediction, which shows how Bioclipse-R can be used to perform complex tasks from within a single software system. AVAILABILITY AND IMPLEMENTATION: Bioclipse-R is implemented as a set of Java plug-ins for Bioclipse based on the R-package rj. Source code and binary packages are available from https://github.com/bioclipse and http://www.bioclipse.net/bioclipse-r, respectively. CONTACT: martin.eklund@farmbio.uu.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Disciplinas das Ciências Biológicas , Gráficos por Computador , Software , Antineoplásicos/química , Antineoplásicos/farmacologia , Antineoplásicos/toxicidade , Interpretação Estatística de Dados , Mutagênese , Linguagens de Programação , Relação Quantitativa Estrutura-Atividade , Integração de Sistemas
19.
Curr Top Med Chem ; 12(18): 1980-6, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23110533

RESUMO

We present the open source components for drug discovery that has been developed and integrated into the graphical workbench Bioclipse. Building on a solid open source cheminformatics core, Bioclipse has advanced functionality for managing and visualizing chemical structures and related information. The features presented here include QSAR/QSPR modeling, various predictive solutions such as decision support for chemical liability assessment, site-ofmetabolism prediction, virtual screening, and knowledge discovery and integration. We demonstrate the utility of the described tools with examples from computational pharmacology, toxicology, and ADME. Bioclipse is used in both academia and industry, and is a good example of open source leading to new solutions for drug discovery.


Assuntos
Descoberta de Drogas , Software , Absorção , Algoritmos , Técnicas de Apoio para a Decisão , Avaliação Pré-Clínica de Medicamentos , Farmacocinética , Toxicologia/métodos
20.
J Cheminform ; 3(1): 37, 2011 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-21999342

RESUMO

BACKGROUND: The Blue Obelisk movement was established in 2005 as a response to the lack of Open Data, Open Standards and Open Source (ODOSOS) in chemistry. It aims to make it easier to carry out chemistry research by promoting interoperability between chemistry software, encouraging cooperation between Open Source developers, and developing community resources and Open Standards. RESULTS: This contribution looks back on the work carried out by the Blue Obelisk in the past 5 years and surveys progress and remaining challenges in the areas of Open Data, Open Standards, and Open Source in chemistry. CONCLUSIONS: We show that the Blue Obelisk has been very successful in bringing together researchers and developers with common interests in ODOSOS, leading to development of many useful resources freely available to the chemistry community.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...