Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 20 de 74
Filtrer
1.
J Cheminform ; 16(1): 75, 2024 Jun 28.
Article de Anglais | MEDLINE | ID: mdl-38943219

RÉSUMÉ

Conformal prediction has seen many applications in pharmaceutical science, being able to calibrate outputs of machine learning models and producing valid prediction intervals. We here present the open source software CPSign that is a complete implementation of conformal prediction for cheminformatics modeling. CPSign implements inductive and transductive conformal prediction for classification and regression, and probabilistic prediction with the Venn-ABERS methodology. The main chemical representation is signatures but other types of descriptors are also supported. The main modeling methodology is support vector machines (SVMs), but additional modeling methods are supported via an extension mechanism, e.g. DeepLearning4J models. We also describe features for visualizing results from conformal models including calibration and efficiency plots, as well as features to publish predictive models as REST services. We compare CPSign against other common cheminformatics modeling approaches including random forest, and a directed message-passing neural network. The results show that CPSign produces robust predictive performance with comparative predictive efficiency, with superior runtime and lower hardware requirements compared to neural network based models. CPSign has been used in several studies and is in production-use in multiple organizations. The ability to work directly with chemical input files, perform descriptor calculation and modeling with SVM in the conformal prediction framework, with a single software package having a low footprint and fast execution time makes CPSign a convenient and yet flexible package for training, deploying, and predicting on chemical data. CPSign can be downloaded from GitHub at https://github.com/arosbio/cpsign .Scientific contribution CPSign provides a single software that allows users to perform data preprocessing, modeling and make predictions directly on chemical structures, using conformal and probabilistic prediction. Building and evaluating new models can be achieved at a high abstraction level, without sacrificing flexibility and predictive performance-showcased with a method evaluation against contemporary modeling approaches, where CPSign performs on par with a state-of-the-art deep learning based model.

2.
J Chem Inf Model ; 59(3): 1230-1237, 2019 03 25.
Article de Anglais | MEDLINE | ID: mdl-30726080

RÉSUMÉ

Iterative screening has emerged as a promising approach to increase the efficiency of high-throughput screening (HTS) campaigns in drug discovery. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models. One of the challenges of iterative screening is to decide how many iterations to perform. This is mainly related to difficulties in estimating the prospective hit rate in any given iteration. In this article, a novel method based on Venn-ABERS predictors is proposed. The method provides accurate estimates of the number of hits retrieved in any given iteration during an HTS campaign. The estimates provide the necessary information to support the decision on the number of iterations needed to maximize the screening outcome. Thus, this method offers a prospective screening strategy for early-stage drug discovery.


Sujet(s)
Biologie informatique/méthodes , Évaluation préclinique de médicament/méthodes , Tests de criblage à haut débit , Apprentissage machine , Relation quantitative structure-activité
3.
Disabil Rehabil ; 41(25): 3061-3070, 2019 12.
Article de Anglais | MEDLINE | ID: mdl-30039717

RÉSUMÉ

Purpose: The purpose of this study was to investigate associations between motivation for return to work and actual return to work, or increased employability among people on long-term sick leave.Materials and methods: Data by responses to questionnaires was collected from 227 people on long-term sick leave (mean = 7.9 years) due to pain syndrome or mild to moderate mental health conditions who had participated in a vocational rehabilitation intervention. The participants' motivation for return to work was measured at baseline. At 12-month follow-up, change in the type of reimbursement between baseline and at present was assessed and used to categorise outcomes as: "decreased work and employability", "unchanged", "increased employability", and "increased work". Associations between baseline motivation and return to work outcome were analysed using logistic and multinomial regression models.Results: Motivation for return to work at baseline was associated with return to work or increased employability at 12-month follow-up in the logistic regression model adjusting for potential confounders (OR 2.44, 95% CI 1.25-4.78).Conclusions: The results suggest that motivation for return to work at baseline was associated with actual chances of return to work or increased employability in people on long-term sick leave due to pain syndrome or mild to moderate mental health conditions. Implication for rehabilitationHigh motivation for return to work seems to increase the chances of actual return to work or increased employability in people on sick leave due to pain syndrome or mild to moderate mental health conditions.The potential impact of motivation for return to work is suggested to be highlighted in vocational rehabilitation.Rehabilitation professionals are recommended to recognise and take into consideration the patient's stated motivation for return to work.Rehabilitation professionals should be aware of that the patient's motivation for return to work might have an impact on the outcome of vocational rehabilitation.


Sujet(s)
Troubles mentaux/rééducation et réadaptation , Motivation , Douleur/rééducation et réadaptation , Reprise du travail , Congé maladie , Adulte , Femelle , Études de suivi , Humains , Mâle , Adulte d'âge moyen , Réadaptation professionnelle , Enquêtes et questionnaires , Suède
4.
J Chem Inf Model ; 59(3): 962-972, 2019 03 25.
Article de Anglais | MEDLINE | ID: mdl-30408959

RÉSUMÉ

The volume of high throughput screening data has considerably increased since the beginning of the automated biochemical and cell-based assays era. This information-rich data source provides tremendous repurposing opportunities for data mining. It was recently shown that biochemical or cell-based assay results can be compiled into so-called high-throughput fingerprints (HTSFPs) as a new type of descriptor describing molecular bioactivity profiles which can be applied in virtual screening, iterative screening, and target deconvolution. However, so far, studies around HTSFPs and machine learning have mainly focused on predicting the outcome of molecules in single high-throughput assays, and no one has reported the modeling of compounds' biochemical assay activities toward a panel of target proteins. In this article, we aim at comparing how our in-house HTSFPs perform at this when combined with multitask deep learning versus the single task support vector machine method both in terms of hit identification and of scaffold hopping potential. Performances obtained from the two HTSFP models were reported with respect to the performances of multitask deep learning and support vector machine models built with the structural descriptors ECFP. Moreover, we investigated the effect of high throughput screening false positives and negatives on the performance of the generated models. Our results showed that the two fingerprints yielded in similar performances and diverse hits with very little overlap, thus demonstrating the orthogonality of bioactivity profile-based descriptors with structural descriptors. Therefore, modeling compound activity data using ECFPs together with HTSFPs increases the scaffold hopping potential of the predictive models.


Sujet(s)
Évaluation préclinique de médicament/méthodes , Tests de criblage à haut débit/méthodes , Apprentissage machine ,
5.
Mutagenesis ; 34(1): 33-40, 2019 03 06.
Article de Anglais | MEDLINE | ID: mdl-30541036

RÉSUMÉ

Valid and predictive models for classifying Ames mutagenicity have been developed using conformal prediction. The models are Random Forest models using signature molecular descriptors. The investigation indicates, on excluding not-strongly mutagenic compounds (class B), that the validity for mutagenic compounds is increased for the predictions based on both public and the Division of Genetics and Mutagenesis, National Institute of Health Sciences of Japan (DGM/NIHS) data while less so when using only the latter data source. The former models only result in valid predictions for the majority, non-mutagenic, class whereas the latter models are valid for both classes, i.e. mutagenic and non-mutagenic compounds. These results demonstrate the importance of data consistency manifested through the superior predictive quality and validity of the models based only on DGM/NIHS generated data compared to a combination of this data with public data sources.


Sujet(s)
Tests de mutagénicité/tendances , Mutagènes/toxicité , Relation quantitative structure-activité , Simulation numérique , Japon , Mutagenèse/génétique
6.
Article de Anglais | MEDLINE | ID: mdl-30384498

RÉSUMÉ

BACKGROUND: People on long-term sick leave often have a long-lasting process back to work, where the individuals may be in multiple and recurrent states; i.e., receiving different social security benefits or working, and over time they may shift between these states. The purpose of this study was to evaluate the effects of two vocational rehabilitation programs, compared to a control, on return-to-work (RTW) or increased employability in patients on long-term sick leave due to mental illness and/or chronic pain. METHODS: In this randomized controlled study, 427 women and men were allocated to either (1) multidisciplinary team management, i.e., multidisciplinary assessments and individual rehabilitation management, (2) acceptance and commitment therapy (ACT), or (3) control. A positive outcome was defined as RTW or increased employability. The outcome was considered negative if the (part-time) wage was reduced or ceased, or if there was an indication of decreased employability. The outcome was measured one year after entry in the project and analyzed using binary and multinomial logistic regressions. RESULTS: Participants in the multidisciplinary team group reported having RTW odds ratio (OR) 3.31 (95% CI 1.39⁻7.87) compared to the control group in adjusted models. Participants in the ACT group reported having increased employability OR 3.22 (95% CI 1.13⁻9.15) compared to the control group in adjusted models. CONCLUSIONS: This study of vocational rehabilitation in mainly female patients on long-term sick leave due to mental illness and/or chronic pain suggests that multidisciplinary team assessments and individually adapted rehabilitation interventions increased RTW and employability. Solely receiving the ACT intervention also increased employability.


Sujet(s)
Thérapie d'acceptation et d'engagement , Douleur chronique/rééducation et réadaptation , Troubles mentaux/rééducation et réadaptation , Services de médecine du travail/méthodes , Réadaptation professionnelle/méthodes , Reprise du travail/psychologie , Adulte , Femelle , Humains , Mâle , Adulte d'âge moyen , Facteurs temps , Résultat thérapeutique
7.
J Chem Inf Model ; 58(5): 1132-1140, 2018 05 29.
Article de Anglais | MEDLINE | ID: mdl-29701973

RÉSUMÉ

Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the resultant prediction intervals to create as efficient (i.e., narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges, and the different approaches were evaluated on 29 publicly available data sets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals, but other approaches were almost as efficient. This approach afforded an average prediction range of 1.65 pIC50 units at the 80% confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.


Sujet(s)
Informatique/méthodes , Apprentissage machine , Relation quantitative structure-activité , Incertitude , Prise de décision
8.
J Arthroplasty ; 33(1): 51-54, 2018 01.
Article de Anglais | MEDLINE | ID: mdl-28844765

RÉSUMÉ

BACKGROUND: Considerable blood loss which requires transfusion is frequently reported after total hip and knee arthroplasties (THA and TKA). The purpose of this study is to review the transfusion rates in contemporary THA and TKA with optimized perioperative protocols including minimized surgical trauma and optimal perioperative patient care. METHODS: This retrospective study included 1442 consecutive patients receiving either a primary THA or a TKA from the same high-volume surgeon between January 2008 and December 2015. Demographics and surgical data were collected from patients' journals. Estimated blood loss, decline in hemoglobin, and use of transfusion were registered. RESULTS: One (0.0013%) THA and 3 (0.0044%) TKAs required blood transfusion postoperatively. Average measured bleeding was 253 mL ± 142 and 207 mL ± 169 in THA and TKA, respectively. Average decline in hemoglobin was 23.5 g/L ± 11.4 and 22.9 g/L ± 11.6 for THA and TKA, respectively. CONCLUSION: In contemporary THA and TKA, perioperative protocols and patient optimization can decrease the rate of blood transfusion to near zero.


Sujet(s)
Arthroplastie prothétique de hanche/statistiques et données numériques , Arthroplastie prothétique de genou/statistiques et données numériques , Transfusion sanguine/statistiques et données numériques , Adulte , Sujet âgé , Sujet âgé de 80 ans ou plus , Liste de contrôle , Femelle , Hémoglobines/analyse , Hémorragie , Humains , Mâle , Adulte d'âge moyen , Soins périopératoires , Période postopératoire , Études rétrospectives
9.
J Cheminform ; 9(1): 33, 2017 Jun 06.
Article de Anglais | MEDLINE | ID: mdl-29086040

RÉSUMÉ

BACKGROUND: The Chemistry Development Kit (CDK) is a widely used open source cheminformatics toolkit, providing data structures to represent chemical concepts along with methods to manipulate such structures and perform computations on them. The library implements a wide variety of cheminformatics algorithms ranging from chemical structure canonicalization to molecular descriptor calculations and pharmacophore perception. It is used in drug discovery, metabolomics, and toxicology. Over the last 10 years, the code base has grown significantly, however, resulting in many complex interdependencies among components and poor performance of many algorithms. RESULTS: We report improvements to the CDK v2.0 since the v1.2 release series, specifically addressing the increased functional complexity and poor performance. We first summarize the addition of new functionality, such atom typing and molecular formula handling, and improvement to existing functionality that has led to significantly better performance for substructure searching, molecular fingerprints, and rendering of molecules. Second, we outline how the CDK has evolved with respect to quality control and the approaches we have adopted to ensure stability, including a code review mechanism. CONCLUSIONS: This paper highlights our continued efforts to provide a community driven, open source cheminformatics library, and shows that such collaborative projects can thrive over extended periods of time, resulting in a high-quality and performant library. By taking advantage of community support and contributions, we show that an open source cheminformatics project can act as a peer reviewed publishing platform for scientific computing software. Graphical abstract CDK 2.0 provides new features and improved performance.

12.
J Chem Inf Model ; 57(7): 1591-1598, 2017 07 24.
Article de Anglais | MEDLINE | ID: mdl-28628322

RÉSUMÉ

Conformal prediction has been proposed as a more rigorous way to define prediction confidence compared to other application domain concepts that have earlier been used for QSAR modeling. One main advantage of such a method is that it provides a prediction region potentially with multiple predicted labels, which contrasts to the single valued (regression) or single label (classification) output predictions by standard QSAR modeling algorithms. Standard conformal prediction might not be suitable for imbalanced data sets. Therefore, Mondrian cross-conformal prediction (MCCP) which combines the Mondrian inductive conformal prediction with cross-fold calibration sets has been introduced. In this study, the MCCP method was applied to 18 publicly available data sets that have various imbalance levels varying from 1:10 to 1:1000 (ratio of active/inactive compounds). Our results show that MCCP in general performed well on bioactivity data sets with various imbalance levels. More importantly, the method not only provides confidence of prediction and prediction regions compared to standard machine learning methods but also produces valid predictions for the minority class. In addition, a compound similarity based nonconformity measure was investigated. Our results demonstrate that although it gives valid predictions, its efficiency is much worse than that of model dependent metrics.


Sujet(s)
Informatique/méthodes , Relation quantitative structure-activité , Algorithmes , Conformation moléculaire
13.
Toxicol Sci ; 158(1): 213-226, 2017 07 01.
Article de Anglais | MEDLINE | ID: mdl-28453775

RÉSUMÉ

Many drugs designed to inhibit kinases have their clinical utility limited by cardiotoxicity-related label warnings or prescribing restrictions. While this liability is widely recognized, designing safer kinase inhibitors (KI) requires knowledge of the causative kinase(s). Efforts to unravel the kinases have encountered pharmacology with nearly prohibitive complexity. At therapeutically relevant concentrations, KIs show promiscuity distributed across the kinome. Here, to overcome this complexity, 65 KIs with known kinome-scale polypharmacology profiles were assessed for effects on cardiomyocyte (CM) beating. Changes in human iPSC-CM beat rate and amplitude were measured using label-free cellular impedance. Correlations between beat effects and kinase inhibition profiles were mined by computation analysis (Matthews Correlation Coefficient) to identify associated kinases. Thirty kinases met criteria of having (1) pharmacological inhibition correlated with CM beat changes, (2) expression in both human-induced pluripotent stem cell-derived cardiomyocytes and adult heart tissue, and (3) effects on CM beating following single gene knockdown. A subset of these 30 kinases were selected for mechanistic follow up. Examples of kinases regulating processes spanning the excitation-contraction cascade were identified, including calcium flux (RPS6KA3, IKBKE) and action potential duration (MAP4K2). Finally, a simple model was created to predict functional cardiotoxicity whereby inactivity at three sentinel kinases (RPS6KB1, FAK, STK35) showed exceptional accuracy in vitro and translated to clinical KI safety data. For drug discovery, identifying causative kinases and introducing a predictive model should transform the ability to design safer KI medicines. For cardiovascular biology, discovering kinases previously unrecognized as influencing cardiovascular biology should stimulate investigation of underappreciated signaling pathways.


Sujet(s)
Coeur/effets des médicaments et des substances chimiques , Inhibiteurs de protéines kinases/toxicité , Calcium/métabolisme , Humains , Cellules souches pluripotentes induites/effets des médicaments et des substances chimiques , Cellules souches pluripotentes induites/enzymologie , Myocytes cardiaques/cytologie , Myocytes cardiaques/effets des médicaments et des substances chimiques , Myocytes cardiaques/enzymologie , Myocytes cardiaques/métabolisme , Protein kinases/métabolisme , RT-PCR
14.
J Cheminform ; 9: 17, 2017.
Article de Anglais | MEDLINE | ID: mdl-28316655

RÉSUMÉ

Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry-scale data not only for building predictive models of in silico polypharmacology and off-target effects but also for the validation of cheminformatics approaches in general.

15.
J Rehabil Med ; 49(2): 170-177, 2017 Jan 31.
Article de Anglais | MEDLINE | ID: mdl-28101560

RÉSUMÉ

OBJECTIVE: Mental illness and chronic pain are common reasons for long-term sick leave, typically more so for women. This study investigated the effects on return to work of 2 vocational rehabilitation programmes. METHODS: In this randomized controlled study, 308 women were allocated to treatment with acceptance and commitment therapy, to multidisciplinary assessment and individualized rehabilitation interventions, or to a control group. Return-to-work at 12 months was assessed as: (i) returning to health insurance; (ii) number of reimbursed health insurance days during follow-up; (iii) self-reported change in working hours; (iv) a composite measure of self-reported change in work-related engagement. RESULTS: The mean age of the Swedish study population was 48.5 years (standard deviation (SD) 6.3 years) and the mean time on sick leave 7.5 years (SD 3.2 years). There were no significant differences in reimbursed days or returning to the health insurance at 12 months. The multidisciplinary assessment and individualized rehabilitation interventions group, compared with control, reported a significant increase in working hours per week, as well as a significant increase in work-related engagement. CONCLUSION: Multidisciplinary assessments and individual rehabilitation interventions may improve the chance of return-to-work in women with long-term sick leave due to pain condition or mental illness.


Sujet(s)
Douleur chronique/rééducation et réadaptation , Troubles mentaux/rééducation et réadaptation , Réadaptation professionnelle/méthodes , Congé maladie/tendances , Adulte , Femelle , Études de suivi , Humains , Soins de longue durée , Adulte d'âge moyen , Reprise du travail , Facteurs temps
16.
Eat Weight Disord ; 21(4): 607-616, 2016 Dec.
Article de Anglais | MEDLINE | ID: mdl-27170194

RÉSUMÉ

PURPOSE: The main aim of this clinical study was to explore how adolescent patients with eating disorders and their parents report their perceived self-image, using Structural Analysis of Social Behavior (SASB), before and after treatment at an intensive outpatient program. Another aim was to relate the self-image of the young patients to the outcome measures body mass index (BMI) and Children's Global Assessment Scale (C-GAS) score. METHODS: A total of 93 individuals (32 adolescents, 34 mothers, and 27 fathers) completed the SASB self-report questionnaire before and after family-based treatment combined with an individual approach at a child and youth psychiatry day care unit. The patients were also assessed using the C-GAS, and their BMI was calculated. RESULTS: The self-image (SASB) of the adolescent patients was negative before treatment and changed to positive after treatment, especially regarding the clusters self-love (higher) and self-blame (lower). A positive correlation between change in self-love and in C-GAS score was found, which rose significantly. Increased self-love was an important factor, explaining a variance of 26 %. BMI also increased significantly, but without any correlation to change in SASB. The patients' fathers exhibited low on the cluster self-protection. Mothers' profiles were in line with a non-clinical group. CONCLUSIONS: Results indicate that the self-image of adolescent patients change from negative to positive alongside with a mainly positive outcome of the ED after treatment. Low self-protection according to SASB among fathers suggests the need for greater focus on their involvement.


Sujet(s)
Image du corps/psychologie , Thérapie familiale , Troubles de l'alimentation/psychologie , Troubles de l'alimentation/thérapie , Concept du soi , Adolescent , Enfant , Femelle , Humains , Mâle , Patients en consultation externe , Parents , Résultat thérapeutique
17.
BMC Fam Pract ; 16: 21, 2015 Feb 21.
Article de Anglais | MEDLINE | ID: mdl-25888369

RÉSUMÉ

BACKGROUND: Many physicians in Sweden, as well as in other countries, find the matter of certification of sickness absence (COSA) particularly burdensome. The issuing of COSAs has also been perceived as a work-environment problem among physicians. Among general practitioners (GPs) are the highest proportion of physicians in Sweden who experience difficulties with COSA. Swedish authorities have created several initiatives, by changing the social security system, to improve the rehabilitation of people who are ill and decrease the number of days of sick leave used. The aim of this study was to describe how GPs in Sweden perceive their work with COSA after these changes. METHODS: A descriptive design with a qualitative, inductive focus-group discussion (FGD) approach was used. RESULTS: Four categories emerged from the analysis of FGDs with GPs in Sweden: 1) Physicians' difficulties in their professional role; 2) Collaboration with other professionals facilitates the COSA; 3) Physicians' approach in relation to the patient; 4) An easier COSA process. CONCLUSIONS: Swedish GPs still perceived COSA to be a burdensome task. However, system changes in recent years have facilitated work related to COSA. Cooperation with other professionals on COSA was perceived positively.


Sujet(s)
Absentéisme , Médecins généralistes , Sécurité sociale , Attitude du personnel soignant , Documentation , Groupes de discussion , Médecins généralistes/psychologie , Humains , Dossiers médicaux , Rôle professionnel , Recherche qualitative , Congé maladie/législation et jurisprudence , Congé maladie/statistiques et données numériques , Sécurité sociale/législation et jurisprudence , Suède
18.
Regul Toxicol Pharmacol ; 71(2): 279-84, 2015 Mar.
Article de Anglais | MEDLINE | ID: mdl-25559551

RÉSUMÉ

Conformal prediction is presented as a framework which fulfills the OECD principles on (Q)SAR. It offers an intuitive extension to the application of machine-learning methods to structure-activity data where focus is on predictions with pre-defined confidence levels. A conformal predictor will make correct predictions on new compounds corresponding to a user defined confidence level. The confidence level can be altered depending on the situation the predictor is being used in, which allows for flexibility and adaption to risks that the user is willing to take. We demonstrate the usefulness of conformal prediction by applying it to 2 publicly available CAESAR binary classification datasets.


Sujet(s)
Bases de données factuelles , Contrôle des médicaments et des stupéfiants/législation et jurisprudence , Modèles théoriques , Conformation moléculaire , Contrôle des médicaments et des stupéfiants/méthodes , Prévision , Relation quantitative structure-activité
19.
J Chem Inf Model ; 55(1): 125-34, 2015 Jan 26.
Article de Anglais | MEDLINE | ID: mdl-25406036

RÉSUMÉ

We consider the impact of gross, systematic, and random experimental errors in relation to their impact on the predictive ability of QSAR/QSPR DMPK models used within early drug discovery. Models whose training sets contain fewer but repeatedly measured data points, with a defined threshold for the random error, resulted in prediction improvements ranging from 3.3% to 23.0% for an external test set, compared to models built from training sets in which the molecules were defined by single measurements. Similarly, models built on data with low experimental uncertainty, compared to those built on data with higher experimental uncertainty, gave prediction improvements ranging from 3.3% to 27.5%.


Sujet(s)
Préparations pharmaceutiques/métabolisme , Relation quantitative structure-activité , Animaux , Découverte de médicament , Évaluation préclinique de médicament/méthodes , Humains , Pharmacocinétique , Plan de recherche
20.
J Chem Inf Model ; 55(1): 19-25, 2015 Jan 26.
Article de Anglais | MEDLINE | ID: mdl-25493610

RÉSUMÉ

Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.


Sujet(s)
Biologie informatique/méthodes , Méthodologies informatiques , Bases de données chimiques , Découverte de médicament/méthodes , Relation quantitative structure-activité , Bases de données factuelles , Internet , Ligands , Logiciel
SÉLECTION CITATIONS
DÉTAIL DE RECHERCHE