Pesquisa | BVS Doenças Infecciosas e Parasitárias

Quantifying the Benefits of Imputation over QSAR Methods in Toxicology Data Modeling.

Whitehead, Thomas M; Strickland, Joel; Conduit, Gareth J; Borrel, Alexandre; Mucs, Daniel; Baskerville-Abraham, Irene.

J Chem Inf Model ; 64(7): 2624-2636, 2024 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-38091381

RESUMO

Imputation machine learning (ML) surpasses traditional approaches in modeling toxicity data. The method was tested on an open-source data set comprising approximately 2500 ingredients with limited in vitro and in vivo data obtained from the OECD QSAR Toolbox. By leveraging the relationships between different toxicological end points, imputation extracts more valuable information from each data point compared to well-established single end point methods, such as ML-based Quantitative Structure Activity Relationship (QSAR) approaches, providing a final improvement of up to around 0.2 in the coefficient of determination. A significant aspect of this methodology is its resilience to the inclusion of extraneous chemical or experimental data. While additional data typically introduces a considerable level of noise and can hinder performance of single end point QSAR modeling, imputation models remain unaffected. This implies a reduction in the need for laborious manual preprocessing tasks such as feature selection, thereby making data preparation for ML analysis more efficient. This successful test, conducted on open-source data, validates the efficacy of imputation approaches in toxicity data analysis. This work opens the way for applying similar methods to other types of sparse toxicological data matrices, and so we discuss the development of regulatory authority guidelines to accept imputation models, a key aspect for the wider adoption of these methods.

Assuntos

Relação Quantitativa Estrutura-Atividade , Toxicologia , Toxicologia/métodos

Machine learning to predict mesenchymal stem cell efficacy for cartilage repair.

Liu, Yu Yang Fredrik; Lu, Yin; Oh, Steve; Conduit, Gareth J.

PLoS Comput Biol ; 16(10): e1008275, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-33027251

RESUMO

Inconsistent therapeutic efficacy of mesenchymal stem cells (MSCs) in regenerative medicine has been documented in many clinical trials. Precise prediction on the therapeutic outcome of a MSC therapy based on the patient's conditions would provide valuable references for clinicians to decide the treatment strategies. In this article, we performed a meta-analysis on MSC therapies for cartilage repair using machine learning. A small database was generated from published in vivo and clinical studies. The unique features of our neural network model in handling missing data and calculating prediction uncertainty enabled precise prediction of post-treatment cartilage repair scores with coefficient of determination of 0.637 ± 0.005. From this model, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and the type of cartilage damage as critical properties that significant impact cartilage repair. A dosage of 17 - 25 million MSCs was found to achieve optimal cartilage repair. Further, critical thresholds at 6% and 64% of cartilage damage in area, and 22% and 56% in depth were predicted to significantly compromise on the efficacy of MSC therapy. This study, for the first time, demonstrated machine learning of patient-specific cartilage repair post MSC therapy. This approach can be applied to identify and investigate more critical properties involved in MSC-induced cartilage repair, and adapted for other clinical indications.

Assuntos

Cartilagem , Aprendizado de Máquina , Transplante de Células-Tronco Mesenquimais , Engenharia Tecidual/métodos , Animais , Cartilagem/citologia , Cartilagem/lesões , Cartilagem/cirurgia , Biologia Computacional , Humanos , Células-Tronco Mesenquimais/citologia , Células-Tronco Mesenquimais/fisiologia , Modelos Biológicos , Coelhos , Ratos , Suínos

Practical Applications of Deep Learning To Impute Heterogeneous Drug Discovery Data.

Irwin, Benedict W J; Levell, Julian R; Whitehead, Thomas M; Segall, Matthew D; Conduit, Gareth J.

J Chem Inf Model ; 60(6): 2848-2857, 2020 06 22.

Artigo em Inglês | MEDLINE | ID: mdl-32478517

RESUMO

Contemporary deep learning approaches still struggle to bring a useful improvement in the field of drug discovery because of the challenges of sparse, noisy, and heterogeneous data that are typically encountered in this context. We use a state-of-the-art deep learning method, Alchemite, to impute data from drug discovery projects, including multitarget biochemical activities, phenotypic activities in cell-based assays, and a variety of absorption, distribution, metabolism, and excretion (ADME) endpoints. The resulting model gives excellent predictions for activity and ADME endpoints, offering an average increase in R2 of 0.22 versus quantitative structure-activity relationship methods. The model accuracy is robust to combining data across uncorrelated endpoints and projects with different chemical spaces, enabling a single model to be trained for all compounds and endpoints. We demonstrate improvements in accuracy on the latest chemistry and data when updating models with new data as an ongoing medicinal chemistry project progresses.

Assuntos

Aprendizado Profundo , Descoberta de Drogas , Química Farmacêutica , Relação Quantitativa Estrutura-Atividade

An Open Drug Discovery Competition: Experimental Validation of Predictive Models in a Series of Novel Antimalarials.

Tse, Edwin G; Aithani, Laksh; Anderson, Mark; Cardoso-Silva, Jonathan; Cincilla, Giovanni; Conduit, Gareth J; Galushka, Mykola; Guan, Davy; Hallyburton, Irene; Irwin, Benedict W J; Kirk, Kiaran; Lehane, Adele M; Lindblom, Julia C R; Lui, Raymond; Matthews, Slade; McCulloch, James; Motion, Alice; Ng, Ho Leung; Öeren, Mario; Robertson, Murray N; Spadavecchio, Vito; Tatsis, Vasileios A; van Hoorn, Willem P; Wade, Alexander D; Whitehead, Thomas M; Willis, Paul; Todd, Matthew H.

J Med Chem ; 64(22): 16450-16463, 2021 11 25.

Artigo em Inglês | MEDLINE | ID: mdl-34748707

RESUMO

The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum, by targeting PfATP4, an essential ion pump on the parasite surface. The structure of PfATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identification of PfATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the final round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as "ill-advised". Since all data and participant interactions remain in the public domain, this research project "lives" and may be improved by others.

Assuntos

Antimaláricos/química , Antimaláricos/farmacologia , ATPases Transportadoras de Cálcio/antagonistas & inibidores , Descoberta de Drogas , Inibidores Enzimáticos/química , Inibidores Enzimáticos/farmacologia , Modelos Biológicos , Humanos , Plasmodium falciparum/efeitos dos fármacos , Plasmodium falciparum/enzimologia , Relação Estrutura-Atividade

OPTIMADE, an API for exchanging materials data.

Andersen, Casper W; Armiento, Rickard; Blokhin, Evgeny; Conduit, Gareth J; Dwaraknath, Shyam; Evans, Matthew L; Fekete, Ádám; Gopakumar, Abhijith; Grazulis, Saulius; Merkys, Andrius; Mohamed, Fawzi; Oses, Corey; Pizzi, Giovanni; Rignanese, Gian-Marco; Scheidgen, Markus; Talirz, Leopold; Toher, Cormac; Winston, Donald; Aversa, Rossella; Choudhary, Kamal; Colinet, Pauline; Curtarolo, Stefano; Di Stefano, Davide; Draxl, Claudia; Er, Suleyman; Esters, Marco; Fornari, Marco; Giantomassi, Matteo; Govoni, Marco; Hautier, Geoffroy; Hegde, Vinay; Horton, Matthew K; Huck, Patrick; Huhs, Georg; Hummelshøj, Jens; Kariryaa, Ankit; Kozinsky, Boris; Kumbhar, Snehal; Liu, Mohan; Marzari, Nicola; Morris, Andrew J; Mostofi, Arash A; Persson, Kristin A; Petretto, Guido; Purcell, Thomas; Ricci, Francesco; Rose, Frisco; Scheffler, Matthias; Speckhard, Daniel; Uhrin, Martin.

Sci Data ; 8(1): 217, 2021 08 12.

Artigo em Inglês | MEDLINE | ID: mdl-34385453

RESUMO

The Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API through worked examples on each of the public materials databases that support the full API specification.

Tail-regression estimator for heavy-tailed distributions of known tail indices and its application to continuum quantum Monte Carlo data.

Ríos, Pablo López; Conduit, Gareth J.

Phys Rev E ; 99(6-1): 063312, 2019 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-31330629

RESUMO

Standard statistical analysis is unable to provide reliable confidence intervals on expectation values of probability distributions that do not satisfy the conditions of the central limit theorem. We present a regression-based estimator of an arbitrary moment of a probability distribution with power-law heavy tails that exploits knowledge of the exponents of its asymptotic decay to bypass this issue entirely. Our method is applied to synthetic data and to energy and atomic force data from variational and diffusion quantum Monte Carlo calculations, whose distributions have known asymptotic forms [J. R. Trail, Phys. Rev. E 77, 016703 (2008)PLEEE81539-375510.1103/PhysRevE.77.016703; A. Badinski et al., J. Phys.: Condens. Matter 22, 074202 (2010)JCOMEL0953-898410.1088/0953-8984/22/7/074202]. We obtain convergent, accurate confidence intervals on the variance of the local energy of an electron gas and on the Hellmann-Feynman force on an atom in the all-electron carbon dimer. In each of these cases the uncertainty on our estimator is 45% and 60 times smaller, respectively, than the nominal (ill-defined) standard error.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA