Pesquisa | BVS CLAP/SMR-OPAS/OMS

Systematic Approaches for the Encoding of Chemical Groups: A Case Study.

Karamertzanis, Panagiotis G; Patlewicz, Grace; Sannicola, Marta; Paul-Friedman, Katie; Shah, Imran.

Chem Res Toxicol ; 37(4): 600-619, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38498310

RESUMO

Regulatory authorities aim to organize substances into groups to facilitate prioritization within hazard and risk assessment processes. Often, such chemical groupings are not explicitly defined by structural rules or physicochemical property information. This is largely due to how these groupings are developed, namely, a manual expert curation process, which in turn makes updating and refining groupings, as new substances are evaluated, a practical challenge. Herein, machine learning methods were leveraged to build models that could preliminarily assign substances to predefined groups. A set of 86 groupings containing 2,184 substances as published on the European Chemicals Agency (ECHA) website were mapped to the U.S. Environmental Protection Agency (EPA) Distributed Toxicity Structure Database (DSSTox) content to extract chemical and structural information. Substances were represented using Morgan fingerprints, and two machine learning approaches were used to classify test substances into 56 groups containing at least 10 substances with a structural representation in the data set: k-nearest neighbor (kNN) and random forest (RF), that led to mean 5-fold cross-validation test accuracies (average F1 scores) of 0.781 and 0.853, respectively. With a 9% improvement, the RF classifier was significantly more accurate than KNN (p-value = 0.001). The approach offers promise as a means of the initial profiling of new substances into predefined groups to facilitate prioritization efforts and streamline the assessment of new substances when earlier groupings are available. The algorithm to fit and use these models has been made available in the accompanying repository, thereby enabling both use of the produced models and refitting of these models, as new groupings become available by regulatory authorities or industry.

Assuntos

Algoritmos , Aprendizado de Máquina , Estados Unidos , United States Environmental Protection Agency , Bases de Dados Factuais

Ver mais detalhes

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA