RESUMO
The concept of similarity is an important aspect in various in silico-based prediction approaches. Most of these approaches follow the basic similarity property principle that states that two or more compounds having a high level of similarity are expected to exert similar biological activity or physicochemical property. Although in some cases this principle fails to predict the biological activity or property efficiently for certain compounds, it is applicable to most of the compounds in a given dataset. With the emerging need to efficiently fill data gaps in the regulatory context, Read-Across (RA), a similarity-based approach, has gained popularity, since this is not a statistical approach like QSAR, which requires a sizeable amount of data points to train a meaningful model. The basic idea behind Read-Across is the identification of the close source neighbors, and based on the similarity considerations, predictions are made for the query compound. Although RA is originally an unsupervised prediction method, recent efforts for quantitative Read-Across (qRA) have introduced supervised similarity-based weightage for quantitative predictions. RA is a useful tool in predictive toxicology, but one of its important drawbacks is the lack of interpretability of the features (especially for q-RA) used to generate the Read-Across-based predictions. To bridge this gap, a novel quantitative Read-Across Structure-Activity Relationship (q-RASAR) approach has recently been proposed, which combines the concepts of QSAR and Read-Across, generating statistically reliable and predictive models using similarity and error-based descriptors. The q-RASAR models are simple and interpretable and can be efficiently used to identify not only the essential features but also the nature of the source and query compounds. In this chapter, we have discussed the concepts and various studies on RA, q-RA, and q-RASAR along with some of the tools available from different research groups.
Assuntos
Relação Quantitativa Estrutura-Atividade , Simulação por Computador , Toxicologia/métodos , Algoritmos , Humanos , Biologia Computacional/métodos , SoftwareRESUMO
Oncorhynchus clarkii, Salvelinus fontinalis, and Salvelinus namaycush are vital trout species in North America, crucial for maintaining ecological balance, economic stability, and human health. These species thrive in cold, unpolluted waters and are highly vulnerable to contaminants. Given the rapid proliferation of industrial organic chemicals, traditional in vivo toxicity testing methods are inadequate to ensure timely and comprehensive risk assessments. Therefore, we employed in silico tools, namely Quantitative Structure-Activity Relationship (QSAR) and Quantitative Read-Across Structure-Activity Relationship (q-RASAR), to efficiently predict the aquatic toxicity of chemicals. Utilizing acute median lethal concentration (LC50) data from the US EPA's ToxValDB, we developed the first-ever species-specific QSAR and q-RASAR models. The q-RASAR models outperformed traditional QSAR models by achieving higher internal and external statistical quality for each species. Key toxicity-determining descriptors included electrotopological state indices, autocorrelation descriptors, and similarity-based RASAR descriptors. For O. clarkii, the presence of chlorine atoms and rotatable bonds significantly influenced toxicity. S. fontinalis toxicity was strongly affected by polarizability, and van der Waals volumes, while S. namaycush showed sensitivity to weak hydrogen bond acceptors and topological complexity. The models predicted the toxicity of 1172 external compounds, identifying the most and least toxic chemicals for each species. This study not only offers the first comprehensive q-RASAR models for predicting trout species-specific toxicity but also provides novel insights into species-specific toxicological modes of action. The results contribute significantly to chemical screening and prioritization in aquatic risk assessments, effectively filling critical data gaps and advancing predictive modeling techniques.
RESUMO
Per- and polyfluoroalkyl substances (PFASs) are widely used in modern industry, causing many adverse effects on both the environment and human health. In this study, for the first time, we followed OECD guidelines to systematically investigate the quantitative structure-activity relationship (QSAR) of the oral acute toxicity of PFASs to Rat and Mouse using simple 2D descriptors. The Read-Across similarity descriptors and 2D descriptors were also combined to develop the quantitative read-across structure-activity relationship (q-RASAR) models. Interspecies toxicity (iST) correlation was also explored between the two rodent species. All developed QSAR, q-RASAR and iST models met the state-of-the-art validation criteria and were applied for toxicity predictions of hundreds of untested PFASs in true external sets. Subsequently, we performed the priority ranking of the untested PFASs based on the model predictions, with the mechanistic interpretation of the top 20 most toxic PFASs predicted by both QSAR and q-RASAR models. The two univariate iST models were also used for filling the interspecies toxicity data gap. Overall, the developed QSAR, q-RASAR and iST models can be used as effective tools for predicting the oral acute toxicity of untested PFASs to Rat and Mouse, thus being important for risk assessment of PFASs in ecological environment.
RESUMO
This study utilized available oral acute toxicity data in Rat and Mouse for polychlorinated persistent organic pollutants (PC-POPs) to construct data fusion-driven machine learning (ML) global models. Based on atom-centered fragments (ACFs), the collected high-throughput data overcame the applicability limitations, enabling accurate toxicity prediction for a wide range of PC-POPs series compounds using only single models. The data variances in the Rat training and test sets were 1.52 and 1.34, respectively, while for the Mouse, the values were 1.48 and 1.36, respectively. Genetic algorithm (GA) was used to build multiple linear regression (MLR) models and pre-screen descriptors, addressing the "black-box" problem prevalent in ML and enhancing model interpretability. The best ML models for Rat and Mouse achieved approximately 90 % prediction reliability for over 100,000 true untested compounds. Ultimately, a warning list of highly toxic compounds for eight categories of polychlorinated atom-centered fragments (PCACFs) was generated based on the prediction results. The analysis of descriptors revealed that dioxin analogs generally exhibited higher toxicity, because the heteroatoms and ring systems increased structural complexity and formed larger conjugated systems, contributing to greater oral acute toxicity. The present study provides valuable insights for guiding the subsequent in vivo tests, environmental risk assessment and the improvement of global governance system of pollutants.
RESUMO
Salmons are crucial to ecosystems and economic activities like commercial fishing and aquaculture, while also serving as an important source of nutrients, underscoring their ecological significance and the need for sustainable management. To better understand the toxicity and biological interactions between the salmon and industrial chemicals in the aquatic environment, we utilized the ToxValDB database to develop first ever computational toxicity models for six salmon subspecies (covering Atlantic and Pacific salmon) across two genera, employing Quantitative Structure-Activity Relationship (QSAR) and quantitative Read-Across Structure-Activity Relationship (q-RASAR) methods. For three smaller datasets (Oncorhynchus nerka, Oncorhynchus keta, and Oncorhynchus gorbuscha), we created mathematical models using the entire datasets where QSAR models demonstrated superior statistical quality compared to q-RASAR. Conversely, the three larger datasets (Oncorhynchus kisutch, Oncorhynchus tshawytscha, and Salmon salar) were divided into training and test sets, the q-RASAR models yielded better results compared to QSAR models. Mechanistic interpretations of these models revealed that descriptors such as Burden eigenvalues (BCUT), autocorrelation of topological structure (ATSC), and molecular polarizability were significant predictors of toxicity. For instance, higher polarizability and certain topological features were associated with increased toxicity as per the developed models. Statistically superior models for each subspecies were used to predict the aquatic toxicity of 1085 untested organic chemicals for toxicity data gap filling and risk assessment considering the applicability domain (AD). These insights are pivotal for designing safer chemicals and emphasize the need for sustainable management of salmon populations.
Assuntos
Simulação por Computador , Relação Quantitativa Estrutura-Atividade , Salmão , Poluentes Químicos da Água , Animais , Poluentes Químicos da Água/toxicidadeRESUMO
The escalating introduction of pesticides/veterinary drugs into the environment has necessitated a rapid evaluation of their potential risks to ecosystems and human health. The developmental toxicity of pesticides/veterinary drugs was less explored, and much less the large-scale predictions for untested pesticides, veterinary drugs and bio-pesticides. Alternative methods like quantitative structure-activity relationship (QSAR) are promising because their potential to ensure the sustainable and safe use of these chemicals. We collected 133 pesticides and veterinary drugs with half-maximal active concentration (AC50) as the zebrafish embryo developmental toxicity endpoint. The QSAR model development adhered to rigorous OECD principles, ensuring that the model possessed good internal robustness (R2 > 0.6 and QLOO2 > 0.6) and external predictivity (Rtest2 > 0.7, QFn2 >0.7, and CCCtest > 0.85). To further enhance the predictive performance of the model, a quantitative read-across structure-activity relationship (q-RASAR) model was established using the combined set of RASAR and 2D descriptors. Mechanistic interpretation revealed that dipole moment, the presence of C-O fragment at 10 topological distance, molecular size, lipophilicity, and Euclidean distance (ED)-based RA function were main factors influencing toxicity. For the first time, the established QSAR and q-RASAR models were combined to prioritize the developmental toxicity of a vast array of true external compounds (pesticides/veterinary drugs/bio-pesticides) lacking experimental values. The prediction reliability of each query molecule was evaluated by leverage approach and prediction reliability indicator. Overall, the dual computational toxicology models can inform decision-making and guide the design of new pesticides/veterinary drugs with improved safety profiles.
Assuntos
Embrião não Mamífero , Praguicidas , Relação Quantitativa Estrutura-Atividade , Peixe-Zebra , Animais , Praguicidas/toxicidade , Praguicidas/química , Embrião não Mamífero/efeitos dos fármacos , Desenvolvimento Embrionário/efeitos dos fármacosRESUMO
We have developed a quantitative safety prediction model for subchronic repeated doses of diverse organic chemicals on rats using the novel quantitative read-across structure-activity relationship (q-RASAR) approach, which uses similarity-based descriptors for predictive model generation. The experimental -Log (NOAEL) values have been used here as a potential indicator of oral subchronic safety on rats as it determines the maximum dose level for which no observed adverse effects of chemicals are found. A total of 186 data points of diverse organic chemicals have been used for the model generation using structural and physicochemical (0D-2D) descriptors. The read-across-derived similarity, error, and concordance measures (RASAR descriptors) have been extracted from the preliminary 0D-2D descriptors. Then, the combined pool of RASAR and the identified 0D-2D descriptors of the training set were employed to develop the final models by using the partial least squares (PLS) algorithm. The developed PLS model was rigorously validated by various internal and external validation metrics as suggested by the Organization for Economic Co-operation and Development (OECD). The final q-RASAR model is proven to be statistically sound, robust and externally predictive (R2 = 0.85, Q2LOO = 0.82 and Q2F1 = 0.94), superseding the internal as well as external predictivity of the corresponding quantitative structure-activity relationship (QSAR) model as well as previously reported subchronic repeated dose toxicity model found in the literature. In a nutshell, the q-RASAR is an effective approach that has the potential to be used as a good alternative way to improve external predictivity, interpretability, and transferability for subchronic oral safety prediction as well as ecotoxicity risk identification.
Assuntos
Nível de Efeito Adverso não Observado , Compostos Orgânicos , Relação Quantitativa Estrutura-Atividade , Animais , Ratos , Compostos Orgânicos/toxicidade , Compostos Orgânicos/química , Administração Oral , Testes de Toxicidade Subcrônica/métodos , Masculino , Relação Dose-Resposta a Droga , Medição de Risco , FemininoRESUMO
With the aim of persistence property analysis and ecotoxicological impact of veterinary pharmaceuticals on different terrestrial species, different classes of veterinary pharmaceuticals (n = 37) with soil degradation property (DT50) were gathered and subjected to QSAR and q-RASAR model development. The models were developed from 2D descriptors under organization for economic cooperation and development guidelines with the application of multiple linear regressions along with genetic algorithm. All developed QSAR and q-RASAR were statistically significant (Internal = R2adj: 0.721-0.861, Q2LOO: 0.609-0.757, and external = Q2Fn = 0.597-0.933, MAEext = 0.174-0.260). Further, the leverage approach of applicability domain assured the model's reliability. The veterinary pharmaceuticals with no experimental values were classified based on their persistence level. Further, the terrestrial toxicity analysis of persistent veterinary pharmaceuticals was done using toxicity prediction by computer assisted technology and in-house built quantitative structure toxicity relationship models to prioritize the toxic and persistent veterinary pharmaceuticals. This study will be helpful in estimation of persistence and toxicity of existing and upcoming veterinary pharmaceuticals.
RESUMO
In the modern fast-paced lifestyle, time-efficient and nutritionally rich foods like corn and oat have gained popularity for their amino acids and antioxidant contents. The increasing demand for these cereals necessitates higher production which leads to dependency on agrochemicals, which can pose health risks through residual present in the plant products. To first report the phytotoxicity for corn and oat, our study employs QSAR, quantitative Read-Across and quantitative RASAR (q-RASAR). All developed QSAR and q-RASAR models were equally robust (R2 = 0.680-0.762, Q2Loo = 0.593-0.693, Q2F1 = 0.680-0.860) and find their superiority in either oat or corn model, respectively, based on MAE criteria. AD and PRI had been performed which confirm the reliability and predictability of the models. The mechanistic interpretation reveals that the symmetrical arrangement of electronegative atoms and polar groups directly influences the toxicity of compounds. The final phytotoxicity and prioritization are performed by the consensus approach which results into selection of 15 most toxic compounds for both species.
Assuntos
Relação Quantitativa Estrutura-Atividade , Zea mays , Avena , Agroquímicos/toxicidade , Consenso , Reprodutibilidade dos Testes , Medição de RiscoRESUMO
Labeo rohita, a fish species within the Carp family, holds significant dietary and aquacultural importance in South Asian countries. However, the habitats of L. rohita often face exposure to various harmful pesticides and organic compounds originating from industrial and agricultural runoff. It is challenging to individually investigate the effects of each potentially harmful compound. In such cases, in silico techniques like Quantitative Structure-Activity Relationship (QSAR) and quantitative Read-Across Structure-Activity Relationship (q-RASAR) can be employed to construct algorithmic models capable of simultaneously assessing the toxicity of numerous compounds. We utilized the US EPA's ToxValDB database to curate data regarding acute median lethal concentration (LC50) toxicity for L. rohita. The experimental variables included study type (mortality), study duration (ranging from 0.25 h to 4 h), exposure route (static, flowthrough, and renewal), exposure method (drinking water), and types of chemicals (industrial chemicals and pharmaceuticals). Using this dataset, we developed regression-based QSAR and q-RASAR models to predict chemical toxicity to L. rohita based on chemical descriptors. The key descriptors for predicting the toxicity of L. rohita in the regression-based QSAR model include F05[S-Cl], SpMax_EA(ri), s4_relPathLength_2, and SpDiam_AEA(ed). These descriptors can be employed to estimate the toxicity of untested compounds and aid in the development of compounds with lower toxicity based on the presence or absence of these descriptors. Both the QSAR and q-RASAR models serve as valuable tools for understanding the chemicals' structural features responsible for toxicity and for filling gaps in aquatic toxicity data by predicting the toxicity of newly untested compounds in relation to L. rohita. Finally, the developed best model was employed to predict 297 external chemicals, the most toxic substances to L. rohita were identified as cyhalothrin, isobornyl thiocyanatoacetate, and paclobutrzol, while the least toxic ones included ethyl acetate, ethylthiourea, and n-butyric acid.
Assuntos
Cyprinidae , Toxinas Biológicas , Animais , Relação Quantitativa Estrutura-Atividade , Simulação por Computador , Dose Letal Mediana , Compostos Orgânicos/toxicidadeRESUMO
We have developed quantitative toxicity prediction models for organic pesticides of agricultural importance considering different fish species using a novel quantitative Read-across structure-activity relationship (q-RASAR) approach. The current study uses experimental (Log 1/LC50) data of organic pesticides to various fish species, including Rainbow trout (RT: Oncorhynchus mykiss: 715 data points), Lepomis (LP: Lepomis macrochirus: 136 data points), and Miscellaneous (Pimephales promelas, Brachydanio rerio: 226 data points). This study has also discussed the validation of the developed models and the analysis of structural features that are important for aquatic toxicity towards fishes. The read-across-derived similarity, error, and concordance measures (RASAR descriptors) have been extracted from the preliminary 0D-2D descriptors; the combined pool of RASAR and selected 0D-2D descriptors have been used to develop the final models by employing partial least squares algorithm. All the q-RASAR models are acceptable in terms of goodness of fit, robustness, and external predictivity, superseding the quality of the respective QSAR models, as seen from the computed validation metrics. The q-RASAR is an effective approach that has the potential to be used as a good alternative way to enhance external predictivity, interpretability, and transferability for aquatic toxicity prediction as well as ecotoxicity potential identification.
Assuntos
Cyprinidae , Oncorhynchus mykiss , Praguicidas , Toxinas Biológicas , Poluentes Químicos da Água , Animais , Praguicidas/toxicidade , Praguicidas/química , Relação Quantitativa Estrutura-Atividade , Poluentes Químicos da Água/toxicidade , Peixe-ZebraRESUMO
We have reported here a quantitative read-across structure-activity relationship (q-RASAR) model for the prediction of binary mixture toxicity (acute contact toxicity) in honey bees. Both the quantitative structure-activity relationship (QSAR) and the similarity-based read-across algorithms are used simultaneously for enhancing the predictability of the model. Several similarity and error-based parameters, obtained from the read-across prediction tool, have been put together with the structural and physicochemical descriptors to develop the final q-RASAR model. The calculated statistical and validation metrics indicate the goodness-of-fit, robustness, and good predictability of the partial least squares (PLS) regression model. Machine learning algorithms like ridge regression, linear support vector machine (SVM), and non-linear SVM have been used to further enhance the predictability of the q-RASAR model. The prediction quality of the q-RASAR models outperforms the previously reported quasi-SMILEs-based QSAR model in terms of external correlation coefficient (Q2F1 SVM q-RASAR: 0.935 vs. Q2VLD QSAR: 0.89). In this research, the toxicity values of several new untested binary mixtures have been predicted with the new models, and the reliability of the PLS predictions has been validated by the prediction reliability indicator tool. The q-RASAR approach can be used as reliable, complementary, and integrative to the conventional experimental approaches of pesticide mixture risk assessment.
Assuntos
Praguicidas , Relação Quantitativa Estrutura-Atividade , Abelhas , Animais , Reprodutibilidade dos Testes , Algoritmos , Aprendizado de Máquina , Praguicidas/toxicidadeRESUMO
The availability of experimental nanotoxicity data is in general limited which warrants both the use of in silico methods for data gap filling and exploring novel methods for effective modeling. Read-Across Structure-Activity Relationship (RASAR) is an emerging cheminformatic approach that combines the usefulness of a QSAR model and similarity-based Read-Across predictions. In this work, we have generated simple, interpretable, and transferable quantitative-RASAR (q-RASAR) models which can efficiently predict the cytotoxicity of TiO2-based multi-component nanoparticles. A data set of 29 TiO2-based nanoparticles with specific amounts of noble metal precursors was rationally divided into training and test sets, and the Read-Across-based predictions for the test set were generated. The optimized hyperparameters and the similarity approach, which yield the best predictions, were used to calculate the similarity and error-based RASAR descriptors. A data fusion of the RASAR descriptors with the chemical descriptors was done followed by the best subset feature selection. The final set of selected descriptors was used to develop the q-RASAR models, which were validated using the stringent OECD criteria. Finally, a random forest model was also developed with the selected descriptors, which could efficiently predict the cytotoxicity of TiO2-based multi-component nanoparticles superseding previously reported models in the prediction quality thus showing the merits of the q-RASAR approach. To further evaluate the usefulness of the approach, we have applied the q-RASAR approach also to a second cytotoxicity data set of 34 heterogeneous TiO2-based nanoparticles which further confirmed the enhancement of external prediction quality of QSAR models after incorporation of RASAR descriptors.
Assuntos
Nanopartículas , Relação Quantitativa Estrutura-Atividade , Titânio/toxicidade , Aprendizado de Máquina , Nanopartículas/toxicidadeRESUMO
Endocrine Disruptor Chemicals are synthetic or natural molecules in the environment that promote adverse modifications of endogenous hormone regulation in humans and/or in animals. In the present research, we have applied two-dimensional quantitative structure-activity relationship (2D-QSAR) modeling to analyze the structural features of these chemicals responsible for binding to the androgen receptors (logRBA) in rats. We have collected the receptor binding data from the EDKB database (https://www.fda.gov/science-research/endocrine-disruptor-knowledge-base/accessing-edkb-database) and then employed the DTC-QSAR tool, available from https://dtclab.webs.com/software-tools, for dataset division, feature selection, and model development. The final partial least squares model was evaluated using various stringent validation criteria. From the model, we interpreted that hydrophobicity, steroidal nucleus, bulkiness and a hydrogen bond donor at an appropriate position contribute to the receptor binding affinity, while presence of electron rich features like aromaticity and polar groups decrease the receptor binding affinity. Additionally we have also performed chemical Read-Across predictions using Read-Across-v3.1 available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home, and the results for the external validation metrics were found to be better than the QSAR-derived predictions. The best quality of external predictions emerged from the q-RASAR approach which combines both read-across and QSAR. To explore the essential features responsible for the receptor binding, pharmacophore mapping, molecular docking along with molecular dynamics simulation were also performed, and the results are in accordance with the QSAR/q-RASAR findings.
Assuntos
Disruptores Endócrinos , Relação Quantitativa Estrutura-Atividade , Ratos , Humanos , Animais , Disruptores Endócrinos/toxicidade , Disruptores Endócrinos/química , Receptores Androgênicos/metabolismo , Simulação de Acoplamento Molecular , HormôniosRESUMO
Quantitative structure-activity relationship (QSAR) and read-across techniques have recently been merged into a new emerging field of read-across structure-activity relationship (RASAR) that uses the chemical similarity concepts of read-across (an unsupervised step) and finally develops a supervised learning model (like QSAR). The RASAR method has so far been used only in case of graded predictions or classification modeling. In this work, we attempt, for the first time, to apply RASAR for quantitative predictions (q-RASAR) using a case study of androgen receptor binding affinity data. We have computed a number of error-based and similarity-based measures such as weighted standard deviation of the predicted values, coefficient of variation of the computed predictions, average similarity level of close training compounds for each query molecule, standard deviation and coefficient of variation of similarity levels, maximum similarity levels to positive and negative close training compounds, a concordance measure indicating similarity to positive, negative or both classes of close training compounds, etc. We have clubbed these additional measures along with the selected chemical descriptors from the previously developed QSAR model and redeveloped new partial least squares models from the training set, and predicted the endpoint using the query data set. Interestingly, these new models outperform the internal and external validation quality of the original QSAR model. In this study, we have also introduced a new similarity-based concordance measure (Banerjee-Roy coefficient) that can significantly contribute to the model quality. A q-RASAR model also has the advantage over read-across predictions in providing easy interpretation and indicating quantitative contributions of important chemical features. The strategy described here should be applicable to other biological/toxicological/property data modeling for enhanced quality of predictions, easy interpretability, and efficient transferability.