RESUMEN
The different features of the impact of nanoparticles on cells, such as the structure of the core, presence/absence of doping, quality of surface, diameter, and dose, were used to define quasi-SMILES, a line of symbols encoded the above physicochemical features of the impact of nanoparticles. The correlation weight for each code in the quasi-SMILES has been calculated by the Monte Carlo method. The descriptor, which is the sum of the correlation weights, is the basis for a one-variable model of the biological activity of nano-inhibitors of human lung carcinoma cell line A549. The system of models obtained by the above scheme was checked on the self-consistence, i.e., reproducing the statistical quality of these models observed for different distributions of available nanomaterials into the training and validation sets. The computational experiments confirm the excellent potential of the approach as a tool to predict the impact of nanomaterials under different experimental conditions. In conclusion, our model is a self-consistent model system that provides a user to assess the reliability of the statistical quality of the used approach.
RESUMEN
Data on Henry's law constants make it possible to systematize geochemical conditions affecting atmosphere status and consequently triggering climate changes. The constants of Henry's law are desired for assessing the processes related to atmospheric contaminations caused by pollutants. The most important are those that are capable of long-term movements over long distances. This ability is closely related to the values of Henry's law constants. Chemical changes in gaseous mixtures affect the fate of atmospheric pollutants and ecology, climate, and human health. Since the number of organic compounds present in the atmosphere is extremely large, it is desirable to develop models suitable for predictions for the large pool of organic molecules that may be present in the atmosphere. Here, we report the development of such a model for Henry's law constants predictions of 29,439 compounds using the CORAL software (2023). The statistical quality of the model is characterized by the value of the coefficient of determination for the training and validation sets of about 0.81 (on average).
RESUMEN
Quantitative structure-property/activity relationships (QSPRs/QSARs) are a tool of modern theoretical and computational chemistry. The self-consistent model system is both a method to build up a group of QSPR/QSAR models and an approach to checking the reliability of these models. Here, a group of models of pesticide toxicity toward Daphnia magna for different distributions into training and test sub-sets is compared. This comparison is the basis for formulating the system of self-consistent models. The so-called index of the ideality of correlation (IIC) has been used to improve the above models' predictive potential of pesticide toxicity. The predictive potential of the suggested models should be classified as high since the average value of the determination coefficient for the validation sets is 0.841, and the dispersion is 0.033 (on all five models). The best model (number 4) has an average determination coefficient of 0.89 for the external validation sets (related to all five splits).
Asunto(s)
Daphnia , Plaguicidas , Animales , Reproducibilidad de los Resultados , Programas Informáticos , Método de Montecarlo , Relación Estructura-Actividad Cuantitativa , Plaguicidas/toxicidadRESUMEN
Simplified molecular input-line entry systems (SMILES) are the representation of the molecular structure that can be used to establish quantitative structure-property/activity relationships (QSPRs/QSARs) for various endpoints expressed as mathematical functions of the molecular architecture. Quasi-SMILES is extending the traditional SMILES by means of additional symbols that reflect experimental conditions. Using the quasi-SMILES models of toxicity to tadpoles gives the possibility to build up models by taking into account the time of exposure. Toxic effects of experimental situations expressed via 188 quasi-SMILES (the negative logarithm of molar concentrations which lead to lethal 50% tadpoles effected during 12 h, 24 h, 48 h, 72 h, and 96 h) were modelled with good results (the average determination coefficient for the validation sets is about 0.97). In this way, we developed new models for this amphibian endpoint, which is poorly studied.
Asunto(s)
Compuestos Orgánicos , Relación Estructura-Actividad Cuantitativa , Animales , Método de Montecarlo , Larva , Estructura Molecular , Compuestos Orgánicos/toxicidad , Programas InformáticosRESUMEN
CONTEXT: To apply the quantitative relationships "structure-endpoint" approach, the reliability of prediction is necessary but sometimes challenging to achieve. In this work, an attempt is made to accomplish the reliability of forecasts by creating a set of random partitions of data into training and validation sets, followed by constructing random models. A system of random models for a helpful approach should be self-consistent, giving a similar or at least comparable statistical quality of the predictions for models obtained using different splits of available data into training and validation sets. METHOD: The carried out computer experiments aimed at obtaining blood-brain barrier permeation models showed that, in principle, can be used such an approach (the Monte Carlo optimization of the correlation weights for different molecular features) for the above purpose taking advantage of specific algorithms to optimize the modelling steps with applying of new statistical criteria such as the index of ideality of correlation (IIC) and the correlation intensity index (CII). The results so obtained are good and better than what was reported previously. The suggested approach to validation of models is non-identic to traditionally applied manners of the checking up models. The concept of validation can be used for arbitrary models (not only for models of the blood-brain barrier).
Asunto(s)
Barrera Hematoencefálica , Compuestos Orgánicos , Reproducibilidad de los Resultados , Simulación por Computador , AlgoritmosRESUMEN
Mutagenicity is one of the most dangerous properties from the point of view of medicine and ecology. Experimental determination of mutagenicity remains a costly process, which makes it attractive to identify new hazardous compounds based on available experimental data through in silico methods or quantitative structure-activity relationships (QSAR). A system for constructing groups of random models is proposed for comparing various molecular features extracted from SMILES and graphs. For mutagenicity (mutagenicity values were expressed by the logarithm of the number of revertants per nanomole assayed by Salmonella typhimurium TA98-S9 microsomal preparation) models, the Morgan connectivity values are more informative than the comparison of quality for different rings in molecules. The resulting models were tested with the previously proposed model self-consistency system. The average determination coefficient for the validation set is 0.8737 ± 0.0312.