RESUMEN
The transdermal route of drug administration has gained popularity for its convenience and bypassing the first-pass metabolism. Accurate skin permeability prediction is crucial for successful transdermal drug delivery (TDD). In this study, we address this critical need to enhance TDD. A dataset comprising 441 records for 140 molecules with diverse LogKp values was characterized. The descriptor calculation yielded 145 relevant descriptors. Machine learning models, including MLR, RF, XGBoost, CatBoost, LGBM, and ANN, were employed for regression analysis. Notably, LGBM, XGBoost, and gradient boosting models outperformed others, demonstrating superior predictive accuracy. Key descriptors influencing skin permeability, such as hydrophobicity, hydrogen bond donors, hydrogen bond acceptors, and topological polar surface area, were identified and visualized. Cluster analysis applied to the FDA-approved drug dataset (2326 compounds) revealed four distinct clusters with significant differences in molecular characteristics. Predicted LogKp values for these clusters offered insights into the permeability variations among FDA-approved drugs. Furthermore, an investigation into skin permeability patterns across 83 classes of FDA-approved drugs based on the ATC code showcased significant differences, providing valuable information for drug development strategies. The study underscores the importance of accurate skin permeability prediction for TDD, emphasizing the superior performance of nonlinear machine learning models. The identified key descriptors and clusters contribute to a nuanced understanding of permeability characteristics among FDA-approved drugs. These findings offer actionable insights for drug design, formulation, and prioritization of molecules with optimum properties, potentially reducing reliance on costly experimental testing. Future research directions include offering promising applications in pharmaceutical research and formulation within the burgeoning field of computer-aided drug design.
RESUMEN
The primary goal of this article is to infer genetic interactions based on gene expression data. A new method for multiorganism Bayesian gene network estimation is presented based on multitask learning. When the input datasets are sparse, as is the case in microarray gene expression data, it becomes difficult to separate random correlations from true correlations that would lead to actual edges when modeling the gene interactions as a Bayesian network. Multitask learning takes advantage of the similarity between related tasks, in order to construct a more accurate model of the underlying relationships represented by the Bayesian networks. The proposed method is tested on synthetic data to illustrate its validity. Then it is iteratively applied on real gene expression data to learn the genetic regulatory networks of two organisms with homologous genes.