Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 258
Filtrar
1.
ACS Chem Neurosci ; 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-38990780

RESUMEN

Opioids are small-molecule agonists of µ-opioid receptor (µOR), while reversal agents such as naloxone are antagonists of µOR. Here, we developed machine learning (ML) models to classify the intrinsic activities of ligands at the human µOR based on the SMILES strings and two-dimensional molecular descriptors. We first manually curated a database of 983 small molecules with measured Emax values at the human µOR. Analysis of the chemical space allowed identification of dominant scaffolds and structurally similar agonists and antagonists. Decision tree models and directed message passing neural networks (MPNNs) were then trained to classify agonistic and antagonistic ligands. The hold-out test AUCs (areas under the receiver operator curves) of the extra-tree (ET) and MPNN models are 91.5 ± 3.9% and 91.8 ± 4.4%, respectively. To overcome the challenge of a small data set, a student-teacher learning method called tritraining with disagreement was tested using an unlabeled data set comprised of 15,816 ligands of human, mouse, and rat µOR, κOR, and δOR. We found that the tritraining scheme was able to increase the hold-out AUC of MPNN models to as high as 95.7%. Our work demonstrates the feasibility of developing ML models to accurately predict the intrinsic activities of µOR ligands, even with limited data. We envisage potential applications of these models in evaluating uncharacterized substances for public safety risks and discovering new therapeutic agents to counteract opioid overdoses.

2.
Int J Biol Macromol ; 276(Pt 2): 133825, 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39002900

RESUMEN

Predicting compound-induced inhibition of cardiac ion channels is crucial and challenging, significantly impacting cardiac drug efficacy and safety assessments. Despite the development of various computational methods for compound-induced inhibition prediction in cardiac ion channels, their performance remains limited. Most methods struggle to fuse multi-source data, relying solely on specific dataset training, leading to poor accuracy and generalization. We introduce MultiCBlo, a model that fuses multimodal information through a progressive learning approach, designed to predict compound-induced inhibition of cardiac ion channels with high accuracy. MultiCBlo employs progressive multimodal information fusion technology to integrate the compound's SMILES sequence, graph structure, and fingerprint, enhancing its representation. This is the first application of progressive multimodal learning for predicting compound-induced inhibition of cardiac ion channels, to our knowledge. The objective of this study was to predict the compound-induced inhibition of three major cardiac ion channels: hERG, Cav1.2, and Nav1.5. The results indicate that MultiCBlo significantly outperforms current models in predicting compound-induced inhibition of cardiac ion channels. We hope that MultiCBlo will facilitate cardiac drug development and reduce compound toxicity risks. Code and data are accessible at: https://github.com/taowang11/MultiCBlo. The online prediction platform is freely accessible at: https://huggingface.co/spaces/wtttt/PCICB.

3.
BMC Bioinformatics ; 25(1): 225, 2024 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-38926641

RESUMEN

PURPOSE: Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) from OpenAI and LLaMA (Large Language Model Meta AI) from Meta AI are increasingly recognized for their potential in the field of cheminformatics, particularly in understanding Simplified Molecular Input Line Entry System (SMILES), a standard method for representing chemical structures. These LLMs also have the ability to decode SMILES strings into vector representations. METHOD: We investigate the performance of GPT and LLaMA compared to pre-trained models on SMILES in embedding SMILES strings on downstream tasks, focusing on two key applications: molecular property prediction and drug-drug interaction prediction. RESULTS: We find that SMILES embeddings generated using LLaMA outperform those from GPT in both molecular property and DDI prediction tasks. Notably, LLaMA-based SMILES embeddings show results comparable to pre-trained models on SMILES in molecular prediction tasks and outperform the pre-trained models for the DDI prediction tasks. CONCLUSION: The performance of LLMs in generating SMILES embeddings shows great potential for further investigation of these models for molecular embedding. We hope our study bridges the gap between LLMs and molecular embedding, motivating additional research into the potential of LLMs in the molecular representation field. GitHub: https://github.com/sshaghayeghs/LLaMA-VS-GPT .


Asunto(s)
Quimioinformática , Quimioinformática/métodos , Interacciones Farmacológicas , Estructura Molecular
4.
J Comput Chem ; 2024 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-38850166

RESUMEN

Here, TS-tools is presented, a Python package facilitating the automated localization of transition states (TS) based on a textual reaction SMILES input. TS searches can either be performed at xTB or DFT level of theory, with the former yielding guesses at marginal computational cost, and the latter directly yielding accurate structures at greater expense. On a benchmarking dataset of mono- and bimolecular reactions, TS-tools reaches an excellent success rate of 95% already at xTB level of theory. For tri- and multimolecular reaction pathways - which are typically not benchmarked when developing new automated TS search approaches, yet are relevant for various types of reactivity, cf. solvent- and autocatalysis and enzymatic reactivity - TS-tools retains its ability to identify TS geometries, though a DFT treatment becomes essential in many cases. Throughout the presented applications, a particular emphasis is placed on solvation-induced mechanistic changes, another issue that received limited attention in the automated TS search literature so far.

5.
Angew Chem Int Ed Engl ; : e202408154, 2024 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-38887967

RESUMEN

The radical Truce-Smiles rearrangement is a straightforward strategy for incorporating aryl groups into organic molecules for which asymmetric processes remains rare. By employing a readily available and non-expensive chiral auxiliary, we developed a highly efficient asymmetric photocatalytic acyl and alkyl radical Truce-Smiles rearrangement of α-substituted acrylamides using tetrabutylammonium decatungstate (TBADT) as a hydrogen atom-transfer photocatalyst, along with aldehydes or C-H containing precursors. The rearranged products exhibited excellent diastereoselectivities (7:1 to >98:2 d.r.) and chiral auxiliary was easily removed. Mechanistic studies allowed understanding the transformation in which density functional theory (DFT) calculations provided insights into the stereochemistry-determining step.

6.
J Cheminform ; 16(1): 71, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38898528

RESUMEN

Among the various molecular properties and their combinations, it is a costly process to obtain the desired molecular properties through theory or experiment. Using machine learning to analyze molecular structure features and to predict molecular properties is a potentially efficient alternative for accelerating the prediction of molecular properties. In this study, we analyze molecular properties through the molecular structure from the perspective of machine learning. We use SMILES sequences as inputs to an artificial neural network in extracting molecular structural features and predicting molecular properties. A SMILES sequence comprises symbols representing molecular structures. To address the problem that a SMILES sequence is different from actual molecular structural data, we propose a pretraining model for a SMILES sequence based on the BERT model, which is widely used in natural language processing, such that the model learns to extract the molecular structural information contained in the SMILES sequence. In an experiment, we first pretrain the proposed model with 100,000 SMILES sequences and then use the pretrained model to predict molecular properties on 22 data sets and the odor characteristics of molecules (98 types of odor descriptor). The experimental results show that our proposed pretraining model effectively improves the performance of molecular property prediction SCIENTIFIC CONTRIBUTION: The 2-encoder pretraining is proposed by focusing on the lower dependency of symbols to the contextual environment in a SMILES than one in a natural language sentence and the corresponding of one compound to multiple SMILES sequences. The model pretrained with 2-encoder shows higher robustness in tasks of molecular properties prediction compared to BERT which is adept at natural language.

7.
Cureus ; 16(4): e57889, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38725786

RESUMEN

In order to effectively address challenges related to anterior teeth restoration and achieve natural-looking results, considerations such as shape, size, gingival contour, and color are crucial. Due to an increasing desire for visually appealing alternatives that are free of metal, materials such as dental zirconia have become popular because of their superior aesthetics and mechanical characteristics. This case report presents clinical insights into anterior teeth rehabilitation with the use of layered zirconia fixed dental prostheses. It delves into the experiences associated with zirconia dental restorations on both endodontically treated and vital abutments, aiming to discern how various factors influence treatment outcomes. Beginning with the design of the restoration, its intricacies significantly impact its fit, strength, and overall durability. Moreover, the composition of zirconia used plays a pivotal role, as different formulations offer varying degrees of mechanical properties, influencing factors such as resilience and wear resistance. The shade selection is also scrutinized, as it directly affects the restoration's aesthetic integration with surrounding natural teeth, contributing to a more harmonious smile. Furthermore, the layering technique employed, particularly when additional porcelain or ceramic layers are applied, is essential for both cosmetic enhancement and structural integrity. Lastly, considerations of occlusion are paramount, ensuring proper alignment and contact between teeth to prevent premature wear and discomfort. By exploring these facets in zirconia restorations across different abutment types, this inquiry seeks to illuminate best practices for achieving favorable treatment outcomes in dental restoration procedures. The choice of zirconia composition, framework design, and shade must be carefully tailored to suit the characteristics of each individual abutment. This emphasizes the significance of adopting a tailored approach to tackle the distinct challenges posed by every clinical scenario. The manuscript provides detailed observations from a clinical case involving the restoration of anterior teeth utilizing monolithic zirconia-fixed dental prostheses. Through a combination of root canal treatment and composite buildup, successful restoration was achieved, with meticulous attention paid to aesthetic considerations. The utilization of computer-aided designing/computer-aided manufacturing (CAD/CAM) technology in crafting zirconia restorations ensured precise fit and superior biocompatibility, contributing to the overall success of the treatment. The study underscores the importance of personalized treatment strategies in achieving optimal outcomes in anterior teeth restoration, emphasizing the need for careful consideration of various factors such as design, composition, and shade selection. Overall, the findings shed light on the potential of zirconia-based restorations in addressing the unique challenges associated with anterior teeth rehabilitation, offering valuable insights for dental practitioners striving to deliver aesthetically pleasing and functionally sound outcomes for their patients.

8.
Toxicol Mech Methods ; : 1-6, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-38572596

RESUMEN

Models of toxicity to tadpoles have been developed as single parameters based on special descriptors which are sums of correlation weights, molecular features, and experimental conditions. This information is presented by quasi-SMILES. Fragments of local symmetry (FLS) are involved in the development of the model and the use of FLS correlation weights improves their predictive potential. In addition, the index of ideality correlation (IIC) and correlation intensity index (CII) are compared. These two potential predictive criteria were tested in models built through Monte Carlo optimization. The CII was more effective than IIC for the models considered here.

9.
Arch Toxicol ; 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38619593

RESUMEN

Cytochrome P450 enzymes are a superfamily of enzymes responsible for the metabolism of a variety of medicines and xenobiotics. Among the Cytochrome P450 family, five isozymes that include 1A2, 2C9, 2C19, 2D6, and 3A4 are most important for the metabolism of xenobiotics. Inhibition of any of these five CYP isozymes causes drug-drug interactions with high pharmacological and toxicological effects. So, the inhibition or non-inhibition prediction of these isozymes is of great importance. Many techniques based on machine learning and deep learning algorithms are currently being used to predict whether these isozymes will be inhibited or not. In this study, three different molecular or substructural properties that include Morgan, MACCS and Morgan (combined) and RDKit of the various molecules are used to train a distinct SVM model against each isozyme (1A2, 2C9, 2C19, 2D6, and 3A4). On the independent dataset, Morgan fingerprints provided the best results, while MACCS and Morgan (combined) achieved comparable results in terms of balanced accuracy (BA), sensitivity (Sn), and Mathews correlation coefficient (MCC). For the Morgan fingerprints, balanced accuracies (BA), Mathews correlation coefficients (MCC), and sensitivities (Sn) against each CYPs isozyme, 1A2, 2C9, 2C19, 2D6, and 3A4 on an independent dataset ranged between 0.81 and 0.85, 0.61 and 0.70, 0.72 and 0.83, respectively. Similarly, on the independent dataset, MACCS and Morgan (combined) fingerprints achieved competitive results in terms of balanced accuracies (BA), Mathews correlation coefficients (MCC), and sensitivities (Sn) against each CYPs isozyme, 1A2, 2C9, 2C19, 2D6, and 3A4, which ranged between 0.79 and 0.85, 0.59 and 0.69, 0.69 and 0.82, respectively.

10.
J Cheminform ; 16(1): 42, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38622746

RESUMEN

PURPOSE: Wiswesser Line Notation (WLN) is a old line notation for encoding chemical compounds for storage and processing by computers. Whilst the notation itself has long since been surpassed by SMILES and InChI, distribution of WLN during its active years was extensive. In the context of modernising chemical data, we present a comprehensive WLN parser developed using the OpenBabel toolkit, capable of translating WLN strings into various formats supported by the library. Furthermore, we have devised a specialised Finite State Machine l, constructed from the rules of WLN, enabling the recognition and extraction of chemical strings out of large bodies of text. Available open-access WLN data with corresponding SMILES or InChI notation is rare, however ChEMBL, ChemSpider and PubChem all contain WLN records which were used for conversion scoring. Our investigation revealed a notable proportion of inaccuracies within the database entries, and we have taken steps to rectify these errors whenever feasible. SCIENTIFIC CONTRIBUTION: Tools for both the extraction and conversion of WLN from chemical documents have been successfully developed. Both the Deterministic Finite Automaton (DFA) and parser handle the majority of WLN rules officially endorsed in the three major WLN manuals, with the parser showing a clear jump in accuracy and chemical coverage over previous submissions. The GitHub repository can be found here: https://github.com/Mblakey/wiswesser .

11.
Sci Rep ; 14(1): 9262, 2024 04 22.
Artículo en Inglés | MEDLINE | ID: mdl-38649402

RESUMEN

Hepatitis B and C viruses (HBV and HCV) are significant causes of chronic liver diseases, with approximately 350 million infections globally. To accelerate the finding of effective treatment options, we introduce HBCVTr, a novel ligand-based drug design (LBDD) method for predicting the inhibitory activity of small molecules against HBV and HCV. HBCVTr employs a hybrid model consisting of double encoders of transformers and a deep neural network to learn the relationship between small molecules' simplified molecular-input line-entry system (SMILES) and their antiviral activity against HBV or HCV. The prediction accuracy of HBCVTr has surpassed baseline machine learning models and existing methods, with R-squared values of 0.641 and 0.721 for the HBV and HCV test sets, respectively. The trained models were successfully applied to virtual screening against 10 million compounds within 240 h, leading to the discovery of the top novel inhibitor candidates, including IJN04 for HBV and IJN12 and IJN19 for HCV. Molecular docking and dynamics simulations identified IJN04, IJN12, and IJN19 target proteins as the HBV core antigen, HCV NS5B RNA-dependent RNA polymerase, and HCV NS3/4A serine protease, respectively. Overall, HBCVTr offers a new and rapid drug discovery and development screening method targeting HBV and HCV.


Asunto(s)
Antivirales , Hepacivirus , Virus de la Hepatitis B , Simulación del Acoplamiento Molecular , Redes Neurales de la Computación , Antivirales/farmacología , Antivirales/química , Virus de la Hepatitis B/efectos de los fármacos , Hepacivirus/efectos de los fármacos , Humanos , Diseño de Fármacos , Proteínas no Estructurales Virales/metabolismo , Proteínas no Estructurales Virales/antagonistas & inhibidores , Hepatitis B/virología , Hepatitis B/tratamiento farmacológico , Ligandos , Simulación de Dinámica Molecular , Hepatitis C/tratamiento farmacológico , Hepatitis C/virología
12.
ACS Appl Mater Interfaces ; 16(13): 16853-16860, 2024 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-38501934

RESUMEN

In this work, we designed a multimodal transformer that combines both the Simplified Molecular Input Line Entry System (SMILES) and molecular graph representations to enhance the prediction of polymer properties. Three models with different embeddings (SMILES, SMILES + monomer, and SMILES + dimer) were employed to assess the performance of incorporating multimodal features into transformer architectures. Fine-tuning results across five properties (i.e., density, glass-transition temperature (Tg), melting temperature (Tm), volume resistivity, and conductivity) demonstrated that the multimodal transformer with both the SMILES and the dimer configuration as inputs outperformed the transformer using only SMILES across all five properties. Furthermore, our model facilitates in-depth analysis by examining attention scores, providing deeper insights into the relationship between the deep learning model and the polymer attributes. We believe that our work, shedding light on the potential of multimodal transformers in predicting polymer properties, paves a new direction for understanding and refining polymer properties.

13.
Artif Intell Med ; 150: 102820, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38553160

RESUMEN

Due to the constant increase in cancer rates, the disease has become a leading cause of death worldwide, enhancing the need for its detection and treatment. In the era of personalized medicine, the main goal is to incorporate individual variability in order to choose more precisely which therapy and prevention strategies suit each person. However, predicting the sensitivity of tumors to anticancer treatments remains a challenge. In this work, we propose two deep neural network models to predict the impact of anticancer drugs in tumors through the half-maximal inhibitory concentration (IC50). These models join biological and chemical data to apprehend relevant features of the genetic profile and the drug compounds, respectively. In order to predict the drug response in cancer cell lines, this study employed different DL methods, resorting to Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs). In the first stage, two autoencoders were pre-trained with high-dimensional gene expression and mutation data of tumors. Afterward, this genetic background is transferred to the prediction models that return the IC50 value that portrays the potency of a substance in inhibiting a cancer cell line. When comparing RSEM Expected counts and TPM as methods for displaying gene expression data, RSEM has been shown to perform better in deep models and CNNs model can obtain better insight in these types of data. Moreover, the obtained results reflect the effectiveness of the extracted deep representations in the prediction of the IC50 value that portrays the potency of a substance in inhibiting a tumor, achieving a performance of a mean squared error of 1.06 and surpassing previous state-of-the-art models.


Asunto(s)
Perfil Genético , Neoplasias , Humanos , Redes Neurales de la Computación , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Línea Celular , Genómica
14.
Angew Chem Int Ed Engl ; 63(17): e202319158, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38506603

RESUMEN

An efficient asymmetric remote arylation of C(sp3)-H bonds under photoredox conditions is described here. The reaction features the addition radicals to a double bond followed by a site-selective radical translocation (1,n-hydrogen atom transfer) as well as a stereocontrolled aryl migration via sulfinyl-Smiles rearrangement furnishing a wide range of chiral α-arylated amides with up to >99 : 1 er. Mechanistic studies indicate that the sulfinamide group governs the stereochemistry of the product with the aryl migration being the rate determining step preceded by a kinetically favored 1,n-HAT process.

15.
Pharm Res ; 41(3): 493-500, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38337105

RESUMEN

PURPOSE: In order to ensure that drug administration is safe during pregnancy, it is crucial to have the possibility to predict the placental permeability of drugs in humans. The experimental method which is most widely used for the said purpose is in vitro human placental perfusion, though the approach is highly expensive and time consuming. Quantitative structure-activity relationship (QSAR) modeling represents a powerful tool for the assessment of the drug placental transfer, and can be successfully employed to be an alternative in in vitro experiments. METHODS: The conformation-independent QSAR models covered in the present study were developed through the use of the SMILES notation descriptors and local molecular graph invariants. What is more, the Monte Carlo optimization method, was used in the test sets and the training sets as the model developer with three independent molecular splits. RESULTS: A range of different statistical parameters was used to validate the developed QSAR model, including the standard error of estimation, mean absolute error, root-mean-square error (RMSE), correlation coefficient, cross-validated correlation coefficient, Fisher ratio, MAE-based metrics and the correlation ideality index. Once the mentioned statistical methods were employed, an excellent predictive potential and robustness of the developed QSAR model was demonstrated. In addition, the molecular fragments, which are derived from the SMILES notation descriptors accounting for the decrease or increase in the investigated activity, were revealed. CONCLUSION: The presented QSAR modeling can be an invaluable tool for the high-throughput screening of the placental permeability of drugs.


Asunto(s)
Placenta , Relación Estructura-Actividad Cuantitativa , Femenino , Embarazo , Humanos , Modelos Moleculares , Método de Montecarlo , Permeabilidad
16.
Cortex ; 173: 150-160, 2024 04.
Artículo en Inglés | MEDLINE | ID: mdl-38402659

RESUMEN

Autistic adults struggle to reliably differentiate genuine and posed smiles. Intergroup bias is a promising factor that may modulate smile discrimination performance, which has been shown in neurotypical adults, and which could highlight ways to make social interactions easier. However, it is not clear whether this bias also exists in autistic people. Thus, the current study aimed to investigate this in autism using a minimal group paradigm. Seventy-five autistic and sixty-one non-autistic adults viewed videos of people making genuine or posed smiles and were informed (falsely) that some of the actors were from an in-group and others were from an out-group. The ability to identify smile authenticity of in-group and out-group members and group identification were assessed. Our results revealed that both groups seemed equally susceptible to ingroup favouritism, rating ingroup members as more genuine, but autistic adults also generally rated smiles as less genuine and were less likely to identify with ingroup members. Autistic adults showed reduced sensitivity to the different smile types but the absence of an intergroup bias in smile discrimination in both groups seems to indicate that membership can only modulate social judgements but not social abilities. These findings suggest a reconsideration of past findings that might have misrepresented the social judgements of autistic people through introducing an outgroup disadvantage, but also a need for tailored support for autistic social differences that emphasizes similarity and inclusion between diverse people.


Asunto(s)
Trastorno Autístico , Adulto , Humanos , Habilidades Sociales , Percepción Social , Sonrisa , Procesos de Grupo
17.
Chemosphere ; 350: 141086, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38163464

RESUMEN

The rising demand from consumer goods and pharmaceutical industry is driving a fast expansion of newly developed chemicals. The conventional toxicity testing of unknown chemicals is expensive, time-consuming, and raises ethical concerns. The quantitative structure-property relationship (QSPR) is an efficient computational method because it saves time, resources, and animal experimentation. Advances in machine learning have improved chemical analysis in QSPR studies, but the real-world application of machine learning-based QSPR studies was limited by the unexplainable 'black box' feature of the machine learnings. In this study, multi-encoder structure-to-toxicity (S2T)-transformer based QSPR model was developed to estimate the properties of polychlorinated biphenyls (PCBs) and endocrine disrupting chemicals (EDCs). Simplified molecular input line entry systems (SMILES) and molecular descriptors calculated by the Dragon 6 software, were simultaneously considered as input of QSPR model. Furthermore, an attention-based framework is proposed to describe the relationship between the molecular structure and toxicity of hazardous chemicals. The S2T-transformer model achieved the highest R2 scores of 0.918, 0.856, and 0.907 for logarithm of octanol-water partition coefficient (Log KOW), octanol-air partition coefficient (Log KOA), and bioconcentration factor (Log BCF) estimation of PCBs, respectively. Moreover, the attention weights were able to properly interpret the lateral (meta, para) chlorination associated with PCBs toxicity and environmental impact.


Asunto(s)
Bifenilos Policlorados , Animales , Bifenilos Policlorados/análisis , Octanoles/química , Agua/química , Programas Informáticos , Relación Estructura-Actividad Cuantitativa , Ambiente
18.
Comput Biol Med ; 169: 107880, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38211383

RESUMEN

It is challenging to model the toxicity of nitroaromatic compounds due to limited experimental data. Nitrobenzene derivatives are commonly used in industry and can lead to environmental contamination. Extensive research, including several QSPR studies, has been conducted to understand their toxicity. Predictive QSPR models can help improve chemical safety, but their limitations must be considered, and the molecular factors affecting toxicity should be carefully investigated. The latest QSPR methods, molecular modeling techniques, machine learning algorithms, and computational chemistry tools are essential for developing accurate and robust models. In this work, we used these methods to study a series of fifty compounds derived from nitrobenzene. The Monte Carlo approach was used for QSPR modeling by applying the SMILES molecular structure representation and optimal molecular descriptors. The correlation ideality index (CII) and correlation contradiction index (CCI) were further introduced as validation parameters to estimate the developed models' predictive ability. The statistical quality of the CII models was better than those without CII. The best QSPR model with the following statistical parameters (Split-3): (R2 = 0.968, CCC = 0.984, IIC = 0.861, CII = 0.979, Q2 = 0.954, QF12 = 0.946, QF22 = 0.938, QF32 = 0.947, Rm2 = 0.878, RMSE = 0.187, MAE = 0.151, FTraining = 390, FInvisible = 218, FCalibration = 240, RTest2 = 0.905) was selected to generate the studied promoters with increasing and decreasing activity.


Asunto(s)
Tetrahymena pyriformis , Modelos Moleculares , Nitrobencenos , Método de Montecarlo , Relación Estructura-Actividad Cuantitativa
19.
BMC Bioinformatics ; 25(1): 47, 2024 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-38291362

RESUMEN

Drug-drug interactions (DDI) are a critical concern in healthcare due to their potential to cause adverse effects and compromise patient safety. Supervised machine learning models for DDI prediction need to be optimized to learn abstract, transferable features, and generalize to larger chemical spaces, primarily due to the scarcity of high-quality labeled DDI data. Inspired by recent advances in computer vision, we present SMR-DDI, a self-supervised framework that leverages contrastive learning to embed drugs into a scaffold-based feature space. Molecular scaffolds represent the core structural motifs that drive pharmacological activities, making them valuable for learning informative representations. Specifically, we pre-trained SMR-DDI on a large-scale unlabeled molecular dataset. We generated augmented views for each molecule via SMILES enumeration and optimized the embedding process through contrastive loss minimization between views. This enables the model to capture relevant and robust molecular features while reducing noise. We then transfer the learned representations for the downstream prediction of DDI. Experiments show that the new feature space has comparable expressivity to state-of-the-art molecular representations and achieved competitive DDI prediction results while training on less data. Additional investigations also revealed that pre-training on more extensive and diverse unlabeled molecular datasets improved the model's capability to embed molecules more effectively. Our results highlight contrastive learning as a promising approach for DDI prediction that can identify potentially hazardous drug combinations using only structural information.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Interacciones Farmacológicas , Aprendizaje Automático Supervisado
20.
J Mol Graph Model ; 128: 108703, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38228013

RESUMEN

Molecular property prediction plays an essential role in drug discovery for identifying the candidate molecules with target properties. Deep learning models usually require sufficient labeled data to train good prediction models. However, the size of labeled data is usually small for molecular property prediction, which brings great challenges to deep learning-based molecular property prediction methods. Furthermore, the global information of molecules is critical for predicting molecular properties. Therefore, we propose INTransformer for molecular property prediction, which is a data augmentation method via contrastive learning to alleviate the limitations of the labeled molecular data while enhancing the ability to capture global information. Specifically, INTransformer consists of two identical Transformer sub-encoders to extract the molecular representation from the original SMILES and noisy SMILES respectively, while achieving the goal of data augmentation. To reduce the influence of noise, we use contrastive learning to ensure the molecular encoding of noisy SMILES is consistent with that of the original input so that the molecular representation information can be better extracted by INTransformer. Experiments on various benchmark datasets show that INTransformer achieved competitive performance for molecular property prediction tasks compared with the baselines and state-of-the-art methods.


Asunto(s)
Descubrimiento de Drogas , Suministros de Energía Eléctrica , Bases de Datos Factuales
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA