Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 259
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36681902

RESUMEN

Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.


Asunto(s)
Algoritmos , Ligandos
2.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36642412

RESUMEN

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.


Asunto(s)
Proteínas , Proteínas/metabolismo , Bases de Datos Factuales , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica
3.
Mod Pathol ; 37(4): 100451, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38369190

RESUMEN

MET amplification (METamp) represents a promising therapeutic target in non-small cell lung cancer, but no consensus has been established to identify METamp-dependent tumors that could potentially benefit from MET inhibitors. In this study, an analysis of MET amplification/overexpression status was performed in a retrospectively recruited cohort comprising 231 patients with non-small cell lung cancer from Shanghai Chest Hospital (SCH cohort) using 3 methods: fluorescence in situ hybridization (FISH), hybrid capture-based next-generation sequencing, and immunohistochemistry for c-MET and phospho-MET. The SCH cohort included 130 cases known to be METamp positive by FISH and 101 negative controls. The clinical relevance of these approaches in predicting the efficacy of MET inhibitors was evaluated. Additionally, next-generation sequencing data from another 2 cohorts including 22,010 lung cancer cases were utilized to examine the biological characteristics of different METamp subtypes. Of the 231 cases, 145 showed MET amplification/overexpression using at least 1 method, whereas only half of them could be identified by all 3 methods. METamp can occur as focal amplification or polysomy. Our study revealed that the inconsistency between next-generation sequencing and FISH primarily occurred in the polysomy subtype. Further investigations indicated that compared with polysomy, focal amplification correlated with fewer co-occurring driver mutations, higher protein expressions of c-MET and phospho-MET, and higher incidence in acquired resistance than in de novo setting. Moreover, patients with focal amplification presented a more robust response to MET inhibitors compared with those with polysomy. Notably, a strong correlation was observed between focal amplification and programmed cell death ligand-1 expression, indicating potential therapeutic implications with combined MET inhibitor and immunotherapy for patients with both alterations. Our findings provide insights into the molecular complexity and clinical relevance of METamp in lung cancer, highlighting the role of MET focal amplification as an oncogenic driver and its feasibility as a primary biomarker to further investigate the clinical activity of MET inhibitors in future studies.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/patología , Carcinoma de Pulmón de Células no Pequeñas/genética , Estudios Retrospectivos , Hibridación Fluorescente in Situ , Mutación , China , Proteínas Proto-Oncogénicas c-met/genética , Proteínas Proto-Oncogénicas c-met/metabolismo , Aberraciones Cromosómicas , Amplificación de Genes
4.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35212357

RESUMEN

Structural information for chemical compounds is often described by pictorial images in most scientific documents, which cannot be easily understood and manipulated by computers. This dilemma makes optical chemical structure recognition (OCSR) an essential tool for automatically mining knowledge from an enormous amount of literature. However, existing OCSR methods fall far short of our expectations for realistic requirements due to their poor recovery accuracy. In this paper, we developed a deep neural network model named ABC-Net (Atom and Bond Center Network) to predict graph structures directly. Based on the divide-and-conquer principle, we propose to model an atom or a bond as a single point in the center. In this way, we can leverage a fully convolutional neural network (CNN) to generate a series of heat-maps to identify these points and predict relevant properties, such as atom types, atom charges, bond types and other properties. Thus, the molecular structure can be recovered by assembling the detected atoms and bonds. Our approach integrates all the detection and property prediction tasks into a single fully CNN, which is scalable and capable of processing molecular images quite efficiently. Experimental results demonstrate that our method could achieve a significant improvement in recognition performance compared with publicly available tools. The proposed method could be considered as a promising solution to OCSR problems and a starting point for the acquisition of molecular information in the literature.


Asunto(s)
Aprendizaje Profundo , Estructura Molecular , Redes Neurales de la Computación
5.
Acta Pharmacol Sin ; 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38750073

RESUMEN

Prostate cancer (PCa) is the second most prevalent malignancy among men worldwide. The aberrant activation of androgen receptor (AR) signaling has been recognized as a crucial oncogenic driver for PCa and AR antagonists are widely used in PCa therapy. To develop novel AR antagonist, a machine-learning MIEC-SVM model was established for the virtual screening and 51 candidates were selected and submitted for bioactivity evaluation. To our surprise, a new-scaffold AR antagonist C2 with comparable bioactivity with Enz was identified at the initial round of screening. C2 showed pronounced inhibition on the transcriptional function (IC50 = 0.63 µM) and nuclear translocation of AR and significant antiproliferative and antimetastatic activity on PCa cell line of LNCaP. In addition, C2 exhibited a stronger ability to block the cell cycle of LNCaP than Enz at lower dose and superior AR specificity. Our study highlights the success of MIEC-SVM in discovering AR antagonists, and compound C2 presents a promising new scaffold for the development of AR-targeted therapeutics.

6.
Sensors (Basel) ; 24(6)2024 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-38544071

RESUMEN

The micro-deformation monitoring radar is usually based on Permanent Scatterer (PS) technology to realize deformation inversion. When the region is continuously monitored for a long time, the radar image amplitude and pixel variance will change significantly with time. Therefore, it is difficult to select phase-stable scatterers by conventional amplitude deviation methods, as they can seriously affect the accuracy of deformation inversion. For different regions studied within the same scenario, using a PS selection method based on the same threshold often increases the size of the deformation error. Therefore, this paper proposes a new PS selection method based on the Gaussian Mixture Model (GMM). Firstly, PS candidates (PSCs) are selected based on the pixels' amplitude information. Then, the amplitude deviation index of each PSC is calculated, and each pixel's probability values in different Gaussian distributions are acquired through iterations. Subsequently, the cluster types of pixels with larger probability values are designated as low-amplitude deviation pixels. Finally, the coherence coefficient and phase stability of low-amplitude deviation pixels are calculated. By comparing the probability values of each of the pixels in different Gaussian distributions, the cluster type with the larger probability, such as high-coherence pixels and high-phase stability pixels, is selected and designated as the final PS. Our analysis of the measured data revealed that the proposed method not only increased the number of PSs in the group, but also improved the stability of the number of PSs between groups.

7.
Medicina (Kaunas) ; 60(5)2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38792893

RESUMEN

Background and Objectives: The risks of uveitis development among pediatric patients with Down syndrome (DS) remain unclear. Therefore, we aimed to determine the risk of uveitis following a diagnosis of DS. Materials and Methods: This multi-institutional retrospective cohort study utilized the TriNetX database to identify individuals aged 18 years and younger with and without a diagnosis of DS between 1 January 2000 and 31 December 2023. The non-DS cohort consisted of randomly selected control patients matched by selected variables. This included gender, age, ethnicity, and certain comorbidities. The main outcome is the incidence of new-onset uveitis. Statistical analysis of the uveitis risk was reported using hazard ratios (HRs) and 95% confidence intervals (CIs). Separate analyses of the uveitis risk among DS patients based on age groups and gender were also performed. Results: A total of 53,993 individuals with DS (46.83% female, 58.26% white, mean age at index 5.21 ± 5.76 years) and 53,993 non-DS individuals (45.56% female, 58.28% white, mean age at index 5.21 ± 5.76 years) were recruited from the TriNetX database. Our analysis also showed no overall increased risk of uveitis among DS patients (HR: 1.33 [CI: 0.89-1.99]) compared to the non-DS cohort across the 23-year study period. Subgroup analyses based on different age groups showed that those aged 0-1 year (HR: 1.36 [CI: 0.68-2.72]), 0-5 years (HR: 1.34 [CI: 0.75-2.39]), and 6-18 years (HR: 1.15 [CI: 0.67-1.96]) were found to have no association with uveitis risk compared to their respective non-DS comparators. There was also no increased risk of uveitis among females (HR: 1.49 [CI: 0.87-2.56]) or males (HR: 0.82 [CI: 0.48-1.41]) with DS compared to their respective non-DS comparators. Conclusions: Our study found no overall increased risk of uveitis following a diagnosis of DS compared to a matched control population.


Asunto(s)
Síndrome de Down , Uveítis , Humanos , Síndrome de Down/complicaciones , Masculino , Femenino , Uveítis/epidemiología , Uveítis/diagnóstico , Uveítis/etiología , Niño , Estudios Retrospectivos , Preescolar , Adolescente , Lactante , Bases de Datos Factuales , Incidencia , Estudios de Cohortes , Factores de Riesgo , Medición de Riesgo/métodos , Medición de Riesgo/estadística & datos numéricos
8.
Angew Chem Int Ed Engl ; 63(10): e202318372, 2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38205971

RESUMEN

The site-specific activation of bioorthogonal prodrugs has provided great opportunities for reducing the severe side effects of chemotherapy. However, the precise control of activation location, sustained drug production at the target site, and high bioorthogonal reaction efficiency in vivo remain great challenges. Here, we propose the construction of tumor cell membrane reactors in vivo to solve the above problems. Specifically, tumor-targeted liposomes with efficient membrane fusion capabilities are generated to install the bioorthogonal trigger, the amphiphilic tetrazine derivative, on the surface of tumor cells. These predecorated tumor cells act as many living reactors, transforming the tumor into a "drug factory" that in situ activates an externally delivered bioorthogonal prodrug, for example intratumorally injected transcyclooctene-caged doxorubicin. In contrast to the rapid elimination of cargo that is encapsulated and delivered by liposomes, these reactors permit stable retention of bioorthogonal triggers in tumor for 96 h after a single dose of liposomes via intravenous injection, allowing sustained generation of doxorubicin. Interestingly, an additional supplement of liposomes will compensate for the trigger consumed by the reaction and significantly improve the efficiency of the local reaction. This strategy provides a solution to the efficacy versus safety dilemma of tumor chemotherapy.


Asunto(s)
Compuestos Heterocíclicos , Neoplasias , Profármacos , Humanos , Profármacos/uso terapéutico , Liposomas , Neoplasias/tratamiento farmacológico , Neoplasias/patología , Doxorrubicina/uso terapéutico
9.
J Biol Chem ; 298(9): 102372, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35970391

RESUMEN

Nitrogen (N2) gas in the atmosphere is partially replenished by microbial denitrification of ammonia. Recent study has shown that Alcaligenes ammonioxydans oxidizes ammonia to dinitrogen via a process featuring the intermediate hydroxylamine, termed "Dirammox" (direct ammonia oxidation). However, the unique biochemistry of this process remains unknown. Here, we report an enzyme involved in Dirammox that catalyzes the conversion of hydroxylamine to N2. We tested previously annotated proteins involved in redox reactions, DnfA, DnfB, and DnfC, to determine their ability to catalyze the oxidation of ammonia or hydroxylamine. Our results showed that none of these proteins bound to ammonia or catalyzed its oxidation; however, we did find DnfA bound to hydroxylamine. Further experiments demonstrated that, in the presence of NADH and FAD, DnfA catalyzed the conversion of 15N-labeled hydroxylamine to 15N2. This conversion did not happen under oxygen (O2)-free conditions. Thus, we concluded that DnfA encodes a hydroxylamine oxidase. We demonstrate that DnfA is not homologous to any known hydroxylamine oxidoreductases and contains a diiron center, which was shown to be involved in catalysis via electron paramagnetic resonance experiments. Furthermore, enzyme kinetics of DnfA were assayed, revealing a Km of 92.9 ± 3.0 µM for hydroxylamine and a kcat of 0.028 ± 0.001 s-1. Finally, we show that DnfA was localized in the cytoplasm and periplasm as well as in tubular membrane invaginations in HO-1 cells. To the best of our knowledge, we conclude that DnfA is the first enzyme discovered that catalyzes oxidation of hydroxylamine to N2.


Asunto(s)
Alcaligenes , Amoníaco , Hidroxilaminas , Oxidorreductasas , Alcaligenes/enzimología , Amoníaco/metabolismo , Proteínas Bacterianas/metabolismo , Flavina-Adenina Dinucleótido/metabolismo , Hidroxilaminas/metabolismo , NAD/metabolismo , Nitrógeno/metabolismo , Oxidación-Reducción , Oxidorreductasas/metabolismo , Oxígeno
10.
Anal Chem ; 95(9): 4479-4485, 2023 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-36802539

RESUMEN

Most organophosphorus pesticide (OP) sensors reported in the literature rely on the inhibition effect of OPs on the activity of acetylcholinesterase (AChE), which suffer from the drawbacks of lack of selective recognition of OPs, high cost, and poor stability. Herein, we proposed a novel chemiluminescence (CL) strategy for the direct detection of glyphosate (an organophosphorus herbicide) with high sensitivity and specificity, which is based on the porous hydroxy zirconium oxide nanozyme (ZrOX-OH) obtained via a facile alkali solution treatment of UIO-66. ZrOX-OH displayed excellent phosphatase-like activity, which could catalyze the dephosphorylation of 3-(2'-spiroadamantyl)-4-methoxy-4-(3'-phosphoryloxyphenyl)-1,2-dioxetane (AMPPD) to generate strong CL. The experimental results showed that the phosphatase-like activity of ZrOX-OH is closely related to the content of hydroxyl groups on their surface. Interestingly, ZrOX-OH with phosphatase-like properties exhibited a unique response to glyphosate because of the consumption of the surface hydroxyl group by the unique carboxyl group of glyphosates and was thus employed to develop a CL sensor for direct and selective detection of glyphosate without using bio-enzymes. The recovery for glyphosate detection of cabbage juice ranged from 96.8 to 103.0%. We believe that the as-proposed CL sensor based on ZrOX-OH with phosphatase-like properties supplies a simpler and more highly selective approach for OP assay and provides a new method for the development of CL sensors for the direct analysis of OPs in real samples.


Asunto(s)
Acetilcolinesterasa , Plaguicidas , Acetilcolinesterasa/análisis , Plaguicidas/análisis , Compuestos Organofosforados/análisis , Luminiscencia , Monoéster Fosfórico Hidrolasas , Glifosato
11.
Anal Chem ; 95(48): 17834-17842, 2023 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-37988125

RESUMEN

Precise and sensitive analysis of exosomal microRNA (miRNA) is of great importance for noninvasive early disease diagnosis, but it remains a great challenge to detect exosomal miRNA in human blood samples because of their small size, high sequence homology, and low abundance. Herein, we integrated reliable Pt-S bond-mediated three-dimensional (3D) DNA nanomachine and magnetic separation in a homogeneous electrochemical strategy for the detection of exosomal miRNA with low background and high sensitivity. The 3D DNA nanomachine was easily prepared via a facile and rapid freezing method, and it was capable of resisting the influence of biothiols, thus endowing it with high stability. Notably, the as-developed magnetic 3D DNA nanomachine not only enabled the detection system to have a low background but also coupled with liposome nanocarriers to synergistically amplify the current signal. Consequently, by ingeniously combining the low background and multiple signal-amplification strategies in homogeneous electrochemical biosensing, highly sensitive detection of exosomal miRNA was successfully achieved. More significantly, with good anti-interference ability, the as-proposed method could effectively discriminate plasma samples from cancer patients and healthy subjects, thus showing a high potential for application in the nondestructive early clinical diagnosis of disease.


Asunto(s)
Técnicas Biosensibles , MicroARNs , Humanos , MicroARNs/análisis , ADN/análisis , Liposomas , Fenómenos Físicos , Fenómenos Magnéticos , Técnicas Biosensibles/métodos , Técnicas Electroquímicas/métodos , Límite de Detección
12.
Brief Bioinform ; 22(3)2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32496540

RESUMEN

Scoring functions (SFs) based on complex machine learning (ML) algorithms have gradually emerged as a promising alternative to overcome the weaknesses of classical SFs. However, extensive efforts have been devoted to the development of SFs based on new protein-ligand interaction representations and advanced alternative ML algorithms instead of the energy components obtained by the decomposition of existing SFs. Here, we propose a new method named energy auxiliary terms learning (EATL), in which the scoring components are extracted and used as the input for the development of three levels of ML SFs including EATL SFs, docking-EATL SFs and comprehensive SFs with ascending VS performance. The EATL approach not only outperforms classical SFs for the absolute performance (ROC) and initial enrichment (BEDROC) but also yields comparable performance compared with other advanced ML-based methods on the diverse subset of Directory of Useful Decoys: Enhanced (DUD-E). The test on the relatively unbiased actives as decoys (AD) dataset also proved the effectiveness of EATL. Furthermore, the idea of learning from SF components to yield improved screening power can also be extended to other docking programs and SFs available.


Asunto(s)
Descubrimiento de Drogas , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Proteínas/química , Unión Proteica
13.
Brief Bioinform ; 22(1): 474-484, 2021 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-31885044

RESUMEN

BACKGROUND: With the increasing development of biotechnology and information technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these resources needs to be extracted and then transformed to useful knowledge by various data mining methods. However, a main computational challenge is how to effectively represent or encode molecular objects under investigation such as chemicals, proteins, DNAs and even complicated interactions when data mining methods are employed. To further explore these complicated data, an integrated toolkit to represent different types of molecular objects and support various data mining algorithms is urgently needed. RESULTS: We developed a freely available R/CRAN package, called BioMedR, for molecular representations of chemicals, proteins, DNAs and pairwise samples of their interactions. The current version of BioMedR could calculate 293 molecular descriptors and 13 kinds of molecular fingerprints for small molecules, 9920 protein descriptors based on protein sequences and six types of generalized scale-based descriptors for proteochemometric modeling, more than 6000 DNA descriptors from nucleotide sequences and six types of interaction descriptors using three different combining strategies. Moreover, this package realized five similarity calculation methods and four powerful clustering algorithms as well as several useful auxiliary tools, which aims at building an integrated analysis pipeline for data acquisition, data checking, descriptor calculation and data modeling. CONCLUSION: BioMedR provides a comprehensive and uniform R package to link up different representations of molecular objects with each other and will benefit cheminformatics/bioinformatics and other biomedical users. It is available at: https://CRAN.R-project.org/package=BioMedR and https://github.com/wind22zhu/BioMedR/.


Asunto(s)
Biología Computacional/métodos , Sistemas de Administración de Bases de Datos , Manejo de Datos/métodos , Bases de Datos de Compuestos Químicos , Bases de Datos Genéticas , Humanos
14.
Brief Bioinform ; 22(3)2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32892221

RESUMEN

BACKGROUND: High-throughput screening (HTS) and virtual screening (VS) have been widely used to identify potential hits from large chemical libraries. However, the frequent occurrence of 'noisy compounds' in the screened libraries, such as compounds with poor drug-likeness, poor selectivity or potential toxicity, has greatly weakened the enrichment capability of HTS and VS campaigns. Therefore, the development of comprehensive and credible tools to detect noisy compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a freely available integrated python library for negative design, called Scopy, which supports the functions of data preparation, calculation of descriptors, scaffolds and screening filters, and data visualization. The current version of Scopy can calculate 39 basic molecular properties, 3 comprehensive molecular evaluation scores, 2 types of molecular scaffolds, 6 types of substructure descriptors and 2 types of fingerprints. A number of important screening rules are also provided by Scopy, including 15 drug-likeness rules (13 drug-likeness rules and 2 building block rules), 8 frequent hitter rules (four assay interference substructure filters and four promiscuous compound substructure filters), and 11 toxicophore filters (five human-related toxicity substructure filters, three environment-related toxicity substructure filters and three comprehensive toxicity substructure filters). Moreover, this library supports four different visualization functions to help users to gain a better understanding of the screened data, including basic feature radar chart, feature-feature-related scatter diagram, functional group marker gram and cloud gram. CONCLUSION: Scopy provides a comprehensive Python package to filter out compounds with undesirable properties or substructures, which will benefit the design of high-quality chemical libraries for drug design and discovery. It is freely available at https://github.com/kotori-y/Scopy.


Asunto(s)
Bases de Datos Farmacéuticas/estadística & datos numéricos , Diseño de Fármacos , Desarrollo de Medicamentos/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Bibliotecas de Moléculas Pequeñas , Productos Biológicos/química , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Estabilidad de Medicamentos , Humanos , Estructura Molecular , Preparaciones Farmacéuticas/química , Reproducibilidad de los Resultados , Proyectos de Investigación
15.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34427296

RESUMEN

Computational methods have become indispensable tools to accelerate the drug discovery process and alleviate the excessive dependence on time-consuming and labor-intensive experiments. Traditional feature-engineering approaches heavily rely on expert knowledge to devise useful features, which could be costly and sometimes biased. The emerging deep learning (DL) methods deliver a data-driven method to automatically learn expressive representations from complex raw data. Inspired by this, researchers have attempted to apply various deep neural network models to simplified molecular input line entry specification (SMILES) strings, which contain all the composition and structure information of molecules. However, current models usually suffer from the scarcity of labeled data. This results in a low generalization ability of SMILES-based DL models, which prevents them from competing with the state-of-the-art computational methods. In this study, we utilized the BiLSTM (bidirectional long short term merory) attention network (BAN) in which we employed a novel multi-step attention mechanism to facilitate the extracting of key features from the SMILES strings. Meanwhile, SMILES enumeration was utilized as a data augmentation method in the training phase to substantially increase the number of labeled data and enlarge the probability of mining more patterns from complex SMILES. We again took advantage of SMILES enumeration in the prediction phase to rectify model prediction bias and provide a more accurate prediction. Combined with the BAN model, our strategies can greatly improve the performance of latent features learned from SMILES strings. In 11 canonical absorption, distribution, metabolism, excretion and toxicity-related tasks, our method outperformed the state-of-the-art approaches.


Asunto(s)
Quimioinformática/métodos , Aprendizaje Profundo , Descubrimiento de Drogas/métodos , Programas Informáticos , Algoritmos , Desarrollo de Medicamentos , Proyectos de Investigación
16.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33418563

RESUMEN

Matched molecular pairs analysis (MMPA) has become a powerful tool for automatically and systematically identifying medicinal chemistry transformations from compound/property datasets. However, accurate determination of matched molecular pair (MMP) transformations largely depend on the size and quality of existing experimental data. Lack of high-quality experimental data heavily hampers the extraction of more effective medicinal chemistry knowledge. Here, we developed a new strategy called quantitative structure-activity relationship (QSAR)-assisted-MMPA to expand the number of chemical transformations and took the logD7.4 property endpoint as an example to demonstrate the reliability of the new method. A reliable logD7.4 consensus prediction model was firstly established, and its applicability domain was strictly assessed. By applying the reliable logD7.4 prediction model to screen two chemical databases, we obtained more high-quality logD7.4 data by defining a strict applicability domain threshold. Then, MMPA was performed on the predicted data and experimental data to derive more chemical rules. To validate the reliability of the chemical rules, we compared the magnitude and directionality of the property changes of the predicted rules with those of the measured rules. Then, we compared the novel chemical rules generated by our proposed approach with the published chemical rules, and found that the magnitude and directionality of the property changes were consistent, indicating that the proposed QSAR-assisted-MMPA approach has the potential to enrich the collection of rule types or even identify completely novel rules. Finally, we found that the number of the MMP rules derived from the experimental data could be amplified by the predicted data, which is helpful for us to analyze the medicinal chemical rules in local chemical environment. In summary, the proposed QSAR-assisted-MMPA approach could be regarded as a very promising strategy to expand the chemical transformation space for lead optimization, especially when no enough experimental data can support MMPA.


Asunto(s)
Técnicas de Química Sintética/métodos , Química Farmacéutica/métodos , Descubrimiento de Drogas/métodos , Drogas en Investigación/síntesis química , Modelos Estadísticos , Biotransformación , Bases de Datos de Compuestos Químicos , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/estadística & datos numéricos , Drogas en Investigación/metabolismo , Humanos , Estructura Molecular , Relación Estructura-Actividad Cuantitativa , Reproducibilidad de los Resultados
17.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33709154

RESUMEN

BACKGROUND: Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS: In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION: PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.


Asunto(s)
Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Programas Informáticos , Pruebas de Carcinogenicidad/métodos , Carcinógenos , Ensayos de Selección de Medicamentos Antitumorales/métodos , Humanos
18.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33951729

RESUMEN

MOTIVATION: Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains. Nevertheless, when applied to molecular property prediction, AI models usually suffer from the scarcity of labeled data and show poor generalization ability. RESULTS: In this study, we proposed molecular graph BERT (MG-BERT), which integrates the local message passing mechanism of graph neural networks (GNNs) into the powerful BERT model to facilitate learning from molecular graphs. Furthermore, an effective self-supervised learning strategy named masked atoms prediction was proposed to pretrain the MG-BERT model on a large amount of unlabeled data to mine context information in molecules. We found the MG-BERT model can generate context-sensitive atomic representations after pretraining and transfer the learned knowledge to the prediction of a variety of molecular properties. The experimental results show that the pretrained MG-BERT model with a little extra fine-tuning can consistently outperform the state-of-the-art methods on all 11 ADMET datasets. Moreover, the MG-BERT model leverages attention mechanisms to focus on atomic features essential to the target property, providing excellent interpretability for the trained model. The MG-BERT model does not require any hand-crafted feature as input and is more reliable due to its excellent interpretability, providing a novel framework to develop state-of-the-art models for a wide range of drug discovery tasks.


Asunto(s)
Modelos Teóricos , Redes Neurales de la Computación
19.
Brief Bioinform ; 22(4)2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-33201188

RESUMEN

BACKGROUND: Fluorescent detection methods are indispensable tools for chemical biology. However, the frequent appearance of potential fluorescent compound has greatly interfered with the recognition of compounds with genuine activity. Such fluorescence interference is especially difficult to identify as it is reproducible and possesses concentration-dependent characteristic. Therefore, the development of a credible screening tool to detect fluorescent compounds from chemical libraries is urgently needed in early stages of drug discovery. RESULTS: In this study, we developed a webserver ChemFLuo for fluorescent compound detection, based on two large and high-quality training datasets containing 4906 blue and 8632 green fluorescent compounds. These molecules were used to construct a group of prediction models based on the combination of three machine learning algorithms and seven types of molecular representations. The best blue fluorescence prediction model achieved with balanced accuracy (BA) = 0.858 and area under the receiver operating characteristic curve (AUC) = 0.931 for the validation set, and BA = 0.823 and AUC = 0.903 for the test set. The best green fluorescence prediction model achieved the prediction accuracy with BA = 0.810 and AUC = 0.887 for the validation set, and BA = 0.771 and AUC = 0.852 for the test set. Besides prediction model, 22 blue and 16 green representative fluorescent substructures were summarized for the screening of potential fluorescent compounds. The comparison with other fluorescence detection tools and theapplication to external validation sets and large molecule libraries have demonstrated the reliability of prediction model for fluorescent compound detection. CONCLUSION: ChemFLuo is a public webserver to filter out compounds with undesirable fluorescent properties, which will benefit the design of high-quality chemical libraries for drug discovery. It is freely available at http://admet.scbdd.com/chemfluo/index/.


Asunto(s)
Descubrimiento de Drogas , Colorantes Fluorescentes/química , Aprendizaje Automático , Modelos Químicos , Bibliotecas de Moléculas Pequeñas , Fluorescencia
20.
Anticancer Drugs ; 34(3): 431-438, 2023 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-36730496

RESUMEN

Apatinib is a selective inhibitor of vascular endothelial growth factor receptor-2. Despite encouraging anticancer activity in different cancer types, some patients may not benefit from apatinib treatment. Herein, we characterized genomic profiles in colorectal cancer (CRC) patients to explore predictive biomarkers of apatinib at molecular level. We retrospectively recruited 19 CRC patients receiving apatinib as third-line treatment. Tissue samples before apatinib treatment were collected and subjected to genomic profiling using a targeted sequencing panel covering 520 cancer-related genes. After apatinib treatment, the patients achieved an objective response rate of 21% (4/19) and disease control rate of 57.9% (11/19). The median progression-free survival (PFS) and overall survival were 5 and 8.7 months, respectively. Genetic alterations were frequently identified in TP53 (95%), APC (53%), KRAS (53%) and PIK3CA (26%). Higher tumor mutation burden levels were significantly observed in patients harboring alterations in ERBB and RAS signaling pathways. Patients harboring FLT1 amplifications ( n = 3) showed significantly worse PFS than wild-type patients. Our study described molecular profiles in CRC patients receiving apatinib treatment and identified FLT1 amplification as a potential predictive biomarker for poor efficacy of apatinib. Further studies are warranted to validate the use of FLT1 amplification during apatinib treatment.


Asunto(s)
Antineoplásicos , Neoplasias Colorrectales , Humanos , Antineoplásicos/farmacología , Estudios Retrospectivos , Factor A de Crecimiento Endotelial Vascular , Neoplasias Colorrectales/patología , Biomarcadores , Genómica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA