Búsqueda | Portal Regional de la BVS

High-throughput prediction of enzyme promiscuity based on substrate-product pairs.

Xing, Huadong; Cai, Pengli; Liu, Dongliang; Han, Mengying; Liu, Juan; Le, Yingying; Zhang, Dachuan; Hu, Qian-Nan.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38487850

RESUMEN

The screening of enzymes for catalyzing specific substrate-product pairs is often constrained in the realms of metabolic engineering and synthetic biology. Existing tools based on substrate and reaction similarity predominantly rely on prior knowledge, demonstrating limited extrapolative capabilities and an inability to incorporate custom candidate-enzyme libraries. Addressing these limitations, we have developed the Substrate-product Pair-based Enzyme Promiscuity Prediction (SPEPP) model. This innovative approach utilizes transfer learning and transformer architecture to predict enzyme promiscuity, thereby elucidating the intricate interplay between enzymes and substrate-product pairs. SPEPP exhibited robust predictive ability, eliminating the need for prior knowledge of reactions and allowing users to define their own candidate-enzyme libraries. It can be seamlessly integrated into various applications, including metabolic engineering, de novo pathway design, and hazardous material degradation. To better assist metabolic engineers in designing and refining biochemical pathways, particularly those without programming skills, we also designed EnzyPick, an easy-to-use web server for enzyme screening based on SPEPP. EnzyPick is accessible at http://www.biosynther.com/enzypick/.

RDBridge: a knowledge graph of rare diseases based on large-scale text mining.

Xing, Huadong; Zhang, Dachuan; Cai, Pengli; Zhang, Rui; Hu, Qian-Nan.

Bioinformatics ; 39(7)2023 07 01.

Artículo en Inglés | MEDLINE | ID: mdl-37458501

RESUMEN

MOTIVATION: Despite low prevalence, rare diseases affect 300 million people worldwide. Research on pathogenesis and drug development lags due to limited commercial potential, insufficient epidemiological data, and a dearth of publications. The unique characteristics of rare diseases, including limited annotated data, intricate processes for extracting pertinent entity relationships, and difficulties in standardizing data, represent challenges for text mining. RESULTS: We developed a rare disease data acquisition framework using text mining and knowledge graphs and constructed the most comprehensive rare disease knowledge graph to date, Rare Disease Bridge (RDBridge). RDBridge offers search functions for genes, potential drugs, pathways, literature, and medical imaging data that will support mechanistic research, drug development, diagnosis, and treatment for rare diseases. AVAILABILITY AND IMPLEMENTATION: RDBridge is freely available at http://rdb.lifesynther.com/.

Asunto(s)

Reconocimiento de Normas Patrones Automatizadas , Enfermedades Raras , Humanos , Enfermedades Raras/diagnóstico , Enfermedades Raras/epidemiología , Enfermedades Raras/genética , Minería de Datos/métodos

SynBioTools: a one-stop facility for searching and selecting synthetic biology tools.

Cai, Pengli; Liu, Sheng; Zhang, Dachuan; Xing, Huadong; Han, Mengying; Liu, Dongliang; Gong, Linlin; Hu, Qian-Nan.

BMC Bioinformatics ; 24(1): 152, 2023 Apr 17.

Artículo en Inglés | MEDLINE | ID: mdl-37069545

RESUMEN

BACKGROUND: The rapid development of synthetic biology relies heavily on the use of databases and computational tools, which are also developing rapidly. While many tool registries have been created to facilitate tool retrieval, sharing, and reuse, no relatively comprehensive tool registry or catalog addresses all aspects of synthetic biology. RESULTS: We constructed SynBioTools, a comprehensive collection of synthetic biology databases, computational tools, and experimental methods, as a one-stop facility for searching and selecting synthetic biology tools. SynBioTools includes databases, computational tools, and methods extracted from reviews via SCIentific Table Extraction, a scientific table-extraction tool that we built. Approximately 57% of the resources that we located and included in SynBioTools are not mentioned in bio.tools, the dominant tool registry. To improve users' understanding of the tools and to enable them to make better choices, the tools are grouped into nine modules (each with subdivisions) based on their potential biosynthetic applications. Detailed comparisons of similar tools in every classification are included. The URLs, descriptions, source references, and the number of citations of the tools are also integrated into the system. CONCLUSIONS: SynBioTools is freely available at https://synbiotools.lifesynther.com/ . It provides end-users and developers with a useful resource of categorized synthetic biology databases, tools, and methods to facilitate tool retrieval and selection.

Asunto(s)

Biología Computacional , Biología Sintética , Biología Computacional/métodos , Sistema de Registros , Bases de Datos Factuales , Programas Informáticos

Data-Driven Elucidation of Flavor Chemistry.

Kou, Xingran; Shi, Peiqin; Gao, Chukun; Ma, Peihua; Xing, Huadong; Ke, Qinfei; Zhang, Dachuan.

J Agric Food Chem ; 71(18): 6789-6802, 2023 May 10.

Artículo en Inglés | MEDLINE | ID: mdl-37102791

RESUMEN

Flavor molecules are commonly used in the food industry to enhance product quality and consumer experiences but are associated with potential human health risks, highlighting the need for safer alternatives. To address these health-associated challenges and promote reasonable application, several databases for flavor molecules have been constructed. However, no existing studies have comprehensively summarized these data resources according to quality, focused fields, and potential gaps. Here, we systematically summarized 25 flavor molecule databases published within the last 20 years and revealed that data inaccessibility, untimely updates, and nonstandard flavor descriptions are the main limitations of current studies. We examined the development of computational approaches (e.g., machine learning and molecular simulation) for the identification of novel flavor molecules and discussed their major challenges regarding throughput, model interpretability, and the lack of gold-standard data sets for equitable model evaluation. Additionally, we discussed future strategies for the mining and designing of novel flavor molecules based on multi-omics and artificial intelligence to provide a new foundation for flavor science research.

Asunto(s)

Inteligencia Artificial , Aprendizaje Automático , Humanos , Simulación por Computador , Bases de Datos de Compuestos Químicos , Bases de Datos Factuales

AddictedChem: A Data-Driven Integrated Platform for New Psychoactive Substance Identification.

Han, Mengying; Liu, Sheng; Zhang, Dachuan; Zhang, Rui; Liu, Dongliang; Xing, Huadong; Sun, Dandan; Gong, Linlin; Cai, Pengli; Tu, Weizhong; Chen, Junni; Hu, Qian-Nan.

Molecules ; 27(12)2022 Jun 19.

Artículo en Inglés | MEDLINE | ID: mdl-35745053

RESUMEN

The mechanisms underlying drug addiction remain nebulous. Furthermore, new psychoactive substances (NPS) are being developed to circumvent legal control; hence, rapid NPS identification is urgently needed. Here, we present the construction of the comprehensive database of controlled substances, AddictedChem. This database integrates the following information on controlled substances from the US Drug Enforcement Administration: physical and chemical characteristics; classified literature by Medical Subject Headings terms and target binding data; absorption, distribution, metabolism, excretion, and toxicity; and related genes, pathways, and bioassays. We created 29 predictive models for NPS identification using five machine learning algorithms and seven molecular descriptors. The best performing models achieved a balanced accuracy (BA) of 0.940 with an area under the curve (AUC) of 0.986 for the test set and a BA of 0.919 and an AUC of 0.968 for the external validation set, which were subsequently used to identify potential NPS with a consensus strategy. Concurrently, a chemical space that included the properties of vectorised addictive compounds was constructed and integrated with AddictedChem, illustrating the principle of diversely existing NPS from a macro perspective. Based on these potential applications, AddictedChem could be considered a highly promising tool for NPS identification and evaluation.

Asunto(s)

Psicotrópicos , Trastornos Relacionados con Sustancias , Sustancias Controladas , Bases de Datos Factuales , Humanos , Psicotrópicos/efectos adversos , Trastornos Relacionados con Sustancias/diagnóstico

A data-driven integrative platform for computational prediction of toxin biotransformation with a case study.

Zhang, Dachuan; Tian, Ye; Tian, Yu; Xing, Huadong; Liu, Sheng; Zhang, Haoyang; Ding, Shaozhen; Cai, Pengli; Sun, Dandan; Zhang, Tong; Hong, Yanhong; Dai, Hongkun; Tu, Weizhong; Chen, Junni; Wu, Aibo; Hu, Qian-Nan.

J Hazard Mater ; 408: 124810, 2021 04 15.

Artículo en Inglés | MEDLINE | ID: mdl-33360695

RESUMEN

Recently, biogenic toxins have received increasing attention owing to their high contamination levels in feed and food as well as in the environment. However, there is a lack of an integrative platform for seamless linking of data-driven computational methods with 'wet' experimental validations. To this end, we constructed a novel platform that integrates the technical aspects of toxin biotransformation methods. First, a biogenic toxin database termed ToxinDB (http://www.rxnfinder.org/toxindb/), containing multifaceted data on more than 4836 toxins, was built. Next, more than 8000 biotransformation reaction rules were extracted from over 300,000 biochemical reactions extracted from ~580,000 literature reports curated by more than 100 people over the past decade. Based on these reaction rules, a toxin biotransformation prediction model was constructed. Finally, the global chemical space of biogenic toxins was constructed, comprising ~550,000 toxins and putative toxin metabolites, of which 94.7% of the metabolites have not been previously reported. Additionally, we performed a case study to investigate citrinin metabolism in Trichoderma, and a novel metabolite was identified with the assistance of the biotransformation prediction tool of ToxinDB. This unique integrative platform will assist exploration of the 'dark matter' of a toxin's metabolome and promote the discovery of detoxification enzymes.

Asunto(s)

Biología Computacional , Metaboloma , Biotransformación , Bases de Datos Factuales , Humanos

SARS2020: an integrated platform for identification of novel coronavirus by a consensus sequence-function model.

Zhang, Dachuan; Zhang, Tong; Liu, Sheng; Sun, Dandan; Ding, Shaozhen; Cheng, Xingxiang; Cai, Pengli; Ren, Ailin; Han, Mengying; Liu, Dongliang; Jia, Cancan; Gong, Linlin; Zhang, Rui; Xing, Huadong; Tu, Weizhong; Chen, Junni; Hu, Qian-Nan.

Bioinformatics ; 37(8): 1182-1183, 2021 05 23.

Artículo en Inglés | MEDLINE | ID: mdl-32871007

RESUMEN

MOTIVATION: The 2019 novel coronavirus outbreak has significantly affected global health and society. Thus, predicting biological function from pathogen sequence is crucial and urgently needed. However, little work has been conducted to identify viruses by the enzymes that they encode, and which are key to pathogen propagation. RESULTS: We built a comprehensive scientific resource, SARS2020, which integrates coronavirus-related research, genomic sequences and results of anti-viral drug trials. In addition, we built a consensus sequence-catalytic function model from which we identified the novel coronavirus as encoding the same proteinase as the severe acute respiratory syndrome virus. This data-driven sequence-based strategy will enable rapid identification of agents responsible for future epidemics. AVAILABILITYAND IMPLEMENTATION: SARS2020 is available at http://design.rxnfinder.org/sars2020/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

COVID-19 , Coronavirus Relacionado al Síndrome Respiratorio Agudo Severo , Secuencia de Consenso , Genoma , Humanos , Coronavirus Relacionado al Síndrome Respiratorio Agudo Severo/genética , SARS-CoV-2

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA