Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Mol Inform ; 40(4): e2000225, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33237627

RESUMEN

The development of novel organic compounds with desired properties is time consuming and costly. Thus, the quantitative structure-property relationship (QSPR) model is used widely for efficiently discovering compounds with the desired properties. Novel structures can be generated from a variety of input structures in silico by structure generators. We previously developed the structure generator DAECS to yield highly active drug-like structures. However, the structural diversity of the structures generated by DAECS was still small for practical applications such as drug discovery. In this paper, we present structure modification rules and the algorithm to output more diverse structures through the DAECS workflow. Two new types of structural modification rules, bond contraction and ring mergence, were added. The new algorithm, which restricts the search area and subsequently clusters structures on a two-dimensional map generated by generative topographic mapping, was implemented for the repetitive selection of seed structures. A case study was conducted to evaluate our method using ligand structures for the histamine H1 receptor. The results showed improved structural diversity than the previous method.


Asunto(s)
Algoritmos , Compuestos Orgánicos/química , Relación Estructura-Actividad Cuantitativa , Estructura Molecular , Compuestos Orgánicos/síntesis química
2.
F1000Res ; 9: 136, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32308977

RESUMEN

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Biología Computacional , Web Semántica , Minería de Datos , Metadatos , Reproducibilidad de los Resultados
3.
J Cheminform ; 12(1): 19, 2020 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-33430997

RESUMEN

Ensemble learning helps improve machine learning results by combining several models and allows the production of better predictive performance compared to a single model. It also benefits and accelerates the researches in quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR). With the growing number of ensemble learning models such as random forest, the effectiveness of QSAR/QSPR will be limited by the machine's inability to interpret the predictions to researchers. In fact, many implementations of ensemble learning models are able to quantify the overall magnitude of each feature. For example, feature importance allows us to assess the relative importance of features and to interpret the predictions. However, different ensemble learning methods or implementations may lead to different feature selections for interpretation. In this paper, we compared the predictability and interpretability of four typical well-established ensemble learning models (Random forest, extreme randomized trees, adaptive boosting and gradient boosting) for regression and binary classification modeling tasks. Then, the blending methods were built by summarizing four different ensemble learning methods. The blending method led to better performance and a unification interpretation by summarizing individual predictions from different learning models. The important features of two case studies which gave us some valuable information to compound properties were discussed in detail in this report. QSPR modeling with interpretable machine learning techniques can move the chemical design forward to work more efficiently, confirm hypothesis and establish knowledge for better results.

4.
BMC Bioinformatics ; 20(1): 728, 2019 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-31870296

RESUMEN

BACKGROUND: Natural products are the source of various functional materials such as medicines, and understanding their biosynthetic pathways can provide information that is helpful for their effective production through the synthetic biology approach. A number of studies have aimed to predict biosynthetic pathways from their chemical structures in a retrosynthesis manner; however, sometimes the calculation finishes without reaching the starting material from the target molecule. In order to address this problem, the method to find suitable starting materials is required. RESULTS: In this study, we developed a predictive workflow named the Metabolic Disassembler that automatically disassembles the target molecule structure into relevant biosynthetic units (BUs), which are the substructures that correspond to the starting materials in the biosynthesis pathway. This workflow uses a biosynthetic unit library (BUL), which contains starting materials, key intermediates, and their derivatives. We obtained the starting materials from the KEGG PATHWAY database, and 765 BUs were registered in the BUL. We then examined the proposed workflow to optimize the combination of the BUs. To evaluate the performance of the proposed Metabolic Disassembler workflow, we used 943 molecules that are included in the secondary metabolism maps of KEGG PATHWAY. About 95.8% of them (903 molecules) were correctly disassembled by our proposed workflow. For comparison, we also implemented a genetic algorithm-based workflow, and found that the accuracy was only about 52.0%. In addition, for 90.7% of molecules, our workflow finished the calculation within one minute. CONCLUSIONS: The Metabolic Disassembler enabled the effective disassembly of natural products in terms of both correctness and computational time. It also outputs automatically highlighted color-coded substructures corresponding to the BUs to help users understand the calculation results. The users do not have to specify starting molecules in advance, and can input any target molecule, even if it is not in databases. Our workflow will be very useful for understanding and predicting the biosynthesis of natural products.


Asunto(s)
Productos Biológicos/química , Vías Biosintéticas/genética , Biología Sintética/métodos , Humanos
5.
J Biol Chem ; 294(49): 18662-18673, 2019 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-31656227

RESUMEN

Cucurbitacins are highly oxygenated triterpenoids characteristic of plants in the family Cucurbitaceae and responsible for the bitter taste of these plants. Fruits of bitter melon (Momordica charantia) contain various cucurbitacins possessing an unusual ether bridge between C5 and C19, not observed in other Cucurbitaceae members. Using a combination of next-generation sequencing and RNA-Seq analysis and gene-to-gene co-expression analysis with the ConfeitoGUIplus software, we identified three P450 genes, CYP81AQ19, CYP88L7, and CYP88L8, expected to be involved in cucurbitacin biosynthesis. CYP81AQ19 co-expression with cucurbitadienol synthase in yeast resulted in the production of cucurbita-5,24-diene-3ß,23α-diol. A mild acid treatment of this compound resulted in an isomerization of the C23-OH group to C25-OH with the concomitant migration of a double bond, suggesting that a nonenzymatic transformation may account for the observed C25-OH in the majority of cucurbitacins found in plants. The functional expression of CYP88L7 resulted in the production of hydroxylated C19 as well as C5-C19 ether-bridged products. A plausible mechanism for the formation of the C5-C19 ether bridge involves C7 and C19 hydroxylations, indicating a multifunctional nature of this P450. On the other hand, functional CYP88L8 expression gave a single product, a triterpene diol, indicating a monofunctional P450 catalyzing the C7 hydroxylation. Our findings of the roles of several plant P450s in cucurbitacin biosynthesis reveal that an allylic hydroxylation is a key enzymatic transformation that triggers subsequent processes to produce structurally diverse products.


Asunto(s)
Sistema Enzimático del Citocromo P-450/metabolismo , Momordica/química , Proteínas de Plantas/metabolismo , Triterpenos/metabolismo , Hidroxilación , Isoformas de Proteínas , Programas Informáticos
6.
Mol Inform ; 38(10): e1900010, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31187601

RESUMEN

Cytochrome P450 (CYP) is an enzyme family that plays a crucial role in metabolism, mainly metabolizing xenobiotics to produce non-toxic structures, however, some metabolized products can cause hepatotoxicity. Hence, predicting the structures of CYP products is an important task in designing non-hepatotoxic drugs. Here, we have developed novel atomic descriptors to predict the sites of metabolism (SoM) in CYP substrates. We proposed descriptors that describe topological and electrostatic characteristics of CYP substrates using Gasteiger charge. The proposed descriptors were applied to CYP3A4 data analysis as a case study. As a result of the descriptor selection, we obtained a gradient boosting decision tree-based SoM classification model that used 139 existing descriptors and the proposed 45 descriptors, and the model performed well in terms of the Matthews correlation coefficient. We also developed a structure converter to predict CYP products. This converter correctly generated 51 structural formulas of experimentally observed CYP3A4 products according to a manual evaluation.


Asunto(s)
Sistema Enzimático del Citocromo P-450/metabolismo , Xenobióticos/química , Xenobióticos/metabolismo , Estructura Molecular , Electricidad Estática
7.
BMC Syst Biol ; 13(Suppl 2): 39, 2019 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-30953486

RESUMEN

BACKGROUND: Characterization of drug-protein interaction networks with biological features has recently become challenging in recent pharmaceutical science toward a better understanding of polypharmacology. RESULTS: We present a novel method for systematic analyses of the underlying features characteristic of drug-protein interaction networks, which we call "drug-protein interaction signatures" from the integration of large-scale heterogeneous data of drugs and proteins. We develop a new efficient algorithm for extracting informative drug-protein interaction signatures from the integration of large-scale heterogeneous data of drugs and proteins, which is made possible by space-efficient representations for fingerprints of drug-protein pairs and sparsity-induced classifiers. CONCLUSIONS: Our method infers a set of drug-protein interaction signatures consisting of the associations between drug chemical substructures, adverse drug reactions, protein domains, biological pathways, and pathway modules. We argue the these signatures are biologically meaningful and useful for predicting unknown drug-protein interactions and are expected to contribute to rational drug design.


Asunto(s)
Biología Computacional/métodos , Preparaciones Farmacéuticas/metabolismo , Proteínas/metabolismo , Modelos Logísticos , Unión Proteica
8.
J Mol Model ; 25(5): 112, 2019 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-30953170

RESUMEN

Membranolytic anticancer peptides (ACPs) are drawing increasing attention as potential future therapeutics against cancer, due to their ability to hinder the development of cellular resistance and their potential to overcome common hurdles of chemotherapy, e.g., side effects and cytotoxicity. In this work, we present an ensemble machine learning model to design potent ACPs. Four counter-propagation artificial neural-networks were trained to identify peptides that kill breast and/or lung cancer cells. For prospective application of the ensemble model, we selected 14 peptides from a total of 1000 de novo designs, for synthesis and testing in vitro on breast cancer (MCF7) and lung cancer (A549) cell lines. Six de novo designs showed anticancer activity in vitro, five of which against both MCF7 and A549 cell lines. The novel active peptides populate uncharted regions of ACP sequence space.


Asunto(s)
Antineoplásicos/química , Modelos Moleculares , Neoplasias/tratamiento farmacológico , Péptidos/química , Células A549 , Antineoplásicos/uso terapéutico , Proliferación Celular/efectos de los fármacos , Humanos , Células MCF-7 , Aprendizaje Automático , Neoplasias/genética , Redes Neurales de la Computación , Péptidos/genética , Péptidos/uso terapéutico
9.
Methods Mol Biol ; 1825: 211-225, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30334207

RESUMEN

Small molecules can be represented in various file formats, (1) one-line systems such as SMILES (Simplified Molecular Input Line Entry System) and InChI (International Chemical Identifier) and (2) table systems such as the molfiles, SDF (Structure Data File), and KCF (KEGG Chemical Function). KCF and KCF-S (KEGG Chemical Function-and-Substructures) apply physicochemical property labels on the representations of small molecules, and contribute to improved analysis of compound-protein networks including drug-target interaction, and compound-compound networks including metabolic pathways. In this chapter, the main concepts, usage, and some example applications of the KCFCO and KCF-S packages are explained.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Redes y Vías Metabólicas , Preparaciones Farmacéuticas/metabolismo , Proteínas/metabolismo , Humanos , Programas Informáticos , Relación Estructura-Actividad
10.
Sci Rep ; 7: 43368, 2017 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-28262809

RESUMEN

Although host-plant selection is a central topic in ecology, its general underpinnings are poorly understood. Here, we performed a case study focusing on the publicly available data on Japanese butterflies. A combined statistical analysis of plant-herbivore relationships and taxonomy revealed that some butterfly subfamilies in different families feed on the same plant families, and the occurrence of this phenomenon more than just by chance, thus indicating the independent acquisition of adaptive phenotypes to the same hosts. We consequently integrated plant-herbivore and plant-compound relationship data and conducted a statistical analysis to identify compounds unique to host plants of specific butterfly families. Some of the identified plant compounds are known to attract certain butterfly groups while repelling others. The additional incorporation of insect-compound relationship data revealed potential metabolic processes that are related to host plant selection. Our results demonstrate that data integration enables the computational detection of compounds putatively involved in particular interspecies interactions and that further data enrichment and integration of genomic and transcriptomic data facilitates the unveiling of the molecular mechanisms involved in host plant selection.


Asunto(s)
Mariposas Diurnas/fisiología , Biología Computacional/métodos , Conducta Alimentaria , Plantas/parasitología , Animales , Factores Quimiotácticos/análisis , Repelentes de Insectos/análisis , Fitoquímicos/análisis , Plantas/química
11.
Sci Rep ; 7: 40164, 2017 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-28071740

RESUMEN

The identification of the modes of action of bioactive compounds is a major challenge in chemical systems biology of diseases. Genome-wide expression profiling of transcriptional responses to compound treatment for human cell lines is a promising unbiased approach for the mode-of-action analysis. Here we developed a novel approach to elucidate the modes of action of bioactive compounds in a cell-specific manner using large-scale chemically-induced transcriptome data acquired from the Library of Integrated Network-based Cellular Signatures (LINCS), and analyzed 16,268 compounds and 68 human cell lines. First, we performed pathway enrichment analyses of regulated genes to reveal active pathways among 163 biological pathways. Next, we explored potential target proteins (including primary targets and off-targets) with cell-specific transcriptional similarity using chemical-protein interactome. Finally, we predicted new therapeutic indications for 461 diseases based on the target proteins. We showed the usefulness of the proposed approach in terms of prediction coverage, interpretation, and large-scale applicability, and validated the new prediction results experimentally by an in vitro cellular assay. The approach has a high potential for advancing drug discovery and repositioning.


Asunto(s)
Productos Biológicos/farmacología , Biología Computacional/métodos , Perfilación de la Expresión Génica , Línea Celular , Humanos , Redes y Vías Metabólicas/genética , Unión Proteica , Biología de Sistemas/métodos
12.
Biophys Physicobiol ; 13: 195-205, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27924274

RESUMEN

Metabolic pathway reconstruction presents a challenge for understanding metabolic pathways in organisms of interest. Different strategies, i.e., reference-based vs. de novo, must be used for pathway reconstruction depending on the availability of well-characterized enzymatic reactions. If at least one enzyme is already known to catalyze a reaction, its amino acid sequence can be used as a reference for identifying homologous enzymes in the genome of an organism of interest. Where there is no known enzyme able to catalyze a corresponding reaction, however, the reaction and the corresponding enzyme must be predicted de novo from chemical transformations of the putative substrate-product pair. This review summarizes studies involving reference-based and de novo metabolic pathway reconstruction and discusses the importance of the classification and structure-function relationships of enzymes.

13.
Bioinformatics ; 32(12): i278-i287, 2016 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-27307627

RESUMEN

MOTIVATION: Metabolic pathways are an important class of molecular networks consisting of compounds, enzymes and their interactions. The understanding of global metabolic pathways is extremely important for various applications in ecology and pharmacology. However, large parts of metabolic pathways remain unknown, and most organism-specific pathways contain many missing enzymes. RESULTS: In this study we propose a novel method to predict the enzyme orthologs that catalyze the putative reactions to facilitate the de novo reconstruction of metabolic pathways from metabolome-scale compound sets. The algorithm detects the chemical transformation patterns of substrate-product pairs using chemical graph alignments, and constructs a set of enzyme-specific classifiers to simultaneously predict all the enzyme orthologs that could catalyze the putative reactions of the substrate-product pairs in the joint learning framework. The originality of the method lies in its ability to make predictions for thousands of enzyme orthologs simultaneously, as well as its extraction of enzyme-specific chemical transformation patterns of substrate-product pairs. We demonstrate the usefulness of the proposed method by applying it to some ten thousands of metabolic compounds, and analyze the extracted chemical transformation patterns that provide insights into the characteristics and specificities of enzymes. The proposed method will open the door to both primary (central) and secondary metabolism in genomics research, increasing research productivity to tackle a wide variety of environmental and public health matters. CONTACT: : maskot@bio.titech.ac.jp.


Asunto(s)
Redes y Vías Metabólicas , Algoritmos , Catálisis , Genómica , Metaboloma
14.
J Chem Inf Model ; 56(3): 510-6, 2016 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-26822930

RESUMEN

Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies that estimate the number of candidate enzyme genes, these studies required some additional information aside from the structures of metabolites such as gene expression and order in the genome. In this study, we developed a novel method to identify a candidate enzyme gene of a reaction using the chemical structures of the substrate-product pair (reactant pair). The proposed method is based on a search for similar reactant pairs in a reference database and offers ortholog groups that possibly mediate the given reaction. We applied the proposed method to two experimentally validated reactions. As a result, we confirmed that the histidine transaminase was correctly identified. Although our method could not directly identify the asparagine oxo-acid transaminase, we successfully found the paralog gene most similar to the correct enzyme gene. We also applied our method to infer candidate enzyme genes in the mesaconate pathway. The advantage of our method lies in the prediction of possible genes for orphan enzyme reactions where any associated gene sequences are not determined yet. We believe that this approach will facilitate experimental identification of genes for orphan enzymes.


Asunto(s)
Enzimas/genética , Bases de Datos de Proteínas , Enzimas/metabolismo , Especificidad por Sustrato
15.
J Chem Inf Model ; 55(12): 2705-16, 2015 Dec 28.
Artículo en Inglés | MEDLINE | ID: mdl-26624799

RESUMEN

The identification of beneficial drug combinations is a challenging issue in pharmaceutical and clinical research toward combinatorial drug therapy. In the present study, we developed a novel computational method for large-scale prediction of beneficial drug combinations using drug efficacy and target profiles. We designed an informative descriptor for each drug-drug pair based on multiple drug profiles representing drug-targeted proteins and Anatomical Therapeutic Chemical Classification System codes. Then, we constructed a predictive model by learning a sparsity-induced classifier based on known drug combinations from the Orange Book and KEGG DRUG databases. Our results show that the proposed method outperforms the previous methods in terms of the accuracy of high-confidence predictions, and the extracted features are biologically meaningful. Finally, we performed a comprehensive prediction of novel drug combinations for 2,639 approved drugs, which predicted 142,988 new potentially beneficial drug-drug pairs. We showed several examples of successfully predicted drug combinations for a variety of diseases.


Asunto(s)
Biología Computacional , Combinación de Medicamentos , Sistemas de Liberación de Medicamentos , Reposicionamiento de Medicamentos , Bases de Datos Farmacéuticas , Interacciones Farmacológicas , Humanos , Análisis de Regresión
16.
Bioinformatics ; 31(12): i161-70, 2015 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-26072478

RESUMEN

MOTIVATION: Recent advances in mass spectrometry and related metabolomics technologies have enabled the rapid and comprehensive analysis of numerous metabolites. However, biosynthetic and biodegradation pathways are only known for a small portion of metabolites, with most metabolic pathways remaining uncharacterized. RESULTS: In this study, we developed a novel method for supervised de novo metabolic pathway reconstruction with an improved graph alignment-based approach in the reaction-filling framework. We proposed a novel chemical graph alignment algorithm, which we called PACHA (Pairwise Chemical Aligner), to detect the regioisomer-sensitive connectivities between the aligned substructures of two compounds. Unlike other existing graph alignment methods, PACHA can efficiently detect only one common subgraph between two compounds. Our results show that the proposed method outperforms previous descriptor-based methods or existing graph alignment-based methods in the enzymatic reaction-likeness prediction for isomer-enriched reactions. It is also useful for reaction annotation that assigns potential reaction characteristics such as EC (Enzyme Commission) numbers and PIERO (Enzymatic Reaction Ontology for Partial Information) terms to substrate-product pairs. Finally, we conducted a comprehensive enzymatic reaction-likeness prediction for all possible uncharacterized compound pairs, suggesting potential metabolic pathways for newly predicted substrate-product pairs.


Asunto(s)
Algoritmos , Redes y Vías Metabólicas , Metabolómica/métodos , Metaboloma
17.
J Bioinform Comput Biol ; 12(6): 1442001, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25385078

RESUMEN

Genomics is faced with the issue of many partially annotated putative enzyme-encoding genes for which activities have not yet been verified, while metabolomics is faced with the issue of many putative enzyme reactions for which full equations have not been verified. Knowledge of enzymes has been collected by IUBMB, and has been made public as the Enzyme List. To date, however, the terminology of the Enzyme List has not been assessed comprehensively by bioinformatics studies. Instead, most of the bioinformatics studies simply use the identifiers of the enzymes, i.e. the Enzyme Commission (EC) numbers. We investigated the actual usage of terminology throughout the Enzyme List, and demonstrated that the partial characteristics of reactions cannot be retrieved by simply using EC numbers. Thus, we developed a novel ontology, named PIERO, for annotating biochemical transformations as follows. First, the terminology describing enzymatic reactions was retrieved from the Enzyme List, and was grouped into those related to overall reactions and biochemical transformations. Consequently, these terms were mapped onto the actual transformations taken from enzymatic reaction equations. This ontology was linked to Gene Ontology (GO) and EC numbers, allowing the extraction of common partial reaction characteristics from given sets of orthologous genes and the elucidation of possible enzymes from the given transformations. Further future development of the PIERO ontology should enhance the Enzyme List to promote the integration of genomics and metabolomics.


Asunto(s)
Ontologías Biológicas , Bases de Datos de Proteínas , Enzimas/química , Enzimas/clasificación , Almacenamiento y Recuperación de la Información/métodos , Terminología como Asunto , Enzimas/genética , Procesamiento de Lenguaje Natural
18.
J Chem Inf Model ; 54(6): 1558-66, 2014 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-24897372

RESUMEN

In recent years, the Semantic Web has become the focus of life science database development as a means to link life science data in an effective and efficient manner. In order for carbohydrate data to be applied to this new technology, there are two requirements for carbohydrate data representations: (1) a linear notation which can be used as a URI (Uniform Resource Identifier) if needed and (2) a unique notation such that any published glycan structure can be represented distinctively. This latter requirement includes the possible representation of nonstandard monosaccharide units as a part of the glycan structure, as well as compositions, repeating units, and ambiguous structures where linkages/linkage positions are unidentified. Therefore, we have developed the Web3 Unique Representation of Carbohydrate Structures (WURCS) as a new linear notation for representing carbohydrates for the Semantic Web.


Asunto(s)
Carbohidratos/química , Bases de Datos de Compuestos Químicos , Secuencia de Carbohidratos , Internet , Modelos Moleculares , Datos de Secuencia Molecular , Programas Informáticos
19.
Bioinformatics ; 30(12): i165-74, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-24931980

RESUMEN

MOTIVATION: Metabolic pathway analysis is crucial not only in metabolic engineering but also in rational drug design. However, the biosynthetic/biodegradation pathways are known only for a small portion of metabolites, and a vast amount of pathways remain uncharacterized. Therefore, an important challenge in metabolomics is the de novo reconstruction of potential reaction networks on a metabolome-scale. RESULTS: In this article, we develop a novel method to predict the multistep reaction sequences for de novo reconstruction of metabolic pathways in the reaction-filling framework. We propose a supervised approach to learn what we refer to as 'multistep reaction sequence likeness', i.e. whether a compound-compound pair is possibly converted to each other by a sequence of enzymatic reactions. In the algorithm, we propose a recursive procedure of using step-specific classifiers to predict the intermediate compounds in the multistep reaction sequences, based on chemical substructure fingerprints/descriptors of compounds. We further demonstrate the usefulness of our proposed method on the prediction of enzymatic reaction networks from a metabolome-scale compound set and discuss characteristic features of the extracted chemical substructure transformation patterns in multistep reaction sequences. Our comprehensively predicted reaction networks help to fill the metabolic gap and to infer new reaction sequences in metabolic pathways. AVAILABILITY AND IMPLEMENTATION: Materials are available for free at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2014/


Asunto(s)
Redes y Vías Metabólicas , Metaboloma , Metabolómica/métodos , Algoritmos , Máquina de Vectores de Soporte
20.
Nucleic Acids Res ; 42(Web Server issue): W39-45, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24838565

RESUMEN

DINIES (drug-target interaction network inference engine based on supervised analysis) is a web server for predicting unknown drug-target interaction networks from various types of biological data (e.g. chemical structures, drug side effects, amino acid sequences and protein domains) in the framework of supervised network inference. The originality of DINIES lies in prediction with state-of-the-art machine learning methods, in the integration of heterogeneous biological data and in compatibility with the KEGG database. The DINIES server accepts any 'profiles' or precalculated similarity matrices (or 'kernels') of drugs and target proteins in tab-delimited file format. When a training data set is submitted to learn a predictive model, users can select either known interaction information in the KEGG DRUG database or their own interaction data. The user can also select an algorithm for supervised network inference, select various parameters in the method and specify weights for heterogeneous data integration. The server can provide integrative analyses with useful components in KEGG, such as biological pathways, functional hierarchy and human diseases. DINIES (http://www.genome.jp/tools/dinies/) is publicly available as one of the genome analysis tools in GenomeNet.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Proteínas/química , Programas Informáticos , Algoritmos , Humanos , Internet , Preparaciones Farmacéuticas/química , Estructura Terciaria de Proteína , Proteínas/efectos de los fármacos , Análisis de Secuencia de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...