Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
BMC Bioinformatics ; 20(1): 380, 2019 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-31288752

RESUMEN

BACKGROUND: Alkaloids, a class of organic compounds that contain nitrogen bases, are mainly synthesized as secondary metabolites in plants and fungi, and they have a wide range of bioactivities. Although there are thousands of compounds in this class, few of their biosynthesis pathways are fully identified. In this study, we constructed a model to predict their precursors based on a novel kind of neural network called the molecular graph convolutional neural network. Molecular similarity is a crucial metric in the analysis of qualitative structure-activity relationships. However, it is sometimes difficult for current fingerprint representations to emphasize specific features for the target problems efficiently. It is advantageous to allow the model to select the appropriate features according to data-driven decisions for extracting more useful information, which influences a classification or regression problem substantially. RESULTS: In this study, we applied a neural network architecture for undirected graph representation of molecules. By encoding a molecule as an abstract graph and applying "convolution" on the graph and training the weight of the neural network framework, the neural network can optimize feature selection for the training problem. By incorporating the effects from adjacent atoms recursively, graph convolutional neural networks can extract the features of latent atoms that represent chemical features of a molecule efficiently. In order to investigate alkaloid biosynthesis, we trained the network to distinguish the precursors of 566 alkaloids, which are almost all of the alkaloids whose biosynthesis pathways are known, and showed that the model could predict starting substances with an averaged accuracy of 97.5%. CONCLUSION: We have showed that our model can predict more accurately compared to the random forest and general neural network when the variables and fingerprints are not selected, while the performance is comparable when we carefully select 507 variables from 18000 dimensions of descriptors. The prediction of pathways contributes to understanding of alkaloid synthesis mechanisms and the application of graph based neural network models to similar problems in bioinformatics would therefore be beneficial. We applied our model to evaluate the precursors of biosynthesis of 12000 alkaloids found in various organisms and found power-low-like distribution.


Asunto(s)
Alcaloides/clasificación , Vías Biosintéticas , Redes Neurales de la Computación , Algoritmos , Alcaloides/química , Metaboloma , Modelos Teóricos
2.
Biomed Res Int ; 2015: 540297, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26491677

RESUMEN

Recently, biology has become a data intensive science because of huge data sets produced by high throughput molecular biological experiments in diverse areas including the fields of genomics, transcriptomics, proteomics, and metabolomics. These huge datasets have paved the way for system-level analysis of the processes and subprocesses of the cell. For system-level understanding, initially the elements of a system are connected based on their mutual relations and a network is formed. Among omics researchers, construction and analysis of biological networks have become highly popular. In this review, we briefly discuss both the biological background and topological properties of major types of omics networks to facilitate a comprehensive understanding and to conceptualize the foundation of network biology.


Asunto(s)
Redes Reguladoras de Genes , Modelos Biológicos
3.
Biomed Res Int ; 2015: 139254, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26495281

RESUMEN

Volatile organic compounds (VOCs) are small molecules that exhibit high vapor pressure under ambient conditions and have low boiling points. Although VOCs contribute only a small proportion of the total metabolites produced by living organisms, they play an important role in chemical ecology specifically in the biological interactions between organisms and ecosystems. VOCs are also important in the health care field as they are presently used as a biomarker to detect various human diseases. Information on VOCs is scattered in the literature until now; however, there is still no available database describing VOCs and their biological activities. To attain this purpose, we have developed KNApSAcK Metabolite Ecology Database, which contains the information on the relationships between VOCs and their emitting organisms. The KNApSAcK Metabolite Ecology is also linked with the KNApSAcK Core and KNApSAcK Metabolite Activity Database to provide further information on the metabolites and their biological activities. The VOC database can be accessed online.


Asunto(s)
Minería de Datos/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos de Compuestos Químicos , Publicaciones Periódicas como Asunto , Compuestos Orgánicos Volátiles/química , Compuestos Orgánicos Volátiles/metabolismo , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , Compuestos Orgánicos Volátiles/clasificación
4.
Biomed Res Int ; 2014: 154594, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24800208

RESUMEN

This work presents a novel approach to predict functional relations between genes using gene expression data. Genes may have various types of relations between them, for example, regulatory relations, or they may be concerned with the same protein complex or metabolic/signaling pathways and obviously gene expression data should contain some clues to such relations. The present approach first digitizes the log-ratio type gene expression data of S. cerevisiae to a matrix consisting of 1, 0, and -1 indicating highly expressed, no major change, and highly suppressed conditions for genes, respectively. For each gene pair, a probability density mass function table is constructed indicating nine joint probabilities. Then gene pairs were selected based on linear and probabilistic relation between their profiles indicated by the sum of probability density masses in selected points. The selected gene pairs share many Gene Ontology terms. Furthermore a network is constructed by selecting a large number of gene pairs based on FDR analysis and the clustering of the network generates many modules rich with similar function genes. Also, the promoters of the gene sets in many modules are rich with binding sites of known transcription factors indicating the effectiveness of the proposed approach in predicting regulatory relations.


Asunto(s)
Inteligencia Artificial , Regulación de la Expresión Génica/fisiología , Modelos Biológicos , Mapeo de Interacción de Proteínas/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/metabolismo , Simulación por Computador , Familia de Multigenes/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Transducción de Señal/fisiología
5.
Mol Inform ; 33(11-12): 790-801, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-27485425

RESUMEN

Developing database systems connecting diverse species based on omics is the most important theme in big data biology. To attain this purpose, we have developed KNApSAcK Family Databases, which are utilized in a number of researches in metabolomics. In the present study, we have developed a network-based approach to analyze relationships between 3D structure and biological activity of metabolites consisting of four steps as follows: construction of a network of metabolites based on structural similarity (Step 1), classification of metabolites into structure groups (Step 2), assessment of statistically significant relations between structure groups and biological activities (Step 3), and 2-dimensional clustering of the constructed data matrix based on statistically significant relations between structure groups and biological activities (Step 4). Applying this method to a data set consisting of 2072 secondary metabolites and 140 biological activities reported in KNApSAcK Metabolite Activity DB, we obtained 983 statistically significant structure group-biological activity pairs. As a whole, we systematically analyzed the relationship between 3D-chemical structures of metabolites and biological activities.

6.
Plant Cell Physiol ; 54(5): 728-39, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23574698

RESUMEN

Metabolomics analysis tools can provide quantitative information on the concentration of metabolites in an organism. In this paper, we propose the minimum pathway model generator tool for simulating the dynamics of metabolite concentrations (SS-mPMG) and a tool for parameter estimation by genetic algorithm (SS-GA). SS-mPMG can extract a subsystem of the metabolic network from the genome-scale pathway maps to reduce the complexity of the simulation model and automatically construct a dynamic simulator to evaluate the experimentally observed behavior of metabolites. Using this tool, we show that stochastic simulation can reproduce experimentally observed dynamics of amino acid biosynthesis in Arabidopsis thaliana. In this simulation, SS-mPMG extracts the metabolic network subsystem from published databases. The parameters needed for the simulation are determined using a genetic algorithm to fit the simulation results to the experimental data. We expect that SS-mPMG and SS-GA will help researchers to create relevant metabolic networks and carry out simulations of metabolic reactions derived from metabolomics data.


Asunto(s)
Algoritmos , Arabidopsis/metabolismo , Simulación por Computador , Redes y Vías Metabólicas , Metabolómica , Cinética , Modelos Biológicos , Análisis de Componente Principal , Procesos Estocásticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...