Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
1.
J Comput Aided Mol Des ; 31(4): 379-391, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28281211

RESUMEN

The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.


Asunto(s)
Diseño Asistido por Computadora , Aprendizaje Automático , Bibliotecas de Moléculas Pequeñas/química , Teorema de Bayes , Simulación por Computador , Diseño de Fármacos , Modelos Químicos , Estructura Molecular , Método de Montecarlo , Preparaciones Farmacéuticas/química , Programas Informáticos
2.
Bioinformatics ; 31(10): 1561-8, 2015 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-25583120

RESUMEN

MOTIVATION: The motif discovery problem consists of finding recurring patterns of short strings in a set of nucleotide sequences. This classical problem is receiving renewed attention as most early motif discovery methods lack the ability to handle large data of recent genome-wide ChIP studies. New ChIP-tailored methods focus on reducing computation time and pay little regard to the accuracy of motif detection. Unlike such methods, our method focuses on increasing the detection accuracy while maintaining the computation efficiency at an acceptable level. The major advantage of our method is that it can mine diverse multiple motifs undetectable by current methods. RESULTS: The repulsive parallel Markov chain Monte Carlo (RPMCMC) algorithm that we propose is a parallel version of the widely used Gibbs motif sampler. RPMCMC is run on parallel interacting motif samplers. A repulsive force is generated when different motifs produced by different samplers near each other. Thus, different samplers explore different motifs. In this way, we can detect much more diverse motifs than conventional methods can. Through application to 228 transcription factor ChIP-seq datasets of the ENCODE project, we show that the RPMCMC algorithm can find many reliable cofactor interacting motifs that existing methods are unable to discover.


Asunto(s)
Algoritmos , Motivos de Nucleótidos/genética , Elementos Reguladores de la Transcripción , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/metabolismo , Inmunoprecipitación de Cromatina , Humanos , Cadenas de Markov , Método de Montecarlo , Regiones Promotoras Genéticas
3.
Bioinformatics ; 30(12): i43-51, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-24932004

RESUMEN

MOTIVATION: Automated fluorescence microscopes produce massive amounts of images observing cells, often in four dimensions of space and time. This study addresses two tasks of time-lapse imaging analyses; detection and tracking of the many imaged cells, and it is especially intended for 4D live-cell imaging of neuronal nuclei of Caenorhabditis elegans. The cells of interest appear as slightly deformed ellipsoidal forms. They are densely distributed, and move rapidly in a series of 3D images. Thus, existing tracking methods often fail because more than one tracker will follow the same target or a tracker transits from one to other of different targets during rapid moves. RESULTS: The present method begins by performing the kernel density estimation in order to convert each 3D image into a smooth, continuous function. The cell bodies in the image are assumed to lie in the regions near the multiple local maxima of the density function. The tasks of detecting and tracking the cells are then addressed with two hill-climbing algorithms. The positions of the trackers are initialized by applying the cell-detection method to an image in the first frame. The tracking method keeps attacking them to near the local maxima in each subsequent image. To prevent the tracker from following multiple cells, we use a Markov random field (MRF) to model the spatial and temporal covariation of the cells and to maximize the image forces and the MRF-induced constraint on the trackers. The tracking procedure is demonstrated with dynamic 3D images that each contain >100 neurons of C.elegans. AVAILABILITY: http://daweb.ism.ac.jp/yoshidalab/crest/ismb2014 SUPPLEMENTARY INFORMATION: Supplementary data are available at http://daweb.ism.ac.jp/yoshidalab/crest/ismb2014


Asunto(s)
Rastreo Celular/métodos , Imagenología Tridimensional/métodos , Algoritmos , Animales , Caenorhabditis elegans/citología , Microscopía Confocal , Microscopía Fluorescente , Neuronas/citología , Imagen de Lapso de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA