Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros

Bases de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
BMC Genomics ; 18(1): 749, 2017 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-28938868

RESUMEN

BACKGROUND: A genomic signal track is a set of genomic intervals associated with values of various types, such as measurements from high-throughput experiments. Analysis of signal tracks requires complex computational methods, which often make the analysts focus too much on the detailed computational steps rather than on their biological questions. RESULTS: Here we propose Signal Track Query Language (STQL) for simple analysis of signal tracks. It is a Structured Query Language (SQL)-like declarative language, which means one only specifies what computations need to be done but not how these computations are to be carried out. STQL provides a rich set of constructs for manipulating genomic intervals and their values. To run STQL queries, we have developed the Signal Track Analytical Research Tool (START, http://yiplab.cse.cuhk.edu.hk/start/ ), a system that includes a Web-based user interface and a back-end execution system. The user interface helps users select data from our database of around 10,000 commonly-used public signal tracks, manage their own tracks, and construct, store and share STQL queries. The back-end system automatically translates STQL queries into optimized low-level programs and runs them on a computer cluster in parallel. We use STQL to perform 14 representative analytical tasks. By repeating these analyses using bedtools, Galaxy and custom Python scripts, we show that the STQL solution is usually the simplest, and the parallel execution achieves significant speed-up with large data files. Finally, we describe how a biologist with minimal formal training in computer programming self-learned STQL to analyze DNA methylation data we produced from 60 pairs of hepatocellular carcinoma (HCC) samples. CONCLUSIONS: Overall, STQL and START provide a generic way for analyzing a large number of genomic signal tracks in parallel easily.


Asunto(s)
Genómica/métodos , Lenguajes de Programación , Carcinoma Hepatocelular/genética , Humanos , Neoplasias Hepáticas/genética
2.
Int J Bioinform Res Appl ; 3(1): 42-64, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-18048172

RESUMEN

We study the problem of pattern-based subspace clustering which is clustering by pattern similarity finds objects that exhibit a coherent pattern of rises and falls in subspaces. Applications of pattern-based subspace clustering include DNA micro-array data analysis. Our goal is to devise pattern-based clustering methods that are capable of: discovering useful patterns of various shapes, and discovering all significant patterns. Our approach is to extend the idea of Order-Preserving Submatrix (OPSM). We devise a novel algorithm for mining OPSM, show that OPSM can be generalised to cover most existing pattern-based clustering models and propose a number of extensions to the original OPSM model.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Inteligencia Artificial , Análisis por Conglomerados , ADN/química , Interpretación Estadística de Datos , Interpretación de Imagen Asistida por Computador , Almacenamiento y Recuperación de la Información , Modelos Genéticos , Modelos Estadísticos , Reconocimiento de Normas Patrones Automatizadas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA