Your browser doesn't support javascript.
loading
Exploration and retrieval of whole-metagenome sequencing samples.
Seth, Sohan; Välimäki, Niko; Kaski, Samuel; Honkela, Antti.
Afiliação
  • Seth S; Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, De
  • Välimäki N; Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, De
  • Kaski S; Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, De
  • Honkela A; Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, De
Bioinformatics ; 30(17): 2471-9, 2014 Sep 01.
Article em En | MEDLINE | ID: mdl-24845653
ABSTRACT
MOTIVATION Over the recent years, the field of whole-metagenome shotgun sequencing has witnessed significant growth owing to the high-throughput sequencing technologies that allow sequencing genomic samples cheaper, faster and with better coverage than before. This technical advancement has initiated the trend of sequencing multiple samples in different conditions or environments to explore the similarities and dissimilarities of the microbial communities. Examples include the human microbiome project and various studies of the human intestinal tract. With the availability of ever larger databases of such measurements, finding samples similar to a given query sample is becoming a central operation.

RESULTS:

In this article, we develop a content-based exploration and retrieval method for whole-metagenome sequencing samples. We apply a distributed string mining framework to efficiently extract all informative sequence k-mers from a pool of metagenomic samples and use them to measure the dissimilarity between two samples. We evaluate the performance of the proposed approach on two human gut metagenome datasets as well as human microbiome project metagenomic samples. We observe significant enrichment for diseased gut samples in results of queries with another diseased sample and high accuracy in discriminating between different body sites even though the method is unsupervised. AVAILABILITY AND IMPLEMENTATION A software implementation of the DSM framework is available at https//github.com/HIITMetagenomics/dsm-framework. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Metagenômica Limite: Humans Idioma: En Ano de publicação: 2014 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Metagenômica Limite: Humans Idioma: En Ano de publicação: 2014 Tipo de documento: Article