Your browser doesn't support javascript.
loading
Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets.
Riquier, Sébastien; Bessiere, Chloé; Guibert, Benoit; Bouge, Anne-Laure; Boureux, Anthony; Ruffle, Florence; Audoux, Jérôme; Gilbert, Nicolas; Xue, Haoliang; Gautheret, Daniel; Commes, Thérèse.
Afiliación
  • Riquier S; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Bessiere C; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Guibert B; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Bouge AL; SeqOne, 34000, Montpellier, France.
  • Boureux A; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Ruffle F; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Audoux J; SeqOne, 34000, Montpellier, France.
  • Gilbert N; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
  • Xue H; Institute for Integrative Biology of the Cell, CEA, CNRS, Université Paris-Saclay, 91198, Gif sur Yvette, France.
  • Gautheret D; Institute for Integrative Biology of the Cell, CEA, CNRS, Université Paris-Saclay, 91198, Gif sur Yvette, France.
  • Commes T; IRMB, University of Montpellier, INSERM, 80 rue Augustin Fliche, 34295, Montpellier, France.
NAR Genom Bioinform ; 3(3): lqab058, 2021 Sep.
Article en En | MEDLINE | ID: mdl-34179780
ABSTRACT
The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K-mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k-mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k-mer signatures, quantify these k-mers into RNA-seq datasets and quickly visualize large dataset characteristics. The core tool, Kmerator, produces specific k-mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor gene-specific k-mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualized through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non-coding RNAs for human health applications.

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2021 Tipo del documento: Article País de afiliación: Francia

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Tipo de estudio: Prognostic_studies Idioma: En Revista: NAR Genom Bioinform Año: 2021 Tipo del documento: Article País de afiliación: Francia