Your browser doesn't support javascript.
loading
2-kupl: mapping-free variant detection from DNA-seq data of matched samples.
Wang, Yunfeng; Xue, Haoliang; Pourcel, Christine; Du, Yang; Gautheret, Daniel.
  • Wang Y; Institute of Integrative Cell Biology (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France.
  • Xue H; Annoroad Gene Technology Co., Ltd, Beijing, 100176, China.
  • Pourcel C; Institute of Integrative Cell Biology (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France.
  • Du Y; Institute of Integrative Cell Biology (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France.
  • Gautheret D; Annoroad Gene Technology Co., Ltd, Beijing, 100176, China.
BMC Bioinformatics ; 22(1): 304, 2021 Jun 05.
Article en En | MEDLINE | ID: mdl-34090332
ABSTRACT

BACKGROUND:

The detection of genome variants, including point mutations, indels and structural variants, is a fundamental and challenging computational problem. We address here the problem of variant detection between two deep-sequencing (DNA-seq) samples, such as two human samples from an individual patient, or two samples from distinct bacterial strains. The preferred strategy in such a case is to align each sample to a common reference genome, collect all variants and compare these variants between samples. Such mapping-based protocols have several limitations. DNA sequences with large indels, aggregated mutations and structural variants are hard to map to the reference. Furthermore, DNA sequences cannot be mapped reliably to genomic low complexity regions and repeats.

RESULTS:

We introduce 2-kupl, a k-mer based, mapping-free protocol to detect variants between two DNA-seq samples. On simulated and actual data, 2-kupl achieves higher accuracy than other mapping-free protocols. Applying 2-kupl to prostate cancer whole exome sequencing data, we identify a number of candidate variants in hard-to-map regions and propose potential novel recurrent variants in this disease.

CONCLUSIONS:

We developed a mapping-free protocol for variant calling between matched DNA-seq samples. Our protocol is suitable for variant detection in unmappable genome regions or in the absence of a reference genome.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Genómica / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Diagnostic_studies Límite: Humans Idioma: En Año: 2021 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Genómica / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Diagnostic_studies Límite: Humans Idioma: En Año: 2021 Tipo del documento: Article