Your browser doesn't support javascript.
loading
UMI-Gen: A UMI-based read simulator for variant calling evaluation in paired-end sequencing NGS libraries.
Sater, Vincent; Viailly, Pierre-Julien; Lecroq, Thierry; Ruminy, Philippe; Bérard, Caroline; Prieur-Gaston, Élise; Jardin, Fabrice.
Afiliação
  • Sater V; University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France.
  • Viailly PJ; INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France.
  • Lecroq T; Department of Pathology, Centre Henri Becquerel, 76000 Rouen, France.
  • Ruminy P; INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France.
  • Bérard C; University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France.
  • Prieur-Gaston É; Department of Pathology, Centre Henri Becquerel, 76000 Rouen, France.
  • Jardin F; INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France.
Comput Struct Biotechnol J ; 18: 2270-2280, 2020.
Article em En | MEDLINE | ID: mdl-32952940
MOTIVATION: With Next Generation Sequencing becoming more affordable every year, NGS technologies asserted themselves as the fastest and most reliable way to detect Single Nucleotide Variants (SNV) and Copy Number Variations (CNV) in cancer patients. These technologies can be used to sequence DNA at very high depths thus allowing to detect abnormalities in tumor cells with very low frequencies. Multiple variant callers are publicly available and are usually efficient at calling out variants. However, when frequencies begin to drop under 1%, the specificity of these tools suffers greatly as true variants at very low frequencies can be easily confused with sequencing or PCR artifacts. The recent use of Unique Molecular Identifiers (UMI) in NGS experiments has offered a way to accurately separate true variants from artifacts. UMI-based variant callers are slowly replacing raw-read based variant callers as the standard method for an accurate detection of variants at very low frequencies. However, benchmarking done in the tools publication are usually realized on real biological data in which real variants are not known, making it difficult to assess their accuracy. RESULTS: We present UMI-Gen, a UMI-based read simulator for targeted sequencing paired-end data. UMI-Gen generates reference reads covering the targeted regions at a user customizable depth. After that, using a number of control files, it estimates the background error rate at each position and then modifies the generated reads to mimic real biological data. Finally, it will insert real variants in the reads from a list provided by the user. AVAILABILITY: The entire pipeline is available at https://gitlab.com/vincent-sater/umigen under MIT license.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2020 Tipo de documento: Article