Your browser doesn't support javascript.
loading
Long-read mapping to repetitive reference sequences using Winnowmap2.
Jain, Chirag; Rhie, Arang; Hansen, Nancy F; Koren, Sergey; Phillippy, Adam M.
  • Jain C; Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India. chirag@iisc.ac.in.
  • Rhie A; Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA. chirag@iisc.ac.in.
  • Hansen NF; Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.
  • Koren S; Comparative Genomics Analysis Unit, National Human Genome Research Institute, Bethesda, MD, USA.
  • Phillippy AM; Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.
Nat Methods ; 19(6): 705-710, 2022 06.
Article en En | MEDLINE | ID: mdl-35365778
Approximately 5-10% of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we developed a new long-read mapping method, Winnowmap2, by using minimal confidently alignable substrings. Winnowmap2 computes each read mapping through a collection of confident subalignments. This approach is more tolerant of structural variation and more sensitive to paralog-specific variants within repeats. Our experiments highlight that Winnowmap2 successfully addresses the issue of allelic bias, enabling more accurate downstream variant calls in repetitive sequences.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Secuencias Repetitivas de Ácidos Nucleicos / Genoma Humano Límite: Humans Idioma: En Año: 2022 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Secuencias Repetitivas de Ácidos Nucleicos / Genoma Humano Límite: Humans Idioma: En Año: 2022 Tipo del documento: Article