Your browser doesn't support javascript.
loading
Sequence Alignment on Directed Graphs.
Kavya, Vaddadi Naga Sai; Tayal, Kshitij; Srinivasan, Rajgopal; Sivadasan, Naveen.
Afiliação
  • Kavya VNS; TCS Research, Hyderabad, India.
  • Tayal K; TCS Research, Hyderabad, India.
  • Srinivasan R; TCS Research, Hyderabad, India.
  • Sivadasan N; TCS Research, Hyderabad, India.
J Comput Biol ; 26(1): 53-67, 2019 01.
Article em En | MEDLINE | ID: mdl-30204489
ABSTRACT
Genomic variations in a reference collection are naturally represented as genome variation graphs. Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges. The resulting graphs are directed graphs possibly with cycles. Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAGs). To achieve this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification). Furthermore, such graph extensions could have considerable blowup in their size and in the worst case the blow-up factor is proportional to the input sequence length. We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps. V-ALIGN is based on a novel dynamic programming (DP) formulation that allows gapped alignment directly on the input graph. It supports affine and linear gaps. We also propose refinements to V-ALIGN for better performance in practice. With the proposed refinements, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph, and its feedback vertex set. We conducted experiments to compare the proposed algorithm against the existing POA-based techniques. We also performed alignment experiments on the genome variation graphs constructed from the 1000 Genomes data. For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high similarity to the input sequence. In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Alinhamento de Sequência Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Alinhamento de Sequência Idioma: En Ano de publicação: 2019 Tipo de documento: Article