Your browser doesn't support javascript.
loading
Please Mind the Gap: Indel-Aware Parsimony for Fast and Accurate Ancestral Sequence Reconstruction and Multiple Sequence Alignment Including Long Indels.
Iglhaut, Clara; Pecerska, Julija; Gil, Manuel; Anisimova, Maria.
Affiliation
  • Iglhaut C; Institute of Computational Life Science, Zurich University of Applied Science, Wädenswil, Switzerland.
  • Pecerska J; Faculty of Mathematics and Science, University of Zurich, Zürich, Switzerland.
  • Gil M; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
  • Anisimova M; Institute of Computational Life Science, Zurich University of Applied Science, Wädenswil, Switzerland.
Mol Biol Evol ; 41(7)2024 Jul 03.
Article in En | MEDLINE | ID: mdl-38842253
ABSTRACT
Despite having important biological implications, insertion, and deletion (indel) events are often disregarded or mishandled during phylogenetic inference. In multiple sequence alignment, indels are represented as gaps and are estimated without considering the distinct evolutionary history of insertions and deletions. Consequently, indels are usually excluded from subsequent inference steps, such as ancestral sequence reconstruction and phylogenetic tree search. Here, we introduce indel-aware parsimony (indelMaP), a novel way to treat gaps under the parsimony criterion by considering insertions and deletions as separate evolutionary events and accounting for long indels. By identifying the precise location of an evolutionary event on the tree, we can separate overlapping indel events and use affine gap penalties for long indel modeling. Our indel-aware approach harnesses the phylogenetic signal from indels, including them into all inference stages. Validation and comparison to state-of-the-art inference tools on simulated data show that indelMaP is most suitable for densely sampled datasets with closely to moderately related sequences, where it can reach alignment quality comparable to probabilistic methods and accurately infer ancestral sequences, including indel patterns. Due to its remarkable speed, our method is well suited for epidemiological datasets, eliminating the need for downsampling and enabling the exploitation of the additional information provided by dense taxonomic sampling. Moreover, indelMaP offers new insights into the indel patterns of biologically significant sequences and advances our understanding of genetic variability by considering gaps as crucial evolutionary signals rather than mere artefacts.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Phylogeny / Sequence Alignment / INDEL Mutation Limits: Humans Language: En Journal: Mol Biol Evol Journal subject: BIOLOGIA MOLECULAR Year: 2024 Document type: Article Affiliation country: Switzerland

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Phylogeny / Sequence Alignment / INDEL Mutation Limits: Humans Language: En Journal: Mol Biol Evol Journal subject: BIOLOGIA MOLECULAR Year: 2024 Document type: Article Affiliation country: Switzerland