Your browser doesn't support javascript.
loading
The optimal metric for viral genome space.
Yu, Hongyu; Yau, Stephen S-T.
Affiliation
  • Yu H; Department of Mathematical Sciences, Tsinghua University, Beijing, 100084, People's Republic of China.
  • Yau SS; Department of Mathematical Sciences, Tsinghua University, Beijing, 100084, People's Republic of China.
Comput Struct Biotechnol J ; 23: 2083-2096, 2024 Dec.
Article in En | MEDLINE | ID: mdl-38803517
ABSTRACT
Understanding the structural similarity between genomes is pivotal in classification and phylogenetic analysis. As the number of known genomes rockets, alignment-free methods have gained considerable attention. Among these methods, the natural vector method stands out as it represents sequences as vectors using statistical moments, enabling effective clustering based on families in biological taxonomy. However, determining an optimal metric that combines different elements in natural vectors remains challenging due to the absence of a rigorous theoretical framework for weighting different k-mers and orders. In this study, we address this challenge by transforming the determination of optimal weights into an optimization problem and resolving it through gradient-based techniques. Our experimental results underscore the substantial improvement in classification accuracy achieved by employing these optimal weights, reaching an impressive 92.73% on the testing set, surpassing other alignment-free methods. On one hand, our method offers an outstanding metric for virus classification, and on the other hand, it provides valuable insights into feature integration within alignment-free methods.
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Comput Struct Biotechnol J Year: 2024 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Comput Struct Biotechnol J Year: 2024 Document type: Article
...