Your browser doesn't support javascript.
loading
A path recorder algorithm for Multiple Longest Common Subsequences (MLCS) problems.
Wei, Shiwei; Wang, Yuping; Yang, Yuanchao; Liu, Sen.
Afiliação
  • Wei S; School of Computer Science and Technology, Xidian University, Xian, Shaanxi 710071, China.
  • Wang Y; School of Computer Science and Engineering, Guilin University of Aerospace Technology, Guilin, Guangxi 541004, China.
  • Yang Y; School of Computer Science and Technology, Xidian University, Xian, Shaanxi 710071, China.
  • Liu S; School of Computer Science and Technology, Xidian University, Xian, Shaanxi 710071, China.
Bioinformatics ; 36(10): 3035-3042, 2020 05 01.
Article em En | MEDLINE | ID: mdl-32119070
ABSTRACT
MOTIVATION Searching the Longest Common Subsequences of many sequences is called a Multiple Longest Common Subsequence (MLCS) problem which is a very fundamental and challenging problem in many fields of data mining. The existing algorithms cannot be applicable to problems with long and large-scale sequences due to their huge time and space consumption. To efficiently handle large-scale MLCS problems, a Path Recorder Directed Acyclic Graph (PRDAG) model and a novel Path Recorder Algorithm (PRA) are proposed.

RESULTS:

In PRDAG, we transform the MLCS problem into searching the longest path from the Directed Acyclic Graph (DAG), where each longest path in DAG corresponds to an MLCS. To tackle the problem efficiently, we eliminate all redundant and repeated nodes during the construction of DAG, and for each node, we only maintain the longest paths from the source node to it but ignore all non-longest paths. As a result, the size of the DAG becomes very small, and the memory space and search time will be greatly saved. Empirical experiments have been performed on a standard benchmark set of both DNA sequences and protein sequences. The experimental results demonstrate that our model and algorithm outperform the related leading algorithms, especially for large-scale MLCS problems. AVAILABILITY AND IMPLEMENTATION This program code is written by the first author and can be available at https//www.ncbi.nlm.nih.gov/nuccore and https//blog.csdn.net/wswguilin. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Mineração de Dados Tipo de estudo: Prognostic_studies Idioma: En Revista: Bioinformatics Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Mineração de Dados Tipo de estudo: Prognostic_studies Idioma: En Revista: Bioinformatics Ano de publicação: 2020 Tipo de documento: Article