Your browser doesn't support javascript.
loading
COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly.
Liu, Binghang; Yuan, Jianying; Yiu, Siu-Ming; Li, Zhenyu; Xie, Yinlong; Chen, Yanxiang; Shi, Yujian; Zhang, Hao; Li, Yingrui; Lam, Tak-Wah; Luo, Ruibang.
Afiliação
  • Liu B; HKU-BGI BAL-Bioinformatics Algorithms and Core Technology Research Laboratory, The University of Hong Kong, Hong Kong.
Bioinformatics ; 28(22): 2870-4, 2012 Nov 15.
Article em En | MEDLINE | ID: mdl-23044551
ABSTRACT
MOTIVATION The boost of next-generation sequencing technologies provides us with an unprecedented opportunity for elucidating genetic mysteries, yet the short-read length hinders us from better assembling the genome from scratch. New protocols now exist that can generate overlapping pair-end reads. By joining the 3' ends of each read pair, one is able to construct longer reads for assembling. However, effectively joining two overlapped pair-end reads remains a challenging task.

RESULT:

In this article, we present an efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30× simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, which is, respectively, 10 and 2% higher than the recently published tool FLASH. When COPE is applied to real reads for genome assembly, the resulting contigs are found to have fewer errors and give a 14-fold improvement in the N50 measurement when compared with the contigs produced using unconnected reads. AVAILABILITY AND IMPLEMENTATION COPE is implemented in C++ and is freely available as open-source code at ftp//ftp.genomics.org.cn/pub/cope. CONTACT twlam@cs.hku.hk or luoruibang@genomics.org.cn
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Mapeamento Cromossômico / Análise de Sequência de DNA / Arabidopsis / Genômica Idioma: En Ano de publicação: 2012 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Mapeamento Cromossômico / Análise de Sequência de DNA / Arabidopsis / Genômica Idioma: En Ano de publicação: 2012 Tipo de documento: Article