Your browser doesn't support javascript.
loading
Integrative Meta-Assembly Pipeline (IMAP): Chromosome-level genome assembler combining multiple de novo assemblies.
Song, Giltae; Lee, Jongin; Kim, Juyeon; Kang, Seokwoo; Lee, Hoyong; Kwon, Daehong; Lee, Daehwan; Lang, Gregory I; Cherry, J Michael; Kim, Jaebum.
Afiliação
  • Song G; School of Computer Science and Engineering, Pusan National University, Busan, South Korea.
  • Lee J; Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea.
  • Kim J; Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea.
  • Kang S; School of Computer Science and Engineering, Pusan National University, Busan, South Korea.
  • Lee H; School of Computer Science and Engineering, Pusan National University, Busan, South Korea.
  • Kwon D; Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea.
  • Lee D; Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea.
  • Lang GI; Department of Biological Sciences, Lehigh University, Bethlehem, PA, United States of America.
  • Cherry JM; Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America.
  • Kim J; Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea.
PLoS One ; 14(8): e0221858, 2019.
Article em En | MEDLINE | ID: mdl-31454399
ABSTRACT

BACKGROUND:

Genomic data have become major resources to understand complex mechanisms at fine-scale temporal and spatial resolution in functional and evolutionary genetic studies, including human diseases, such as cancers. Recently, a large number of whole genomes of evolving populations of yeast (Saccharomyces cerevisiae W303 strain) were sequenced in a time-dependent manner to identify temporal evolutionary patterns. For this type of study, a chromosome-level sequence assembly of the strain or population at time zero is required to compare with the genomes derived later. However, there is no fully automated computational approach in experimental evolution studies to establish the chromosome-level genome assembly using unique features of sequencing data. METHODS AND

RESULTS:

In this study, we developed a new software pipeline, the integrative meta-assembly pipeline (IMAP), to build chromosome-level genome sequence assemblies by generating and combining multiple initial assemblies using three de novo assemblers from short-read sequencing data. We significantly improved the continuity and accuracy of the genome assembly using a large collection of sequencing data and hybrid assembly approaches. We validated our pipeline by generating chromosome-level assemblies of yeast strains W303 and SK1, and compared our results with assemblies built using long-read sequencing and various assembly evaluation metrics. We also constructed chromosome-level sequence assemblies of S. cerevisiae strain Sigma1278b, and three commonly used fungal strains Aspergillus nidulans A713, Neurospora crassa 73, and Thielavia terrestris CBS 492.74, for which long-read sequencing data are not yet available. Finally, we examined the effect of IMAP parameters, such as reference and resolution, on the quality of the final assembly of the yeast strains W303 and SK1.

CONCLUSIONS:

We developed a cost-effective pipeline to generate chromosome-level sequence assemblies using only short-read sequencing data. Our pipeline combines the strengths of reference-guided and meta-assembly approaches. Our pipeline is available online at http//github.com/jkimlab/IMAP including a Docker image, as well as a Perl script, to help users install the IMAP package, including several prerequisite programs. Users can use IMAP to easily build the chromosome-level assembly for the genome of their interest.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de DNA Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Coréia do Sul

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de DNA Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Coréia do Sul