Your browser doesn't support javascript.
loading
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms.
Sze, Sing-Hoi; Parrott, Jonathan J; Tarone, Aaron M.
Afiliación
  • Sze SH; Department of Computer Science and Engineering, Texas A&M University, College Station, Mexico, 77843, TX, USA. shsze@cse.tamu.edu.
  • Parrott JJ; Department of Biochemistry & Biophysics, Texas A&M University, College Station, Mexico, 77843, TX, USA. shsze@cse.tamu.edu.
  • Tarone AM; Department of Entomology, Texas A&M University, College Station, Mexico, 77843, TX, USA.
BMC Genomics ; 18(Suppl 10): 895, 2017 Dec 06.
Article en En | MEDLINE | ID: mdl-29244008
ABSTRACT

BACKGROUND:

While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies.

RESULTS:

We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies.

CONCLUSIONS:

Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: BMC Genomics Asunto de la revista: GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Algoritmos / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: BMC Genomics Asunto de la revista: GENETICA Año: 2017 Tipo del documento: Article País de afiliación: Estados Unidos