Your browser doesn't support javascript.
loading
On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data.
Arredondo-Alonso, Sergio; Willems, Rob J; van Schaik, Willem; Schürch, Anita C.
Afiliação
  • Arredondo-Alonso S; 1​Department of Medical Microbiology, Universitair Medisch Centrum Utrecht, Utrecht, The Netherlands.
  • Willems RJ; 1​Department of Medical Microbiology, Universitair Medisch Centrum Utrecht, Utrecht, The Netherlands.
  • van Schaik W; 1​Department of Medical Microbiology, Universitair Medisch Centrum Utrecht, Utrecht, The Netherlands.
  • Schürch AC; 2​Institute of Microbiology and Infection, University of Birmingham, Birmingham, England, UK.
Microb Genom ; 3(10): e000128, 2017 10.
Article em En | MEDLINE | ID: mdl-29177087
ABSTRACT
To benchmark algorithms for automated plasmid sequence reconstruction from short-read sequencing data, we selected 42 publicly available complete bacterial genome sequences spanning 12 genera, containing 148 plasmids. We predicted plasmids from short-read data with four programs (PlasmidSPAdes, Recycler, cBar and PlasmidFinder) and compared the outcome to the reference sequences. PlasmidSPAdes reconstructs plasmids based on coverage differences in the assembly graph. It reconstructed most of the reference plasmids (recall=0.82), but approximately a quarter of the predicted plasmid contigs were false positives (precision=0.75). PlasmidSPAdes merged 84 % of the predictions from genomes with multiple plasmids into a single bin. Recycler searches the assembly graph for sub-graphs corresponding to circular sequences and correctly predicted small plasmids, but failed with long plasmids (recall=0.12, precision=0.30). cBar, which applies pentamer frequency analysis to detect plasmid-derived contigs, showed a recall and precision of 0.76 and 0.62, respectively. However, cBar categorizes contigs as plasmid-derived and does not bin the different plasmids. PlasmidFinder, which searches for replicons, had the highest precision (1.0), but was restricted by the contents of its database and the contig length obtained from de novo assembly (recall=0.36). PlasmidSPAdes and Recycler detected putative small plasmids (<10 kbp), which were also predicted as plasmids by cBar, but were absent in the original assembly. This study shows that it is possible to automatically predict small plasmids. Prediction of large plasmids (>50 kbp) containing repeated sequences remains challenging and limits the high-throughput analysis of plasmids from short-read whole-genome sequencing data.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Plasmídeos / Genoma Bacteriano Tipo de estudo: Prognostic_studies Idioma: En Revista: Microb Genom Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Holanda

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Plasmídeos / Genoma Bacteriano Tipo de estudo: Prognostic_studies Idioma: En Revista: Microb Genom Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Holanda