Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
1.
Proc Natl Acad Sci U S A ; 110(26): E2390-9, 2013 Jun 25.
Article in English | MEDLINE | ID: mdl-23754396

ABSTRACT

The "dark matter of life" describes microbes and even entire divisions of bacterial phyla that have evaded cultivation and have yet to be sequenced. We present a genome from the globally distributed but elusive candidate phylum TM6 and uncover its metabolic potential. TM6 was detected in a biofilm from a sink drain within a hospital restroom by analyzing cells using a highly automated single-cell genomics platform. We developed an approach for increasing throughput and effectively improving the likelihood of sampling rare events based on forming small random pools of single-flow-sorted cells, amplifying their DNA by multiple displacement amplification and sequencing all cells in the pool, creating a "mini-metagenome." A recently developed single-cell assembler, SPAdes, in combination with contig binning methods, allowed the reconstruction of genomes from these mini-metagenomes. A total of 1.07 Mb was recovered in seven contigs for this member of TM6 (JCVI TM6SC1), estimated to represent 90% of its genome. High nucleotide identity between a total of three TM6 genome drafts generated from pools that were independently captured, amplified, and assembled provided strong confirmation of a correct genomic sequence. TM6 is likely a Gram-negative organism and possibly a symbiont of an unknown host (nonfree living) in part based on its small genome, low-GC content, and lack of biosynthesis pathways for most amino acids and vitamins. Phylogenomic analysis of conserved single-copy genes confirms that TM6SC1 is a deeply branching phylum.


Subject(s)
Biofilms , Hospitals , Metagenome , Sanitary Engineering , Water Microbiology , Bacteria/classification , Bacteria/genetics , Bacteria/isolation & purification , DNA, Bacterial/genetics , DNA, Bacterial/isolation & purification , DNA, Bacterial/metabolism , Evolution, Molecular , Genome, Bacterial , Humans , Metabolic Networks and Pathways , Metagenomics/methods , Molecular Sequence Data , Phylogeny , Water Supply
2.
Brief Bioinform ; 14(5): 548-55, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23793381

ABSTRACT

Next-generation sequencing (NGS) is increasingly being adopted as the backbone of biomedical research. With the commercialization of various affordable desktop sequencers, NGS will be reached by increasing numbers of cellular and molecular biologists, necessitating community consensus on bioinformatics protocols to tackle the exponential increase in quantity of sequence data. The current resources for NGS informatics are extremely fragmented. Finding a centralized synthesis is difficult. A multitude of tools exist for NGS data analysis; however, none of these satisfies all possible uses and needs. This gap in functionality could be filled by integrating different methods in customized pipelines, an approach helped by the open-source nature of many NGS programmes. Drawing from community spirit and with the use of the Wikipedia framework, we have initiated a collaborative NGS resource: The NGS WikiBook. We have collected a sufficient amount of text to incentivize a broader community to contribute to it. Users can search, browse, edit and create new content, so as to facilitate self-learning and feedback to the community. The overall structure and style for this dynamic material is designed for the bench biologists and non-bioinformaticians. The flexibility of online material allows the readers to ignore details in a first read, yet have immediate access to the information they need. Each chapter comes with practical exercises so readers may familiarize themselves with each step. The NGS WikiBook aims to create a collective laboratory book and protocol that explains the key concepts and describes best practices in this fast-evolving field.


Subject(s)
Computational Biology/education , Computer-Assisted Instruction/methods , High-Throughput Nucleotide Sequencing/statistics & numerical data , Cooperative Behavior , Internet , Teaching
3.
Bioinformatics ; 29(8): 1072-5, 2013 Apr 15.
Article in English | MEDLINE | ID: mdl-23422339

ABSTRACT

SUMMARY: Limitations of genome sequencing techniques have led to dozens of assembly algorithms, none of which is perfect. A number of methods for comparing assemblers have been developed, but none is yet a recognized benchmark. Further, most existing methods for comparing assemblies are only applicable to new assemblies of finished genomes; the problem of evaluating assemblies of previously unsequenced species has not been adequately considered. Here, we present QUAST-a quality assessment tool for evaluating and comparing genome assemblies. This tool improves on leading assembly comparison software with new ideas and quality metrics. QUAST can evaluate assemblies both with a reference genome, as well as without a reference. QUAST produces many reports, summary tables and plots to help scientists in their research and in their publications. In this study, we used QUAST to compare several genome assemblers on three datasets. QUAST tables and plots for all of them are available in the Supplementary Material, and interactive versions of these reports are on the QUAST website. AVAILABILITY: http://bioinf.spbau.ru/quast . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics/standards , Sequence Analysis, DNA/standards , Software , Algorithms , Animals , Chromosome Mapping , Contig Mapping , Escherichia coli/genetics , Genome , Genomic Structural Variation , Humans , Quality Control , Sequence Alignment
4.
F1000Res ; 2: 258, 2013.
Article in English | MEDLINE | ID: mdl-25110578

ABSTRACT

We present C-Sibelia, a highly accurate and easy-to-use software tool for comparing two closely related bacterial genomes, which can be presented as either finished sequences or fragmented assemblies. C-Sibelia takes as input two FASTA files and produces: (1) a VCF file containing all identified single nucleotide variations and indels; (2) an XMFA file containing alignment information. The software also produces Circos diagrams visualizing high level genomic architecture for rearrangement analyses. C-Sibelia is a part of the Sibelia comparative genomics suite, which is freely available under the GNU GPL v.2 license at http://sourceforge.net/projects/sibelia-bio. C-Sibelia is compatible with Unix-like operating systems. A web-based version of the software is available at http://etool.me/software/csibelia.

5.
J Comput Biol ; 19(5): 455-77, 2012 May.
Article in English | MEDLINE | ID: mdl-22506599

ABSTRACT

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.


Subject(s)
Algorithms , Bacteria/genetics , Genome, Bacterial , Metagenomics/methods , Single-Cell Analysis/methods , Sequence Analysis, DNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL