Your browser doesn't support javascript.
loading
Probabilistic base calling of Solexa sequencing data.
Rougemont, Jacques; Amzallag, Arnaud; Iseli, Christian; Farinelli, Laurent; Xenarios, Ioannis; Naef, Felix.
Afiliação
  • Rougemont J; School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland. jacques.rougemont@epfl.ch
BMC Bioinformatics ; 9: 431, 2008 Oct 13.
Article em En | MEDLINE | ID: mdl-18851737
ABSTRACT

BACKGROUND:

Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology.

RESULTS:

We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads.

CONCLUSION:

We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / DNA Viral / Análise de Sequência de DNA Tipo de estudo: Prognostic_studies Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2008 Tipo de documento: Article País de afiliação: Suíça

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / DNA Viral / Análise de Sequência de DNA Tipo de estudo: Prognostic_studies Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2008 Tipo de documento: Article País de afiliação: Suíça