Your browser doesn't support javascript.
loading
A support vector machine for identification of single-nucleotide polymorphisms from next-generation sequencing data.
O'Fallon, Brendan D; Wooderchak-Donahue, Whitney; Crockett, David K.
Afiliação
  • O'Fallon BD; ARUP Institute of Clinical and Experimental Pathology, 500 Chipeta Way, Salt Lake City, UT 84102, USA. brendan.d.ofallon@aruplab.com
Bioinformatics ; 29(11): 1361-6, 2013 Jun 01.
Article em En | MEDLINE | ID: mdl-23620357
ABSTRACT
MOTIVATION Accurate determination of single-nucleotide polymorphisms (SNPs) from next-generation sequencing data is a significant challenge facing bioinformatics researchers. Most current methods use mechanistic models that assume nucleotides aligning to a given reference position are sampled from a binomial distribution. While such methods are sensitive, they are often unable to discriminate errors resulting from misaligned reads, sequencing errors or platform artifacts from true variants.

RESULTS:

To enable more accurate SNP calling, we developed an algorithm that uses a trained support vector machine (SVM) to determine variants from .BAM or .SAM formatted alignments of sequence reads. Our SVM-based implementation determines SNPs with significantly greater sensitivity and specificity than alternative platforms, including the UnifiedGenotyper included with the Genome Analysis Toolkit, samtools and FreeBayes. In addition, the quality scores produced by our implementation more accurately reflect the likelihood that a variant is real when compared with those produced by the Genome Analysis Toolkit. While results depend on the model used, the implementation includes tools to easily build new models and refine existing models with additional training data.

AVAILABILITY:

Source code and executables are available from github.com/brendanofallon/SNPSVM/
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Análise de Sequência de DNA / Polimorfismo de Nucleotídeo Único / Sequenciamento de Nucleotídeos em Larga Escala / Máquina de Vetores de Suporte Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Ano de publicação: 2013 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Análise de Sequência de DNA / Polimorfismo de Nucleotídeo Único / Sequenciamento de Nucleotídeos em Larga Escala / Máquina de Vetores de Suporte Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Ano de publicação: 2013 Tipo de documento: Article