A cross-sample statistical model for SNP detection in short-read sequencing data.

Muralidharan, Omkar; Natsoulis, Georges; Bell, John; Newburger, Daniel; Xu, Hua; Kela, Itai; Ji, Hanlee; Zhang, Nancy

Muralidharan, Omkar; Natsoulis, Georges; Bell, John; Newburger, Daniel; Xu, Hua; Kela, Itai; Ji, Hanlee; Zhang, Nancy.

Afiliación

Muralidharan O; Department of Statistics, Stanford University, 390 Serra Mall, Stanford, CA, 94305, USA.

Nucleic Acids Res ; 40(1): e5, 2012 Jan.

Article en En | MEDLINE | ID: mdl-22064853

ABSTRACT

ABSTRACT

Highly multiplex DNA sequencers have greatly expanded our ability to survey human genomes for previously unknown single nucleotide polymorphisms (SNPs). However, sequencing and mapping errors, though rare, contribute substantially to the number of false discoveries in current SNP callers. We demonstrate that we can significantly reduce the number of false positive SNP calls by pooling information across samples. Although many studies prepare and sequence multiple samples with the same protocol, most existing SNP callers ignore cross-sample information. In contrast, we propose an empirical Bayes method that uses cross-sample information to learn the error properties of the data. This error information lets us call SNPs with a lower false discovery rate than existing methods.

Asunto(s)

Modelos Estadísticos; Polimorfismo de Nucleótido Simple; Análisis de Secuencia de ADN/métodos; Alelos; Técnicas de Genotipaje; Secuenciación de Nucleótidos de Alto Rendimiento

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Modelos Estadísticos / Análisis de Secuencia de ADN / Polimorfismo de Nucleótido Simple Tipo de estudio: Diagnostic_studies / Risk_factors_studies Idioma: En Revista: Nucleic Acids Res Año: 2012 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google