Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data.
Am J Hum Genet
; 97(2): 284-90, 2015 Aug 06.
Article
en En
| MEDLINE
| ID: mdl-26235984
ABSTRACT
DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%-20%), contamination-adjusted calls eliminate 48%-77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Análisis de Secuencia de ADN
/
Contaminación de ADN
/
Técnicas de Genotipaje
/
Modelos Genéticos
Tipo de estudio:
Evaluation_studies
/
Prognostic_studies
Idioma:
En
Revista:
Am J Hum Genet
Año:
2015
Tipo del documento:
Article
País de afiliación:
Estados Unidos