Your browser doesn't support javascript.
loading
Genome assembly quality: assessment and improvement using the neutral indel model.
Meader, Stephen; Hillier, LaDeana W; Locke, Devin; Ponting, Chris P; Lunter, Gerton.
Afiliação
  • Meader S; Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom.
Genome Res ; 20(5): 675-84, 2010 May.
Article em En | MEDLINE | ID: mdl-20305016
We describe a statistical and comparative-genomic approach for quantifying error rates of genome sequence assemblies. The method exploits not substitutions but the pattern of insertions and deletions (indels) in genome-scale alignments for closely related species. Using two- or three-way alignments, the approach estimates the amount of aligned sequence containing clusters of nucleotides that were wrongly inserted or deleted during sequencing or assembly. Thus, the method is well-suited to assessing fine-scale sequence quality within single assemblies, between different assemblies of a single set of reads, and between genome assemblies for different species. When applying this approach to four primate genome assemblies, we found that average gap error rates per base varied considerably, by up to sixfold. As expected, bacterial artificial chromosome (BAC) sequences contained lower, but still substantial, predicted numbers of errors, arguing for caution in regarding BACs as the epitome of genome fidelity. We then mapped short reads, at approximately 10-fold statistical coverage, from a Bornean orangutan onto the Sumatran orangutan genome assembly originally constructed from capillary reads. This resulted in a reduced gap error rate and a separation of error-prone from high-fidelity sequence. Over 5000 predicted indel errors in protein-coding sequence were corrected in a hybrid assembly. Our approach contributes a new fine-scale quality metric for assemblies that should facilitate development of improved genome sequencing and assembly strategies.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Primatas / Mapeamento Cromossômico / Genômica / Mutação INDEL / Modelos Genéticos Tipo de estudo: Prognostic_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2010 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Primatas / Mapeamento Cromossômico / Genômica / Mutação INDEL / Modelos Genéticos Tipo de estudo: Prognostic_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2010 Tipo de documento: Article