Your browser doesn't support javascript.
loading
Reference-guided assembly of four diverse Arabidopsis thaliana genomes.
Schneeberger, Korbinian; Ossowski, Stephan; Ott, Felix; Klein, Juliane D; Wang, Xi; Lanz, Christa; Smith, Lisa M; Cao, Jun; Fitz, Joffrey; Warthmann, Norman; Henz, Stefan R; Huson, Daniel H; Weigel, Detlef.
Afiliação
  • Schneeberger K; Department of Molecular Biology, Max Planck Institute for Developmental Biology, D-72076 Tübingen, Germany.
Proc Natl Acad Sci U S A ; 108(25): 10249-54, 2011 Jun 21.
Article em En | MEDLINE | ID: mdl-21646520
We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Arabidopsis / Genoma de Planta Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2011 Tipo de documento: Article País de afiliação: Alemanha

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Arabidopsis / Genoma de Planta Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2011 Tipo de documento: Article País de afiliação: Alemanha