Your browser doesn't support javascript.
loading
Quinoa genome assembly employing genomic variation for guided scaffolding.
Bodrug-Schepers, Alexandrina; Stralis-Pavese, Nancy; Buerstmayr, Hermann; Dohm, Juliane C; Himmelbauer, Heinz.
Afiliação
  • Bodrug-Schepers A; Institute of Computational Biology, Department of Biotechnology, Universität für Bodenkultur, Vienna, Austria.
  • Stralis-Pavese N; Institute of Computational Biology, Department of Biotechnology, Universität für Bodenkultur, Vienna, Austria.
  • Buerstmayr H; Institute of Biotechnology in Plant Production, Department of Agrobiotechnology and Department of Crop Sciences, Universität für Bodenkultur, Tulln, Austria.
  • Dohm JC; Institute of Computational Biology, Department of Biotechnology, Universität für Bodenkultur, Vienna, Austria. dohm@boku.ac.at.
  • Himmelbauer H; Institute of Computational Biology, Department of Biotechnology, Universität für Bodenkultur, Vienna, Austria. heinz.himmelbauer@boku.ac.at.
Theor Appl Genet ; 134(11): 3577-3594, 2021 Nov.
Article em En | MEDLINE | ID: mdl-34365519
KEY MESSAGE: We propose to use the natural variation between individuals of a population for genome assembly scaffolding. In today's genome projects, multiple accessions get sequenced, leading to variant catalogs. Using such information to improve genome assemblies is attractive both cost-wise as well as scientifically, because the value of an assembly increases with its contiguity. We conclude that haplotype information is a valuable resource to group and order contigs toward the generation of pseudomolecules. Quinoa (Chenopodium quinoa) has been under cultivation in Latin America for more than 7500 years. Recently, quinoa has gained increasing attention due to its stress resistance and its nutritional value. We generated a novel quinoa genome assembly for the Bolivian accession CHEN125 using PacBio long-read sequencing data (assembly size 1.32 Gbp, initial N50 size 608 kbp). Next, we re-sequenced 50 quinoa accessions from Peru and Bolivia. This set of accessions differed at 4.4 million single-nucleotide variant (SNV) positions compared to CHEN125 (1.4 million SNV positions on average per accession). We show how to exploit variation in accessions that are distantly related to establish a genome-wide ordered set of contigs for guided scaffolding of a reference assembly. The method is based on detecting shared haplotypes and their expected continuity throughout the genome (i.e., the effect of linkage disequilibrium), as an extension of what is expected in mapping populations where only a few haplotypes are present. We test the approach using Arabidopsis thaliana data from different populations. After applying the method on our CHEN125 quinoa assembly we validated the results with mate-pairs, genetic markers, and another quinoa assembly originating from a Chilean cultivar. We show consistency between these information sources and the haplotype-based relations as determined by us and obtain an improved assembly with an N50 size of 1079 kbp and ordered contig groups of up to 39.7 Mbp. We conclude that haplotype information in distantly related individuals of the same species is a valuable resource to group and order contigs according to their adjacency in the genome toward the generation of pseudomolecules.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Variação Genética / Genoma de Planta / Chenopodium quinoa País como assunto: America do sul / Bolivia / Chile / Peru Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Variação Genética / Genoma de Planta / Chenopodium quinoa País como assunto: America do sul / Bolivia / Chile / Peru Idioma: En Ano de publicação: 2021 Tipo de documento: Article