Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes.
BMC Genomics
; 18(1): 261, 2017 03 27.
Article
em En
| MEDLINE
| ID: mdl-28347275
ABSTRACT
BACKGROUND:
Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome.RESULTS:
Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable - estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation.CONCLUSIONS:
Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Genoma de Planta
/
Medicago truncatula
/
Variações do Número de Cópias de DNA
Idioma:
En
Revista:
BMC Genomics
Assunto da revista:
GENETICA
Ano de publicação:
2017
Tipo de documento:
Article
País de afiliação:
Estados Unidos