RESUMO
MOTIVATION: Pangenomes provide novel insights for population and quantitative genetics, genomics and breeding not available from studying a single reference genome. Instead, a species is better represented by a pangenome or collection of genomes. Unfortunately, managing and using pangenomes for genomically diverse species is computationally and practically challenging. We developed a trellis graph representation anchored to the reference genome that represents most pangenomes well and can be used to impute complete genomes from low density sequence or variant data. RESULTS: The Practical Haplotype Graph (PHG) is a pangenome pipeline, database (PostGRES & SQLite), data model (Java, Kotlin or R) and Breeding API (BrAPI) web service. The PHG has already been able to accurately represent diversity in four major crops including maize, one of the most genomically diverse species, with up to 1000-fold data compression. Using simulated data, we show that, at even 0.1× coverage, with appropriate reads and sequence alignment, imputation results in extremely accurate haplotype reconstruction. The PHG is a platform and environment for the understanding and application of genomic diversity. AVAILABILITY AND IMPLEMENTATION: All resources listed here are freely available. The PHG Docker used to generate the simulation results is https://hub.docker.com/ as maizegenetics/phg:0.0.27. PHG source code is at https://bitbucket.org/bucklerlab/practicalhaplotypegraph/src/master/. The code used for the analysis of simulated data is at https://bitbucket.org/bucklerlab/phg-manuscript/src/master/. The PHG database of NAM parent haplotypes is in the CyVerse data store (https://de.cyverse.org/de/) and named/iplant/home/shared/panzea/panGenome/PHG_db_maize/phg_v5Assemblies_20200608.db. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma , Melhoramento Vegetal , Haplótipos , Genômica/métodos , SoftwareRESUMO
KEY MESSAGE: A large association panel of 836 maize inbreds revealed a broader genetic diversity of cold tolerance, as predominantly favorable QTL with small effects were identified, indicating that genomic selection is the most promising option for breeding maize for cold tolerance. Maize (Zea mays L.) has limited cold tolerance, and breeding for cold tolerance is a noteworthy bottleneck for reaching the high potential of maize production in temperate areas. In this study, we evaluate a large panel of 836 maize inbred lines to detect genetic loci and candidate genes for cold tolerance at the germination and seedling stages. Genetic variation for cold tolerance was larger than in previous reports with moderately high heritability for most traits. We identified 187 significant single-nucleotide polymorphisms (SNPs) that were integrated into 159 quantitative trait loci (QTL) for emergence and traits related to early growth. Most of the QTL have small effects and are specific for each environment, with the majority found under control conditions. Favorable alleles are more frequent in 120 inbreds including all germplasm groups, but mainly from Minnesota and Spain. Therefore, there is a large, potentially novel, genetic variability in the germplasm groups represented by these inbred lines. Most of the candidate genes are involved in metabolic processes and intracellular membrane-bounded organelles. We expect that further evaluations of germplasm with broader genetic diversity could identify additional favorable alleles for cold tolerance. However, it is not likely that further studies will find favorable alleles with large effects for improving cold tolerance in maize.