RESUMO
Peanut (Arachis hypogaea L.) is a globally important oil and food crop frequently grown in arid, semi-arid, or dryland environments. Improving drought tolerance is a key goal for peanut crop improvement efforts. Here we present the genome assembly and gene model annotation for 'Line8', a peanut genotype bred from drought tolerant cultivars. Our assembly and annotation are the most contiguous and complete peanut genome resources currently available. The high contiguity of the Line8 assembly allowed us to explore structural variation both between peanut genotypes and subgenomes. We detect several large inversions between Line8 and other peanut genome assemblies, and there is a trend for the inversions between more genetically diverged genotypes to have higher gene content. We also relate patterns of subgenome exchange to structural variation between Line8 homeologous chromosomes. Unexpectedly, we discover that Line8 harbors an introgression from A.cardenasii, a diploid peanut relative and important donor of disease resistance alleles to peanut breeding populations. The fully resolved sequences of both haplotypes in this introgression provide the first in situ characterization of A.cardenasii candidate alleles that can be leveraged for future targeted improvement efforts. The completeness of our genome will support peanut biotechnology and broader research into the evolution of hybridization and polyploidy.
RESUMO
BACKGROUND: Aspergillus flavus is an important agricultural and food safety threat due to its production of carcinogenic aflatoxins. It has high level of genetic diversity that is adapted to various environments. Recently, we reported two reference genomes of A. flavus isolates, AF13 (MAT1-2 and highly aflatoxigenic isolate) and NRRL3357 (MAT1-1 and moderate aflatoxin producer). Where, an insertion of 310 kb in AF13 included an aflatoxin producing gene bZIP transcription factor, named atfC. Observations of significant genomic variants between these isolates of contrasting phenotypes prompted an investigation into variation among other agricultural isolates of A. flavus with the goal of discovering novel genes potentially associated with aflatoxin production regulation. Present study was designed with three main objectives: (1) collection of large number of A. flavus isolates from diverse sources including maize plants and field soils; (2) whole genome sequencing of collected isolates and development of a pangenome; and (3) pangenome-wide association study (Pan-GWAS) to identify novel secondary metabolite cluster genes. RESULTS: Pangenome analysis of 346 A. flavus isolates identified a total of 17,855 unique orthologous gene clusters, with mere 41% (7,315) core genes and 59% (10,540) accessory genes indicating accumulation of high genomic diversity during domestication. 5,994 orthologous gene clusters in accessory genome not annotated in either the A. flavus AF13 or NRRL3357 reference genomes. Pan-genome wide association analysis of the genomic variations identified 391 significant associated pan-genes associated with aflatoxin production. Interestingly, most of the significantly associated pan-genes (94%; 369 associations) belonged to accessory genome indicating that genome expansion has resulted in the incorporation of new genes associated with aflatoxin and other secondary metabolites. CONCLUSION: In summary, this study provides complete pangenome framework for the species of Aspergillus flavus along with associated genes for pathogen survival and aflatoxin production. The large accessory genome indicated large genome diversity in the species A. flavus, however AflaPan is a closed pangenome represents optimum diversity of species A. flavus. Most importantly, the newly identified aflatoxin producing gene clusters will be a new source for seeking aflatoxin mitigation strategies and needs new attention in research.
Assuntos
Aflatoxinas , Aspergillus flavus , Genoma Fúngico , Família Multigênica , Metabolismo Secundário , Aspergillus flavus/genética , Aspergillus flavus/metabolismo , Aflatoxinas/genética , Aflatoxinas/metabolismo , Metabolismo Secundário/genética , Zea mays/microbiologia , Zea mays/genética , Estudo de Associação Genômica Ampla , Genes Fúngicos , Sequenciamento Completo do Genoma , Variação GenéticaRESUMO
Cultivated peanut or groundnut (Arachis hypogaea L.) is a grain legume grown in many developing countries by smallholder farmers for food, feed, and/or income. The speciation of the cultivated species, that involved polyploidization followed by domestication, greatly reduced its variability at the DNA level. Mobilizing peanut diversity is a prerequisite for any breeding program for overcoming the main constraints that plague production and for increasing yield in farmer fields. In this study, the Groundnut Improvement Network for Africa assembled a collection of 1,049 peanut breeding lines, varieties, and landraces from 9 countries in Africa. The collection was genotyped with the Axiom_Arachis2 48K SNP array and 8,229 polymorphic single nucleotide polymorphism (SNP) markers were used to analyze the genetic structure of this collection and quantify the level of genetic diversity in each breeding program. A supervised model was developed using dapc to unambiguously assign 542, 35, and 172 genotypes to the Spanish, Valencia, and Virginia market types, respectively. Distance-based clustering of the collection showed a clear grouping structure according to subspecies and market types, with 73% of the genotypes classified as fastigiata and 27% as hypogaea subspecies. Using STRUCTURE, the global structuration was confirmed and showed that, at a minimum membership of 0.8, 76% of the varieties that were not assigned by dapc were actually admixed. This was particularly the case of most of the genotype of the Valencia subgroup that exhibited admixed genetic heritage. The results also showed that the geographic origin (i.e. East, Southern, and West Africa) did not strongly explain the genetic structure. The gene diversity managed by each breeding program, measured by the expected heterozygosity, ranged from 0.25 to 0.39, with the Niger breeding program having the lowest diversity mainly because only lines that belong to the fastigiata subspecies are used in this program. Finally, we developed a core collection composed of 300 accessions based on breeding traits and genetic diversity. This collection, which is composed of 205 genotypes of fastigiata subspecies (158 Spanish and 47 Valencia) and 95 genotypes of hypogaea subspecies (all Virginia), improves the genetic diversity of each individual breeding program and is, therefore, a unique resource for allele mining and breeding.
Assuntos
Variação Genética , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único , Arachis/genética , África , Estudos de Associação GenéticaRESUMO
White mold (WM), caused by the ubiquitous fungus Sclerotinia sclerotiorum, is a devastating disease that limits production and quality of dry bean globally. In the present study, classic linkage mapping combined with QTL-seq were employed in two recombinant inbred line (RIL) populations, "Montrose"/I9365-25 (M25) and "Raven"/I9365-31 (R31), with the initial goal of fine-mapping QTL WM5.4 and WM7.5 that condition WM resistance. The RILs were phenotyped for WM reactions under greenhouse (straw test) and field environments. The general region of WM5.4 and WM7.5 were reconfirmed with both mapping strategies within each population. Combining the results from both mapping strategies, WM5.4 was delimited to a 22.60-36.25 Mb interval in the heterochromatic regions on Pv05, while WM7.5 was narrowed to a 0.83 Mb (3.99-4.82 Mb) region on the Pv07 chromosome. Furthermore, additional QTL WM2.2a (3.81-7.24 Mb), WM2.2b (11.18-17.37 Mb, heterochromatic region), and WM2.2c (23.33-25.94 Mb) were mapped to a narrowed genomic interval on Pv02 and WM4.2 in a 0.89 Mb physical interval at the distal end of Pv04 chromosome. Gene models encoding gibberellin 2-oxidase proteins regulating plant architecture are likely candidate genes associated with WM2.2a resistance. Nine gene models encoding a disease resistance protein (quinone reductase family protein and ATWRKY69) found within the WM5.4 QTL interval are putative candidate genes. Clusters of 13 and 5 copies of gene models encoding cysteine-rich receptor-like kinase and receptor-like protein kinase-related family proteins, respectively, are potential candidate genes associated with WM7.5 resistance and most likely trigger physiological resistance to WM. Acquired knowledge of the narrowed major QTL intervals, flanking markers, and candidate genes provides promising opportunities to develop functional molecular markers to implement marker-assisted selection for WM resistant dry bean cultivars.
Assuntos
Cromossomos de Plantas , Locos de Características Quantitativas , Mapeamento Cromossômico/métodos , Fenótipo , Resistência à Doença/genéticaRESUMO
Root nodule symbiosis (RNS) is the pillar behind sustainable agriculture and plays a pivotal role in the environmental nitrogen cycle. Most of the genetic, molecular, and cell-biological knowledge on RNS comes from model legumes that exhibit a root-hair mode of bacterial infection, in contrast to the Dalbergoid legumes exhibiting crack-entry of rhizobia. As a step toward understanding this important group of legumes, we have combined microscopic analysis and temporal transcriptome to obtain a dynamic view of plant gene expression during Arachis hypogaea (peanut) nodule development. We generated comprehensive transcriptome data by mapping the reads to A. hypogaea, and two diploid progenitor genomes. Additionally, we performed BLAST searches to identify nodule-induced yet-to-be annotated peanut genes. Comparison between peanut, Medicago truncatula, Lotus japonicus, and Glycine max showed upregulation of 61 peanut orthologs among 111 tested known RNS-related genes, indicating conservation in mechanisms of nodule development among members of the Papilionoid family. Unlike model legumes, recruitment of class 1 phytoglobin-derived symbiotic hemoglobin (SymH) in peanut indicates diversification of oxygen-scavenging mechanisms in the Papilionoid family. Finally, the absence of cysteine-rich motif-1-containing nodule-specific cysteine-rich peptide (NCR) genes but the recruitment of defensin-like NCRs suggest a diverse molecular mechanism of terminal bacteroid differentiation. In summary, our work describes genetic conservation and diversification in legume-rhizobia symbiosis in the Papilionoid family, as well as among members of the Dalbergoid legumes.[Formula: see text] Copyright © 2022 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Assuntos
Arachis , Medicago truncatula , Arachis/genética , Arachis/microbiologia , Diferenciação Celular , Medicago truncatula/microbiologia , Fixação de Nitrogênio/genética , Nódulos Radiculares de Plantas/microbiologia , Simbiose/genética , Transcriptoma/genéticaRESUMO
Aspergillus flavus and Aspergillus parasiticus produce carcinogenic aflatoxins during crop infection, with extensive variations in production among isolates, ranging from atoxigenic to highly toxigenic. Here, we report draft genome sequences of one A. parasiticus isolate and nine A. flavus isolates from field environments for use in comparative, functional, and phylogenetic studies.
RESUMO
Efforts in genome sequencing in the Aspergillus genus have led to the development of quality reference genomes for several important species including A. nidulans, A. fumigatus, and A. oryzae However, less progress has been made for A. flavus As part of the effort of the USDA-ARS Annual Aflatoxin Workshop Fungal Genome Project, the isolate NRRL3357 was sequenced and resulted in a scaffold-level genome released in 2005. Our goal has been biologically driven, focusing on two areas: isolate variation in aflatoxin production and drought stress exacerbating aflatoxin production by A. flavus Therefore, we developed two reference pseudomolecule genome assemblies derived from chromosome arms for two isolates: AF13, a MAT1-2, highly stress tolerant, and highly aflatoxigenic isolate; and NRRL3357, a MAT1-1, less stress tolerant, and moderate aflatoxin producer in comparison to AF13. Here, we report these two reference-grade assemblies for these isolates through a combination of PacBio long-read sequencing and optical mapping, and coupled them with comparative, functional, and phylogenetic analyses. This analysis resulted in the identification of 153 and 45 unique genes in AF13 and NRRL3357, respectively. We also confirmed the presence of a unique 310 Kb insertion in AF13 containing 60 genes. Analysis of this insertion revealed the presence of a bZIP transcription factor, named atfC, which may contribute to isolate pathogenicity and stress tolerance. Phylogenomic analyses comparing these and other available assemblies also suggest that the species complex of A. flavus is polyphyletic.
Assuntos
Aflatoxinas , Aspergillus flavus , Aspergillus flavus/genética , Sequência de Bases , Genoma Fúngico , FilogeniaRESUMO
Like many other crops, the cultivated peanut (Arachis hypogaea L.) is of hybrid origin and has a polyploid genome that contains essentially complete sets of chromosomes from two ancestral species. Here we report the genome sequence of peanut and show that after its polyploid origin, the genome has evolved through mobile-element activity, deletions and by the flow of genetic information between corresponding ancestral chromosomes (that is, homeologous recombination). Uniformity of patterns of homeologous recombination at the ends of chromosomes favors a single origin for cultivated peanut and its wild counterpart A. monticola. However, through much of the genome, homeologous recombination has created diversity. Using new polyploid hybrids made from the ancestral species, we show how this can generate phenotypic changes such as spontaneous changes in the color of the flowers. We suggest that diversity generated by these genetic mechanisms helped to favor the domestication of the polyploid A. hypogaea over other diploid Arachis species cultivated by humans.
Assuntos
Arachis/genética , Arachis/classificação , Argentina , Cromossomos de Plantas/genética , Produtos Agrícolas/genética , Metilação de DNA , DNA de Plantas/genética , Domesticação , Evolução Molecular , Regulação da Expressão Gênica de Plantas , Variação Genética , Genoma de Planta , Hibridização Genética , Fenótipo , Poliploidia , Recombinação Genética , Especificidade da Espécie , TetraploidiaRESUMO
Single nucleotide polymorphisms (SNPs) have many advantages as molecular markers since they are ubiquitous and codominant. However, the discovery of true SNPs in polyploid species is difficult. Peanut ( L.) is an allopolyploid, which has a very low rate of true SNP calling. A large set of true and false SNPs identified from the Axiom_ 58k array was leveraged to train machine-learning models to enable identification of true SNPs directly from sequence data to reduce ascertainment bias. These models achieved accuracy rates above 80% using real peanut RNA sequencing (RNA-seq) and whole-genome shotgun (WGS) resequencing data, which is higher than previously reported for polyploids and at least a twofold improvement for peanut. A 48K SNP array, Axiom_2, was designed using this approach resulting in 75% accuracy of calling SNPs from different tetraploid peanut genotypes. Using the method to simulate SNP variation in several polyploids, models achieved >98% accuracy in selecting true SNPs. Additionally, models built with simulated genotypes were able to select true SNPs at >80% accuracy using real peanut data. This work accomplished the objective to create an effective approach for calling highly reliable SNPs from polyploids using machine learning. A novel tool was developed for predicting true SNPs from sequence data, designated as SNP machine learning (SNP-ML), using the described models. The SNP-ML additionally provides functionality to train new models not included in this study for customized use, designated SNP machine learner (SNP-MLer). The SNP-ML is publicly available.
Assuntos
Arachis/genética , Aprendizado de Máquina , Polimorfismo de Nucleotídeo Único , Conjuntos de Dados como Assunto , Modelos Genéticos , PoliploidiaRESUMO
Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs) are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2) was developed for Arachis hypogaea (peanut), an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.
RESUMO
Late leaf spot (LLS; Cercosporidium personatum) is a major fungal disease of cultivated peanut (Arachis hypogaea). A recombinant inbred line population segregating for quantitative field resistance was used to identify quantitative trait loci (QTL) using QTL-seq. High rates of false positive SNP calls using established methods in this allotetraploid crop obscured significant QTLs. To resolve this problem, robust parental SNPs were first identified using polyploid-specific SNP identification pipelines, leading to discovery of significant QTLs for LLS resistance. These QTLs were confirmed over 4 years of field data. Selection with markers linked to these QTLs resulted in a significant increase in resistance, showing that these markers can be immediately applied in breeding programs. This study demonstrates that QTL-seq can be used to rapidly identify QTLs controlling highly quantitative traits in polyploid crops with complex genomes. Markers identified can then be deployed in breeding programs, increasing the efficiency of selection using molecular tools. Key Message: Field resistance to late leaf spot is a quantitative trait controlled by many QTLs. Using polyploid-specific methods, QTL-seq is faster and more cost effective than QTL mapping.
RESUMO
Pre-harvest aflatoxin contamination (PAC) is a major problem facing peanut production worldwide. Produced by the ubiquitous soil fungus, Aspergillus flavus, aflatoxin is the most naturally occurring known carcinogen. The interaction between fungus and host resulting in PAC is complex, and breeding for PAC resistance has been slow. It has been shown that aflatoxin production can be induced by applying drought stress as peanut seeds mature. We have implemented an automated rainout shelter that controls temperature and moisture in the root and peg zone to induce aflatoxin production. Using polymerase chain reaction (PCR) and high performance liquid chromatography (HPLC), seeds meeting the following conditions were selected: infected with Aspergillus flavus and contaminated with aflatoxin; and not contaminated with aflatoxin. RNA sequencing analysis revealed groups of genes that describe the transcriptional state of contaminated vs. uncontaminated seed. These data suggest that fatty acid biosynthesis and abscisic acid (ABA) signaling are altered in contaminated seeds and point to a potential susceptibility factor, ABR1, as a repressor of ABA signaling that may play a role in permitting PAC.
Assuntos
Aflatoxinas/análise , Arachis , Aspergillus flavus/fisiologia , Sementes , Ácido Abscísico/biossíntese , Arachis/química , Arachis/genética , Arachis/microbiologia , Cromatografia Líquida de Alta Pressão , DNA de Plantas/genética , Ácidos Graxos/biossíntese , Contaminação de Alimentos , Interações Hospedeiro-Patógeno , Doenças das Plantas/genética , Doenças das Plantas/microbiologia , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único , RNA não Traduzido/genética , Sementes/química , Sementes/genética , Sementes/microbiologia , Análise de Sequência de RNARESUMO
Pod-filling is an important stage of peanut (Arachis hypogaea) seed development. It is partially controlled by genetic factors, as cultivars considerably vary in pod-filling potential. Here, a study was done to detect changes in mRNA levels that accompany pod-filling processes. Four seed developmental stages were sampled from two peanut genotypes differing in their oil content and pod-filling potential. Transcriptome data were generated by RNA-Seq and explored with respect to genic and subgenomic patterns of expression. Very dynamic transcriptomic changes occurred during seed development in both genotypes. Yet, general higher expression rates of transcripts and an enrichment in processes involved "energy generation" and "primary metabolites" were observed in the genotype with the better pod-filling ("Hanoch"). A dataset of 584 oil-related genes was assembled and analyzed, resulting in several lipid metabolic processes highly expressed in Hanoch, including oil storage and FA synthesis/elongation. Homoeolog-specific gene expression analysis revealed that both subgenomes contribute to the oil genes expression. Yet, biases were observed in particular parts of the pathway with possible biological meaning, presumably explaining the genotypic variation in oil biosynthesis and pod-filling. This study provides baseline information and a resource that may be used to understand development and oil biosynthesis in the peanut seeds.
Assuntos
Arachis/crescimento & desenvolvimento , Óleos de Plantas/metabolismo , Sementes/crescimento & desenvolvimento , Arachis/genética , Arachis/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Sequenciamento de Nucleotídeos em Larga Escala , Redes e Vias Metabólicas/genética , Redes e Vias Metabólicas/fisiologia , Óleo de Amendoim , Reação em Cadeia da Polimerase , Sementes/genética , Sementes/metabolismoRESUMO
High-throughput next-generation sequence-based genotyping and single nucleotide polymorphism (SNP) detection opens the door for emerging genomics-based breeding strategies such as genome-wide association analysis and genomic selection. In polyploids, SNP detection is confounded by a highly similar homeologous sequence where a polymorphism between subgenomes must be differentiated from a SNP. We have developed and implemented a novel tool called SWEEP: Sliding Window Extraction of Explicit Polymorphisms. SWEEP uses subgenome polymorphism haplotypes as contrast to identify true SNPs between genotypes. The tool is a single command script that calls a series of modules based on user-defined options and takes sorted/indexed bam files or vcf files as input. Filtering options are highly flexible and include filtering based on sequence depth, alternate allele ratio, and SNP quality on top of the SWEEP filtering procedure. Using real and simulated data we show that SWEEP outperforms current SNP filtering methods for polyploids. SWEEP can be used for high-quality SNP discovery in polyploid crops.
Assuntos
Produtos Agrícolas/genética , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Poliploidia , Software , Alelos , Perfilação da Expressão Gênica/métodos , Genótipo , TranscriptomaRESUMO
Understanding the relationship between genotype and phenotype is a major biological question and being able to predict phenotypes based on molecular genotypes is integral to molecular breeding. Whole-genome duplications have shaped the history of all flowering plants and present challenges to elucidating the relationship between genotype and phenotype, especially in neopolyploid species. Although single nucleotide polymorphisms (SNPs) have become popular tools for genetic mapping, discovery and application of SNPs in polyploids has been difficult. Here, we summarize common experimental approaches to SNP calling, highlighting recent polyploid successes. To examine the impact of software choice on these analyses, we called SNPs among five peanut genotypes using different alignment programs (BWA-mem and Bowtie 2) and variant callers (SAMtools, GATK, and Freebayes). Alignments produced by Bowtie 2 and BWA-mem and analyzed in SAMtools shared 24.5% concordant SNPs, and SAMtools, GATK, and Freebayes shared 1.4% concordant SNPs. A subsequent analysis of simulated Brassica napus chromosome 1A and 1C genotypes demonstrated that, of the three software programs, SAMtools performed with the highest sensitivity and specificity on Bowtie 2 alignments. These results, however, are likely to vary among species, and we therefore propose a series of best practices for SNP calling in polyploids.