RESUMO
Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago1,2. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat.
Assuntos
Produção Agrícola , Genoma de Planta , Genômica , Triticum , Triticum/classificação , Triticum/genética , Produção Agrícola/história , História Antiga , Sequenciamento Completo do Genoma , Introgressão Genética , Hibridização Genética , Pão/história , Genoma de Planta/genética , Centrômero/genéticaRESUMO
The current limitations in genome sequencing technology require the construction of physical maps for high-quality draft sequences of large plant genomes, such as that of Aegilops tauschii, the wheat D-genome progenitor. To construct a physical map of the Ae. tauschii genome, we fingerprinted 461,706 bacterial artificial chromosome clones, assembled contigs, designed a 10K Ae. tauschii Infinium SNP array, constructed a 7,185-marker genetic map, and anchored on the map contigs totaling 4.03 Gb. Using whole genome shotgun reads, we extended the SNP marker sequences and found 17,093 genes and gene fragments. We showed that collinearity of the Ae. tauschii genes with Brachypodium distachyon, rice, and sorghum decreased with phylogenetic distance and that structural genome evolution rates have been high across all investigated lineages in subfamily Pooideae, including that of Brachypodieae. We obtained additional information about the evolution of the seven Triticeae chromosomes from 12 ancestral chromosomes and uncovered a pattern of centromere inactivation accompanying nested chromosome insertions in grasses. We showed that the density of noncollinear genes along the Ae. tauschii chromosomes positively correlates with recombination rates, suggested a cause, and showed that new genes, exemplified by disease resistance genes, are preferentially located in high-recombination chromosome regions.
Assuntos
Mapeamento de Sequências Contíguas , Genoma de Planta , Poaceae/genética , Centrômero/ultraestrutura , Cromossomos Artificiais Bacterianos , Cromossomos de Plantas/ultraestrutura , Evolução Molecular , Genes de Plantas , Marcadores Genéticos , Polimorfismo de Nucleotídeo Único , Recombinação Genética , Análise de Sequência de DNA , Triticum/genéticaRESUMO
BACKGROUND: The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high resolution genome maps with saturated marker scaffolds to anchor and orient BAC contigs/ sequence scaffolds for whole genome assembly. Radiation hybrid (RH) mapping has proven to be an excellent tool for the development of such maps for it offers much higher and more uniform marker resolution across the length of the chromosome compared to genetic mapping and does not require marker polymorphism per se, as it is based on presence (retention) vs. absence (deletion) marker assay. METHODS: In this study, a 178 line RH panel was genotyped with SSRs and DArT markers to develop the first high resolution RH maps of the entire D-genome of Ae. tauschii accession AL8/78. To confirm map order accuracy, the AL8/78-RH maps were compared with:1) a DArT consensus genetic map constructed using more than 100 bi-parental populations, 2) a RH map of the D-genome of reference hexaploid wheat 'Chinese Spring', and 3) two SNP-based genetic maps, one with anchored D-genome BAC contigs and another with anchored D-genome sequence scaffolds. Using marker sequences, the RH maps were also anchored with a BAC contig based physical map and draft sequence of the D-genome of Ae. tauschii. RESULTS: A total of 609 markers were mapped to 503 unique positions on the seven D-genome chromosomes, with a total map length of 14,706.7 cR. The average distance between any two marker loci was 29.2 cR which corresponds to 2.1 cM or 9.8 Mb. The average mapping resolution across the D-genome was estimated to be 0.34 Mb (Mb/cR) or 0.07 cM (cM/cR). The RH maps showed almost perfect agreement with several published maps with regard to chromosome assignments of markers. The mean rank correlations between the position of markers on AL8/78 maps and the four published maps, ranged from 0.75 to 0.92, suggesting a good agreement in marker order. With 609 mapped markers, a total of 2481 deletions for the whole D-genome were detected with an average deletion size of 42.0 Mb. A total of 520 markers were anchored to 216 Ae. tauschii sequence scaffolds, 116 of which were not anchored earlier to the D-genome. CONCLUSION: This study reports the development of first high resolution RH maps for the D-genome of Ae. tauschii accession AL8/78, which were then used for the anchoring of unassigned sequence scaffolds. This study demonstrates how RH mapping, which offered high and uniform resolution across the length of the chromosome, can facilitate the complete sequence assembly of the large and complex plant genomes.
Assuntos
Genoma de Planta , Poaceae/genética , Mapeamento de Híbridos Radioativos/métodos , Mapeamento Cromossômico , Cromossomos de Plantas/genética , GenótipoRESUMO
BACKGROUND: Mapping and map-based cloning of genes that control agriculturally and economically important traits remain great challenges for plants with complex highly repetitive genomes such as those within the grass tribe, Triticeae. Mapping limitations in the Triticeae are primarily due to low frequencies of polymorphic gene markers and poor genetic recombination in certain genetic regions. Although the abundance of repetitive sequence may pose common problems in genome analysis and sequence assembly of large and complex genomes, they provide repeat junction markers with random and unbiased distribution throughout chromosomes. Hence, development of a high-throughput mapping technology that combine both gene-based and repeat junction-based markers is needed to generate maps that have better coverage of the entire genome. RESULTS: In this study, the available genomics resource of the diploid Aegilop tauschii, the D genome donor of bread wheat, were used to develop genome specific markers that can be applied for mapping in modern hexaploid wheat. A NimbleGen array containing both gene-based and repeat junction probe sequences derived from Ae. tauschii was developed and used to map the Chinese Spring nullisomic-tetrasomic lines and deletion bin lines of the D genome chromosomes. Based on these mapping data, we have now anchored 5,171 repeat junction probes and 10,892 gene probes, corresponding to 5,070 gene markers, to the delineated deletion bins of the D genome. The order of the gene-based markers within the deletion bins of the Chinese Spring can be inferred based on their positions on the Ae. tauschii genetic map. Analysis of the probe sequences against the Chinese Spring chromosome sequence assembly database facilitated mapping of the NimbleGen probes to the sequence contigs and allowed assignment or ordering of these sequence contigs within the deletion bins. The accumulated length of anchored sequence contigs is about 155 Mb, representing ~ 3.2 % of the D genome. A specific database was developed to allow user to search or BLAST against the probe sequence information and to directly download PCR primers for mapping specific genetic loci. CONCLUSIONS: In bread wheat, aneuploid stocks have been extensively used to assign markers linked with genes/traits to chromosomes, chromosome arms, and their specific bins. Through this study, we added thousands of markers to the existing wheat chromosome bin map, representing a significant step forward in providing a resource to navigate the wheat genome. The database website ( http://probes.pw.usda.gov/ATRJM/ ) provides easy access and efficient utilization of the data. The resources developed herein can aid map-based cloning of traits of interest and the sequencing of the D genome of hexaploid wheat.
Assuntos
Diploide , Marcadores Genéticos , Genoma de Planta , Poliploidia , Triticum/genética , Mapeamento Cromossômico , Cromossomos de Plantas , Sondas de DNA , Evolução Molecular , Genômica/métodos , Sequências Repetitivas de Ácido Nucleico , Reprodutibilidade dos Testes , Deleção de SequênciaRESUMO
Gene families often show degrees of differences in terms of exon-intron structures depending on their distinct evolutionary histories. Comparative analysis of gene structures is important for understanding their evolutionary and functional relationships within plant species. Here, we present a comparative genomics database named PIECE (http://wheat.pw.usda.gov/piece) for Plant Intron and Exon Comparison and Evolution studies. The database contains all the annotated genes extracted from 25 sequenced plant genomes. These genes were classified based on Pfam motifs. Phylogenetic trees were pre-constructed for each gene category. PIECE provides a user-friendly interface for different types of searches and a graphical viewer for displaying a gene structure pattern diagram linked to the resulting bootstrapped dendrogram for each gene family. The gene structure evolution of orthologous gene groups was determined using the GLOOME, Exalign and GECA software programs that can be accessed within the database. PIECE also provides a web server version of the software, GSDraw, for drawing schematic diagrams of gene structures. PIECE is a powerful tool for comparing gene sequences and provides valuable insights into the evolution of gene structure in plant genomes.
Assuntos
Bases de Dados Genéticas , Evolução Molecular , Éxons , Genes de Plantas , Íntrons , Genoma de Planta , Internet , Família Multigênica , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/genética , Alinhamento de SequênciaRESUMO
In the course of evolution, the genomes of grasses have maintained an observable degree of gene order conservation. The information available for already sequenced genomes can be used to predict the gene order of nonsequenced species by means of comparative colinearity studies. The "Wheat Zapper" application presented here performs on-demand colinearity analysis between wheat, rice, Sorghum, and Brachypodium in a simple, time efficient, and flexible manner. This application was specifically designed to provide plant scientists with a set of tools, comprising not only synteny inference, but also automated primer design, intron/exon boundaries prediction, visual representation using the graphic tool Circos 0.53, and the possibility of downloading FASTA sequences for downstream applications. Quality of the "Wheat Zapper" prediction was confirmed against the genome of maize, with good correlation (r > 0.83) observed between the gene order predicted on the basis of synteny and their actual position on the genome. Further, the accuracy of "Wheat Zapper" was calculated at 0.65 considering the "Genome Zipper" application as the "gold" standard. The differences between these two tools are amply discussed, making the point that "Wheat Zapper" is an accurate and reliable on-demand tool that is sure to benefit the cereal scientific community. The Wheat Zapper is available at http://wge.ndsu.nodak.edu/wheatzapper/ .
Assuntos
Genoma de Planta , Poaceae/genética , Software , SinteniaRESUMO
Diploid A-genome wheat (einkorn wheat) presents a nutrition-rich option as an ancient grain crop and a resource for the improvement of bread wheat against abiotic and biotic stresses. Realizing the importance of this wheat species, reference-level assemblies of two einkorn wheat accessions were generated (wild and domesticated). This work reports an einkorn genome database that provides an interface to the cereals research community to perform comparative genomics, applied genetics and breeding research. It features queries for annotated genes, the use of a recent genome browser release, and the ability to search for sequence alignments using a modern BLAST interface. Other features include a comparison of reference einkorn assemblies with other wheat cultivars through genomic synteny visualization and an alignment visualization tool for BLAST results. Altogether, this resource will help wheat research and breeding. Database URL https://wheat.pw.usda.gov/GG3/pangenome.
Assuntos
Genoma de Planta , Triticum , Triticum/genética , Genoma de Planta/genética , Melhoramento Vegetal , Genômica/métodos , Estudos de Associação GenéticaRESUMO
BACKGROUND: Development of a high quality reference sequence is a daunting task in crops like wheat with large (~17Gb), highly repetitive (>80%) and polyploid genome. To achieve complete sequence assembly of such genomes, development of a high quality physical map is a necessary first step. However, due to the lack of recombination in certain regions of the chromosomes, genetic mapping, which uses recombination frequency to map marker loci, alone is not sufficient to develop high quality marker scaffolds for a sequence ready physical map. Radiation hybrid (RH) mapping, which uses radiation induced chromosomal breaks, has proven to be a successful approach for developing marker scaffolds for sequence assembly in animal systems. Here, the development and characterization of a RH panel for the mapping of D-genome of wheat progenitor Aegilops tauschii is reported. RESULTS: Radiation dosages of 350 and 450 Gy were optimized for seed irradiation of a synthetic hexaploid (AABBDD) wheat with the D-genome of Ae. tauschii accession AL8/78. The surviving plants after irradiation were crossed to durum wheat (AABB), to produce pentaploid RH1s (AABBD), which allows the simultaneous mapping of the whole D-genome. A panel of 1,510 RH1 plants was obtained, of which 592 plants were generated from the mature RH1 seeds, and 918 plants were rescued through embryo culture due to poor germination (<3%) of mature RH1 seeds. This panel showed a homogenous marker loss (2.1%) after screening with SSR markers uniformly covering all the D-genome chromosomes. Different marker systems mostly detected different lines with deletions. Using markers covering known distances, the mapping resolution of this RH panel was estimated to be <140kb. Analysis of only 16 RH lines carrying deletions on chromosome 2D resulted in a physical map with cM/cR ratio of 1:5.2 and 15 distinct bins. Additionally, with this small set of lines, almost all the tested ESTs could be mapped. A set of 399 most informative RH lines with an average deletion frequency of ~10% were identified for developing high density marker scaffolds of the D-genome. CONCLUSIONS: The RH panel reported here is the first developed for any wild ancestor of a major cultivated plant species. The results provided insight into various aspects of RH mapping in plants, including the genetically effective cell number for wheat (for the first time) and the potential implementation of this technique in other plant species. This RH panel will be an invaluable resource for mapping gene based markers, developing a complete marker scaffold for the whole genome sequence assembly, fine mapping of markers and functional characterization of genes and gene networks present on the D-genome.
Assuntos
Genoma de Planta/genética , Poaceae/genética , Mapeamento de Híbridos Radioativos/métodos , Cruzamentos Genéticos , Triticum/genéticaRESUMO
Allotetraploid (2n = 4x = 28) Leymus triticoides and Leymus cinereus are divergent perennial grasses, which form fertile hybrids. Genetic maps with n = 14 linkage groups (LG) comprised with 1,583 AFLP and 67 heterologous anchor markers were previously used for mapping quantitative trait loci (QTLs) in these hybrids, and chromosomes of other Leymus wildryes have been transferred to wheat. However, identifications of the x = 7 homoeologous groups were tenuous and genetic research has been encumbered by a lack of functional, conserved gene marker sequences. Herein, we mapped 350 simple sequence repeats and 26 putative lignin biosynthesis genes from a new Leymus EST library and constructed one integrated consensus map with 799 markers, including 375 AFLPs and 48 heterologous markers, spanning 2,381 centiMorgans. LG1b and LG6b were reassigned as LG6b* and LG1b*, respectively, and LG4Ns and LG4Xm were inverted so that all 14 linkage groups are aligned to the x = 7 Triticeae chromosomes based on EST alignments to barley and other reference genomes. Amplification of 146 mapped Leymus ESTs representing six of the seven homoeologous groups was shown for 17 wheat-Leymus chromosome introgression lines. Reciprocal translocations between 4L and 5L in both Leymus and Triticum monococcum were aligned to the same regions of Brachypodium chromosome 1. A caffeic acid O-methyltransferase locus aligned to fiber QTL peaks on Leymus LG7a and brown midrib mutations of maize and sorghum. Glaucousness genes on Leymus and wheat chromosome 2 were aligned to the same region of Brachypodium chromosome 5. Markers linked to the S self-incompatibility gene on Leymus LG1a cosegregated with markers on LG2b, possibly cross-linked by gametophytic selection. Homoeologous chromosomes 1 and 2 harbor the S and Z gametophytic self-incompatibility genes of Phalaris, Secale, and Lolium, but the Leymus chromosome-2 self-incompatibility gene aligns to a different region on Brachypodium chromosome 5. Nevertheless, cosegregation of self-incompatibility genes on Leymus presents a powerful system for mapping these loci.
Assuntos
Etiquetas de Sequências Expressas , Genes de Plantas , Hibridização Genética/genética , Translocação Genética , Triticum/genética , Mapeamento Cromossômico , Cromossomos de Plantas , Fibras na Dieta/metabolismo , Ligação Genética , Lignina/biossíntese , FenótipoRESUMO
Transposable elements (TE) exist in the genomes of nearly all eukaryotes. TE mobilization through 'cut-and-paste' or 'copy-and-paste' mechanisms causes their insertions into other repetitive sequences, gene loci and other DNA. An insertion of a TE commonly creates a unique TE junction in the genome. TE junctions are also randomly distributed along chromosomes and therefore useful for genome-wide marker development. Several TE-based marker systems have been developed and applied to genetic diversity assays, and to genetic and physical mapping. A software tool 'RJPrimers' reported here allows for accurate identification of unique repeat junctions using BLASTN against annotated repeat databases and a repeat junction finding algorithm, and then for fully automated high-throughput repeat junction-based primer design using Primer3 and BatchPrimer3. The software was tested using the rice genome and genomic sequences of Aegilops tauschii. Over 90% of repeat junction primers designed by RJPrimers were unique. At least one RJM marker per 10 Kb sequence of A. tauschii was expected with an estimate of over 0.45 million such markers in a genome of 4.02 Gb, providing an almost unlimited source of molecular markers for mapping large and complex genomes. A web-based server and a command line-based pipeline for RJPrimers are both available at http://wheat.pw.usda.gov/demos/RJPrimers/.
Assuntos
Primers do DNA/química , Sequências Repetitivas Dispersas , Reação em Cadeia da Polimerase , Software , Marcadores Genéticos , Genoma de Planta , Internet , Oryza/genética , Poaceae/genéticaRESUMO
As one of the US Department of Agriculture-Agricultural Research Service flagship databases, GrainGenes (https://wheat.pw.usda.gov) serves the data and community needs of globally distributed small grains researchers for the genetic improvement of the Triticeae family and Avena species that include wheat, barley, rye and oat. GrainGenes accomplishes its mission by continually enriching its cross-linked data content following the findable, accessible, interoperable and reusable principles, enhancing and maintaining an intuitive web interface, creating tools to enable easy data access and establishing data connections within and between GrainGenes and other biological databases to facilitate knowledge discovery. GrainGenes operates within the biological database community, collaborates with curators and genome sequencing groups and contributes to the AgBioData Consortium and the International Wheat Initiative through the Wheat Information System (WheatIS). Interactive and linked content is paramount for successful biological databases and GrainGenes now has 2917 manually curated gene records, including 289 genes and 254 alleles from the Wheat Gene Catalogue (WGC). There are >4.8 million gene models in 51 genome browser assemblies, 6273 quantitative trait loci and >1.4 million genetic loci on 4756 genetic and physical maps contained within 443 mapping sets, complete with standardized metadata. Most notably, 50 new genome browsers that include outputs from the Wheat and Barley PanGenome projects have been created. We provide an example of an expression quantitative trait loci track on the International Wheat Genome Sequencing Consortium Chinese Spring wheat browser to demonstrate how genome browser tracks can be adapted for different data types. To help users benefit more from its data, GrainGenes created four tutorials available on YouTube. GrainGenes is executing its vision of service by continuously responding to the needs of the global small grains community by creating a centralized, long-term, interconnected data repository. Database URL:https://wheat.pw.usda.gov.
Assuntos
Genoma de Planta , Hordeum , Avena/genética , Mapeamento Cromossômico , Bases de Dados Genéticas , Genoma de Planta/genética , Genômica , Hordeum/genética , Locos de Características Quantitativas , Triticum/genéticaRESUMO
BACKGROUND: Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST) information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. RESULTS: Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM) analysis. Of these, 52 (54%) were polymorphic between parents of the Ogle1040 × TAM O-301 (OT) mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. CONCLUSIONS: The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide a model for SNP discovery and genotyping in other species with complex and poorly-characterized genomes.
Assuntos
Avena/genética , Genoma de Planta/genética , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Biologia Computacional , Etiquetas de Sequências Expressas , GenótipoRESUMO
The small annual grass Brachypodium distachyon (Brachypodium) is rapidly emerging as a powerful model system to study questions unique to the grasses. Many Brachypodium resources have been developed including a whole genome sequence, highly efficient transformation and a large germplasm collection. We developed a genetic linkage map of Brachypodium using single nucleotide polymorphism (SNP) markers and an F(2) mapping population of 476 individuals. SNPs were identified by targeted resequencing of single copy genomic sequences. Using the Illumina GoldenGate Genotyping platform we placed 558 markers into five linkage groups corresponding to the five chromosomes of Brachypodium. The unusually long total genetic map length, 1,598 centiMorgans (cM), indicates that the Brachypodium mapping population has a high recombination rate. By comparing the genetic map to genome features we found that the recombination rate was positively correlated with gene density and negatively correlated with repetitive regions and sites of ancestral chromosome fusions that retained centromeric repeat sequences. A comparison of adjacent genome regions with high versus low recombination rates revealed a positive correlation between interspecific synteny and recombination rate.
Assuntos
Brachypodium/genética , Ligação Genética , Genoma de Planta , Mapeamento Cromossômico , Cromossomos de Plantas , Marcadores Genéticos , Genótipo , Polimorfismo de Nucleotídeo Único , Recombinação Genética , Sequências Repetitivas de Ácido Nucleico , Alinhamento de Sequência/métodosRESUMO
BACKGROUND: A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. RESULTS: Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. CONCLUSIONS: In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.
Assuntos
Mapeamento Cromossômico , Cromossomos de Plantas/genética , Variação Genética , Genoma de Planta/genética , Nucleotídeos/genética , Triticum/genética , Códon/genética , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Deleção de Genes , Genes de Plantas/genética , Ligação Genética , Loci Gênicos/genética , Haplótipos/genética , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética , PoliploidiaRESUMO
BACKGROUND: In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose. RESULTS: We have developed, the ConservedPrimers 2.0 pipeline, for designing intron-flanking primers for large-scale SNP discovery and marker development, and demonstrated its utility in wheat. This tool uses non-redundant wheat EST sequences, such as wheat contigs and singleton ESTs, and related genomic sequences, such as those of rice, as inputs. It aligns the ESTs to the genomic sequences to identify unique colinear exon blocks and predicts intron lengths. Intron-flanking primers are then designed based on the intron/exon information using the Primer3 core program or BatchPrimer3. Finally, a tab-delimited file containing intron-flanking primer pair sequences and their primer properties is generated for primer ordering and their PCR applications. Using this tool, 1,922 bin-mapped wheat ESTs (31.8% of the 6,045 in total) were found to have unique colinear exon blocks suitable for primer design and 1,821 primer pairs were designed from these single- or low-copy genes for PCR amplification and SNP discovery. With these primers and subsequently designed genome-specific primers, a total of 1,527 loci were found to contain one or more genome-specific SNPs. CONCLUSION: The ConservedPrimers 2.0 pipeline for designing intron-flanking primers was developed and its utility demonstrated. The tool can be used for SNP discovery, genetic variation assays and marker development for any target genome that has abundant ESTs and a related reference genome that has been fully sequenced. The ConservedPrimers 2.0 pipeline has been implemented as a command-line tool as well as a web application. Both versions are freely available at http://wheat.pw.usda.gov/demos/ConservedPrimers/.
Assuntos
Biologia Computacional/métodos , Primers do DNA/química , Genoma de Planta , Íntrons/genética , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único , Triticum/genética , Alinhamento de SequênciaRESUMO
BACKGROUND: Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence. RESULTS: A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics. CONCLUSION: The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at http://phymap.ucdavis.edu/brachypodium/.
Assuntos
Cromossomos Artificiais Bacterianos/genética , Oryza/genética , Mapeamento Físico do Cromossomo/métodos , Poaceae/genética , Triticum/genética , Mapeamento de Sequências Contíguas , Impressões Digitais de DNA , Grão Comestível/genética , Evolução Molecular , Etiquetas de Sequências Expressas/metabolismo , Genoma de Planta/genéticaRESUMO
Brachypodium distachyon (Brachypodium) has been recently recognized as an emerging model system for both comparative and functional genomics in grass species. In this study, 55,221 repeat masked Brachypodium BAC end sequences (BES) were used for comparative analysis against the 12 rice pseudomolecules. The analysis revealed that approximately 26.4% of BES have significant matches with the rice genome and 82.4% of the matches were homologous to known genes. Further analysis of paired-end BES and approximately 1.0 Mb sequences from nine selected BACs proved to be useful in revealing conserved regions and regions that have undergone considerable genomic changes. Differential gene amplification, insertions/deletions and inversions appeared to be the common evolutionary events that caused variations of microcolinearity at different orthologous genomic regions. It was found that approximately 17% of genes in the two genomes are not colinear in the orthologous regions. Analysis of BAC sequences also revealed higher gene density (approximately 9 kb/gene) and lower repeat DNA content (approximately 13.1%) in Brachypodium when compared to the orthologous rice regions, consistent with the smaller size of the Brachypodium genome. The 119 annotated Brachypodium genes were BLASTN compared against the wheat EST database and deletion bin mapped wheat ESTs. About 77% of the genes retrieved significant matches in the EST database, while 9.2% matched to the bin mapped ESTs. In some cases, genes in single Brachypodium BACs matched to multiple ESTs that were mapped to the same deletion bins, suggesting that the Brachypodium genome will be useful for ordering wheat ESTs within the deletion bins and developing specific markers at targeted regions in the wheat genome.
Assuntos
Genoma de Planta , Oryza/genética , Poaceae/genética , Sintenia , Triticum/genética , Cromossomos Artificiais Bacterianos , Sequência Conservada , DNA de Plantas/genética , Evolução Molecular , Etiquetas de Sequências Expressas , Genes de Plantas , Genômica , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
A survey and analysis is made of all available omega-gliadin DNA sequences including omega-gliadin genes within a large genomic clone, previously reported gene sequences, and ESTs identified from the large wheat EST collection. A contiguous portion of the Gli-B3 locus is shown to contain two apparently active omega-gliadin genes, two pseudogenes, and four fragments of the 3' portion of omega-gliadin sequences. Comparison of omega-gliadin sequences allows a phylogenetic picture of their relationships and genomes of origin. Results show three groupings of omega-gliadin active gene sequences assigned to each of the three hexaploid wheat genomes, and a fourth group thus far consisting of pseudogenes assigned to the A-genome. Analysis of omega-gliadin ESTs allows reconstruction of two full-length model sequences encoding the AREL- and ARQL-type proteins from the Gli-A3 and Gli-D3 loci, respectively. There is no DNA evidence of multiple active genes from these two loci. In contrast, ESTs allow identification of at least three to four distinct active genes at the Gli-B3 locus of some cultivars. Additional results include more information on the position of cysteines in some omega-gliadin genes and discussion of problems in studying the omega-gliadin gene family.
Assuntos
Etiquetas de Sequências Expressas , Gliadina/genética , Triticum , Sequência de Aminoácidos , Sequência de Bases , Mapeamento Cromossômico , Genes de Plantas , Gliadina/classificação , Dados de Sequência Molecular , Filogenia , Sementes/química , Triticum/química , Triticum/genéticaRESUMO
Databases have become an integral part of all aspects of biological research, including basic and applied plant biology. The importance of databases continues to increase as the volume of data from direct and indirect genomics approaches expands. What is not always obvious to users of databases is the range of available database resources, their access points, or some basic elements of database querying. This chapter briefly summarizes the history of data access via the Internet and reviews some basic terms and considerations in dealing with plant and crop databases. The reader is directed to some of the major publicly available Internet-accessible relevant databases with summaries of the major focuses of those databases, and several examples are given to illustrate how to access plant genomics data. Finally, an outline is given of some of the issues facing the future of plant and crop databases.
Assuntos
Produtos Agrícolas/genética , Bases de Dados Genéticas , Plantas/genética , Biologia Computacional , Etiquetas de Sequências Expressas , Marcadores Genéticos , Genoma de Planta , Genômica/estatística & dados numéricos , InternetRESUMO
GrainGenes (https://wheat.pw.usda.gov or https://graingenes.org) is an international centralized repository for curated, peer-reviewed datasets useful to researchers working on wheat, barley, rye and oat. GrainGenes manages genomic, genetic, germplasm and phenotypic datasets through a dynamically generated web interface for facilitated data discovery. Since 1992, GrainGenes has served geneticists and breeders in both the public and private sectors on six continents. Recently, several new datasets were curated into the database along with new tools for analysis. The GrainGenes homepage was enhanced by making it more visually intuitive and by adding links to commonly used pages. Several genome assemblies and genomic tracks are displayed through the genome browsers at GrainGenes, including the Triticum aestivum (bread wheat) cv. 'Chinese Spring' IWGSC RefSeq v1.0 genome assembly, the Aegilops tauschii (D genome progenitor) Aet v4.0 genome assembly, the Triticum turgidum ssp. dicoccoides (wild emmer wheat) cv. 'Zavitan' WEWSeq v.1.0 genome assembly, a T. aestivum (bread wheat) pangenome, the Hordeum vulgare (barley) cv. 'Morex' IBSC genome assembly, the Secale cereale (rye) select 'Lo7' assembly, a partial hexaploid Avena sativa (oat) assembly and the Triticum durum cv. 'Svevo' (durum wheat) RefSeq Release 1.0 assembly. New genetic maps and markers were added and can be displayed through CMAP. Quantitative trait loci, genetic maps and genes from the Wheat Gene Catalogue are indexed and linked through the Wheat Information System (WheatIS) portal. Training videos were created to help users query and reach the data they need. GSP (Genome Specific Primers) and PIECE2 (Plant Intron Exon Comparison and Evolution) tools were implemented and are available to use. As more small grains reference sequences become available, GrainGenes will play an increasingly vital role in helping researchers improve crops.