RESUMO
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.
Assuntos
Cromossomos Artificiais Bacterianos/genética , Genoma de Planta/genética , Hordeum/genética , Dados de Sequência MolecularRESUMO
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Hordeum/genética , Análise de Sequência de DNA , Cromossomos Artificiais Bacterianos , Clonagem Molecular , Biologia Computacional/métodos , Simulação por Computador , Genes de Plantas , Marcadores Genéticos/genética , Biblioteca Genômica , Genômica , Modelos Genéticos , Oryza/genética , Mapeamento Físico do Cromossomo , Especificidade da EspécieRESUMO
Consensus genetic linkage maps provide a genomic framework for quantitative trait loci identification, map-based cloning, assessment of genetic diversity, association mapping, and applied breeding in marker-assisted selection schemes. Among "orphan crops" with limited genomic resources such as cowpea [Vigna unguiculata (L.) Walp.] (2n = 2x = 22), the use of transcript-derived SNPs in genetic maps provides opportunities for automated genotyping and estimation of genome structure based on synteny analysis. Here, we report the development and validation of a high-throughput EST-derived SNP assay for cowpea, its application in consensus map building, and determination of synteny to reference genomes. SNP mining from 183,118 ESTs sequenced from 17 cDNA libraries yielded approximately 10,000 high-confidence SNPs from which an Illumina 1,536-SNP GoldenGate genotyping array was developed and applied to 741 recombinant inbred lines from six mapping populations. Approximately 90% of the SNPs were technically successful, providing 1,375 dependable markers. Of these, 928 were incorporated into a consensus genetic map spanning 680 cM with 11 linkage groups and an average marker distance of 0.73 cM. Comparison of this cowpea genetic map to reference legumes, soybean (Glycine max) and Medicago truncatula, revealed extensive macrosynteny encompassing 85 and 82%, respectively, of the cowpea map. Regions of soybean genome duplication were evident relative to the simpler diploid cowpea. Comparison with Arabidopsis revealed extensive genomic rearrangement with some conserved microsynteny. These results support evolutionary closeness between cowpea and soybean and identify regions for synteny-based functional genomics studies in legumes.
Assuntos
Etiquetas de Sequências Expressas , Fabaceae/genética , Polimorfismo de Nucleotídeo Único , Mapeamento Cromossômico , Cromossomos de Plantas , Evolução Molecular , GenótipoRESUMO
Quantitative trait locus (QTL) detection is commonly performed by analysis of designed segregating populations derived from two inbred parental lines, where absence of selection, mutation and genetic drift is assumed. Even for designed populations, selection cannot always be avoided, with as consequence varying correlation between genotypes instead of uniform correlation. Akin to linkage disequilibrium mapping, ignoring this type of genetic relatedness will increase the rate of false-positives. In this paper, we advocate using mixed models including genetic relatedness, or 'kinship' information for QTL detection in populations where selection forces operated. We demonstrate our case with a three-way barley cross, designed to segregate for dwarfing, vernalization and spike morphology genes, in which selection occurred. The population of 161 inbred lines was screened with 1,536 single nucleotide polymorphisms (SNPs), and used for gene and QTL detection. The coefficient of coancestry matrix was estimated based on the SNPs and imposed to structure the distribution of random genotypic effects. The model incorporating kinship, coancestry, information was consistently superior to the one without kinship (according to the Akaike information criterion). We show, for three traits, that ignoring the coancestry information results in an unrealistically high number of marker-trait associations, without providing clear conclusions about QTL locations. We used a number of widely recognized dwarfing and vernalization genes known to segregate in the studied population as landmarks or references to assess the agreement of the mapping results with a priori candidate gene expectations. Additional QTLs to the major genes were detected for all traits as well.
Assuntos
Genes de Plantas/genética , Hordeum/genética , Fenótipo , Locos de Características Quantitativas/genética , Seleção Genética , Cruzamentos Genéticos , Genótipo , Hordeum/anatomia & histologia , Hordeum/crescimento & desenvolvimento , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Biotic or abiotic stress can cause considerable damage to crop plants that can be managed by building disease resistance in the cultivated gene pool through breeding for disease resistance genes (R-genes). R-genes, conferring resistance to diverse pathogens or pests share a high level of similarity at the DNA and protein levels in different plant species. This property of R-genes has been successfully employed to isolate putative resistance gene analogues (RGAs) using a PCR-based approach from new plant sources. Using a similar approach, in the present study, we have successfully amplified putative RGAs having nucleotide-binding-site leucine-rich repeats (NBS-LRR-type RGAs) from seven different sources: two cultivated coffee species (Coffea arabica L. and Coffea canephora Pierre ex. A. Froehner), four related taxa endemic to India (wild tree coffee species: Psilanthus bengalensis (Roem. & Schuttles) J.-F. Leroy, Psilanthus khasiana , Psilanthus travencorensis (Wight & Arn.) J.-F. Leroy, Psilanthus weightiana (Wall. ex Wight & Arn.) J.-F. Leroy), and a cDNA pool originally prepared from light- and drought-stressed Coffea arabica L. leaves. The total PCR amplicons obtained using NBS-LRR-specific primers from each source were cloned and transformed to construct seven independent libraries, from which 434 randomly picked clones were sequenced. In silico analysis of the sequenced clones revealed 27 sequences that contained characteristic RGA motifs, of which 24 had complete uninterrupted open reading frames. Comparisons of these with published RGAs showed several of these to be novel RGA sequences. Interestingly, most of such novel RGAs belonged to the related wild Psilanthus species. The data thus suggest the potential of the secondary gene pool as possible untapped donors of resistance genes to the present day cultivated species of coffee.
Assuntos
Café/genética , Genes de Plantas/genética , Imunidade Inata/genética , Sequência de Aminoácidos , Sequência de Bases , Clonagem Molecular , Café/classificação , Índia , Dados de Sequência Molecular , Filogenia , Doenças das Plantas/genética , Reação em Cadeia da Polimerase , Alinhamento de SequênciaRESUMO
Genetic linkage maps are cornerstones of a wide spectrum of biotechnology applications, including map-assisted breeding, association genetics, and map-assisted gene cloning. During the past several years, the adoption of high-throughput genotyping technologies has been paralleled by a substantial increase in the density and diversity of genetic markers. New genetic mapping algorithms are needed in order to efficiently process these large datasets and accurately construct high-density genetic maps. In this paper, we introduce a novel algorithm to order markers on a genetic linkage map. Our method is based on a simple yet fundamental mathematical property that we prove under rather general assumptions. The validity of this property allows one to determine efficiently the correct order of markers by computing the minimum spanning tree of an associated graph. Our empirical studies obtained on genotyping data for three mapping populations of barley (Hordeum vulgare), as well as extensive simulations on synthetic data, show that our algorithm consistently outperforms the best available methods in the literature, particularly when the input data are noisy or incomplete. The software implementing our algorithm is available in the public domain as a web tool under the name MSTmap.
Assuntos
Algoritmos , Mapeamento Cromossômico/estatística & dados numéricos , Análise por Conglomerados , Simulação por Computador , Bases de Dados Genéticas , Genes de Plantas , Marcadores Genéticos , Genótipo , Hordeum/genética , Modelos Genéticos , Família Multigênica , Polimorfismo de Nucleotídeo Único , SoftwareRESUMO
BACKGROUND: High density genetic maps of plants have, nearly without exception, made use of marker datasets containing missing or questionable genotype calls derived from a variety of genic and non-genic or anonymous markers, and been presented as a single linear order of genetic loci for each linkage group. The consequences of missing or erroneous data include falsely separated markers, expansion of cM distances and incorrect marker order. These imperfections are amplified in consensus maps and problematic when fine resolution is critical including comparative genome analyses and map-based cloning. Here we provide a new paradigm, a high-density consensus genetic map of barley based only on complete and error-free datasets and genic markers, represented accurately by graphs and approximately by a best-fit linear order, and supported by a readily available SNP genotyping resource. RESULTS: Approximately 22,000 SNPs were identified from barley ESTs and sequenced amplicons; 4,596 of them were tested for performance in three pilot phase Illumina GoldenGate assays. Data from three barley doubled haploid mapping populations supported the production of an initial consensus map. Over 200 germplasm selections, principally European and US breeding material, were used to estimate minor allele frequency (MAF) for each SNP. We selected 3,072 of these tested SNPs based on technical performance, map location, MAF and biological interest to fill two 1536-SNP "production" assays (BOPA1 and BOPA2), which were made available to the barley genetics community. Data were added using BOPA1 from a fourth mapping population to yield a consensus map containing 2,943 SNP loci in 975 marker bins covering a genetic distance of 1099 cM. CONCLUSION: The unprecedented density of genic markers and marker bins enabled a high resolution comparison of the genomes of barley and rice. Low recombination in pericentric regions is evident from bins containing many more than the average number of markers, meaning that a large number of genes are recombinationally locked into the genetic centromeric regions of several barley chromosomes. Examination of US breeding germplasm illustrated the usefulness of BOPA1 and BOPA2 in that they provide excellent marker density and sensitivity for detection of minor alleles in this genetically narrow material.
Assuntos
Hordeum/genética , Polimorfismo de Nucleotídeo Único , Alelos , Ligação Genética , Marcadores Genéticos , Técnicas Genéticas , GenótipoRESUMO
BACKGROUND: A large number of genetic variations have been identified in rice. Such variations must in many cases control phenotypic differences in abiotic stress tolerance and other traits. A single feature polymorphism (SFP) is an oligonucleotide array-based polymorphism which can be used for identification of SNPs or insertion/deletions (INDELs) for high throughput genotyping and high density mapping. Here we applied SFP markers to a lingering question about the source of salt tolerance in a particular rice recombinant inbred line (RIL) derived from a salt tolerant and salt sensitive parent. RESULTS: Expression data obtained by hybridizing RNA to an oligonucleotide array were analyzed using a statistical method called robustified projection pursuit (RPP). By applying the RPP method, a total of 1208 SFP probes were detected between two presumed parental genotypes (Pokkali and IR29) of a RIL population segregating for salt tolerance. We focused on the Saltol region, a major salt tolerance QTL. Analysis of FL478, a salt tolerant RIL, revealed a small (< 1 Mb) region carrying alleles from the presumed salt tolerant parent, flanked by alleles matching the salt sensitive parent IR29. Sequencing of putative SFP-containing amplicons from this region and other positions in the genome yielded a validation rate more than 95%. CONCLUSION: Recombinant inbred line FL478 contains a small (< 1 Mb) segment from the salt tolerant parent in the Saltol region. The Affymetrix rice genome array provides a satisfactory platform for high resolution mapping in rice using RNA hybridization and the RPP method of SFP analysis.
Assuntos
Genoma de Planta , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Oryza/genética , Polimorfismo Genético , Sequência de Bases , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Expressão Gênica , Marcadores Genéticos , Dados de Sequência Molecular , Locos de Características Quantitativas , RNA de Plantas/genética , Plantas Tolerantes a Sal/genética , Análise de Sequência de DNARESUMO
A rye-wheat centric chromosome translocation 1RS.1BL has been widely used in wheat breeding programs around the world. Increased yield of translocation lines was probably a consequence of increased root biomass. In an effort to map loci-controlling root characteristics, homoeologous recombinants of 1RS with 1BS were used to generate a consensus genetic map comprised of 20 phenotypic and molecular markers, with an average spacing of 2.5 cM. Physically, all recombination events were located in the distal 40% of the arms. A total of 68 recombinants was used and recombination breakpoints were aligned and ordered over map intervals with all the markers, integrated together in a genetic map. This approach enabled dissection of genetic components of quantitative traits, such as root traits, present on 1S. To validate our hypothesis, phenotyping of 45-day-old wheat roots was performed in five lines including three recombinants representative of the entire short arm along with bread wheat parents 'Pavon 76' and Pavon 1RS.1BL. Individual root characteristics were ranked and the genotypic rank sums were subjected to Quade analysis to compare the overall rooting ability of the genotypes. It appears that the terminal 15% of the rye 1RS arm carries gene(s) for greater rooting ability in wheat.
Assuntos
Pão , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Raízes de Plantas/genética , Característica Quantitativa Herdável , Secale/genética , Triticum/genética , Análise de Variância , Marcadores Genéticos , Repetições Minissatélites/genética , Fenótipo , Mapeamento Físico do Cromossomo , Brotos de Planta/genética , Reação em Cadeia da Polimerase , Recombinação Genética/genéticaRESUMO
We report mapping of translocation breakpoints using a microarray. We used complex RNA to compare normal hexaploid wheat (17,000 Mb genome) to a ditelosomic stock missing the short arm of chromosome 1B (1BS) and wheat-rye translocations that replace portions of 1BS with rye 1RS. Transcripts detected by a probe set can come from all three Triticeae genomes in ABD hexaploid wheat, and sequences of homoeologous genes on 1AS, 1BS and 1DS often differ from each other. Absence or replacement of 1BS therefore must sometimes result in patterns within a probe set that deviate from hexaploid wheat. We termed these 'high variance probe sets' (HVPs) and examined the extent to which HVPs associated with 1BS aneuploidy are related to rice genes on syntenic rice chromosome 5 short arm (5S). We observed an enrichment of such probe sets to 15-20% of all HVPs, while 1BS represents approximately 2% of the total genome. In total 257 HVPs constitute wheat 1BS markers. Two wheat-rye translocations subdivided 1BS HVPs into three groups, allocating translocation breakpoints to narrow intervals defined by rice 5S coordinates. This approach could be extended to the entire wheat genome or any organism with suitable aneuploid or translocation stocks.
Assuntos
Quebra Cromossômica , Mapeamento Cromossômico/métodos , Genômica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Translocação Genética , Triticum/genética , Interpretação Estatística de Dados , Marcadores Genéticos , Genoma de Planta , Sondas de Oligonucleotídeos , Oryza/genéticaRESUMO
BACKGROUND: Cowpea (Vigna unguiculata L. Walp) is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP) is a microarray-based marker which can be used for high throughput genotyping and high density mapping. RESULTS: Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max) genome array. Robustified projection pursuit (RPP) was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL) population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. CONCLUSION: We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.
Assuntos
Fabaceae/genética , Genoma de Planta/genética , Glycine max/genética , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo Genético , Animais , Produtos Agrícolas/genética , DNA/genética , Eletroforese em Gel de Ágar , Marcadores Genéticos/genética , Humanos , Camundongos , Hibridização de Ácido Nucleico , Reação em Cadeia da Polimerase , RNA Complementar/genética , Reprodutibilidade dos Testes , Alinhamento de SequênciaRESUMO
BACKGROUND: Flow cytometry facilitates sorting of single chromosomes and chromosome arms which can be used for targeted genome analysis. However, the recovery of microgram amounts of DNA needed for some assays requires sorting of millions of chromosomes which is laborious and time consuming. Yet, many genomic applications such as development of genetic maps or physical mapping do not require large DNA fragments. In such cases time-consuming de novo sorting can be minimized by utilizing whole-genome amplification. RESULTS: Here we report a protocol optimized in barley including amplification of DNA from only ten thousand chromosomes, which can be isolated in less than one hour. Flow-sorted chromosomes were treated with proteinase K and amplified using Phi29 multiple displacement amplification (MDA). Overnight amplification in a 20-microlitre reaction produced 3.7 - 5.7 micrograms DNA with a majority of products between 5 and 30 kb. To determine the purity of sorted fractions and potential amplification bias we used quantitative PCR for specific genes on each chromosome. To extend the analysis to a whole genome level we performed an oligonucleotide pool assay (OPA) for interrogation of 1524 loci, of which 1153 loci had known genetic map positions. Analysis of unamplified genomic DNA of barley cv. Akcent using this OPA resulted in 1426 markers with present calls. Comparison with three replicates of amplified genomic DNA revealed >99% concordance. DNA samples from amplified chromosome 1H and a fraction containing chromosomes 2H - 7H were examined. In addition to loci with known map positions, 349 loci with unknown map positions were included. Based on this analysis 40 new loci were mapped to 1H. CONCLUSION: The results indicate a significant potential of using this approach for physical mapping. Moreover, the study showed that multiple displacement amplification of flow-sorted chromosomes is highly efficient and representative which considerably expands the potential of chromosome flow sorting in plant genomics.
Assuntos
Cromossomos de Plantas/genética , Hordeum/genética , Técnicas de Amplificação de Ácido Nucleico/métodos , Mapeamento Físico do Cromossomo/métodos , Polimorfismo de Nucleotídeo Único , DNA de Plantas/genética , Citometria de Fluxo , Marcadores Genéticos , Reação em Cadeia da PolimeraseRESUMO
Genic microsatellites or EST-SSRs derived from expressed sequence tags (ESTs) are desired because these are inexpensive to develop, represent transcribed genes, and often a putative function can be assigned to them. In this study we investigated 2,553 coffee ESTs (461 from the public domain and 2,092 in-house generated ESTs) for identification and development of genic microsatellite markers. Of these, 2,458 ESTs (all >100 bp in size) were searched for SSRs using MISA--search module followed by stackPACK clustering that revealed a total of 425 microsatellites in 331 (13.5%) non-redundant ESTs/consensus sequences suggesting an approximate frequency of 1 SSR/2.16 kb of the analysed coffee transcriptome. Identified microsatellites mainly comprised of di-/tri-nucleotide repeats, of which repeat motifs AG and AAG were the most abundant. A total of 224 primer pairs could be designed from the non-redundant SSR-positive ESTs (excluding those with only mononucleotide repeats) for possible use as potential genic markers. Of this set, a total of 24 (10%) primer pairs were tested and 18 could be validated as usable markers. Sixteen of these markers revealed moderate to high polymorphism information content (PIC) across 23 genotypes of C. arabica and C. canephora, while 2 markers were found to be monomorphic. All the markers also showed robust cross-species amplifications across 14 Coffea and 4 Psilanthus species. The apparent broad cross-species/genera transferability was further confirmed by cloning and sequencing of the amplified alleles. Thus, the study provides an insight about the frequency and distribution of SSRs in coffee transcriptome, and also demonstrates the successful development of genic-SSRs. It is expected that the potential markers described here would add to the repertoire of DNA markers needed for genetic studies in cultivated coffee and also related taxa that constitute the important secondary genepool for coffee improvement.
Assuntos
Coffea/genética , Etiquetas de Sequências Expressas , Repetições de Microssatélites/genética , Sequência de Bases , Marcadores Genéticos , Dados de Sequência Molecular , FilogeniaRESUMO
Genomewide association studies depend on the extent of linkage disequilibrium (LD), the number and distribution of markers, and the underlying structure in populations under study. Outbreeding species generally exhibit limited LD, and consequently, a very large number of markers are required for effective whole-genome association genetic scans. In contrast, several of the world's major food crops are self-fertilizing inbreeding species with narrow genetic bases and theoretically extensive LD. Together these are predicted to result in a combination of low resolution and a high frequency of spurious associations in LD-based studies. However, inbred elite plant varieties represent a unique human-induced pseudo-outbreeding population that has been subjected to strong selection for advantageous alleles. By assaying 1,524 genomewide SNPs we demonstrate that, after accounting for population substructure, the level of LD exhibited in elite northwest European barley, a typical inbred cereal crop, can be effectively exploited to map traits by using whole-genome association scans with several hundred to thousands of biallelic SNPs.