RESUMO
Triticum urartu (diploid, AA) is the progenitor of the A subgenome of tetraploid (Triticum turgidum, AABB) and hexaploid (Triticum aestivum, AABBDD) wheat1,2. Genomic studies of T. urartu have been useful for investigating the structure, function and evolution of polyploid wheat genomes. Here we report the generation of a high-quality genome sequence of T. urartu by combining bacterial artificial chromosome (BAC)-by-BAC sequencing, single molecule real-time whole-genome shotgun sequencing 3 , linked reads and optical mapping4,5. We assembled seven chromosome-scale pseudomolecules and identified protein-coding genes, and we suggest a model for the evolution of T. urartu chromosomes. Comparative analyses with genomes of other grasses showed gene loss and amplification in the numbers of transposable elements in the T. urartu genome. Population genomics analysis of 147 T. urartu accessions from across the Fertile Crescent showed clustering of three groups, with differences in altitude and biostress, such as powdery mildew disease. The T. urartu genome assembly provides a valuable resource for studying genetic variation in wheat and related grasses, and promises to facilitate the discovery of genes that could be useful for wheat improvement.
Assuntos
Evolução Molecular , Genoma de Planta/genética , Filogenia , Triticum/classificação , Triticum/genética , Altitude , Cromossomos Artificiais Bacterianos/genética , Cromossomos de Plantas/genética , Elementos de DNA Transponíveis/genética , Variação Genética , Mapeamento Geográfico , Anotação de Sequência Molecular , Doenças das Plantas/microbiologia , Análise de Sequência de DNA , Sintenia/genéticaRESUMO
Genomics-based breeding of economically important crops such as banana, coffee, cotton, potato, tobacco and wheat is often hampered by genome size, polyploidy and high repeat content. We adapted sequence-based whole-genome profiling (WGP™) technology to obtain insight into the polyploidy of the model plant Nicotiana tabacum (tobacco). N. tabacum is assumed to originate from a hybridization event between ancestors of Nicotiana sylvestris and Nicotiana tomentosiformis approximately 200,000 years ago. This resulted in tobacco having a haploid genome size of 4500 million base pairs, approximately four times larger than the related tomato (Solanum lycopersicum) and potato (Solanum tuberosum) genomes. In this study, a physical map containing 9750 contigs of bacterial artificial chromosomes (BACs) was constructed. The mean contig size was 462 kbp, and the calculated genome coverage equaled the estimated tobacco genome size. We used a method for determination of the ancestral origin of the genome by annotation of WGP sequence tags. This assignment agreed with the ancestral annotation available from the tobacco genetic map, and may be used to investigate the evolution of homoeologous genome segments after polyploidization. The map generated is an essential scaffold for the tobacco genome. We propose the combination of WGP physical mapping technology and tag profiling of ancestral lines as a generally applicable method to elucidate the ancestral origin of genome segments of polyploid species. The physical mapping of genes and their origins will enable application of biotechnology to polyploid plants aimed at accelerating and increasing the precision of breeding for abiotic and biotic stress resistance.
Assuntos
Mapeamento Cromossômico , Genoma de Planta , Nicotiana/genética , Mapeamento Físico do Cromossomo , Cruzamento , Ligação Genética , Hibridização Genética , Anotação de Sequência Molecular , PoliploidiaRESUMO
We present whole genome profiling (WGP), a novel next-generation sequencing-based physical mapping technology for construction of bacterial artificial chromosome (BAC) contigs of complex genomes, using Arabidopsis thaliana as an example. WGP leverages short read sequences derived from restriction fragments of two-dimensionally pooled BAC clones to generate sequence tags. These sequence tags are assigned to individual BAC clones, followed by assembly of BAC contigs based on shared regions containing identical sequence tags. Following in silico analysis of WGP sequence tags and simulation of a map of Arabidopsis chromosome 4 and maize, a WGP map of Arabidopsis thaliana ecotype Columbia was constructed de novo using a six-genome equivalent BAC library. Validation of the WGP map using the Columbia reference sequence confirmed that 350 BAC contigs (98%) were assembled correctly, spanning 97% of the 102-Mb calculated genome coverage. We demonstrate that WGP maps can also be generated for more complex plant genomes and will serve as excellent scaffolds to anchor genetic linkage maps and integrate whole genome sequence data.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico/métodos , Genoma de Planta/genética , Sequenciamento de Nucleotídeos em Larga Escala , Cromossomos Artificiais Bacterianos/genética , Biologia Computacional , Mapeamento de Sequências Contíguas , Biblioteca GenômicaRESUMO
BACKGROUND: Sequencing projects using a clone-by-clone approach require the availability of a robust physical map. The SNaPshot technology, based on pair-wise comparisons of restriction fragments sizes, has been used recently to build the first physical map of a wheat chromosome and to complete the maize physical map. However, restriction fragments sizes shared randomly between two non-overlapping BACs often lead to chimerical contigs and mis-assembled BACs in such large and repetitive genomes. Whole Genome Profiling (WGP™) was developed recently as a new sequence-based physical mapping technology and has the potential to limit this problem. RESULTS: A subset of the wheat 3B chromosome BAC library covering 230 Mb was used to establish a WGP physical map and to compare it to a map obtained with the SNaPshot technology. We first adapted the WGP-based assembly methodology to cope with the complexity of the wheat genome. Then, the results showed that the WGP map covers the same length than the SNaPshot map but with 30% less contigs and, more importantly with 3.5 times less mis-assembled BACs. Finally, we evaluated the benefit of integrating WGP tags in different sequence assemblies obtained after Roche/454 sequencing of BAC pools. We showed that while WGP tag integration improves assemblies performed with unpaired reads and with paired-end reads at low coverage, it does not significantly improve sequence assemblies performed at high coverage (25x) with paired-end reads. CONCLUSIONS: Our results demonstrate that, with a suitable assembly methodology, WGP builds more robust physical maps than the SNaPshot technology in wheat and that WGP can be adapted to any genome. Moreover, WGP tag integration in sequence assemblies improves low quality assembly. However, to achieve a high quality draft sequence assembly, a sequencing depth of 25x paired-end reads is required, at which point WGP tag integration does not provide additional scaffolding value. Finally, we suggest that WGP tags can support the efficient sequencing of BAC pools by enabling reliable assignment of sequence scaffolds to their BAC of origin, a feature that is of great interest when using BAC pooling strategies to reduce the cost of sequencing large genomes.
Assuntos
Genoma de Planta , Mapeamento Físico do Cromossomo , Análise de Sequência de DNA/métodos , Triticum/genética , Cromossomos Artificiais Bacterianos , Cromossomos de Plantas , Mapeamento de Sequências Contíguas , Elementos de DNA Transponíveis , Alinhamento de SequênciaRESUMO
Global warming poses severe threats to agricultural production, including soybean. One of the major mechanisms for organisms to combat heat stress is through heat shock proteins (HSPs) that stabilize protein structures at above-optimum temperatures, by assisting in the folding of nascent, misfolded, or unfolded proteins. The HSP40 subgroups, or the J-domain proteins, functions as co-chaperones. They capture proteins that require folding or refolding and pass them on to HSP70 for processing. In this study, we have identified a type-I HSP40 gene in soybean, GmDNJ1, with high basal expression under normal growth conditions and also highly inducible under abiotic stresses, especially heat. Gmdnj1-knockout mutants had diminished growth in normal conditions, and when under heat stress, exhibited more severe browning, reduced chlorophyll contents, higher reactive oxygen species (ROS) contents, and higher induction of heat stress-responsive transcription factors and ROS-scavenging enzyme-encoding genes. Under both normal and heat-stress conditions, the mutant lines accumulated more aggregated proteins involved in protein catabolism, sugar metabolism, and membrane transportation, in both roots and leaves. In summary, GmDNJ1 plays crucial roles in the overall plant growth and heat tolerance in soybean, probably through the surveillance of misfolded proteins for refolding to maintain the full capacity of cellular functions.
RESUMO
BLAST searchable databases containing insertion flanking sequences have revolutionized reverse genetics in plant research. The development of such databases has so far been limited to a small number of model species and normally requires extensive labour input. Here we describe a highly efficient and widely applicable method that we adapted to identify unique transposon-flanking genomic sequences in Petunia. The procedure is based on a multi-dimensional pooling strategy for the collection of DNA samples; up to thousands of different templates are amplified from each of the DNA pools separately, and knowledge of their source is safeguarded by the use of pool-specific (sample) identification tags in one of the amplification primers. All products are combined into a single sample that is subsequently used as a template for unidirectional pyrosequencing. Computational analysis of the clustered sequence output allows automatic assignment of sequences to individual DNA sources. We have amplified and analysed transposon-flanking sequences from a Petunia transposon insertion library of 1000 individuals. Using 30 DNA isolations, 70 PCR reactions and two GS20 sequencing runs, we were able to allocate around 10 000 transposon flanking sequences to specific plants in the library. These sequences have been organized in a database that can be BLAST-searched for insertions into genes of interest. As a proof of concept, we have performed an in silico screen for insertions into members of the NAM/NAC transcription factor family. All in silico-predicted transposon insertions into members of this family could be confirmed in planta.
Assuntos
Bases de Dados Genéticas , Mutagênese Insercional , Petunia/genética , Análise de Sequência de DNA/métodos , Análise por Conglomerados , Biologia Computacional , Elementos de DNA Transponíveis , DNA de Plantas/genética , Biblioteca Gênica , Reação em Cadeia da PolimeraseRESUMO
In plant breeding the use of molecular markers has resulted in tremendous improvement of the speed with which new crop varieties are introduced into the market. Single Nucleotide Polymorphism (SNP) genotyping is routinely used for association studies, Linkage Disequilibrium (LD) and Quantitative Trait Locus (QTL) mapping studies, marker-assisted backcrosses and validation of large numbers of novel SNPs. Here we present the KeyGene SNPSelect technology, a scalable and flexible multiplexed, targeted sequence-based, genotyping solution. The multiplex composition of SNPSelect assays can be easily changed between experiments by adding or removing loci, demonstrating their content flexibility. To demonstrate this versatility, we first designed a 1,056-plex maize assay and genotyped a total of 374 samples originating from an F2 and a Recombinant Inbred Line (RIL) population and a maize germplasm collection. Next, subsets of the most informative SNP loci were assembled in 384-plex and 768-plex assays for further genotyping. Indeed, selection of the most informative SNPs allows cost-efficient yet highly informative genotyping in a custom-made fashion, with average call rates between 88.1% (1,056-plex assay) and 99.4% (384-plex assay), and average reproducibility rates between duplicate samples ranging from 98.2% (1056-plex assay) to 99.9% (384-plex assay). The SNPSelect workflow can be completed from a DNA sample to a genotype dataset in less than three days. We propose SNPSelect as an attractive and competitive genotyping solution to meet the targeted genotyping needs in fields such as plant breeding.
Assuntos
Técnicas de Genotipagem/métodos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Mapeamento Cromossômico , Frequência do Gene , Código Genético , Genótipo , Melhoramento Vegetal , Reprodutibilidade dos Testes , Fatores de Tempo , Fluxo de Trabalho , Zea mays/genéticaRESUMO
The SNPWave marker system, based on SNPs between the reference accessions Colombia-0 and Landsberg erecta (Ler), was used to distinguish a set of 92 Arabidopsis accessions from various parts of the world. In addition, we used these markers to genotype three new recombinant inbred line populations for Arabidopsis, having Ler as a common parent that was crossed with the accessions Antwerp-1, Kashmir-2, and Kondara. The benefit of using multiple populations that contain many similar markers and the fact that all markers are linked to the physical map of Arabidopsis facilitates the quantitative comparison of maps. Flowering-time variation was analyzed in the three recombinant inbred line populations. Per population, four to eight quantitative trait loci (QTL) were detected. The comparison of the QTL positions related to the physical map allowed the estimate of 12 different QTL segregating for flowering time for which Ler has an allele different from one, two, or three of the other accessions.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico , Flores/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Recombinação Genética , Arabidopsis/fisiologia , Cruzamentos Genéticos , Flores/fisiologia , Ligação Genética , Marcadores Genéticos , Plantas Geneticamente ModificadasRESUMO
Scalable multiplexed amplification technologies are needed for cost-effective large-scale genotyping of genetic markers such as single nucleotide polymorphisms (SNPs). We present SNPWave, a novel SNP genotyping technology to detect various subsets of sequences in a flexible fashion in a fixed detection format. SNPWave is based on highly multiplexed ligation, followed by amplification of up to 20 ligated probes in a single PCR. Depending on the multiplexing level of the ligation reaction, the latter employs selective amplification using the amplified fragment length polymorphism (AFLP) technology. Detection of SNPWave reaction products is based on size separation on a sequencing instrument with multiple fluorescence labels and short run times. The SNPWave technique is illustrated by a 100-plex genotyping assay for Arabidopsis, a 40-plex assay for tomato and a 10-plex assay for Caenorhabditis elegans, detected on the MegaBACE 1000 capillary sequencer.
Assuntos
Arabidopsis/genética , Caenorhabditis elegans/genética , Reação em Cadeia da Polimerase/métodos , Polimorfismo de Nucleotídeo Único/genética , Solanum lycopersicum/genética , Alelos , Animais , DNA/análise , DNA/genética , Sondas de DNA/genética , Genótipo , Padrões de Referência , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
Conventional marker-based genotyping platforms are widely available, but not without their limitations. In this context, we developed Sequence-Based Genotyping (SBG), a technology for simultaneous marker discovery and co-dominant scoring, using next-generation sequencing. SBG offers users several advantages including a generic sample preparation method, a highly robust genome complexity reduction strategy to facilitate de novo marker discovery across entire genomes, and a uniform bioinformatics workflow strategy to achieve genotyping goals tailored to individual species, regardless of the availability of a reference sequence. The most distinguishing features of this technology are the ability to genotype any population structure, regardless whether parental data is included, and the ability to co-dominantly score SNP markers segregating in populations. To demonstrate the capabilities of SBG, we performed marker discovery and genotyping in Arabidopsis thaliana and lettuce, two plant species of diverse genetic complexity and backgrounds. Initially we obtained 1,409 SNPs for arabidopsis, and 5,583 SNPs for lettuce. Further filtering of the SNP dataset produced over 1,000 high quality SNP markers for each species. We obtained a genotyping rate of 201.2 genotypes/SNP and 58.3 genotypes/SNP for arabidopsis (nâ=â222 samples) and lettuce (nâ=â87 samples), respectively. Linkage mapping using these SNPs resulted in stable map configurations. We have therefore shown that the SBG approach presented provides users with the utmost flexibility in garnering high quality markers that can be directly used for genotyping and downstream applications. Until advances and costs will allow for routine whole-genome sequencing of populations, we expect that sequence-based genotyping technologies such as SBG will be essential for genotyping of model and non-model genomes alike.
Assuntos
Arabidopsis/genética , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Lactuca/genética , Mapeamento Cromossômico , Biologia Computacional/métodos , Ligação Genética , Marcadores Genéticos , Genoma de Planta , Genótipo , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos TestesRESUMO
Reverse genetics approaches rely on the detection of sequence alterations in target genes to identify allelic variants among mutant or natural populations. Current (pre-) screening methods such as TILLING and EcoTILLING are based on the detection of single base mismatches in heteroduplexes using endonucleases such as CEL 1. However, there are drawbacks in the use of endonucleases due to their relatively poor cleavage efficiency and exonuclease activity. Moreover, pre-screening methods do not reveal information about the nature of sequence changes and their possible impact on gene function. We present KeyPoint technology, a high-throughput mutation/polymorphism discovery technique based on massive parallel sequencing of target genes amplified from mutant or natural populations. KeyPoint combines multi-dimensional pooling of large numbers of individual DNA samples and the use of sample identification tags ("sample barcoding") with next-generation sequencing technology. We show the power of KeyPoint by identifying two mutants in the tomato eIF4E gene based on screening more than 3000 M2 families in a single GS FLX sequencing run, and discovery of six haplotypes of tomato eIF4E gene by re-sequencing three amplicons in a subset of 92 tomato lines from the EU-SOL core collection. We propose KeyPoint technology as a broadly applicable amplicon sequencing approach to screen mutant populations or germplasm collections for identification of (novel) allelic variation in a high-throughput fashion.
Assuntos
Mutação , Técnicas de Amplificação de Ácido Nucleico/métodos , Polimorfismo Genético , Análise de Sequência de DNA/métodos , Solanum lycopersicum/genética , Alelos , Sequência de Bases , Fator de Iniciação 4E em Eucariotos/genética , Haplótipos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The AFLP technique is a powerful DNA fingerprinting technology applicable to any organism without the need for prior sequence knowledge. The protocol involves the selective PCR amplification of restriction fragments of a total digest of genomic DNA, typically obtained with a mix of two restriction enzymes. Two limited sets of AFLP primers are sufficient to generate a large number of different primer combinations (PCs), each of which will yield unique fingerprints. Visualization of AFLP fingerprints after gel electrophoresis of AFLP products is described using either a conventional autoradiography platform or an automated LI-COR system. The AFLP technology has been used predominantly for assessing the degree of variability among plant cultivars, establishing linkage groups in crosses and saturating genomic regions with markers for gene landing efforts. AFLP fragments may also be used as physical markers to determine the overlap and positions of genomic clones and to integrate genetic and physical maps. Crucial characteristics of the AFLP technology are its robustness, reliability and quantitative nature. This latter feature has been exploited for co-dominant scoring of AFLP markers in sample collections such as F2 or back-cross populations using appropriate AFLP scoring software. This protocol can be completed in 2-3 d.
Assuntos
Impressões Digitais de DNA/métodos , Técnicas de Amplificação de Ácido Nucleico/métodos , Sequência de Aminoácidos , Sequência de Bases , Genes de Plantas , Genoma de Planta , Solanum lycopersicum/genética , Dados de Sequência Molecular , Proteínas de Plantas/química , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Polimorfismo GenéticoRESUMO
Although DNA microarrays are currently the standard tool for genome-wide expression analysis, their application is limited to organisms for which the complete genome sequence or large collections of known transcript sequences are available. Here, we describe a protocol for cDNA-AFLP, an AFLP-based transcript profiling method that allows genome-wide expression analysis in any species without the need for prior sequence knowledge. In essence, the cDNA-AFLP method involves reverse transcription of mRNA into double-stranded cDNA, followed by restriction digestion, ligation of specific adapters and fractionation of this mixture of cDNA fragments into smaller subsets by selective PCR amplification. The resulting cDNA-AFLP fragments are separated on high-resolution gels, and visualization of cDNA-AFLP fingerprints is described using either a conventional autoradiography platform or an automated LI-COR system. Observed differences in band intensities between samples provide a good measure of the relative differences in the gene expression levels. Identification of differentially expressed genes can be accomplished by purifying cDNA-AFLP fragments from sequence gels and subsequent sequencing. This method has found widespread use as an attractive technology for gene discovery on the basis of fragment detection and for temporal quantitative gene expression analysis. The protocol can be completed in 3-4 d.
Assuntos
DNA Complementar/genética , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Técnicas de Amplificação de Ácido Nucleico/métodos , Transcrição Gênica/genética , Enzimas de Restrição do DNA , RNA/genética , RNA/metabolismoRESUMO
Application of single nucleotide polymorphisms (SNPs) is revolutionizing human bio-medical research. However, discovery of polymorphisms in low polymorphic species is still a challenging and costly endeavor, despite widespread availability of Sanger sequencing technology. We present CRoPS as a novel approach for polymorphism discovery by combining the power of reproducible genome complexity reduction of AFLP with Genome Sequencer (GS) 20/GS FLX next-generation sequencing technology. With CRoPS, hundreds-of-thousands of sequence reads derived from complexity-reduced genome sequences of two or more samples are processed and mined for SNPs using a fully-automated bioinformatics pipeline. We show that over 75% of putative maize SNPs discovered using CRoPS are successfully converted to SNPWave assays, confirming them to be true SNPs derived from unique (single-copy) genome sequences. By using CRoPS, polymorphism discovery will become affordable in organisms with high levels of repetitive DNA in the genome and/or low levels of polymorphism in the (breeding) germplasm without the need for prior sequence information.
Assuntos
Polimorfismo de Nucleotídeo Único , Sequência de Bases , Genoma de Planta , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico , Zea mays/genéticaRESUMO
Progressive familial intrahepatic cholestasis (PFIC) and benign recurrent intrahepatic cholestasis (BRIC) are clinically distinct hereditary disorders. PFIC patients suffer from chronic cholestasis and develop liver fibrosis. BRIC patients experience intermittent attacks of cholestasis that resolve spontaneously. Mutations in ATP8B1 (previously FIC1) may result in PFIC or BRIC. We report the genomic organization of ATP8B1 and mutation analyses of 180 families with PFIC or BRIC that identified 54 distinct disease mutations, including 10 mutations predicted to disrupt splicing, 6 nonsense mutations, 11 small insertion or deletion mutations predicted to induce frameshifts, 1 large genomic deletion, 2 small inframe deletions, and 24 missense mutations. Most mutations are rare, occurring in 1-3 families, or are limited to specific populations. Many patients are compound heterozygous for 2 mutations. Mutation type or location correlates overall with clinical severity: missense mutations are more common in BRIC (58% vs. 38% in PFIC), while nonsense, frameshifting, and large deletion mutations are more common in PFIC (41% vs. 16% in BRIC). Some mutations, however, lead to a wide range of phenotypes, from PFIC to BRIC or even no clinical disease. ATP8B1 mutations were detected in 30% and 41%, respectively, of the PFIC and BRIC patients screened.