RESUMO
The rhizome is responsible for the invasiveness and competitiveness of many plants with great economic and agricultural impact worldwide. Besides its value as an invasive organ, the rhizome plays a role in the establishment and massive growth of forage, providing biomass for biofuel production. Despite these features, little is known about the molecular mechanisms that contribute to rhizome growth, development, and function in plants. In this work, we characterized the proteome of rhizome apical tips and elongation zones from different species using a GeLC-MS/MS (one-dimensional electrophoresis in combination with liquid chromatography coupled online with tandem mass spectrometry) spectral-counting proteomics strategy. Five rhizomatous grasses and an ancient species were compared to study the protein regulation in rhizomes. An average of 2200 rhizome proteins per species were confidently identified and quantified. Rhizome-characteristic proteins showed similar functional distributions across all species analyzed. The over-representation of proteins associated with central roles in cellular, metabolic, and developmental processes indicated accelerated metabolism in growing rhizomes. Moreover, 61 rhizome-characteristic proteins appeared to be regulated similarly among analyzed plants. In addition, 36 showed conserved regulation between rhizome apical tips and elongation zones across species. These proteins were preferentially expressed in rhizome tissues regardless of the species analyzed, making them interesting candidates for more detailed investigative studies about their roles in rhizome development.
Assuntos
Equisetum/genética , Proteínas de Plantas/análise , Poaceae/genética , Proteoma/genética , Proteoma/metabolismo , Proteômica/métodos , Rizoma/metabolismo , Cromatografia Líquida , Eletroforese em Gel de Poliacrilamida , Equisetum/metabolismo , Proteínas de Plantas/classificação , Proteínas de Plantas/metabolismo , Poaceae/metabolismo , Rizoma/genética , Especificidade da Espécie , Espectrometria de Massas em TandemRESUMO
Research on the distribution and structure of fungal communities in caves is lacking. Kartchner Caverns is a wet and mineralogically diverse carbonate cave located in an escarpment of Mississippian Escabrosa limestone in the Whetstone Mountains, Arizona, USA. Fungal diversity from speleothem and rock wall surfaces was examined with 454 FLX Titanium sequencing technology using the Internal Transcribed Spacer 1 as a fungal barcode marker. Fungal diversity was estimated and compared between speleothem and rock wall surfaces, and its variation with distance from the natural entrance of the cave was quantified. Effects of environmental factors and nutrient concentrations in speleothem drip water at different sample sites on fungal diversity were also examined. Sequencing revealed 2,219 fungal operational taxonomic units (OTUs) at the 95% similarity level. Speleothems supported a higher fungal richness and diversity than rock walls. However, community membership and the taxonomic distribution of fungal OTUs at the class level did not differ significantly between speleothems and rock walls. Both OTU richness and diversity decreased significantly with increasing distance from the natural cave entrance. Community membership and taxonomic distribution of fungal OTUs also differed significantly between the sampling sites closest to the entrance and those furthest away. There was no significant effect of temperature, CO2 concentration, or drip water nutrient concentration on fungal community structure on either speleothems or rock walls. Together, these results suggest that proximity to the natural entrance is a critical factor in determining fungal community structure on mineral surfaces in Kartchner Caverns.
Assuntos
Adaptação Biológica/fisiologia , Biodiversidade , Carbonatos/química , Cavernas/microbiologia , Meio Ambiente , Fungos/genética , Adaptação Biológica/genética , Arizona , Sequência de Bases , Dióxido de Carbono/análise , Primers do DNA/genética , Fungos/fisiologia , Dados de Sequência Molecular , Análise de Sequência de DNA , Especificidade da Espécie , TemperaturaRESUMO
BACKGROUND: The rhizome, the original stem of land plants, enables species to invade new territory and is a critical component of perenniality, especially in grasses. Red rice (Oryza longistaminata) is a perennial wild rice species with many valuable traits that could be used to improve cultivated rice cultivars, including rhizomatousness, disease resistance and drought tolerance. Despite these features, little is known about the molecular mechanisms that contribute to rhizome growth, development and function in this plant. RESULTS: We used an integrated approach to compare the transcriptome, proteome and metabolome of the rhizome to other tissues of red rice. 116 Gb of transcriptome sequence was obtained from various tissues and used to identify rhizome-specific and preferentially expressed genes, including transcription factors and hormone metabolism and stress response-related genes. Proteomics and metabolomics approaches identified 41 proteins and more than 100 primary metabolites and plant hormones with rhizome preferential accumulation. Of particular interest was the identification of a large number of gene transcripts from Magnaportha oryzae, the fungus that causes rice blast disease in cultivated rice, even though the red rice plants showed no sign of disease. CONCLUSIONS: A significant set of genes, proteins and metabolites appear to be specifically or preferentially expressed in the rhizome of O. longistaminata. The presence of M. oryzae gene transcripts at a high level in apparently healthy plants suggests that red rice is resistant to this pathogen, and may be able to provide genes to cultivated rice that will enable resistance to rice blast disease.
Assuntos
Oryza/metabolismo , Rizoma/metabolismo , Resistência à Doença/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Oryza/genética , Oryza/fisiologia , Rizoma/genética , Rizoma/fisiologia , Transcriptoma/genéticaRESUMO
BACKGROUND: Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. RESULTS: In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. CONCLUSION: A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific transcription factors and other regulatory genes were found that were common to the two species and that are excellent candidates for involvement in rhizome growth, differentiation and development. Large classes of enzymes involved in specialized metabolism were also found to have apparent tissue-specific expression, suggesting that gene expression itself may play an important role in regulating metabolite production in these plants.
Assuntos
Catecóis/metabolismo , Curcuma/metabolismo , Álcoois Graxos/metabolismo , Terpenos/metabolismo , Zingiber officinale/metabolismo , Curcuma/genética , Etiquetas de Sequências Expressas , Zingiber officinale/genéticaRESUMO
Caves are relatively accessible subterranean habitats ideal for the study of subsurface microbial dynamics and metabolisms under oligotrophic, non-photosynthetic conditions. A 454-pyrotag analysis of the V6 region of the 16S rRNA gene was used to systematically evaluate the bacterial diversity of ten cave surfaces within Kartchner Caverns, a limestone cave. Results showed an average of 1,994 operational taxonomic units (97 % cutoff) per speleothem and a broad taxonomic diversity that included 21 phyla and 12 candidate phyla. Comparative analysis of speleothems within a single room of the cave revealed three distinct bacterial taxonomic profiles dominated by either Actinobacteria, Proteobacteria, or Acidobacteria. A gradient in observed species richness along the sampling transect revealed that the communities with lower diversity corresponded to those dominated by Actinobacteria while the more diverse communities were those dominated by Proteobacteria. A 16S rRNA gene clone library from one of the Actinobacteria-dominated speleothems identified clones with 99 % identity to chemoautotrophs and previously characterized oligotrophs, providing insights into potential energy dynamics supporting these communities. The robust analysis conducted for this study demonstrated a rich bacterial diversity on speleothem surfaces. Further, it was shown that seemingly comparable speleothems supported divergent phylogenetic profiles suggesting that these communities are very sensitive to subtle variations in nutritional inputs and environmental factors typifying speleothem surfaces in Kartchner Caverns.
Assuntos
Bactérias/classificação , Biodiversidade , Cavernas/microbiologia , Filogenia , Microbiologia do Solo , Arizona , Bactérias/genética , Bactérias/isolamento & purificação , DNA Bacteriano/genética , Biblioteca Gênica , RNA Ribossômico 16S/genética , Análise de Sequência de DNARESUMO
SyMAP (Synteny Mapping and Analysis Program) was originally developed to compute synteny blocks between a sequenced genome and a FPC map, and has been extended to support pairs of sequenced genomes. SyMAP uses MUMmer to compute the raw hits between the two genomes, which are then clustered and filtered using the optional gene annotation. The filtered hits are input to the synteny algorithm, which was designed to discover duplicated regions and form larger-scale synteny blocks, where intervening micro-rearrangements are allowed. SyMAP provides extensive interactive Java displays at all levels of resolution along with simultaneous displays of multiple aligned pairs. The synteny blocks from multiple chromosomes may be displayed in a high-level dot plot or three-dimensional view, and the user may then drill down to see the details of a region, including the alignments of the hits to the gene annotation. These capabilities are illustrated by showing their application to the study of genome duplication, differential gene loss and transitive homology between sorghum, maize and rice. The software may be used from a website or standalone for the best performance. A project manager is provided to organize and automate the analysis of multi-genome groups. The software is freely distributed at http://www.agcol.arizona.edu/software/symap.
Assuntos
Genoma de Planta , Software , Sintenia , Cromossomos de Plantas , Gráficos por Computador , Genômica/métodosRESUMO
Magnaporthe oryzae causes rice blast disease, which is the most serious disease of cultivated rice worldwide. We previously developed the Magnaporthe grisea-Orzya sativa (MGOS) database as a repository for the M. oryzae and rice genome sequences together with a comprehensive set of functional interaction data generated by a major consortium of U.S. researchers. The MGOS database has now undergone a major redesign to include data from the international blast research community, accessible with a new intuitive, easy-to-use interface. Registered database users can manually annotate gene sequences and features as well as add mutant data and literature on individual gene pages. Over 900 genes have been manually curated based on various biological databases and the scientific literature. Gene names and descriptions, gene ontology annotations, published and unpublished information on mutants and their phenotypes, responses in diverse microarray analyses, and related literature have been incorporated. Thus far, 362 M. oryzae genes have associated information on mutants. MGOS is now poised to become a one-stop repository for all structural and functional data available on all genes of this critically important rice pathogen.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Magnaporthe/genética , Oryza/microbiologia , Doenças das Plantas/microbiologia , Interface Usuário-Computador , Regulação Fúngica da Expressão Gênica , Genes Fúngicos/genética , Genoma Fúngico , Anotação de Sequência Molecular , Mutação , Fenótipo , Vocabulário ControladoRESUMO
BACKGROUND: Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. RESULTS: Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. CONCLUSION: Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant-insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs.
Assuntos
Besouros/crescimento & desenvolvimento , Bases de Dados Genéticas , Ulmus/genética , Animais , Biologia Computacional , Ciclopentanos/metabolismo , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica , Biblioteca Gênica , Redes e Vias Metabólicas , Óvulo/química , Óvulo/metabolismo , Oxilipinas/metabolismo , Fotossíntese/genética , Folhas de Planta/genéticaRESUMO
Glandular trichomes play important roles in protecting plants from biotic attack by producing defensive compounds. We investigated the metabolic profiles and transcriptomes to characterize the differences between different glandular trichome types in several domesticated and wild Solanum species: Solanum lycopersicum (glandular trichome types 1, 6, and 7), Solanum habrochaites (types 1, 4, and 6), Solanum pennellii (types 4 and 6), Solanum arcanum (type 6), and Solanum pimpinellifolium (type 6). Substantial chemical differences in and between Solanum species and glandular trichome types are likely determined by the regulation of metabolism at several levels. Comparison of S. habrochaites type 1 and 4 glandular trichomes revealed few differences in chemical content or transcript abundance, leading to the conclusion that these two glandular trichome types are the same and differ perhaps only in stalk length. The observation that all of the other species examined here contain either type 1 or 4 trichomes (not both) supports the conclusion that these two trichome types are the same. Most differences in metabolites between type 1 and 4 glands on the one hand and type 6 glands on the other hand are quantitative but not qualitative. Several glandular trichome types express genes associated with photosynthesis and carbon fixation, indicating that some carbon destined for specialized metabolism is likely fixed within the trichome secretory cells. Finally, Solanum type 7 glandular trichomes do not appear to be involved in the biosynthesis and storage of specialized metabolites and thus likely serve another unknown function, perhaps as the site of the synthesis of protease inhibitors.
Assuntos
Genômica/métodos , Epiderme Vegetal/anatomia & histologia , Epiderme Vegetal/genética , Solanum/genética , Cromatografia Líquida , Análise por Conglomerados , Análise Discriminante , Análise dos Mínimos Quadrados , Espectrometria de Massas , Metaboloma/genética , Dados de Sequência Molecular , Folhas de Planta/anatomia & histologia , Folhas de Planta/genética , Análise de Componente Principal , Solanum/metabolismo , Especificidade da EspécieRESUMO
Nearly half the earth's surface is occupied by dryland ecosystems, regions susceptible to reduced states of biological productivity caused by climate fluctuations. Of these regions, arid zones located at the interface between vegetated semiarid regions and biologically unproductive hyperarid zones are considered most vulnerable. The objective of this study was to conduct a deep diversity analysis of bacterial communities in unvegetated arid soils of the Atacama Desert, to characterize community structure and infer the functional potential of these communities based on observed phylogenetic associations. A 454-pyrotag analysis was conducted of three unvegetated arid sites located at the hyperarid-arid margin. The analysis revealed communities with unique bacterial diversity marked by high abundances of novel Actinobacteria and Chloroflexi and low levels of Acidobacteria and Proteobacteria, phyla that are dominant in many biomes. A 16S rRNA gene library of one site revealed the presence of clones with phylogenetic associations to chemoautotrophic taxa able to obtain energy through oxidation of nitrite, carbon monoxide, iron, or sulfur. Thus, soils at the hyperarid margin were found to harbor a wealth of novel bacteria and to support potentially viable communities with phylogenetic associations to non-phototrophic primary producers and bacteria capable of biogeochemical cycling.
Assuntos
Actinobacteria , Chloroflexi , Clima Desértico , RNA Bacteriano/genética , RNA Ribossômico 16S/genética , Microbiologia do Solo , Actinobacteria/classificação , Actinobacteria/genética , Actinobacteria/isolamento & purificação , Chile , Chloroflexi/classificação , Chloroflexi/genética , Chloroflexi/isolamento & purificação , DNA Bacteriano/genética , DNA Ribossômico/genéticaRESUMO
PREMISE OF THE STUDY: The common reed (Phragmites australis), one of the most widely distributed of all angiosperms, uses its rhizomes (underground stems) to invade new territory, making it one of the most successful weedy species worldwide. Characterization of the rhizome transcriptome and proteome is needed to identify candidate genes and proteins involved in rhizome growth, development, metabolism, and invasiveness. METHODS: We employed next-generation sequencing technologies including 454 and Illumina platforms to characterize the reed rhizome transcriptome and used quantitative proteomics techniques to identify the rhizome proteome. KEY RESULTS: Combining 336514 Roche 454 Titanium reads and 103350802 Illumina paired-end reads in a de novo hybrid assembly yielded 124450 unique transcripts with an average length of 549 bp, of which 54317 were annotated. Rhizome-specific and differentially expressed transcripts were identified between rhizome apical tips (apical meristematic region) and rhizome elongation zones. A total of 1280 nonredundant proteins were identified and quantified using GeLC-MS/MS based label-free proteomics, where 174 and 77 proteins were preferentially expressed in the rhizome elongation zone and apical tip tissues, respectively. Genes involved in allelopathy and in controlling development and potentially invasiveness were identified. CONCLUSIONS: In addition to being a valuable sequence and protein data resource for studying plant rhizome species, our results provide useful insights into identifying specific genes and proteins with potential roles in rhizome differentiation, development, and function.
Assuntos
Perfilação da Expressão Gênica , Genes de Plantas , Poaceae/genética , Proteômica , Rizoma/genética , Sequência de Bases , Cromatografia Líquida , Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Espécies Introduzidas , Espectrometria de Massas , Meristema/genética , Meristema/metabolismo , Anotação de Sequência Molecular , Proteínas de Plantas/análise , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Poaceae/crescimento & desenvolvimento , Poaceae/metabolismo , RNA de Plantas/genética , Rizoma/crescimento & desenvolvimento , Rizoma/metabolismo , Análise de Sequência de RNA , Especificidade da Espécie , Fatores de Transcrição/genética , TranscriptomaRESUMO
Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5' and 3' UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org).
Assuntos
Mapeamento Cromossômico/métodos , DNA Complementar/genética , Análise de Sequência de DNA/métodos , Zea mays/genética , Arabidopsis/genética , Sequência de Bases , Cromossomos de Plantas/genética , Mapeamento de Sequências Contíguas , Elementos de DNA Transponíveis/genética , Etiquetas de Sequências Expressas , Genes de Plantas/genética , Internet , Repetições Minissatélites/genética , Dados de Sequência Molecular , Oryza/genética , Proteínas de Plantas/metabolismo , Poli A/genética , Polimorfismo de Nucleotídeo Único/genética , Populus/genética , Homologia de Sequência do Ácido Nucleico , Sorghum/genética , Fatores de Transcrição/genéticaRESUMO
Eukaryotic plant pathogens are responsible for the destruction of billions of dollars worth of crops each year. With large-scale genomics of both pathogens and hosts and the corresponding computational analysis, biologists are now able to gain knowledge about many pathogenic and defense genes concurrently. To study the interactions between these two organism groups, it is necessary to design experiments to elucidate the genes being expressed during the invasion of the pathogen into the host. For the most part, this does not require new software development, though it does require the use of existing software in novel ways. This article provides a broad overview of several key and illustrative experiments and the corresponding computational analyses, outlining the knowledge gained in each. It goes on to describe databases for plant-pathogen data and important initiatives such as Plant-Associated Microbe Gene Ontology. It discusses how various emerging approaches will increase the power of computers in host-pathogen interaction studies.
Assuntos
Mapeamento Cromossômico/métodos , Eucariotos/fisiologia , Fungos/fisiologia , Interações Hospedeiro-Parasita/genética , Plantas/genética , Plantas/microbiologia , SoftwareRESUMO
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Genômica/métodos , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Bases , SoftwareRESUMO
Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes.
Assuntos
Evolução Molecular , Genoma de Planta , Zea mays/genética , Mapeamento Cromossômico , Cromossomos Artificiais Bacterianos/genética , Cromossomos de Plantas/genética , Impressões Digitais de DNA , DNA de Plantas/genética , Grão Comestível/genética , Duplicação Gênica , Rearranjo Gênico , Oryza/genética , Filogenia , Especificidade da EspécieRESUMO
BACKGROUND: New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs. RESULTS: The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs. CONCLUSION: The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.
Assuntos
Etiquetas de Sequências Expressas , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Análise por Conglomerados , Mapeamento de Sequências Contíguas , Genoma de Planta , Zea mays/genéticaRESUMO
BACKGROUND: Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence. RESULTS: A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics. CONCLUSION: The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at http://phymap.ucdavis.edu/brachypodium/.
Assuntos
Cromossomos Artificiais Bacterianos/genética , Oryza/genética , Mapeamento Físico do Cromossomo/métodos , Poaceae/genética , Triticum/genética , Mapeamento de Sequências Contíguas , Impressões Digitais de DNA , Grão Comestível/genética , Evolução Molecular , Etiquetas de Sequências Expressas/metabolismo , Genoma de Planta/genéticaRESUMO
BACKGROUND: Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. RESULTS: A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to approximately 11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary. CONCLUSION: MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.
Assuntos
Mapeamento Cromossômico/métodos , Metilação de DNA , Genoma de Planta , Zea mays/genética , Cromossomos Artificiais Bacterianos , DNA de Plantas/genética , Epigênese Genética , Biblioteca Gênica , Genômica/métodos , Alinhamento de Sequência , Análise de Sequência de DNA/métodosRESUMO
A comparative physical map of the AA genome (Oryza sativa) and the BB genome (O. punctata) was constructed by aligning a physical map of O. punctata, deduced from 63,942 BAC end sequences (BESs) and 34,224 fingerprints, onto the O. sativa genome sequence. The level of conservation of each chromosome between the two species was determined by calculating a ratio of BES alignments. The alignment result suggests more divergence of intergenic and repeat regions in comparison to gene-rich regions. Further, this characteristic enabled localization of heterochromatic and euchromatic regions for each chromosome of both species. The alignment identified 16 locations containing expansions, contractions, inversions, and transpositions. By aligning 40% of the punctata BES on the map, 87% of the punctata FPC map covered 98% of the O. sativa genome sequence. The genome size of O. punctata was estimated to be 8% larger than that of O. sativa with individual chromosome differences of 1.5-16.5%. The sum of expansions and contractions observed in regions >500 kb were similar, suggesting that most of the contractions/expansions contributing to the genome size difference between the two species are small, thus preserving the macro-collinearity between these species, which diverged approximately 2 million years ago.
Assuntos
Genoma de Planta/genética , Oryza/classificação , Oryza/genética , Mapeamento Físico do Cromossomo , Inversão Cromossômica/genética , Cromossomos Artificiais Bacterianos/genética , Cromossomos de Plantas/genética , Células Clonais , Dados de Sequência Molecular , Translocação GenéticaRESUMO
Identification of important transcripts from fungal pathogens and host plants is indispensable for full understanding the molecular events occurring during fungal-plant interactions. Recently, we developed an improved LongSAGE method called robust-long serial analysis of gene expression (RL-SAGE) for deep transcriptome analysis of fungal and plant genomes. Using this method, we made 10 RL-SAGE libraries from two plant species (Oryza sativa and Zea maize) and one fungal pathogen (Magnaporthe grisea). Many of the transcripts identified from these libraries were novel in comparison with their corresponding EST collections. Bioinformatic tools and databases for analyzing the RL-SAGE data were developed. Our results demonstrate that RL-SAGE is an effective approach for large-scale identification of expressed genes in fungal and plant genomes.