Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
G3 (Bethesda) ; 14(1)2023 Dec 29.
Article in English | MEDLINE | ID: mdl-37875130

ABSTRACT

Black spruce (Picea mariana [Mill.] B.S.P.) is a dominant conifer species in the North American boreal forest that plays important ecological and economic roles. Here, we present the first genome assembly of P. mariana with a reconstructed genome size of 18.3 Gbp and NG50 scaffold length of 36.0 kbp. A total of 66,332 protein-coding sequences were predicted in silico and annotated based on sequence homology. We analyzed the evolutionary relationships between P. mariana and 5 other spruces for which complete nuclear and organelle genome sequences were available. The phylogenetic tree estimated from mitochondrial genome sequences agrees with biogeography; specifically, P. mariana was strongly supported as a sister lineage to P. glauca and 3 other taxa found in western North America, followed by the European Picea abies. We obtained mixed topologies with weaker statistical support in phylogenetic trees estimated from nuclear and chloroplast genome sequences, indicative of ancient reticulate evolution affecting these 2 genomes. Clustering of protein-coding sequences from the 6 Picea taxa and 2 Pinus species resulted in 34,776 orthogroups, 560 of which appeared to be specific to P. mariana. Analysis of these specific orthogroups and dN/dS analysis of positive selection signatures for 497 single-copy orthogroups identified gene functions mostly related to plant development and stress response. The P. mariana genome assembly and annotation provides a valuable resource for forest genetics research and applications in this broadly distributed species, especially in relation to climate adaptation.


Subject(s)
Picea , Phylogeny , Picea/genetics , North America
2.
Plant J ; 111(5): 1469-1485, 2022 09.
Article in English | MEDLINE | ID: mdl-35789009

ABSTRACT

Spruces (Picea spp.) are coniferous trees widespread in boreal and mountainous forests of the northern hemisphere, with large economic significance and enormous contributions to global carbon sequestration. Spruces harbor very large genomes with high repetitiveness, hampering their comparative analysis. Here, we present and compare the genomes of four different North American spruces: the genome assemblies for Engelmann spruce (Picea engelmannii) and Sitka spruce (Picea sitchensis) together with improved and more contiguous genome assemblies for white spruce (Picea glauca) and for a naturally occurring introgress of these three species known as interior spruce (P. engelmannii × glauca × sitchensis). The genomes were structurally similar, and a large part of scaffolds could be anchored to a genetic map. The composition of the interior spruce genome indicated asymmetric contributions from the three ancestral genomes. Phylogenetic analysis of the nuclear and organelle genomes revealed a topology indicative of ancient reticulation. Different patterns of expansion of gene families among genomes were observed and related with presumed diversifying ecological adaptations. We identified rapidly evolving genes that harbored high rates of non-synonymous polymorphisms relative to synonymous ones, indicative of positive selection and its hitchhiking effects. These gene sets were mostly distinct between the genomes of ecologically contrasted species, and signatures of convergent balancing selection were detected. Stress and stimulus response was identified as the most frequent function assigned to expanding gene families and rapidly evolving genes. These two aspects of genomic evolution were complementary in their contribution to divergent evolution of presumed adaptive nature. These more contiguous spruce giga-genome sequences should strengthen our understanding of conifer genome structure and evolution, as their comparison offers clues into the genetic basis of adaptation and ecology of conifers at the genomic level. They will also provide tools to better monitor natural genetic diversity and improve the management of conifer forests. The genomes of four closely related North American spruces indicate that their high similarity at the morphological level is paralleled by the high conservation of their physical genome structure. Yet, the evidence of divergent evolution is apparent in their rapidly evolving genomes, supported by differential expansion of key gene families and large sets of genes under positive selection, largely in relation to stimulus and environmental stress response.


Subject(s)
Picea , Tracheophyta , Expressed Sequence Tags , Genome, Plant/genetics , Multigene Family/genetics , Phylogeny , Picea/genetics , Tracheophyta/genetics
3.
BMC Genomics ; 19(1): 942, 2018 Dec 17.
Article in English | MEDLINE | ID: mdl-30558528

ABSTRACT

BACKGROUND: Norway spruce [Picea abies (L.) Karst.] is ecologically and economically one of the most important conifer worldwide. Our main goal was to develop a large catalog of annotated high confidence gene SNPs that should sustain the development of genomic tools for the conservation of natural and domesticated genetic diversity resources, and hasten tree breeding efforts in this species. RESULTS: Targeted sequencing was achieved by capturing P. abies exome with probes previously designed from the sequenced transcriptome of white spruce (Picea glauca (Moench) Voss). Capture efficiency was high (74.5%) given a high level of exome conservation between the two species. Using stringent criteria, we delimited a set of 61,771 high-confidence SNPs across 13,543 genes. To validate SNPs, a high-throughput genotyping array was developed for a subset of 5571 predicted SNPs representing as many different gene loci, and was used to genotype over 1000 trees. The estimated true positive rate of the resource was 84.2%, which was comparable with the genotyping success rate obtained for P. abies control SNPs recycled from previous genotyping efforts. We also analyzed SNP abundance across various gene functional categories. Several GO terms and gene families involved in stress response were found over-represented in highly polymorphic genes. CONCLUSION: The annotated high-confidence SNP catalog developed herein represents a valuable genomic resource, being representative of over 13 K genes distributed across the P. abies genome. This resource should serve a variety of population genomics and breeding applications in Norway spruce.


Subject(s)
Exome/genetics , Picea/genetics , Polymorphism, Single Nucleotide , Contig Mapping , DNA, Plant/isolation & purification , DNA, Plant/metabolism , Genotype , Molecular Sequence Annotation , Plant Leaves/genetics , Sequence Analysis, DNA
4.
Plant J ; 90(1): 189-203, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28090692

ABSTRACT

Over the last decade, extensive genetic and genomic resources have been developed for the conifer white spruce (Picea glauca, Pinaceae), which has one of the largest plant genomes (20 Gbp). Draft genome sequences of white spruce and other conifers have recently been produced, but dense genetic maps are needed to comprehend genome macrostructure, delineate regions involved in quantitative traits, complement functional genomic investigations, and assist the assembly of fragmented genomic sequences. A greatly expanded P. glauca composite linkage map was generated from a set of 1976 full-sib progeny, with the positioning of 8793 expressed genes. Regions with significant low or high gene density were identified. Gene family members tended to be mapped on the same chromosomes, with tandemly arrayed genes significantly biased towards specific functional classes. The map was integrated with transcriptome data surveyed across eight tissues. In total, 69 clusters of co-expressed and co-localising genes were identified. A high level of synteny was found with pine genetic maps, which should facilitate the transfer of structural information in the Pinaceae. Although the current white spruce genome sequence remains highly fragmented, dozens of scaffolds encompassing more than one mapped gene were identified. From these, the relationship between genetic and physical distances was examined and the genome-wide recombination rate was found to be much smaller than most estimates reported for angiosperm genomes. This gene linkage map shall assist the large-scale assembly of the next-generation white spruce genome sequence and provide a reference resource for the conifer genomics community.


Subject(s)
Genome, Plant/genetics , Picea/genetics , Chromosome Mapping/methods , Computational Biology/methods , DNA, Plant/genetics , Genomics/methods , Polymorphism, Single Nucleotide/genetics , Synteny
5.
Mol Ecol Resour ; 16(2): 588-98, 2016 03.
Article in English | MEDLINE | ID: mdl-26391535

ABSTRACT

Picea mariana is a widely distributed boreal conifer across Canada and the subject of advanced breeding programmes for which population genomics and genomic selection approaches are being developed. Targeted sequencing was achieved after capturing P. mariana exome with probes designed from the sequenced transcriptome of Picea glauca, a distant relative. A high capture efficiency of 75.9% was reached although spruce has a complex and large genome including gene sequences interspersed by some long introns. The results confirmed the relevance of using probes from congeneric species to perform successfully interspecific exome capture in the genus Picea. A bioinformatics pipeline was developed including stringent criteria that helped detect a set of 97,075 highly reliable in silico SNPs. These SNPs were distributed across 14,909 genes. Part of an Infinium iSelect array was used to estimate the rate of true positives by validating 4267 of the predicted in silico SNPs by genotyping trees from P. mariana populations. The true positive rate was 96.2% for in silico SNPs, compared to a genotyping success rate of 96.7% for a set 1115 P. mariana control SNPs recycled from previous genotyping arrays. These results indicate the high success rate of the genotyping array and the relevance of the selection criteria used to delineate the new P. mariana in silico SNP resource. Furthermore, in silico SNPs were generally of medium to high frequency in natural populations, thus providing high informative value for future population genomics applications.


Subject(s)
Exome , Genetic Variation , Genotyping Techniques/methods , Picea/classification , Picea/genetics , Polymorphism, Single Nucleotide , Canada , Sequence Analysis, DNA
6.
Genome Biol Evol ; 7(12): 3269-85, 2015 Nov 11.
Article in English | MEDLINE | ID: mdl-26560341

ABSTRACT

Understanding the genetic basis of adaptation to climate is of paramount importance for preserving and managing genetic diversity in plants in a context of climate change. Yet, this objective has been addressed mainly in short-lived model species. Thus, expanding knowledge to nonmodel species with contrasting life histories, such as forest trees, appears necessary. To uncover the genetic basis of adaptation to climate in the widely distributed boreal conifer white spruce (Picea glauca), an environmental association study was conducted using 11,085 single nucleotide polymorphisms representing 7,819 genes, that is, approximately a quarter of the transcriptome.Linear and quadratic regressions controlling for isolation-by-distance, and the Random Forest algorithm, identified several dozen genes putatively under selection, among which 43 showed strongest signals along temperature and precipitation gradients. Most of them were related to temperature. Small to moderate shifts in allele frequencies were observed. Genes involved encompassed a wide variety of functions and processes, some of them being likely important for plant survival under biotic and abiotic environmental stresses according to expression data. Literature mining and sequence comparison also highlighted conserved sequences and functions with angiosperm homologs.Our results are consistent with theoretical predictions that local adaptation involves genes with small frequency shifts when selection is recent and gene flow among populations is high. Accordingly, genetic adaptation to climate in P. glauca appears to be complex, involving many independent and interacting gene functions, biochemical pathways, and processes. From an applied perspective, these results shall lead to specific functional/association studies in conifers and to the development of markers useful for the conservation of genetic resources.


Subject(s)
Acclimatization/genetics , Gene Frequency , Genes, Plant , Picea/genetics , Models, Genetic
7.
Genome Biol Evol ; 5(10): 1910-25, 2013.
Article in English | MEDLINE | ID: mdl-24065735

ABSTRACT

Gene families differ in composition, expression, and chromosomal organization between conifers and angiosperms, but little is known regarding nucleotide polymorphism. Using various sequencing strategies, an atlas of 212k high-confidence single nucleotide polymorphisms (SNPs) with a validation rate of more than 92% was developed for the conifer white spruce (Picea glauca). Nonsynonymous and synonymous SNPs were annotated over the corresponding 13,498 white spruce genes representative of 2,457 known gene families. Patterns of nucleotide polymorphisms were analyzed by estimating the ratio of nonsynonymous to synonymous numbers of substitutions per site (A/S). A general excess of synonymous SNPs was expected and observed. However, the analysis from several perspectives enabled to identify groups of genes harboring an excess of nonsynonymous SNPs, thus potentially under positive selection. Four known gene families harbored such an excess: dehydrins, ankyrin-repeats, AP2/DREB, and leucine-rich repeat. Conifer-specific sequences were also generally associated with the highest A/S ratios. A/S values were also distributed asymmetrically across genes specifically expressed in megagametophytes, roots, or in both, harboring on average an excess of nonsynonymous SNPs. These patterns confirm that the breadth of gene expression is a contributing factor to the evolution of nucleotide polymorphism. The A/S ratios of Medicago truncatula genes were also analyzed: several gene families shared between P. glauca and M. truncatula data sets had similar excess of synonymous or nonsynonymous SNPs. However, a number of families with high A/S ratios were found specific to P. glauca, suggesting cases of divergent evolution at the functional level.


Subject(s)
Genome, Plant , Medicago truncatula/genetics , Picea/genetics , Polymorphism, Single Nucleotide/genetics , Base Sequence , Expressed Sequence Tags , Genotype , Medicago truncatula/classification , Multigene Family , Open Reading Frames/genetics , Picea/classification , Sequence Analysis, DNA
8.
Mol Ecol Resour ; 13(2): 324-36, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23351128

ABSTRACT

High-density SNP genotyping arrays can be designed for any species given sufficient sequence information of high quality. Two high-density SNP arrays relying on the Infinium iSelect technology (Illumina) were designed for use in the conifer white spruce (Picea glauca). One array contained 7338 segregating SNPs representative of 2814 genes of various molecular functional classes for main uses in genetic association and population genetics studies. The other one contained 9559 segregating SNPs representative of 9543 genes for main uses in population genetics, linkage mapping of the genome and genomic prediction. The SNPs assayed were discovered from various sources of gene resequencing data. SNPs predicted from high-quality sequences derived from genomic DNA reached a genotyping success rate of 64.7%. Nonsingleton in silico SNPs (i.e. a sequence polymorphism present in at least two reads) predicted from expressed sequenced tags obtained with the Roche 454 technology and Illumina GAII analyser resulted in a similar genotyping success rate of 71.6% when the deepest alignment was used and the most favourable SNP probe per gene was selected. A variable proportion of these SNPs was shared by other nordic and subtropical spruce species from North America and Europe. The number of shared SNPs was inversely proportional to phylogenetic divergence and standing genetic variation in the recipient species, but positively related to allele frequency in P. glauca natural populations. These validated SNP resources should open up new avenues for population genetics and comparative genetic mapping at a genomic scale in spruce species.


Subject(s)
Oligonucleotide Array Sequence Analysis/methods , Picea/genetics , Polymorphism, Single Nucleotide , Genomics , Genotype , Phylogeny , Picea/classification
9.
BMC Biol ; 10: 84, 2012 Oct 26.
Article in English | MEDLINE | ID: mdl-23102090

ABSTRACT

BACKGROUND: Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling. RESULTS: To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago. CONCLUSIONS: Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed plants.


Subject(s)
Chromosome Mapping , DNA Shuffling , Evolution, Molecular , Genome, Plant/genetics , Phylogeny , Picea/genetics , Chromosomes, Plant/genetics , Extinction, Biological , Gene Duplication/genetics , Gene Expression Regulation, Plant , Genes, Plant/genetics , Genetic Linkage , Methyltransferases/genetics , Molecular Sequence Annotation , Multigene Family/genetics , Picea/enzymology , Pinus/genetics
10.
Plant Mol Biol ; 80(6): 555-69, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22960864

ABSTRACT

Several new initiatives have been launched recently to sequence conifer genomes including pines, spruces and Douglas-fir. Owing to the very large genome sizes ranging from 18 to 35 gigabases, sequencing even a single conifer genome had been considered unattainable until the recent throughput increases and cost reductions afforded by next generation sequencers. The purpose of this review is to describe the context for these new initiatives. A knowledge foundation has been acquired in several conifers of commercial and ecological interest through large-scale cDNA analyses, construction of genetic maps and gene mapping studies aiming to link phenotype and genotype. Exploratory sequencing in pines and spruces have pointed out some of the unique properties of these giga-genomes and suggested strategies that may be needed to extract value from their sequencing. The hope is that recent and pending developments in sequencing technology will contribute to rapidly filling the knowledge vacuum surrounding their structure, contents and evolution. Researchers are also making plans to use comparative analyses that will help to turn the data into a valuable resource for enhancing and protecting the world's conifer forests.


Subject(s)
Genome, Plant , Tracheophyta/genetics , Breeding , Chromosome Mapping , Chromosomes, Artificial, Bacterial/genetics , Gene Expression Profiling , Genetic Association Studies , Genomics/methods , Genomics/trends , Multigene Family , Plant Proteins/genetics , Polymorphism, Single Nucleotide , Quantitative Trait Loci , RNA, Plant/genetics , RNA, Small Untranslated/genetics , Transcriptome
11.
New Phytol ; 188(3): 774-86, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20955415

ABSTRACT

• The eucalyptus R2R3 transcription factor, EgMYB1 contains an active repressor motif in the regulatory domain of the predicted protein. It is preferentially expressed in differentiating xylem and is capable of repressing the transcription of two key lignin genes in vivo. • In order to investigate in planta the role of this putative transcriptional repressor of the lignin biosynthetic pathway, we overexpressed the EgMYB1 gene in Arabidopsis and poplar. • Expression of EgMYB1 produced similar phenotypes in both species, with stronger effects in transgenic Arabidopsis plants than in poplar. Vascular development was altered in overexpressors showing fewer lignified fibres (in phloem and interfascicular zones in poplar and Arabidopsis, respectively) and reduced secondary wall thickening. Klason lignin content was moderately but significantly reduced in both species. Decreased transcript accumulation was observed for genes involved in the biosynthesis of lignins, cellulose and xylan, the three main polymers of secondary cell walls. Transcriptomic profiles of transgenic poplars were reminiscent of those reported when lignin biosynthetic genes are disrupted. • Together, these results strongly suggest that EgMYB1 is a repressor of secondary wall formation and provide new opportunities to dissect the transcriptional regulation of secondary wall biosynthesis.


Subject(s)
Arabidopsis/metabolism , Cell Wall/metabolism , Eucalyptus/metabolism , Gene Expression Regulation, Plant , Lignin/biosynthesis , Populus/metabolism , Transcription Factors/metabolism , Arabidopsis/genetics , Cellulose/biosynthesis , Cellulose/genetics , Eucalyptus/genetics , Gene Expression , Gene Expression Profiling , Genes, Plant , Lignin/genetics , Phenotype , Plant Proteins/genetics , Plant Proteins/metabolism , Plant Vascular Bundle/cytology , Plant Vascular Bundle/metabolism , Plants, Genetically Modified , Populus/genetics , Transcription Factors/genetics , Xylans/biosynthesis , Xylans/genetics
12.
Tree Physiol ; 30(10): 1273-89, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20739427

ABSTRACT

Previous studies indicated that high nitrogen fertilization may impact secondary xylem development and alter fibre anatomy and composition. The resulting wood shares some resemblance with tension wood, which has much thicker cell walls than normal wood due to the deposition of an additional layer known as the G-layer. This report compares the short-term effects of high nitrogen fertilization and tree leaning to induce tension wood, either alone or in combination, upon wood formation in young trees of Populus trichocarpa (Torr. & Gray) × P. deltoides Bartr. ex Marsh. Fibre anatomy, chemical composition and transcript profiles were examined in newly formed secondary xylem. Each of the treatments resulted in thicker cell walls relative to the controls. High nitrogen and tree leaning had overlapping effects on chemical composition based on Fourier transform infrared analysis, specifically indicating that secondary cell wall composition was shifted in favour of cellulose and hemicelluloses relative to lignin content. In contrast, the high-nitrogen trees had shorter fibres, whilst the leaning trees had longer fibres that the controls. Microarray transcript profiling carried out after 28 days of treatment identified 180 transcripts that accumulated differentially in one or more treatments. Only 10% of differentially expressed transcripts were affected in all treatments relative to the controls. Several of the affected transcripts were related to carbohydrate metabolism, secondary cell wall formation, nitrogen metabolism and osmotic stress. RT-qPCR analyses at 1, 7 and 28 days showed that several transcripts followed very different accumulation profiles in terms of rate and level of accumulation, depending on the treatment. Our findings suggest that high nitrogen fertilization and tension wood induction elicit largely distinct and molecular pathways with partial overlap. When combined, the two types of environmental cue yielded additive effects.


Subject(s)
Plant Stems/physiology , Populus/growth & development , Wood/growth & development , Light , Nitrogen/metabolism , Polysaccharides/analysis , Populus/genetics , Populus/physiology , Spectroscopy, Fourier Transform Infrared , Stress, Mechanical , Wood/physiology , Xylem/physiology
13.
New Phytol ; 180(4): 766-86, 2008.
Article in English | MEDLINE | ID: mdl-18811621

ABSTRACT

One approach for investigating the molecular basis of wood formation is to integrate microarray profiling data sets and sequence analyses, comparing tree species with model plants such as Arabidopsis. Conifers may be included in comparative studies thanks to large-scale expressed sequence tag (EST) analyses, which enable the development of cDNA microarrays with very significant genome coverage. A microarray of 10,400 low-redundancy sequences was designed starting from white spruce (Picea glauca (Moench.) Voss) cDNAs. Computational procedures that were developed to ensure broad transcriptome coverage and efficient PCR amplification were used to select cDNA clones, which were re-sequenced in the microarray manufacture process. White spruce transcript profiling experiments that compared secondary xylem to phloem and needles identified 360 xylem-preferential gene sequences. The functional annotations of all differentially expressed sequences were highly consistent with the results of similar analyses carried out in angiosperm trees and herbaceous plants. Computational analyses comparing the spruce microarray sequences and core xylem gene sets from Arabidopsis identified 31 transcripts that were highly conserved in angiosperms and gymnosperms, in terms of both sequence and xylem expression. Several other spruce sequences have not previously been linked to xylem differentiation (including genes encoding TUBBY-like domain proteins (TLPs) and a gibberellin insensitive (gai) gene sequence) or were shown to encode proteins of unknown function encompassing diverse conserved domains of unknown function.


Subject(s)
Gene Expression Profiling , Genes, Plant , Picea/genetics , Xylem/genetics , Arabidopsis/genetics , Base Sequence , Expressed Sequence Tags , Gene Expression Regulation, Plant , Genome, Plant , Microarray Analysis/methods , Multigene Family , Nucleic Acid Hybridization , Oligonucleotide Array Sequence Analysis , Phloem/genetics , Plant Leaves/genetics , Reverse Transcriptase Polymerase Chain Reaction/methods , Sequence Analysis , Transcription, Genetic , Trees/genetics
14.
J Exp Bot ; 59(14): 3925-39, 2008.
Article in English | MEDLINE | ID: mdl-18805909

ABSTRACT

The involvement of two R2R3-MYB genes from Pinus taeda L., PtMYB1 and PtMYB8, in phenylpropanoid metabolism and secondary cell wall biogenesis was investigated in planta. These pine MYBs were constitutively overexpressed (OE) in Picea glauca (Moench) Voss, used as a heterologous conifer expression system. Morphological, histological, chemical (lignin and soluble phenols), and transcriptional analyses, i.e. microarray and reverse transcription quantitative PCR (RT-qPCR) were used for extensive phenotyping of MYB-overexpressing spruce plantlets. Upon germination of somatic embryos, root growth was reduced in both transgenics. Enhanced lignin deposition was also a common feature but ectopic secondary cell wall deposition was more strongly associated with PtMYB8-OE. Microarray and RT-qPCR data showed that overexpression of each MYB led to an overlapping up-regulation of many genes encoding phenylpropanoid enzymes involved in lignin monomer synthesis, while misregulation of several cell wall-related genes and other MYB transcription factors was specifically associated with PtMYB8-OE. Together, the results suggest that MYB1 and MYB8 may be part of a conserved transcriptional network involved in secondary cell wall deposition in conifers.


Subject(s)
Cell Wall/metabolism , Picea/metabolism , Pinus taeda/genetics , Plant Proteins/metabolism , Transcription Factors/metabolism , Cell Wall/genetics , Gene Expression , Lignin/metabolism , Molecular Sequence Data , Phenols/metabolism , Phloem/metabolism , Picea/genetics , Plant Proteins/genetics , Transcription Factors/genetics , Transcription, Genetic
15.
BMC Genomics ; 9: 21, 2008 Jan 18.
Article in English | MEDLINE | ID: mdl-18205909

ABSTRACT

BACKGROUND: To explore the potential value of high-throughput genotyping assays in the analysis of large and complex genomes, we designed two highly multiplexed Illumina bead arrays using the GoldenGate SNP assay for gene mapping in white spruce (Picea glauca [Moench] Voss) and black spruce (Picea mariana [Mill.] B.S.P.). RESULTS: Each array included 768 SNPs, identified by resequencing genomic DNA from parents of each mapping population. For white spruce and black spruce, respectively, 69.2% and 77.1% of genotyped SNPs had valid GoldenGate assay scores and segregated in the mapping populations. For each of these successful SNPs, on average, valid genotyping scores were obtained for over 99% of progeny. SNP data were integrated to pre-existing ALFP, ESTP, and SSR markers to construct two individual linkage maps and a composite map for white spruce and black spruce genomes. The white spruce composite map contained 821 markers including 348 gene loci. Also, 835 markers including 328 gene loci were positioned on the black spruce composite map. In total, 215 anchor markers (mostly gene markers) were shared between the two species. Considering lineage divergence at least 10 Myr ago between the two spruces, interspecific comparison of homoeologous linkage groups revealed remarkable synteny and marker colinearity. CONCLUSION: The design of customized highly multiplexed Illumina SNP arrays appears as an efficient procedure to enhance the mapping of expressed genes and make linkage maps more informative and powerful in such species with poorly known genomes. This genotyping approach will open new avenues for co-localizing candidate genes and QTLs, partial genome sequencing, and comparative mapping across conifers.


Subject(s)
Chromosome Mapping/methods , Genome, Plant , Oligonucleotide Array Sequence Analysis , Picea/genetics , Polymorphism, Single Nucleotide , Chromosomes, Plant , Cluster Analysis , Computational Biology/methods , Crosses, Genetic , DNA Primers/chemistry , DNA, Plant/genetics , DNA, Plant/isolation & purification , Expressed Sequence Tags , Genetic Markers , Genotype , Nucleic Acid Amplification Techniques , Polymerase Chain Reaction , Reproducibility of Results , Sequence Analysis, DNA , Synteny , Temperature , Time Factors
16.
Nucleic Acids Res ; 35(Database issue): D888-94, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17130142

ABSTRACT

ForestTreeDB is intended as a resource that centralizes large-scale expressed sequence tag (EST) sequencing results from several tree species (http://foresttree.org/ftdb). It currently encompasses 344,878 quality sequences from 68 libraries, from diverse organs of conifer and hybrid poplar trees. It utilizes the Nimbus data model to provide a hosting system for multiple projects, and uses object-relational mapping APIs in Java and Perl for data accesses within an Oracle database designed to be scalable, maintainable and extendable. Transcriptome builds or unigene sets occupy the focal point of the system. Several of the five current species-specific unigenes were used to design microarrays and SNP resources. The ForestTreeDB web application provides the means for multiple combination database queries. It presents the user with a list of discrete queries to retrieve and download large EST datasets or sequences from precompiled unigene assemblies. Functional annotation assignment is not trivial in conifers which are distantly related to angiosperm model plants. Optimal annotations are achieved through database queries that integrate results from several procedures based open-source tools. ForestTreeDB aims to facilitate sequence mining of coherent annotations in multiple species to support comparative genomic approaches. We plan to continuously enrich ForestTreeDB with other resources through collaborations with other genomic projects.


Subject(s)
Databases, Nucleic Acid , Expressed Sequence Tags/chemistry , Populus/genetics , Tracheophyta/genetics , Gene Expression Profiling , Genomics , Internet , Polymorphism, Single Nucleotide , Transcription, Genetic , Trees/genetics , User-Computer Interface
17.
BMC Genomics ; 7: 174, 2006 Jul 06.
Article in English | MEDLINE | ID: mdl-16824208

ABSTRACT

BACKGROUND: High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. RESULTS: A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (PSNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either PSNP > or = 0.95 or > or = 0.99. A total of 9,310 SNPs were detected by using PSNP > or = 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. CONCLUSION: We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies.


Subject(s)
Expressed Sequence Tags , Picea/genetics , Polymorphism, Single Nucleotide/genetics , Algorithms , Base Sequence , Bayes Theorem , DNA, Complementary/chemistry , DNA, Complementary/genetics , Databases, Genetic , Gene Library , Genes, Plant/genetics , Genome, Plant/genetics , Genotype , Molecular Sequence Data , Sequence Analysis, DNA/methods , Software
18.
BMC Genomics ; 6: 144, 2005 Oct 19.
Article in English | MEDLINE | ID: mdl-16236172

ABSTRACT

BACKGROUND: The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to produce extensive collections of ESTs and cDNA clones to support manufacture of cDNA microarrays and gene discovery in white spruce (Picea glauca [Moench] Voss). RESULTS: We produced 16 cDNA libraries from different tissues and a variety of treatments, and partially sequenced 50,000 cDNA clones. High quality 3' and 5' reads were assembled into 16,578 consensus sequences, 45% of which represented full length inserts. Consensus sequences derived from 5' and 3' reads of the same cDNA clone were linked to define 14,471 transcripts. A large proportion (84%) of the spruce sequences matched a pine sequence, but only 68% of the spruce transcripts had homologs in Arabidopsis or rice. Nearly all the sequences that matched the Populus trichocarpa genome (the only sequenced tree genome) also matched rice or Arabidopsis genomes. We used several sequence similarity search approaches for assignment of putative functions, including blast searches against general and specialized databases (transcription factors, cell wall related proteins), Gene Ontology term assignation and Hidden Markov Model searches against PFAM protein families and domains. In total, 70% of the spruce transcripts displayed matches to proteins of known or unknown function in the Uniref100 database (blastx e-value < 1e-10). We identified multigenic families that appeared larger in spruce than in the Arabidopsis or rice genomes. Detailed analysis of translationally controlled tumour proteins and S-adenosylmethionine synthetase families confirmed a twofold size difference. Sequences and annotations were organized in a dedicated database, SpruceDB. Several search tools were developed to mine the data either based on their occurrence in the cDNA libraries or on functional annotations. CONCLUSION: This report illustrates specific approaches for large-scale gene discovery and annotation in an organism that is very distantly related to any of the fully sequenced genomes. The ArboreaSet sequences and cDNA clones represent a valuable resource for investigations ranging from plant comparative genomics to applied conifer genetics.


Subject(s)
Expressed Sequence Tags , Gene Expression Regulation, Plant , Genes, Plant , Picea/genetics , Arabidopsis/genetics , Cell Wall/metabolism , Cluster Analysis , Contig Mapping , Cytoskeleton/metabolism , DNA, Complementary/metabolism , Databases as Topic , Databases, Genetic , Gene Library , Genome, Plant , Genomics , Multigene Family , Oryza/genetics , RNA, Messenger/metabolism , Software
19.
Plant Mol Biol ; 57(2): 203-24, 2005 Jan.
Article in English | MEDLINE | ID: mdl-15821878

ABSTRACT

A computational analysis of pine transcripts was conducted to contribute to the functional annotation of conifer sequences. A statistical analysis of expressed sequential tags(ESTs) belonging the 7732 contigs in the TIGR Pinus Gene Index (PGI1.0) identified 260 differentially represented gene sequences across six cDNA libraries from loblolly pine secondary xylem. Cluster analysis of this subset of contigs resulted in five groups representing genes preferentially represented in one of the xylem samples (compression wood, plannings, root xylem, latewood) and one group containing mostly genes simultaneously present in compression and side wood libraries. To complement the sequence annotation, 27 cDNA clones representing selected transcripts were completely sequenced. Several genes were identified that could represent putative markers for xylem from different organs, at different stages of development. Several sequences encoding regulatory proteins were over-represented in root xylem as opposed to the other xylem samples. Some of them belonged to known families of plant transcription factors, but two genes were previously uncharacterized in plants. One transcript was homologous to the gene encoding the Smad4 interacting factor, a key co-activator in TGFbeta (transforming growth factor) signalling in animals. Thus, the digital analysis of pine ESTs highlighted a putative gene function of potentially broad interest but that has yet to be investigated in plants. More generally, this study showed that the application of numerical approaches to EST databases should be helpful in establishing priorities among genes to consider for targeted functional studies. Thus, we illustrated the potential of extracting information from conifer sequences already accessible through well-structured public databases.


Subject(s)
Expressed Sequence Tags , Pinus/genetics , Plant Structures/genetics , Amino Acid Sequence , Cluster Analysis , DNA, Complementary/chemistry , DNA, Complementary/genetics , Data Interpretation, Statistical , Gene Expression Profiling/statistics & numerical data , Gene Expression Regulation, Plant , Gene Library , Molecular Sequence Data , Reproducibility of Results , Reverse Transcriptase Polymerase Chain Reaction/methods , Sequence Alignment , Sequence Analysis, DNA , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...