Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Plant J ; 111(5): 1469-1485, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35789009

RESUMEN

Spruces (Picea spp.) are coniferous trees widespread in boreal and mountainous forests of the northern hemisphere, with large economic significance and enormous contributions to global carbon sequestration. Spruces harbor very large genomes with high repetitiveness, hampering their comparative analysis. Here, we present and compare the genomes of four different North American spruces: the genome assemblies for Engelmann spruce (Picea engelmannii) and Sitka spruce (Picea sitchensis) together with improved and more contiguous genome assemblies for white spruce (Picea glauca) and for a naturally occurring introgress of these three species known as interior spruce (P. engelmannii × glauca × sitchensis). The genomes were structurally similar, and a large part of scaffolds could be anchored to a genetic map. The composition of the interior spruce genome indicated asymmetric contributions from the three ancestral genomes. Phylogenetic analysis of the nuclear and organelle genomes revealed a topology indicative of ancient reticulation. Different patterns of expansion of gene families among genomes were observed and related with presumed diversifying ecological adaptations. We identified rapidly evolving genes that harbored high rates of non-synonymous polymorphisms relative to synonymous ones, indicative of positive selection and its hitchhiking effects. These gene sets were mostly distinct between the genomes of ecologically contrasted species, and signatures of convergent balancing selection were detected. Stress and stimulus response was identified as the most frequent function assigned to expanding gene families and rapidly evolving genes. These two aspects of genomic evolution were complementary in their contribution to divergent evolution of presumed adaptive nature. These more contiguous spruce giga-genome sequences should strengthen our understanding of conifer genome structure and evolution, as their comparison offers clues into the genetic basis of adaptation and ecology of conifers at the genomic level. They will also provide tools to better monitor natural genetic diversity and improve the management of conifer forests. The genomes of four closely related North American spruces indicate that their high similarity at the morphological level is paralleled by the high conservation of their physical genome structure. Yet, the evidence of divergent evolution is apparent in their rapidly evolving genomes, supported by differential expansion of key gene families and large sets of genes under positive selection, largely in relation to stimulus and environmental stress response.


Asunto(s)
Picea , Tracheophyta , Etiquetas de Secuencia Expresada , Genoma de Planta/genética , Familia de Multigenes/genética , Filogenia , Picea/genética , Tracheophyta/genética
2.
Plant J ; 90(1): 189-203, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28090692

RESUMEN

Over the last decade, extensive genetic and genomic resources have been developed for the conifer white spruce (Picea glauca, Pinaceae), which has one of the largest plant genomes (20 Gbp). Draft genome sequences of white spruce and other conifers have recently been produced, but dense genetic maps are needed to comprehend genome macrostructure, delineate regions involved in quantitative traits, complement functional genomic investigations, and assist the assembly of fragmented genomic sequences. A greatly expanded P. glauca composite linkage map was generated from a set of 1976 full-sib progeny, with the positioning of 8793 expressed genes. Regions with significant low or high gene density were identified. Gene family members tended to be mapped on the same chromosomes, with tandemly arrayed genes significantly biased towards specific functional classes. The map was integrated with transcriptome data surveyed across eight tissues. In total, 69 clusters of co-expressed and co-localising genes were identified. A high level of synteny was found with pine genetic maps, which should facilitate the transfer of structural information in the Pinaceae. Although the current white spruce genome sequence remains highly fragmented, dozens of scaffolds encompassing more than one mapped gene were identified. From these, the relationship between genetic and physical distances was examined and the genome-wide recombination rate was found to be much smaller than most estimates reported for angiosperm genomes. This gene linkage map shall assist the large-scale assembly of the next-generation white spruce genome sequence and provide a reference resource for the conifer genomics community.


Asunto(s)
Genoma de Planta/genética , Picea/genética , Mapeo Cromosómico/métodos , Biología Computacional/métodos , ADN de Plantas/genética , Genómica/métodos , Polimorfismo de Nucleótido Simple/genética , Sintenía
3.
BMC Genomics ; 19(1): 942, 2018 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-30558528

RESUMEN

BACKGROUND: Norway spruce [Picea abies (L.) Karst.] is ecologically and economically one of the most important conifer worldwide. Our main goal was to develop a large catalog of annotated high confidence gene SNPs that should sustain the development of genomic tools for the conservation of natural and domesticated genetic diversity resources, and hasten tree breeding efforts in this species. RESULTS: Targeted sequencing was achieved by capturing P. abies exome with probes previously designed from the sequenced transcriptome of white spruce (Picea glauca (Moench) Voss). Capture efficiency was high (74.5%) given a high level of exome conservation between the two species. Using stringent criteria, we delimited a set of 61,771 high-confidence SNPs across 13,543 genes. To validate SNPs, a high-throughput genotyping array was developed for a subset of 5571 predicted SNPs representing as many different gene loci, and was used to genotype over 1000 trees. The estimated true positive rate of the resource was 84.2%, which was comparable with the genotyping success rate obtained for P. abies control SNPs recycled from previous genotyping efforts. We also analyzed SNP abundance across various gene functional categories. Several GO terms and gene families involved in stress response were found over-represented in highly polymorphic genes. CONCLUSION: The annotated high-confidence SNP catalog developed herein represents a valuable genomic resource, being representative of over 13 K genes distributed across the P. abies genome. This resource should serve a variety of population genomics and breeding applications in Norway spruce.


Asunto(s)
Exoma/genética , Picea/genética , Polimorfismo de Nucleótido Simple , Mapeo Contig , ADN de Plantas/aislamiento & purificación , ADN de Plantas/metabolismo , Genotipo , Anotación de Secuencia Molecular , Hojas de la Planta/genética , Análisis de Secuencia de ADN
4.
BMC Biol ; 10: 84, 2012 Oct 26.
Artículo en Inglés | MEDLINE | ID: mdl-23102090

RESUMEN

BACKGROUND: Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling. RESULTS: To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago. CONCLUSIONS: Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed plants.


Asunto(s)
Mapeo Cromosómico , Barajamiento de ADN , Evolución Molecular , Genoma de Planta/genética , Filogenia , Picea/genética , Cromosomas de las Plantas/genética , Extinción Biológica , Duplicación de Gen/genética , Regulación de la Expresión Génica de las Plantas , Genes de Plantas/genética , Ligamiento Genético , Metiltransferasas/genética , Anotación de Secuencia Molecular , Familia de Multigenes/genética , Picea/enzimología , Pinus/genética
5.
G3 (Bethesda) ; 14(1)2023 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-37875130

RESUMEN

Black spruce (Picea mariana [Mill.] B.S.P.) is a dominant conifer species in the North American boreal forest that plays important ecological and economic roles. Here, we present the first genome assembly of P. mariana with a reconstructed genome size of 18.3 Gbp and NG50 scaffold length of 36.0 kbp. A total of 66,332 protein-coding sequences were predicted in silico and annotated based on sequence homology. We analyzed the evolutionary relationships between P. mariana and 5 other spruces for which complete nuclear and organelle genome sequences were available. The phylogenetic tree estimated from mitochondrial genome sequences agrees with biogeography; specifically, P. mariana was strongly supported as a sister lineage to P. glauca and 3 other taxa found in western North America, followed by the European Picea abies. We obtained mixed topologies with weaker statistical support in phylogenetic trees estimated from nuclear and chloroplast genome sequences, indicative of ancient reticulate evolution affecting these 2 genomes. Clustering of protein-coding sequences from the 6 Picea taxa and 2 Pinus species resulted in 34,776 orthogroups, 560 of which appeared to be specific to P. mariana. Analysis of these specific orthogroups and dN/dS analysis of positive selection signatures for 497 single-copy orthogroups identified gene functions mostly related to plant development and stress response. The P. mariana genome assembly and annotation provides a valuable resource for forest genetics research and applications in this broadly distributed species, especially in relation to climate adaptation.


Asunto(s)
Picea , Filogenia , Picea/genética , América del Norte
6.
Plant Mol Biol ; 80(6): 555-69, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22960864

RESUMEN

Several new initiatives have been launched recently to sequence conifer genomes including pines, spruces and Douglas-fir. Owing to the very large genome sizes ranging from 18 to 35 gigabases, sequencing even a single conifer genome had been considered unattainable until the recent throughput increases and cost reductions afforded by next generation sequencers. The purpose of this review is to describe the context for these new initiatives. A knowledge foundation has been acquired in several conifers of commercial and ecological interest through large-scale cDNA analyses, construction of genetic maps and gene mapping studies aiming to link phenotype and genotype. Exploratory sequencing in pines and spruces have pointed out some of the unique properties of these giga-genomes and suggested strategies that may be needed to extract value from their sequencing. The hope is that recent and pending developments in sequencing technology will contribute to rapidly filling the knowledge vacuum surrounding their structure, contents and evolution. Researchers are also making plans to use comparative analyses that will help to turn the data into a valuable resource for enhancing and protecting the world's conifer forests.


Asunto(s)
Genoma de Planta , Tracheophyta/genética , Cruzamiento , Mapeo Cromosómico , Cromosomas Artificiales Bacterianos/genética , Perfilación de la Expresión Génica , Estudios de Asociación Genética , Genómica/métodos , Genómica/tendencias , Familia de Multigenes , Proteínas de Plantas/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , ARN de Planta/genética , ARN Pequeño no Traducido/genética , Transcriptoma
7.
New Phytol ; 188(3): 774-86, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20955415

RESUMEN

• The eucalyptus R2R3 transcription factor, EgMYB1 contains an active repressor motif in the regulatory domain of the predicted protein. It is preferentially expressed in differentiating xylem and is capable of repressing the transcription of two key lignin genes in vivo. • In order to investigate in planta the role of this putative transcriptional repressor of the lignin biosynthetic pathway, we overexpressed the EgMYB1 gene in Arabidopsis and poplar. • Expression of EgMYB1 produced similar phenotypes in both species, with stronger effects in transgenic Arabidopsis plants than in poplar. Vascular development was altered in overexpressors showing fewer lignified fibres (in phloem and interfascicular zones in poplar and Arabidopsis, respectively) and reduced secondary wall thickening. Klason lignin content was moderately but significantly reduced in both species. Decreased transcript accumulation was observed for genes involved in the biosynthesis of lignins, cellulose and xylan, the three main polymers of secondary cell walls. Transcriptomic profiles of transgenic poplars were reminiscent of those reported when lignin biosynthetic genes are disrupted. • Together, these results strongly suggest that EgMYB1 is a repressor of secondary wall formation and provide new opportunities to dissect the transcriptional regulation of secondary wall biosynthesis.


Asunto(s)
Arabidopsis/metabolismo , Pared Celular/metabolismo , Eucalyptus/metabolismo , Regulación de la Expresión Génica de las Plantas , Lignina/biosíntesis , Populus/metabolismo , Factores de Transcripción/metabolismo , Arabidopsis/genética , Celulosa/biosíntesis , Celulosa/genética , Eucalyptus/genética , Expresión Génica , Perfilación de la Expresión Génica , Genes de Plantas , Lignina/genética , Fenotipo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Haz Vascular de Plantas/citología , Haz Vascular de Plantas/metabolismo , Plantas Modificadas Genéticamente , Populus/genética , Factores de Transcripción/genética , Xilanos/biosíntesis , Xilanos/genética
8.
Tree Physiol ; 30(10): 1273-89, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20739427

RESUMEN

Previous studies indicated that high nitrogen fertilization may impact secondary xylem development and alter fibre anatomy and composition. The resulting wood shares some resemblance with tension wood, which has much thicker cell walls than normal wood due to the deposition of an additional layer known as the G-layer. This report compares the short-term effects of high nitrogen fertilization and tree leaning to induce tension wood, either alone or in combination, upon wood formation in young trees of Populus trichocarpa (Torr. & Gray) × P. deltoides Bartr. ex Marsh. Fibre anatomy, chemical composition and transcript profiles were examined in newly formed secondary xylem. Each of the treatments resulted in thicker cell walls relative to the controls. High nitrogen and tree leaning had overlapping effects on chemical composition based on Fourier transform infrared analysis, specifically indicating that secondary cell wall composition was shifted in favour of cellulose and hemicelluloses relative to lignin content. In contrast, the high-nitrogen trees had shorter fibres, whilst the leaning trees had longer fibres that the controls. Microarray transcript profiling carried out after 28 days of treatment identified 180 transcripts that accumulated differentially in one or more treatments. Only 10% of differentially expressed transcripts were affected in all treatments relative to the controls. Several of the affected transcripts were related to carbohydrate metabolism, secondary cell wall formation, nitrogen metabolism and osmotic stress. RT-qPCR analyses at 1, 7 and 28 days showed that several transcripts followed very different accumulation profiles in terms of rate and level of accumulation, depending on the treatment. Our findings suggest that high nitrogen fertilization and tension wood induction elicit largely distinct and molecular pathways with partial overlap. When combined, the two types of environmental cue yielded additive effects.


Asunto(s)
Tallos de la Planta/fisiología , Populus/crecimiento & desarrollo , Madera/crecimiento & desarrollo , Luz , Nitrógeno/metabolismo , Polisacáridos/análisis , Populus/genética , Populus/fisiología , Espectroscopía Infrarroja por Transformada de Fourier , Estrés Mecánico , Madera/fisiología , Xilema/fisiología
9.
Nucleic Acids Res ; 35(Database issue): D888-94, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17130142

RESUMEN

ForestTreeDB is intended as a resource that centralizes large-scale expressed sequence tag (EST) sequencing results from several tree species (http://foresttree.org/ftdb). It currently encompasses 344,878 quality sequences from 68 libraries, from diverse organs of conifer and hybrid poplar trees. It utilizes the Nimbus data model to provide a hosting system for multiple projects, and uses object-relational mapping APIs in Java and Perl for data accesses within an Oracle database designed to be scalable, maintainable and extendable. Transcriptome builds or unigene sets occupy the focal point of the system. Several of the five current species-specific unigenes were used to design microarrays and SNP resources. The ForestTreeDB web application provides the means for multiple combination database queries. It presents the user with a list of discrete queries to retrieve and download large EST datasets or sequences from precompiled unigene assemblies. Functional annotation assignment is not trivial in conifers which are distantly related to angiosperm model plants. Optimal annotations are achieved through database queries that integrate results from several procedures based open-source tools. ForestTreeDB aims to facilitate sequence mining of coherent annotations in multiple species to support comparative genomic approaches. We plan to continuously enrich ForestTreeDB with other resources through collaborations with other genomic projects.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Etiquetas de Secuencia Expresada/química , Populus/genética , Tracheophyta/genética , Perfilación de la Expresión Génica , Genómica , Internet , Polimorfismo de Nucleótido Simple , Transcripción Genética , Árboles/genética , Interfaz Usuario-Computador
10.
BMC Genomics ; 9: 21, 2008 Jan 18.
Artículo en Inglés | MEDLINE | ID: mdl-18205909

RESUMEN

BACKGROUND: To explore the potential value of high-throughput genotyping assays in the analysis of large and complex genomes, we designed two highly multiplexed Illumina bead arrays using the GoldenGate SNP assay for gene mapping in white spruce (Picea glauca [Moench] Voss) and black spruce (Picea mariana [Mill.] B.S.P.). RESULTS: Each array included 768 SNPs, identified by resequencing genomic DNA from parents of each mapping population. For white spruce and black spruce, respectively, 69.2% and 77.1% of genotyped SNPs had valid GoldenGate assay scores and segregated in the mapping populations. For each of these successful SNPs, on average, valid genotyping scores were obtained for over 99% of progeny. SNP data were integrated to pre-existing ALFP, ESTP, and SSR markers to construct two individual linkage maps and a composite map for white spruce and black spruce genomes. The white spruce composite map contained 821 markers including 348 gene loci. Also, 835 markers including 328 gene loci were positioned on the black spruce composite map. In total, 215 anchor markers (mostly gene markers) were shared between the two species. Considering lineage divergence at least 10 Myr ago between the two spruces, interspecific comparison of homoeologous linkage groups revealed remarkable synteny and marker colinearity. CONCLUSION: The design of customized highly multiplexed Illumina SNP arrays appears as an efficient procedure to enhance the mapping of expressed genes and make linkage maps more informative and powerful in such species with poorly known genomes. This genotyping approach will open new avenues for co-localizing candidate genes and QTLs, partial genome sequencing, and comparative mapping across conifers.


Asunto(s)
Mapeo Cromosómico/métodos , Genoma de Planta , Análisis de Secuencia por Matrices de Oligonucleótidos , Picea/genética , Polimorfismo de Nucleótido Simple , Cromosomas de las Plantas , Análisis por Conglomerados , Biología Computacional/métodos , Cruzamientos Genéticos , Cartilla de ADN/química , ADN de Plantas/genética , ADN de Plantas/aislamiento & purificación , Etiquetas de Secuencia Expresada , Marcadores Genéticos , Genotipo , Técnicas de Amplificación de Ácido Nucleico , Reacción en Cadena de la Polimerasa , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Sintenía , Temperatura , Factores de Tiempo
11.
New Phytol ; 180(4): 766-86, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18811621

RESUMEN

One approach for investigating the molecular basis of wood formation is to integrate microarray profiling data sets and sequence analyses, comparing tree species with model plants such as Arabidopsis. Conifers may be included in comparative studies thanks to large-scale expressed sequence tag (EST) analyses, which enable the development of cDNA microarrays with very significant genome coverage. A microarray of 10,400 low-redundancy sequences was designed starting from white spruce (Picea glauca (Moench.) Voss) cDNAs. Computational procedures that were developed to ensure broad transcriptome coverage and efficient PCR amplification were used to select cDNA clones, which were re-sequenced in the microarray manufacture process. White spruce transcript profiling experiments that compared secondary xylem to phloem and needles identified 360 xylem-preferential gene sequences. The functional annotations of all differentially expressed sequences were highly consistent with the results of similar analyses carried out in angiosperm trees and herbaceous plants. Computational analyses comparing the spruce microarray sequences and core xylem gene sets from Arabidopsis identified 31 transcripts that were highly conserved in angiosperms and gymnosperms, in terms of both sequence and xylem expression. Several other spruce sequences have not previously been linked to xylem differentiation (including genes encoding TUBBY-like domain proteins (TLPs) and a gibberellin insensitive (gai) gene sequence) or were shown to encode proteins of unknown function encompassing diverse conserved domains of unknown function.


Asunto(s)
Perfilación de la Expresión Génica , Genes de Plantas , Picea/genética , Xilema/genética , Arabidopsis/genética , Secuencia de Bases , Etiquetas de Secuencia Expresada , Regulación de la Expresión Génica de las Plantas , Genoma de Planta , Análisis por Micromatrices/métodos , Familia de Multigenes , Hibridación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos , Floema/genética , Hojas de la Planta/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Análisis de Secuencia , Transcripción Genética , Árboles/genética
12.
J Exp Bot ; 59(14): 3925-39, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18805909

RESUMEN

The involvement of two R2R3-MYB genes from Pinus taeda L., PtMYB1 and PtMYB8, in phenylpropanoid metabolism and secondary cell wall biogenesis was investigated in planta. These pine MYBs were constitutively overexpressed (OE) in Picea glauca (Moench) Voss, used as a heterologous conifer expression system. Morphological, histological, chemical (lignin and soluble phenols), and transcriptional analyses, i.e. microarray and reverse transcription quantitative PCR (RT-qPCR) were used for extensive phenotyping of MYB-overexpressing spruce plantlets. Upon germination of somatic embryos, root growth was reduced in both transgenics. Enhanced lignin deposition was also a common feature but ectopic secondary cell wall deposition was more strongly associated with PtMYB8-OE. Microarray and RT-qPCR data showed that overexpression of each MYB led to an overlapping up-regulation of many genes encoding phenylpropanoid enzymes involved in lignin monomer synthesis, while misregulation of several cell wall-related genes and other MYB transcription factors was specifically associated with PtMYB8-OE. Together, the results suggest that MYB1 and MYB8 may be part of a conserved transcriptional network involved in secondary cell wall deposition in conifers.


Asunto(s)
Pared Celular/metabolismo , Picea/metabolismo , Pinus taeda/genética , Proteínas de Plantas/metabolismo , Factores de Transcripción/metabolismo , Pared Celular/genética , Expresión Génica , Lignina/metabolismo , Datos de Secuencia Molecular , Fenoles/metabolismo , Floema/metabolismo , Picea/genética , Proteínas de Plantas/genética , Factores de Transcripción/genética , Transcripción Genética
13.
BMC Genomics ; 7: 174, 2006 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-16824208

RESUMEN

BACKGROUND: High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. RESULTS: A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (PSNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either PSNP > or = 0.95 or > or = 0.99. A total of 9,310 SNPs were detected by using PSNP > or = 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. CONCLUSION: We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies.


Asunto(s)
Etiquetas de Secuencia Expresada , Picea/genética , Polimorfismo de Nucleótido Simple/genética , Algoritmos , Secuencia de Bases , Teorema de Bayes , ADN Complementario/química , ADN Complementario/genética , Bases de Datos Genéticas , Biblioteca de Genes , Genes de Plantas/genética , Genoma de Planta/genética , Genotipo , Datos de Secuencia Molecular , Análisis de Secuencia de ADN/métodos , Programas Informáticos
14.
Mol Ecol Resour ; 16(2): 588-98, 2016 03.
Artículo en Inglés | MEDLINE | ID: mdl-26391535

RESUMEN

Picea mariana is a widely distributed boreal conifer across Canada and the subject of advanced breeding programmes for which population genomics and genomic selection approaches are being developed. Targeted sequencing was achieved after capturing P. mariana exome with probes designed from the sequenced transcriptome of Picea glauca, a distant relative. A high capture efficiency of 75.9% was reached although spruce has a complex and large genome including gene sequences interspersed by some long introns. The results confirmed the relevance of using probes from congeneric species to perform successfully interspecific exome capture in the genus Picea. A bioinformatics pipeline was developed including stringent criteria that helped detect a set of 97,075 highly reliable in silico SNPs. These SNPs were distributed across 14,909 genes. Part of an Infinium iSelect array was used to estimate the rate of true positives by validating 4267 of the predicted in silico SNPs by genotyping trees from P. mariana populations. The true positive rate was 96.2% for in silico SNPs, compared to a genotyping success rate of 96.7% for a set 1115 P. mariana control SNPs recycled from previous genotyping arrays. These results indicate the high success rate of the genotyping array and the relevance of the selection criteria used to delineate the new P. mariana in silico SNP resource. Furthermore, in silico SNPs were generally of medium to high frequency in natural populations, thus providing high informative value for future population genomics applications.


Asunto(s)
Exoma , Variación Genética , Técnicas de Genotipaje/métodos , Picea/clasificación , Picea/genética , Polimorfismo de Nucleótido Simple , Canadá , Análisis de Secuencia de ADN
15.
BMC Genomics ; 6: 144, 2005 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-16236172

RESUMEN

BACKGROUND: The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to produce extensive collections of ESTs and cDNA clones to support manufacture of cDNA microarrays and gene discovery in white spruce (Picea glauca [Moench] Voss). RESULTS: We produced 16 cDNA libraries from different tissues and a variety of treatments, and partially sequenced 50,000 cDNA clones. High quality 3' and 5' reads were assembled into 16,578 consensus sequences, 45% of which represented full length inserts. Consensus sequences derived from 5' and 3' reads of the same cDNA clone were linked to define 14,471 transcripts. A large proportion (84%) of the spruce sequences matched a pine sequence, but only 68% of the spruce transcripts had homologs in Arabidopsis or rice. Nearly all the sequences that matched the Populus trichocarpa genome (the only sequenced tree genome) also matched rice or Arabidopsis genomes. We used several sequence similarity search approaches for assignment of putative functions, including blast searches against general and specialized databases (transcription factors, cell wall related proteins), Gene Ontology term assignation and Hidden Markov Model searches against PFAM protein families and domains. In total, 70% of the spruce transcripts displayed matches to proteins of known or unknown function in the Uniref100 database (blastx e-value < 1e-10). We identified multigenic families that appeared larger in spruce than in the Arabidopsis or rice genomes. Detailed analysis of translationally controlled tumour proteins and S-adenosylmethionine synthetase families confirmed a twofold size difference. Sequences and annotations were organized in a dedicated database, SpruceDB. Several search tools were developed to mine the data either based on their occurrence in the cDNA libraries or on functional annotations. CONCLUSION: This report illustrates specific approaches for large-scale gene discovery and annotation in an organism that is very distantly related to any of the fully sequenced genomes. The ArboreaSet sequences and cDNA clones represent a valuable resource for investigations ranging from plant comparative genomics to applied conifer genetics.


Asunto(s)
Etiquetas de Secuencia Expresada , Regulación de la Expresión Génica de las Plantas , Genes de Plantas , Picea/genética , Arabidopsis/genética , Pared Celular/metabolismo , Análisis por Conglomerados , Mapeo Contig , Citoesqueleto/metabolismo , ADN Complementario/metabolismo , Bases de Datos como Asunto , Bases de Datos Genéticas , Biblioteca de Genes , Genoma de Planta , Genómica , Familia de Multigenes , Oryza/genética , ARN Mensajero/metabolismo , Programas Informáticos
16.
Genome Biol Evol ; 7(12): 3269-85, 2015 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-26560341

RESUMEN

Understanding the genetic basis of adaptation to climate is of paramount importance for preserving and managing genetic diversity in plants in a context of climate change. Yet, this objective has been addressed mainly in short-lived model species. Thus, expanding knowledge to nonmodel species with contrasting life histories, such as forest trees, appears necessary. To uncover the genetic basis of adaptation to climate in the widely distributed boreal conifer white spruce (Picea glauca), an environmental association study was conducted using 11,085 single nucleotide polymorphisms representing 7,819 genes, that is, approximately a quarter of the transcriptome.Linear and quadratic regressions controlling for isolation-by-distance, and the Random Forest algorithm, identified several dozen genes putatively under selection, among which 43 showed strongest signals along temperature and precipitation gradients. Most of them were related to temperature. Small to moderate shifts in allele frequencies were observed. Genes involved encompassed a wide variety of functions and processes, some of them being likely important for plant survival under biotic and abiotic environmental stresses according to expression data. Literature mining and sequence comparison also highlighted conserved sequences and functions with angiosperm homologs.Our results are consistent with theoretical predictions that local adaptation involves genes with small frequency shifts when selection is recent and gene flow among populations is high. Accordingly, genetic adaptation to climate in P. glauca appears to be complex, involving many independent and interacting gene functions, biochemical pathways, and processes. From an applied perspective, these results shall lead to specific functional/association studies in conifers and to the development of markers useful for the conservation of genetic resources.


Asunto(s)
Aclimatación/genética , Frecuencia de los Genes , Genes de Plantas , Picea/genética , Modelos Genéticos
17.
Genome Biol Evol ; 5(10): 1910-25, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24065735

RESUMEN

Gene families differ in composition, expression, and chromosomal organization between conifers and angiosperms, but little is known regarding nucleotide polymorphism. Using various sequencing strategies, an atlas of 212k high-confidence single nucleotide polymorphisms (SNPs) with a validation rate of more than 92% was developed for the conifer white spruce (Picea glauca). Nonsynonymous and synonymous SNPs were annotated over the corresponding 13,498 white spruce genes representative of 2,457 known gene families. Patterns of nucleotide polymorphisms were analyzed by estimating the ratio of nonsynonymous to synonymous numbers of substitutions per site (A/S). A general excess of synonymous SNPs was expected and observed. However, the analysis from several perspectives enabled to identify groups of genes harboring an excess of nonsynonymous SNPs, thus potentially under positive selection. Four known gene families harbored such an excess: dehydrins, ankyrin-repeats, AP2/DREB, and leucine-rich repeat. Conifer-specific sequences were also generally associated with the highest A/S ratios. A/S values were also distributed asymmetrically across genes specifically expressed in megagametophytes, roots, or in both, harboring on average an excess of nonsynonymous SNPs. These patterns confirm that the breadth of gene expression is a contributing factor to the evolution of nucleotide polymorphism. The A/S ratios of Medicago truncatula genes were also analyzed: several gene families shared between P. glauca and M. truncatula data sets had similar excess of synonymous or nonsynonymous SNPs. However, a number of families with high A/S ratios were found specific to P. glauca, suggesting cases of divergent evolution at the functional level.


Asunto(s)
Genoma de Planta , Medicago truncatula/genética , Picea/genética , Polimorfismo de Nucleótido Simple/genética , Secuencia de Bases , Etiquetas de Secuencia Expresada , Genotipo , Medicago truncatula/clasificación , Familia de Multigenes , Sistemas de Lectura Abierta/genética , Picea/clasificación , Análisis de Secuencia de ADN
18.
Mol Ecol Resour ; 13(2): 324-36, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23351128

RESUMEN

High-density SNP genotyping arrays can be designed for any species given sufficient sequence information of high quality. Two high-density SNP arrays relying on the Infinium iSelect technology (Illumina) were designed for use in the conifer white spruce (Picea glauca). One array contained 7338 segregating SNPs representative of 2814 genes of various molecular functional classes for main uses in genetic association and population genetics studies. The other one contained 9559 segregating SNPs representative of 9543 genes for main uses in population genetics, linkage mapping of the genome and genomic prediction. The SNPs assayed were discovered from various sources of gene resequencing data. SNPs predicted from high-quality sequences derived from genomic DNA reached a genotyping success rate of 64.7%. Nonsingleton in silico SNPs (i.e. a sequence polymorphism present in at least two reads) predicted from expressed sequenced tags obtained with the Roche 454 technology and Illumina GAII analyser resulted in a similar genotyping success rate of 71.6% when the deepest alignment was used and the most favourable SNP probe per gene was selected. A variable proportion of these SNPs was shared by other nordic and subtropical spruce species from North America and Europe. The number of shared SNPs was inversely proportional to phylogenetic divergence and standing genetic variation in the recipient species, but positively related to allele frequency in P. glauca natural populations. These validated SNP resources should open up new avenues for population genetics and comparative genetic mapping at a genomic scale in spruce species.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Picea/genética , Polimorfismo de Nucleótido Simple , Genómica , Genotipo , Filogenia , Picea/clasificación
19.
Plant Mol Biol ; 57(2): 203-24, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15821878

RESUMEN

A computational analysis of pine transcripts was conducted to contribute to the functional annotation of conifer sequences. A statistical analysis of expressed sequential tags(ESTs) belonging the 7732 contigs in the TIGR Pinus Gene Index (PGI1.0) identified 260 differentially represented gene sequences across six cDNA libraries from loblolly pine secondary xylem. Cluster analysis of this subset of contigs resulted in five groups representing genes preferentially represented in one of the xylem samples (compression wood, plannings, root xylem, latewood) and one group containing mostly genes simultaneously present in compression and side wood libraries. To complement the sequence annotation, 27 cDNA clones representing selected transcripts were completely sequenced. Several genes were identified that could represent putative markers for xylem from different organs, at different stages of development. Several sequences encoding regulatory proteins were over-represented in root xylem as opposed to the other xylem samples. Some of them belonged to known families of plant transcription factors, but two genes were previously uncharacterized in plants. One transcript was homologous to the gene encoding the Smad4 interacting factor, a key co-activator in TGFbeta (transforming growth factor) signalling in animals. Thus, the digital analysis of pine ESTs highlighted a putative gene function of potentially broad interest but that has yet to be investigated in plants. More generally, this study showed that the application of numerical approaches to EST databases should be helpful in establishing priorities among genes to consider for targeted functional studies. Thus, we illustrated the potential of extracting information from conifer sequences already accessible through well-structured public databases.


Asunto(s)
Etiquetas de Secuencia Expresada , Pinus/genética , Estructuras de las Plantas/genética , Secuencia de Aminoácidos , Análisis por Conglomerados , ADN Complementario/química , ADN Complementario/genética , Interpretación Estadística de Datos , Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación de la Expresión Génica de las Plantas , Biblioteca de Genes , Datos de Secuencia Molecular , Reproducibilidad de los Resultados , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Alineación de Secuencia , Análisis de Secuencia de ADN , Homología de Secuencia de Aminoácido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA