ABSTRACT
Fungal plant diseases are a major threat to food security worldwide. Current efforts to identify and list loci involved in different biological processes are more complicated than originally thought, even when complete genome assemblies are available. Despite numerous experimental and computational efforts to characterize gene functions in plants, about ~40% of protein-coding genes in the model plant Arabidopsis thaliana L. are still not categorized in the Gene Ontology (GO) Biological Process (BP) annotation. In non-model organisms, such as sunflower (Helianthus annuus L.), the number of BP term annotations is far fewer, ~22%. In the current study, we performed gene co-expression network analysis using eight terabytes of public transcriptome datasets and expression-based functional prediction to categorize and identify loci involved in the response to fungal pathogens. We were able to construct a reference gene network of healthy green tissue (GreenGCN) and a gene network of healthy and stressed root tissues (RootGCN). Both networks achieved robust, high-quality scores on the metrics of guilt-by-association and selective constraints versus gene connectivity. We were able to identify eight modules enriched in defense functions, of which two out of the three modules in the RootGCN were also conserved in the GreenGCN, suggesting similar defense-related expression patterns. We identified 16 WRKY genes involved in defense related functions and 65 previously uncharacterized loci now linked to defense response. In addition, we identified and classified 122 loci previously identified within QTLs or near candidate loci reported in GWAS studies of disease resistance in sunflower linked to defense response. All in all, we have implemented a valuable strategy to better describe genes within specific biological processes.
ABSTRACT
INTRODUCTION: Coinfection with two SARS-CoV-2 viruses is still a very understudied phenomenon. Although next generation sequencing methods are very sensitive to detect heterogeneous viral populations in a sample, there is no standardized method for their characterization, so their clinical and epidemiological importance is unknown. MATERIAL AND METHODS: We developed VICOS (Viral COinfection Surveillance), a new bioinformatic algorithm for variant calling, filtering and statistical analysis to identify samples suspected of being mixed SARS-CoV-2 populations from a large dataset in the framework of a community genomic surveillance. VICOS was used to detect SARS-CoV-2 coinfections in a dataset of 1,097 complete genomes collected between March 2020 and August 2021 in Argentina. RESULTS: We detected 23 cases (2%) of SARS-CoV-2 coinfections. Detailed study of VICOS's results together with additional phylogenetic analysis revealed 3 cases of coinfections by two viruses of the same lineage, 2 cases by viruses of different genetic lineages, 13 were compatible with both coinfection and intra-host evolution, and 5 cases were likely a product of laboratory contamination. DISCUSSION: Intra-sample viral diversity provides important information to understand the transmission dynamics of SARS-CoV-2. Advanced bioinformatics tools, such as VICOS, are a necessary resource to help unveil the hidden diversity of SARS-CoV-2.
Subject(s)
COVID-19 , Coinfection , Humans , SARS-CoV-2/genetics , Phylogeny , Genome, Viral , Computational Biology , Consensus SequenceABSTRACT
Wood properties and agronomic traits associated with fast growth and frost tolerance make Eucalyptus nitens a valuable forest alternative. However, the rapid age-related decline in the adventitious root (AR) formation (herein, meaning induction, initiation, and expression stages) limits its propagation. We analyzed transcriptomic profile variation in leaves and stem bases during AR induction of microcuttings to elucidate the molecular mechanisms involved in AR formation. In addition, we quantified expressions of candidate genes associated with recalcitrance. We delimited the ontogenic phases of root formation using histological techniques and Scarecrow and Short-Root expression quantification for RNA sequencing sample collection. We quantified the gene expressions associated with root meristem formation, auxin biosynthesis, perception, signaling, conjugation, and cytokinin signaling in shoots harvested from 2- to 36-month-old plants. After IBA treatment, 702 transcripts changed their expressions. Several were involved in hormone homeostasis and the signaling pathways that determine cell dedifferentiation, leading to root meristem formation. In part, the age-related decline in the rooting capacity is attributable to the increase in the ARR1 gene expression, which negatively affects auxin homeostasis. The analysis of the transcriptomic variation in the leaves and rooting zones provided profuse information: (1) To elucidate the auxin metabolism; (2) to understand the hormonal and signaling processes involved; (3) to collect data associated with their recalcitrance.
ABSTRACT
BACKGROUND: Anastrepha fraterculus sp. 1 is considered a quarantine pest in several American countries. Since chemical control applied in an integrated pest management program is the only strategy utilized against this pest, the development of pesticide-free methods, such as the Sterile Insect Technique, is being considered. The search for genes involved in sex-determination and differentiation, and in metabolic pathways associated with communication and mating behaviour, contributes with key information to the development of genetic control strategies. The aims of this work were to perform a comprehensive analysis of A. fraterculus sp. 1 transcriptome and to obtain an initial evaluation of genes associated with main metabolic pathways by the expression analysis of specific transcripts identified in embryos and adults. RESULTS: Sexually mature adults of both sexes and 72 h embryos were considered for transcriptome analysis. The de novo transcriptome assembly was fairly complete (62.9% complete BUSCO orthologs detected) with a total of 86,925 transcripts assembled and 28,756 GO annotated sequences. Paired-comparisons between libraries showed 319 transcripts differently expressed between embryos and females, 1242 between embryos and males, and 464 between sexes. Using this information and genes searches based on published studies from other tephritid species, we evaluated a set of transcripts involved in development, courtship and metabolic pathways. The qPCR analysis evidenced that the early genes serendipity alpha and transformer-2 displayed similar expression levels in the analyzed stages, while heat shock protein 27 is over-expressed in embryos and females in comparison to males. The expression of genes associated with courtship (takeout-like, odorant-binding protein 50a1) differed between males and females, independently of their reproductive status (virgin vs mated individuals). Genes associated with metabolic pathways (maltase 2-like, androgen-induced gene 1) showed differential expression between embryos and adults. Furthermore, 14,262 microsatellite motifs were identified, with 11,208 transcripts containing at least one simple sequence repeat, including 48% of di/trinucleotide motifs. CONCLUSION: Our results significantly expand the available gene space of A. fraterculus sp. 1, contributing with a fairly complete transcript database of embryos and adults. The expression analysis of the selected candidate genes, along with a set of microsatellite markers, provides a valuable resource for further genetic characterization of A. fraterculus sp. 1 and supports the development of specific genetic control strategies.
Subject(s)
Sexual Behavior, Animal , Tephritidae/genetics , Transcriptome , Animals , Embryo, Nonmammalian , Female , Male , Microsatellite Repeats , RNA-Seq , Reproduction , Tephritidae/embryologyABSTRACT
Sclerotinia head rot (SHR), caused by the necrotrophic fungus Sclerotinia sclerotiorum, is one of the most devastating sunflower crop diseases. Despite its worldwide occurrence, the genetic determinants of plant resistance are still largely unknown. Here, we investigated the Sclerotinia-sunflower pathosystem by analysing temporal changes in gene expression in one susceptible and two tolerant inbred lines (IL) inoculated with the pathogen under field conditions. Differential expression analysis showed little overlapping among ILs, suggesting genotype-specific control of cell defense responses possibly related to differences in disease resistance strategies. Functional enrichment assessments yielded a similar pattern. However, all three ILs altered the expression of genes involved in the cellular redox state and cell wall remodeling, in agreement with current knowledge about the initiation of plant immune responses. Remarkably, the over-representation of long non-coding RNAs (lncRNA) was another common feature among ILs. Our findings highlight the diversity of transcriptional responses to SHR within sunflower breeding lines and provide evidence of lncRNAs playing a significant role at early stages of defense.
Subject(s)
Ascomycota/genetics , Helianthus/microbiology , Plant Diseases/microbiology , Breeding/methods , Cell Wall/microbiology , Disease Resistance , Gene Expression/genetics , Genotype , Oxidation-Reduction , RNA, Long Noncoding/genetics , Sequence Analysis, RNA/methods , Transcription, Genetic/geneticsABSTRACT
Sunflower germplasm collections are valuable resources for broadening the genetic base of commercial hybrids and ameliorate the risk of climate events. Nowadays, the most studied worldwide sunflower pre-breeding collections belong to INTA (Argentina), INRA (France), and USDA-UBC (United States of America-Canada). In this work, we assess the amount and distribution of genetic diversity (GD) available within and between these collections to estimate the distribution pattern of global diversity. A mixed genotyping strategy was implemented, by combining proprietary genotyping-by-sequencing data with public whole-genome-sequencing data, to generate an integrative 11,834-common single nucleotide polymorphism matrix including the three breeding collections. In general, the GD estimates obtained were moderate. An analysis of molecular variance provided evidence of population structure between breeding collections. However, the optimal number of subpopulations, studied via discriminant analysis of principal components (K = 12), the bayesian STRUCTURE algorithm (K = 6) and distance-based methods (K = 9) remains unclear, since no single unifying characteristic is apparent for any of the inferred groups. Different overall patterns of linkage disequilibrium (LD) were observed across chromosomes, with Chr10, Chr17, Chr5, and Chr2 showing the highest LD. This work represents the largest and most comprehensive inter-breeding collection analysis of genomic diversity for cultivated sunflower conducted to date.
Subject(s)
Helianthus/genetics , Linkage Disequilibrium , Polymorphism, Genetic , Seed Bank , Chromosomes, Plant/genetics , Plant Breeding/methodsABSTRACT
Cercospora kikuchii (Tak. Matsumoto & Tomoy.) M.W. Gardner 1927 is an ascomycete fungal pathogen that causes Cercospora leaf blight and purple seed stain on soybean. Here, we report the first draft genome sequence and assembly of this pathogen. The C. kikuchii strain ARG_18_001 was isolated from soybean purple seed collected from San Pedro, Buenos Aires, Argentina, during the 2018 harvest. The genome was sequenced using a 2 × 150 bp paired-end method by Illumina NovaSeq 6000. The C. kikuchii protein-coding genes were predicted using FunGAP (Fungal Genome Annotation Pipeline). The draft genome assembly was 33.1 Mb in size with a GC-content of 53%. The gene prediction resulted in 14,856 gene models/14,721 protein coding genes. Genomic data of C. kikuchii presented here will be a useful resource for future studies of this pathosystem. The data can be accessed at GenBank under the accession number VTAY00000000 https://www.ncbi.nlm.nih.gov/nuccore/VTAY00000000.
ABSTRACT
MAIN CONCLUSION: miRNA targets from Citrus sinensis are predicted and validated using degradome data. They show an up-regulation upon infection with CPsV, with a positive correlation between target expression and symptom severity. Sweet orange (Citrus sinensis) may suffer from disease symptoms induced by virus infections, thus resulting in drastic economic losses. Infection of sweet orange plants with two isolates of citrus psorosis virus (CPsV), expressing different symptomatologies, alters the accumulation of a set of endogenous microRNAs (miRNAs). Here, we predicted ten putative targets from four down-regulated miRNAs: three belonging to the CCAAT-binding transcription factor family (CBFAs); an Ethylene-responsive transcription factor (RAP2-7); an Integrase-type DNA-binding superfamily protein (AP2B); Transport inhibitor response 1 (TIR1); GRR1-like protein 1-related (GRR1); Argonaute 2-related (AGO2), Argonaute 7 (AGO7), and a long non-coding RNA (ncRNA). We validated six of them through analysis of leaf degradome data. Expressions of the validated targets increase in infected samples compared to healthy tissue, showing a more striking up-regulation those samples with higher symptom severity. This study contributes to the understanding of the miRNA-mediated regulation of important transcripts in Citrus sinensis through target validation and shed light in the manner a virus can alter host regulatory mechanisms leading to symptom expression.
Subject(s)
Citrus sinensis/metabolism , Citrus sinensis/virology , MicroRNAs/metabolism , Plant Viruses/pathogenicity , Transcriptional Activation/genetics , Transcriptional Activation/physiologyABSTRACT
BACKGROUND: Leaf senescence is a complex process, controlled by multiple genetic and environmental variables. In sunflower, leaf senescence is triggered abruptly following anthesis thereby limiting the capacity of plants to keep their green leaf area during grain filling, which subsequently has a strong impact on crop yield. Recently, we performed a selection of contrasting sunflower inbred lines for the progress of leaf senescence through a physiological, cytological and molecular approach. Here we present a large scale transcriptomic analysis using RNA-seq and its integration with metabolic profiles for two contrasting sunflower inbred lines, R453 and B481-6 (early and delayed senescence respectively), with the aim of identifying metabolic pathways associated to leaf senescence. RESULTS: Gene expression profiles revealed a higher number of differentially expressed genes, as well as, higher expression levels in R453, providing evidence for early activation of the senescence program in this line. Metabolic pathways associated with sugars and nutrient recycling were differentially regulated between the lines. Additionally, we identified transcription factors acting as hubs in the co-expression networks; some previously reported as senescence-associated genes in model species but many are novel candidate genes. CONCLUSIONS: Understanding the onset and the progress of the senescence process in crops and the identification of these new candidate genes will likely prove highly useful for different management strategies to mitigate the impact of senescence on crop yield. Functional characterization of candidate genes will help to develop molecular tools for biotechnological applications in breeding crop yield.
Subject(s)
Gene Expression Regulation, Plant , Gene Regulatory Networks , Helianthus/genetics , Systems Biology , Transcriptome , Genomics , Helianthus/physiology , Phenotype , Plant Leaves/genetics , Plant Leaves/physiology , Species Specificity , Time FactorsABSTRACT
MAIN CONCLUSION: Abscisic acid is involved in the drought response of Ilex paraguariensis. Acclimation includes root growth stimulation, stomatal closure, osmotic adjustment, photoprotection, and regulation of nonstructural carbohydrates and amino acid metabolisms. Ilex paraguariensis (yerba mate) is cultivated in the subtropical region of South America, where the occurrence of drought episodes limit yield. To explore the mechanisms that allow I. paraguariensis to overcome dehydration, we investigated (1) how gene expression varied between water-stressed and non-stressed plants and (2) in what way the modulation of gene expression was linked to physiological status and metabolite composition. A total of 4920 differentially expressed transcripts were obtained through RNA-Seq after water deprivation. Drought induced the expression of several transcripts involved in the ABA-signalling pathway. Stomatal closure and leaf osmotic adjustments were promoted to minimize water loss, and these responses were accompanied by a high transcriptional remodeling of stress perception, signalling and transcriptional regulation, the photoprotective and antioxidant systems, and other stress-responsive genes. Simultaneously, significant changes in metabolite contents were detected. Glutamine, phenylalanine, isomaltose, fucose, and malate levels were shown to be positively correlated with dehydration. Principal component analysis showed differences in the metabolic profiles of control and stressed leaves. These results provide a comprehensive overview of how I. paraguariensis responds to dehydration at transcriptional and metabolomic levels and provide further characterization of the molecular mechanisms associated with drought response in perennial subtropical species.
Subject(s)
Abscisic Acid/metabolism , Gene Expression Regulation, Plant , Ilex paraguariensis/physiology , Metabolome , Plant Growth Regulators/metabolism , Transcriptome , Acclimatization , Dehydration , Droughts , Gene Expression Profiling , Ilex paraguariensis/genetics , Plant Leaves/genetics , Plant Leaves/physiology , Plant Roots/genetics , Plant Roots/physiology , Stress, PhysiologicalABSTRACT
Snakin-1 is a member of the Solanum tuberosum Snakin/GASA family. We previously demonstrated that Snakin-1 is involved in plant defense to pathogens as well as in plant growth and development, but its mechanism of action has not been completely elucidated yet. Here, we showed that leaves of Snakin-1 silenced potato transgenic plants exhibited increased levels of reactive oxygen species and significantly reduced content of ascorbic acid. Furthermore, Snakin-1 silencing enhanced salicylic acid content in accordance with an increased expression of SA-inducible PRs genes. Interestingly, gibberellic acid levels were also enhanced and transcriptome analysis revealed that a large number of genes related to sterol biosynthesis were downregulated in these silenced lines. Moreover, we demonstrated that Snakin-1 directly interacts with StDIM/DWF1, an enzyme involved in plant sterols biosynthesis. Additionally, the analysis of the expression pattern of PStSN1::GUS in potato showed that Snakin-1 is present mainly in young tissues associated with active growth and cell division zones. Our comprehensive analysis of Snakin-1 silenced lines demonstrated for the first time in potato that Snakin-1 plays a role in redox balance and participates in a complex crosstalk among different hormones.
Subject(s)
Plant Growth Regulators , Plant Leaves , Plant Proteins , Plants, Genetically Modified , Solanum tuberosum , Phytosterols/biosynthesis , Phytosterols/genetics , Plant Growth Regulators/genetics , Plant Growth Regulators/metabolism , Plant Leaves/genetics , Plant Leaves/metabolism , Plant Proteins/biosynthesis , Plant Proteins/genetics , Plants, Genetically Modified/genetics , Plants, Genetically Modified/metabolism , Solanum tuberosum/genetics , Solanum tuberosum/metabolismABSTRACT
The endangered Cedrela balansae C.DC. (Meliaceae) is a high-value timber species with great potential for forest plantations that inhabits the tropical forests in Northwestern Argentina.Research on this species is scarce because of the limited genetic and genomic information available. Here, we explored the transcriptome of C. balansae using 454 GS FLX Titanium next-generation sequencing (NGS) technology. Following de novo assembling, we identified 27,111 non-redundant unigenes longer than 200 bp, and considered these transcripts for further downstream analysis. The functional annotation was performed searching the 27,111 unigenes against the NR-Protein and the Interproscan databases. This analysis revealed 26,977 genes with homology in at least one of the Database analyzed. Furthermore, 7,774 unigenes in 142 different active biological pathways in C. balansae were identified with the KEGG database. Moreover, after in silico analyses, we detected 2,663 simple sequence repeats (SSRs) markers. A subset of 70 SSRs related to important "stress tolerance" traits based on functional annotation evidence, were selected for wet PCR-validation in C. balansae and other Cedrela species inhabiting in northwest and northeast of Argentina (C. fissilis, C. saltensis and C. angustifolia). Successful transferability was between 77% and 93% and thanks to this study, 32 polymorphic functional SSRs for all analyzed Cedrela species are now available. The gene catalog and molecular markers obtained here represent a starting point for further research, which will assist genetic breeding programs in the Cedrela genus and will contribute to identifying key populations for its preservation.
Subject(s)
Cedrela/genetics , Computer Simulation , Databases, Nucleic Acid , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Transcriptome/physiology , Argentina , Cedrela/growth & development , Genetic MarkersABSTRACT
KEY MESSAGE: By integration of transcriptional and metabolic profiles we identified pathways and hubs transcription factors regulated during drought conditions in sunflower, useful for applications in molecular and/or biotechnological breeding. Drought is one of the most important environmental stresses that effects crop productivity in many agricultural regions. Sunflower is tolerant to drought conditions but the mechanisms involved in this tolerance remain unclear at the molecular level. The aim of this study was to characterize and integrate transcriptional and metabolic pathways related to drought stress in sunflower plants, by using a system biology approach. Our results showed a delay in plant senescence with an increase in the expression level of photosynthesis related genes as well as higher levels of sugars, osmoprotectant amino acids and ionic nutrients under drought conditions. In addition, we identified transcription factors that were upregulated during drought conditions and that may act as hubs in the transcriptional network. Many of these transcription factors belong to families implicated in the drought response in model species. The integration of transcriptomic and metabolomic data in this study, together with physiological measurements, has improved our understanding of the biological responses during droughts and contributes to elucidate the molecular mechanisms involved under this environmental condition. These findings will provide useful biotechnological tools to improve stress tolerance while maintaining crop yield under restricted water availability.
Subject(s)
Gene Expression Regulation, Plant/physiology , Helianthus/metabolism , Stress, Physiological/physiology , Transcription Factors/metabolism , Water/metabolism , Chlorophyll/metabolism , Helianthus/genetics , Plant Leaves/genetics , Plant Leaves/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Protein Array Analysis , RNA, Plant/genetics , RNA, Plant/metabolism , Transcription Factors/geneticsABSTRACT
BACKGROUND: In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. RESULTS: We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. CONCLUSIONS: ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Subject(s)
Transcriptome , User-Computer Interface , Animals , Databases, Genetic , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Internet , Sequence Analysis, RNAABSTRACT
BACKGROUND: Diachasmimorpha longicaudata (Hymenoptera: Braconidae) is a solitary parasitoid of Tephritidae (Diptera) fruit flies of economic importance currently being mass-reared in bio-factories and successfully used worldwide. A peculiar biological aspect of Hymenoptera is its haplo-diploid life cycle, where females (diploid) develop from fertilized eggs and males (haploid) from unfertilized eggs. Diploid males were described in many species and recently evidenced in D. longicaudata by mean of inbreeding studies. Sex determination in this parasitoid is based on the Complementary Sex Determination (CSD) system, with alleles from at least one locus involved in early steps of this pathway. Since limited information is available about genetics of this parasitoid species, a deeper analysis on D. longicaudata's genomics is required to provide molecular tools for achieving a more cost effective production under artificial rearing conditions. RESULTS: We report here the first transcriptome analysis of male-larvae, adult females and adult males of D. longicaudata using 454-pyrosequencing. A total of 469766 reads were analyzed and 8483 high-quality isotigs were assembled. After functional annotation, a total of 51686 unigenes were produced, from which, 7021 isotigs and 20227 singletons had at least one BLAST hit against the NCBI non-redundant protein database. A preliminary comparison of adult female and male evidenced that 98 transcripts showed differential expression profiles, with at least a 10-fold difference. Among the functionally annotated transcripts we detected four sequences potentially involved in sex determination and three homologues to two known genes involved in the sex determination cascade. Finally, a total of 4674SimpleSequence Repeats (SSRs) were in silico identified and characterized. CONCLUSION: The information obtained here will significantly contribute to the development of D. longicaudata functional genomics, genetics and population-based genome studies. Thousands of new microsatellite markers were identified as toolkits for population genetics analysis. The transcriptome characterized here is the starting point to elucidate the molecular bases of the sex determination mechanism in this species.
Subject(s)
Computational Biology , Gene Expression Profiling , Transcriptome , Wasps/genetics , Animals , Computational Biology/methods , Female , Gene Ontology , Genetic Variation , Genetics, Population , High-Throughput Nucleotide Sequencing , Larva , Male , Microsatellite Repeats , Molecular Sequence Annotation , Reproducibility of Results , Sex Determination ProcessesABSTRACT
Cellulomonas sp. strain B6 was isolated from a subtropical forest soil sample and presented (hemi)cellulose-degrading activity. We report here its draft genome sequence, with an estimated genome size of 4 Mb, a G+C content of 75.1%, and 3,443 predicted protein-coding sequences, 92 of which are glycosyl hydrolases involved in polysaccharide degradation.
ABSTRACT
Leaf senescence is a complex process, which has dramatic consequences on crop yield. In sunflower, gap between potential and actual yields reveals the economic impact of senescence. Indeed, sunflower plants are incapable of maintaining their green leaf area over sustained periods. This study characterizes the leaf senescence process in sunflower through a systems biology approach integrating transcriptomic and metabolomic analyses: plants being grown under both glasshouse and field conditions. Our results revealed a correspondence between profile changes detected at the molecular, biochemical and physiological level throughout the progression of leaf senescence measured at different plant developmental stages. Early metabolic changes were detected prior to anthesis and before the onset of the first senescence symptoms, with more pronounced changes observed when physiological and molecular variables were assessed under field conditions. During leaf development, photosynthetic activity and cell growth processes decreased, whereas sucrose, fatty acid, nucleotide and amino acid metabolisms increased. Pathways related to nutrient recycling processes were also up-regulated. Members of the NAC, AP2-EREBP, HB, bZIP and MYB transcription factor families showed high expression levels, and their expression level was highly correlated, suggesting their involvement in sunflower senescence. The results of this study thus contribute to the elucidation of the molecular mechanisms involved in the onset and progression of leaf senescence in sunflower leaves as well as to the identification of candidate genes involved in this process.
Subject(s)
Gene Expression Profiling/methods , Helianthus/genetics , Helianthus/metabolism , Metabolomics/methods , Plant Leaves/growth & development , Plant Leaves/metabolism , Gas Chromatography-Mass Spectrometry , Gene Expression Regulation, Plant , Gene Ontology , Genes, Plant , Ions , Oligonucleotide Array Sequence Analysis , Plant Leaves/genetics , Principal Component Analysis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Transcription Factors/metabolismABSTRACT
BACKGROUND: Prosopis alba (Fabaceae) is an important native tree adapted to arid and semiarid regions of north-western Argentina which is of great value as multipurpose species. Despite its importance, the genomic resources currently available for the entire Prosopis genus are still limited. Here we describe the development of a leaf transcriptome and the identification of new molecular markers that could support functional genetic studies in natural and domesticated populations of this genus. RESULTS: Next generation DNA pyrosequencing technology applied to P. alba transcripts produced a total of 1,103,231 raw reads with an average length of 421 bp. De novo assembling generated a set of 15,814 isotigs and 71,101 non-assembled sequences (singletons) with an average of 991 bp and 288 bp respectively. A total of 39,000 unique singletons were identified after clustering natural and artificial duplicates from pyrosequencing reads.Regarding the non-redundant sequences or unigenes, 22,095 out of 54,814 were successfully annotated with Gene Ontology terms. Moreover, simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 5,992 and 6,236 markers, respectively, throughout the genome. For the validation of the the predicted SSR markers, a subset of 87 SSRs selected through functional annotation evidence was successfully amplified from six DNA samples of seedlings. From this analysis, 11 of these 87 SSRs were identified as polymorphic. Additionally, another set of 123 nuclear polymorphic SSRs were determined in silico, of which 50% have the probability of being effectively polymorphic. CONCLUSIONS: This study generated a successful global analysis of the P. alba leaf transcriptome after bioinformatic and wet laboratory validations of RNA-Seq data.The limited set of molecular markers currently available will be significantly increased with the thousands of new markers that were identified in this study. This information will strongly contribute to genomics resources for P. alba functional analysis and genetics. Finally, it will also potentially contribute to the development of population-based genome studies in the genera.
Subject(s)
Plant Leaves/genetics , Prosopis/genetics , Transcriptome , Chloroplasts/genetics , Gene Frequency , Gene Ontology , Genes, Plant , Genetic Markers , High-Throughput Nucleotide Sequencing , Metabolic Networks and Pathways/genetics , Microsatellite Repeats , Molecular Sequence Annotation , Plant Leaves/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Polymorphism, Single Nucleotide , Prosopis/metabolism , Sequence Analysis, DNAABSTRACT
Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement.
Subject(s)
Helianthus/genetics , Oligonucleotide Array Sequence Analysis/methods , Expressed Sequence Tags , Gene Expression Regulation, Plant/genetics , Gene Expression Regulation, Plant/physiologyABSTRACT
BACKGROUND: Nothofagus nervosa is one of the most emblematic native tree species of Patagonian temperate forests. Here, the shotgun RNA-sequencing (RNA-Seq) of the transcriptome of N. nervosa, including de novo assembly, functional annotation, and in silico discovery of potential molecular markers to support population and associations genetic studies, are described. RESULTS: Pyrosequencing of a young leaf cDNA library generated a total of 111,814 high quality reads, with an average length of 447 bp. De novo assembly using Newbler resulted into 3,005 tentative isotigs (including alternative transcripts). The non-assembled sequences (singletons) were clustered with CD-HIT-454 to identify natural and artificial duplicates from pyrosequencing reads, leading to 21,881 unique singletons. 15,497 out of 24,886 non-redundant sequences or unigenes, were successfully annotated against a plant protein database. A substantial number of simple sequence repeat markers (SSRs) were discovered in the assembled and annotated sequences. More than 40% of the SSR sequences were inside ORF sequences. To confirm the validity of these predicted markers, a subset of 73 SSRs selected through functional annotation evidences were successfully amplified from six seedlings DNA samples, being 14 polymorphic. CONCLUSIONS: This paper is the first report that shows a highly precise representation of the mRNAs diversity present in young leaves of a native South American tree, N. nervosa, as well as its in silico deduced putative functionality. The reported Nothofagus transcriptome sequences represent a unique resource for genetic studies and provide a tool to discover genes of interest and genetic markers that will greatly aid questions involving evolution, ecology, and conservation using genetic and genomic approaches in the genus.