RESUMO
X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders.
Assuntos
Distúrbios Distônicos/genética , Doenças Genéticas Ligadas ao Cromossomo X/genética , Genoma Humano , Transcriptoma/genética , Processamento Alternativo/genética , Elementos Alu/genética , Sequência de Bases , Sistemas CRISPR-Cas/genética , Estudos de Coortes , Família , Feminino , Loci Gênicos , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Histona Acetiltransferases/genética , Histona Acetiltransferases/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Íntrons/genética , Masculino , Repetições Minissatélites/genética , Modelos Genéticos , Degeneração Neural/genética , Degeneração Neural/patologia , Células-Tronco Neurais/metabolismo , Neurônios/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Elementos Nucleotídeos Curtos e Dispersos , Fatores Associados à Proteína de Ligação a TATA/genética , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Fator de Transcrição TFIID/genética , Fator de Transcrição TFIID/metabolismoRESUMO
Genes have the ability to produce transcript variants that perform specific cellular functions. However, accurately detecting all transcript variants remains a long-standing challenge, especially when working with poorly annotated genomes or without a known genome. To address this issue, we have developed a new computational method, TransIntegrator, which enables transcriptome-wide detection of novel transcript variants. For this, we determined 10 Illumina sequencing transcriptomes and a PacBio full-length transcriptome for consecutive embryo development stages of amphioxus, a species of great evolutionary importance. Based on the transcriptomes, we employed TransIntegrator to create a comprehensive transcript variant library, namely iTranscriptome. The resulting iTrancriptome contained 91 915 distinct transcript variants, with an average of 2.4 variants per gene. This substantially improved current amphioxus genome annotation by expanding the number of genes from 21 954 to 38 777. Further analysis manifested that the gene expansion was largely ascribed to integration of multiple Illumina datasets instead of involving the PacBio data. Moreover, we demonstrated an example application of TransIntegrator, via generating iTrancriptome, in aiding accurate transcriptome assembly, which significantly outperformed other hybrid methods such as IDP-denovo and Trinity. For user convenience, we have deposited the source codes of TransIntegrator on GitHub as well as a conda package in Anaconda. In summary, this study proposes an affordable but efficient method for reliable transcriptomic research in most species.
Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos , Genoma , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
BACKGROUND: Transcriptome assembly from RNA-sequencing data in species without a reliable reference genome has to be performed de novo, but studies have shown that de novo methods often have inadequate ability to reconstruct transcript isoforms. We address this issue by constructing an assembly pipeline whose main purpose is to produce a comprehensive set of transcript isoforms. RESULTS: We present the de novo transcript isoform assembler ClusTrast, which takes short read RNA-seq data as input, assembles a primary assembly, clusters a set of guiding contigs, aligns the short reads to the guiding contigs, assembles each clustered set of short reads individually, and merges the primary and clusterwise assemblies into the final assembly. We tested ClusTrast on real datasets from six eukaryotic species, and showed that ClusTrast reconstructed more expressed known isoforms than any of the other tested de novo assemblers, at a moderate reduction in precision. For recall, ClusTrast was on top in the lower end of expression levels (<15% percentile) for all tested datasets, and over the entire range for almost all datasets. Reference transcripts were often (35-69% for the six datasets) reconstructed to at least 95% of their length by ClusTrast, and more than half of reference transcripts (58-81%) were reconstructed with contigs that exhibited polymorphism, measuring on a subset of reliably predicted contigs. ClusTrast recall increased when using a union of assembled transcripts from more than one assembly tool as primary assembly. CONCLUSION: We suggest that ClusTrast can be a useful tool for studying isoforms in species without a reliable reference genome, in particular when the goal is to produce a comprehensive transcriptome set with polymorphic variants.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Transcriptoma , Análise de Sequência , RNA-Seq , Análise de Sequência de RNA , Isoformas de Proteínas/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Rapeseed-mustard, the oleiferous Brassica species are important oilseed crops cultivated all over the globe. Mustard aphid Lipaphis erysimi (L.) Kaltenbach is a major threat to the cultivation of rapeseed-mustard. Wild mustard Rorippa indica (L.) Hiern shows tolerance to mustard aphids as a nonhost and hence is an important source for the bioprospecting of potential resistance genes and defense measures to manage mustard aphids sustainably. We performed mRNA sequencing of the R. indica plant uninfested and infested by the mustard aphids, harvested at 24 hours post-infestation. Following quality control, the high-quality reads were subjected to de novo assembly of the transcriptome. As there is no genomic information available for this potential wild plant, the raw reads will be useful for further bioinformatics analysis and the sequence information of the assembled transcripts will be helpful to design primers for the characterization of specific gene sequences. In this study, we also used the generated resource to comprehensively analyse the global profile of differential gene expression in R. indica in response to infestation by mustard aphids. The functional enrichment analysis of the differentially expressed genes reveals a significant immune response and suggests the possibility of chitin-induced defense signaling.
Assuntos
Afídeos , Rorippa , Animais , Mostardeira/genética , Transcriptoma , Afídeos/genética , Rorippa/genéticaRESUMO
Freshwater ecosystems are among the most endangered ecosystems worldwide. While numerous taxa are on the verge of extinction as a result of global changes and direct or indirect anthropogenic activity, genomic and transcriptomic resources represent a key tool for comprehending species' adaptability and serve as the foundation for conservation initiatives. The Loire grayling, Thymallus ligericus, is a freshwater European salmonid endemic to the upper Loire River basin. The species is comprised of fragmented populations that are dispersed over a small area and it has been identified as a vulnerable species. Here, we provide a multi-tissue de novo transcriptome assembly of T. ligericus. The completeness and integrity of the transcriptome were assessed before and after redundancy removal with lineage-specific libraries from Eukaryota, Metazoa, Vertebrata, and Actinopterygii. Relative gene expression was assessed for each of the analyzed tissues, using the de novo assembled transcriptome and a genome-based analysis using the available T. thymallus genome as a reference. The final assembly, with a contig N50 of 1221 and Benchmarking Universal Single-Copy Orthologs (BUSCO) scores above 94%, is made accessible along with structural and functional annotations and relative gene expression of the five tissues (NCBI SRA and FigShare databases). This is the first transcriptomic resource for this species, which provides a foundation for future research on this and other salmonid species that are increasingly exposed to environmental stressors.
Assuntos
Salmonidae , Transcriptoma , Animais , Salmonidae/genética , Água Doce , Anotação de Sequência Molecular , Perfilação da Expressão Gênica , Espécies em Perigo de Extinção , GenomaRESUMO
Many questions in biology benefit greatly from the use of a variety of model systems. High-throughput sequencing methods have been a triumph in the democratization of diverse model systems. They allow for the economical sequencing of an entire genome or transcriptome of interest, and with technical variations can even provide insight into genome organization and the expression and regulation of genes. The analysis and biological interpretation of such large datasets can present significant challenges that depend on the 'scientific status' of the model system. While high-quality genome and transcriptome references are readily available for well-established model systems, the establishment of such references for an emerging model system often requires extensive resources such as finances, expertise and computation capabilities. The de novo assembly of a transcriptome represents an excellent entry point for genetic and molecular studies in emerging model systems as it can efficiently assess gene content while also serving as a reference for differential gene expression studies. However, the process of de novo transcriptome assembly is non-trivial, and as a rule must be empirically optimized for every dataset. For the researcher working with an emerging model system, and with little to no experience with assembling and quantifying short-read data from the Illumina platform, these processes can be daunting. In this guide we outline the major challenges faced when establishing a reference transcriptome de novo and we provide advice on how to approach such an endeavor. We describe the major experimental and bioinformatic steps, provide some broad recommendations and cautions for the newcomer to de novo transcriptome assembly and differential gene expression analyses. Moreover, we provide an initial selection of tools that can assist in the journey from raw short-read data to assembled transcriptome and lists of differentially expressed genes.
RESUMO
Ziziphus nummularia an elite heat-stress tolerant shrub, grows in arid regions of desert. However, its molecular mechanism responsible for heat stress tolerance is unexplored. Therefore, we analysed whole transcriptome of Jaisalmer (heat tolerant) and Godhra (heat sensitive) genotypes of Z. nummularia to understand its molecular mechanism responsible for heat stress tolerance. De novo assembly of 16,22,25,052 clean reads yielded 276,029 transcripts. A total of 208,506 unigenes were identified which contains 4290 and 1043 differentially expressed genes (DEG) in TGO (treated Godhra at 42 °C) vs. CGO (control Godhra) and TJR (treated Jaisalmer at 42 °C) vs. CJR (control Jaisalmer), respectively. A total of 987 (67 highly enriched) and 754 (34 highly enriched) pathways were obsorved in CGO vs. TGO and CJR vs. TJR, respectively. Antioxidant pathways and TFs like Homeobox, HBP, ARR, PHD, GRAS, CPP, and E2FA were uniquely observed in Godhra genotype and SET domains were uniquely observed in Jaisalmer genotype. Further transposable elements were highly up-regulated in Godhra genotype but no activation in Jaisalmer genotype. A total of 43,093 and 39,278 simple sequence repeats were identified in the Godhra and Jaisalmer genotypes, respectively. A total of 10 DEGs linked to heat stress were validated in both genotypes for their expression under different heat stresses using quantitative real-time PCR. Comparing expression patterns of the selected DEGs identified ClpB1 as a potential candidate gene for heat tolerance in Z. nummularia. Here we present first characterized transcriptome of Z. nummularia in response to heat stress for the identification and characterization of heat stress-responsive genes. Supplementary Information: The online version contains supplementary material available at 10.1007/s12298-024-01431-y.
RESUMO
BACKGROUND: RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS: Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS: transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
Assuntos
Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq , Perfilação da Expressão Gênica , Anotação de Sequência MolecularRESUMO
Hepatitis B virus (HBV) is one of the smallest human DNA viruses and its 3.2 Kb genome encodes multiple overlapping open reading frames, making its viral transcriptome challenging to dissect. Previous studies have combined quantitative PCR and Next Generation Sequencing to identify viral transcripts and splice junctions, however the fragmentation and selective amplification used in short read sequencing precludes the resolution of full length RNAs. Our study coupled an oligonucleotide enrichment protocol with state-of-the-art long read sequencing (PacBio) to identify the repertoire of HBV RNAs. This methodology provides sequencing libraries where up to 25â% of reads are of viral origin and enable the identification of canonical (unspliced), non-canonical (spliced) and chimeric viral-human transcripts. Sequencing RNA isolated from de novo HBV infected cells or those transfected with 1.3 × overlength HBV genomes allowed us to assess the viral transcriptome and to annotate 5' truncations and polyadenylation profiles. The two HBV model systems showed an excellent agreement in the pattern of major viral RNAs, however differences were noted in the abundance of spliced transcripts. Viral-host chimeric transcripts were identified and more commonly found in the transfected cells. Enrichment capture and PacBio sequencing allows the assignment of canonical and non-canonical HBV RNAs using an open-source analysis pipeline that enables the accurate mapping of the HBV transcriptome.
Assuntos
Vírus da Hepatite B , Transcriptoma , Humanos , Vírus da Hepatite B/genética , Sequenciamento de Nucleotídeos em Larga Escala , RNA Viral/genéticaRESUMO
Despite the economic losses due to the walnut anthracnose, Ophiognomonia leptostyla is an orphan fungus with respect to genomic resources. In the present study, the transcriptome of O. leptostyla was assembled for the first time. RNA sequencing was conducted for the fungal mycelia grown in a liquid media, and the inoculated leaf samples of walnut with the fungal conidia sampled at 48, 96 and 144 h post inoculation (hpi). The completeness, correctness, and contiguity of the de novo transcriptome assemblies generated with Trinity, Oases, SOAPdenovo-Trans and Bridger were compared to identify a single superior reference assembly. In most of the assessment criteria including N50, Transrate score, number of ORFs with known description in gene bank, the percentage of reads mapped back to the transcript (RMBT), BUSCO score, Swiss-Prot coverage bin and RESM-EVAL score, the Bridger assembly was the superior and thus used as a reference for profiling the O. leptostyla transcriptome in liquid media vs. during walnut infection. The k-means clustering of transcripts resulted in four distinct transcription patterns across the three sampling time points. Most of the detected CAZy transcripts had elevated transcription at 96 hpi that is hypothetically concurrent with the start of intracellular growth. The in-silico analysis revealed 103 candidate effectors of which six were members of Necrosis and Ethylene Inducing Like Protein (NLP) gene family belonging to three distinct k-means clusters. This study provided a complex and temporal pattern of the CAZys and candidate effectors transcription during six days post O. leptostyla inoculation on walnut leaves, introducing a list of candidate virulence genes for validation in future studies.
Assuntos
Ascomicetos , Juglans , Transcriptoma/genética , Juglans/genética , Virulência/genética , Ascomicetos/genéticaRESUMO
RNA-seq technology is widely employed in various research areas related to transcriptome analyses, and the identification of all the expressed transcripts from short sequencing reads presents a considerable computational challenge. In this study, we introduce TransRef, a new computational algorithm for accurate transcriptome assembly by redefining a novel graph model, the neo-splicing graph, and then iteratively applying a constrained dynamic programming to reconstruct all the expressed transcripts for each graph. When TransRef is utilized to analyze both real and simulated datasets, its performance is notably better than those of several state-of-the-art assemblers, including StringTie2, Cufflinks and Scallop. In particular, the performance of TransRef is notably strong in identifying novel transcripts and transcripts with low-expression levels, while the other assemblers are less effective.
Assuntos
Algoritmos , Splicing de RNA , Transcriptoma , Conjuntos de Dados como Assunto , Genoma , RNA Mensageiro/genéticaRESUMO
Fusarium circinatum poses a threat to both commercial and natural pine forests. Large variation in host resistance exists between species, with many economically important species being susceptible. Development of resistant genotypes could be expedited and optimised by investigating the molecular mechanisms underlying host resistance and susceptibility as well as increasing the available genetic resources. RNA-seq data, from F. circinatum inoculated and mock-inoculated ca. 6-month-old shoot tissue at 3- and 7-days postinoculation, was generated for three commercially important tropical pines, Pinus oocarpa, Pinus maximinoi and Pinus greggii. De novo transcriptomes were assembled and used to investigate the NLR and PR gene content within available pine references. Host responses to F. circinatum challenge were investigated in P. oocarpa (resistant) and P. greggii (susceptible), in comparison to previously generated expression profiles from Pinus tecunumanii (resistant) and Pinus patula (susceptible). Expression results indicated crosstalk between induced salicylate, jasmonate and ethylene signalling is involved in host resistance and compromised in susceptible hosts. Additionally, higher constitutive expression of sulfur metabolism and flavonoid biosynthesis in resistant hosts suggest involvement of these metabolites in resistance.
Assuntos
Fusarium , Pinus , Transcriptoma/genética , Fusarium/fisiologia , Genótipo , Pinus/genética , Doenças das Plantas/genéticaRESUMO
Jack (Artocarpus heterophyllus) is a multipurpose fruit-tree species with minimal genomic resources. The study reports developing comprehensive transcriptome data containing 80,411 unigenes with an N50 value of 1265 bp. We predicted 64,215 CDSs from the unigenes and annotated and functionally categorized them into the biological process (23,230), molecular function (27,149), and cellular components (17,284). From 80,411 unigenes, we discovered 16,853 perfect SSRs with 192 distinct repeat motif types reiterating 4 to 22 times. Besides, we identified 2741 TFs from 69 TF families, 53 miRNAs from 19 conserved miRNA families, 25,953 potential lncRNAs, and placed three functional eTMs in different lncRNA-miRNA pairs. The regulatory networks involving genes, TFs, and miRNAs identified several regulatory and regulated nodes providing insight into miRNAs' gene associations and transcription factor-mediated regulation. The comparison of expression patterns of some selected miRNAs vis-à-vis their corresponding target genes showed an inverse relationship indicating the possible miRNA-mediated regulation of the genes.
Assuntos
Artocarpus , MicroRNAs , Humanos , Transcriptoma , Artocarpus/genética , MicroRNAs/genética , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Perfilação da Expressão Gênica , Anotação de Sequência MolecularRESUMO
Marsupenaeus japonicus is an important marine crustacean species. However, a lack of genomic resources hinders the use of whole genome sequencing to explore their genetic basis and molecular mechanisms for genome-assisted breeding. Consequently, we determined the chromosome-level genome of M. japonicus. Here we determine the chromosome-level genome assembly for M. japonicus with a total of 665.19 Gb genomic sequencing data, yielding an approximately1.54 Gb assembly with a contig N50 size of 229.97 kb and a scaffold N50 size of 38.27 Mb. With the high-throughput chromosome conformation capture (Hi-C) technology, we anchored 18,019 contigs onto 42 pseudo-chromosomes, accounting for 99.40% of the total genome assembly. Analysis of the present M. japonicus genome revealed 24,317 protein-coding genes and a high proportion of repetitive sequences (61.56%). The high-quality genome assembly enabled the identification of genes associated with cold-stress and cold tolerance in kuruma shrimp through the comparison of eyestalk transcriptomes between the low temperature-stressed shrimp (10 °C) and normal temperature shrimp (28 °C). The genome assembly presented here could be useful in future studies to reveal the molecular mechanisms of M. japonicus in response to low temperature stress and the molecular assisted breeding of M. japonicus in low temperature.
Assuntos
Genoma , Genômica , Cromossomos/genética , Sequências Repetitivas de Ácido Nucleico , Temperatura Baixa , FilogeniaRESUMO
BACKGROUND: RNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. For organisms that lack a well-annotated reference genome or transcriptome, a conventional RNA-seq data analysis workflow requires constructing a de-novo transcriptome assembly and annotating it against a high-confidence protein database. The assembly serves as a reference for read mapping, and the annotation is necessary for functional analysis of genes found to be differentially expressed. However, assembly is computationally expensive. It is also prone to errors that impact expression analysis, especially since sequencing depth is typically much lower for expression studies than for transcript discovery. RESULTS: We propose a shortcut, in which we obtain counts for differential expression analysis by directly aligning RNA-seq reads to the high-confidence proteome that would have been otherwise used for annotation. By avoiding assembly, we drastically cut down computational costs - the running time on a typical dataset improves from the order of tens of hours to under half an hour, and the memory requirement is reduced from the order of tens of Gbytes to tens of Mbytes. We show through experiments on simulated and real data that our pipeline not only reduces computational costs, but has higher sensitivity and precision than a typical assembly-based pipeline. A Snakemake implementation of our workflow is available at: https://bitbucket.org/project_samar/samar . CONCLUSIONS: The flip side of RNA-seq becoming accessible to even modestly resourced labs has been that the time, labor, and infrastructure cost of bioinformatics analysis has become a bottleneck. Assembly is one such resource-hungry process, and we show here that it can be avoided for quick and easy, yet more sensitive and precise, differential gene expression analysis in non-model organisms.
Assuntos
Perfilação da Expressão Gênica , Transcriptoma , DNA , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , RNA-Seq , Análise de Sequência de RNARESUMO
In this study, we characterized the fatty acid production in Neochloris aquatica at transcriptomics and biochemical levels under limiting, normal, and excess nitrate concentrations in different growth phases. At the stationary phase, N. aquatica mainly produced saturated fatty acids such as stearic acid under the limiting nitrate concentration, which is suitable for biodiesel production. However, it produced polyunsaturated fatty acids such as α-linolenic acid under the excess nitrate concentration, which has nutritional values as food supplements. In addition, RNA-seq was employed to identify genes and pathways that were being affected in N. aquatica for three growth phases in the presence of the different nitrate amounts. Genes that are responsible for the production of saturated fatty acids were upregulated in the cells grown under a limiting nitrogen amount while genes that are responsible for the production of polyunsaturated fatty acid were upregulated in the cells grown under excess nitrogen amount. Further analysis showed more genes differentially expressed (DEGs) at the logarithmic phase in all conditions while a relatively steady trend was observed during the transition from the logarithmic phase to the stationary phase under limiting and excess nitrogen. Our results provide a foundation for identifying developmentally important genes and understanding the biological processes in the different growth phases of the N. aquatica in terms of biomass and lipid production.
Assuntos
Ácidos Graxos , Transcriptoma , Biomassa , Ácidos Graxos/metabolismo , Nitratos , Nitrogênio/metabolismoRESUMO
BACKGROUND: The underutilized species Vigna aconitifolia (Moth Bean) is an important legume crop cultivated in semi-arid conditions and is valued for its seeds for their high protein content. It is also a popular green manure cover crop that offers many agronomic benefits including nitrogen fixation and soil nutrients. Despite its economic potential, genomic resources for this crop are scarce and there is limited knowledge on the developmental process of this plant at a molecular level. In the present communication, we have studied the molecular mechanisms that regulate plant development in V. aconitifolia, with a special focus on flower and seed development. We believe that this study will greatly enrich the genomic resources for this plant in form of differentially expressed genes, transcription factors, and genic molecular markers. RESULTS: We have performed the de novo transcriptome assembly using six types of tissues from various developmental stages of Vigna aconitifolia (var. RMO-435), namely, leaves, roots, flowers, pods, and seed tissue in the early and late stages of development, using the Illumina NextSeq platform. We assembled the transcriptome to get 150938 unigenes with an average length of 937.78 bp. About 79.9% of these unigenes were annotated in public databases and 12839 of those unigenes showed a significant match in the KEGG database. Most of the unigenes displayed significant differential expression in the late stages of seed development as compared with leaves. We annotated 74082 unigenes as transcription factors and identified 12096 simple sequence repeats (SSRs) in the genic regions of V.aconitifolia. Digital expression analysis revealed specific gene activities in different tissues which were validated using Real-time PCR analysis. CONCLUSIONS: The Vigna aconitifolia transcriptomic resources generated in this study provide foundational resources for gene discovery with respect to various developmental stages. This study provides the first comprehensive analysis revealing the genes involved in molecular as well as metabolic pathways that regulate seed development and may be responsible for the unique nutritive values of moth bean seeds. Hence, this study would serve as a foundation for characterization of candidate genes which would not only provide novel insights into understanding seed development but also provide resources for improved moth bean and related species genetic enhancement.
Assuntos
Vigna , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Marcadores Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites/genética , Anotação de Sequência Molecular , Fatores de Transcrição/genética , Transcriptoma , Vigna/genéticaRESUMO
BACKGROUND: In ipecac (Carapichea ipecacuanha (Brot.) L. Andersson), adventitious shoots can be induced simply by placing internodal segments on phytohormone-free culture medium. The shoots form locally on the epidermis of the apical region of the segments, but not the basal region. Levels of endogenous auxin and cytokinin transiently increase in the segments after 1 week of culture. RESULTS: Here, we conducted RNA-seq analysis to compare gene expression patterns in apical and basal regions of segments before culture and after 1 week of culture for adventitious shoot formation. The results revealed 8987 differentially expressed genes in a de novo assembly of 76,684 genes. Among them, 276 genes were upregulated in the apical region after 1 week of culture relative to before culture and the basal region after 1 week of culture. These genes include 18 phytohormone-response genes and shoot-formation-related genes. Validation of the gene expression by quantitative real-time PCR assay confirmed that the expression patterns were similar to those of the RNA-seq data. CONCLUSIONS: The transcriptome data show that expression of cytokinin biosynthesis genes is induced along with the acquisition of cellular pluripotency and the initiation of cell division by wounding in the apical region of internodal segments, that trigger adventitious shoot formation without callusing.
Assuntos
Ácidos Indolacéticos , Ipeca , Citocininas/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Ácidos Indolacéticos/metabolismo , Ipeca/metabolismo , Reguladores de Crescimento de Plantas/metabolismo , Brotos de Planta/genética , Brotos de Planta/metabolismoRESUMO
Microalgae are key ecological players with a complex evolutionary history. Genomic diversity, in addition to limited availability of high-quality genomes, challenge studies that aim to elucidate molecular mechanisms underlying microalgal ecophysiology. Here, we present a novel and comprehensive transcriptomic hybrid approach to generate a reference for genetic analyses and resolve the microalgal gene landscape at the strain level. The approach is demonstrated for a strain of the coccolithophore microalga Emiliania huxleyi, which is a species complex with considerable genome variability. The investigated strain is commonly studied as a model for algal-bacterial interactions and was therefore sequenced in the presence of bacteria to elicit the expression of interaction-relevant genes. We applied complementary PacBio Iso-Seq full-length cDNA and poly(A)-independent Illumina total RNA sequencing, which resulted in a de novo-assembled, near-complete hybrid transcriptome. In particular, hybrid sequencing improved the reconstruction of long transcripts and increased the recovery of full-length transcript isoforms. To use the resulting hybrid transcriptome as a reference for genetic analyses, we demonstrate a method that collapses the transcriptome into a genome-like data set, termed "synthetic genome" (sGenome). We used the sGenome as a reference to visually confirm the robustness of the CCMP3266 gene assembly, to conduct differential gene expression analysis, and to characterize novel E. huxleyi genes. The newly identified genes contribute to our understanding of E. huxleyi genome diversification and are predicted to play a role in microbial interactions. Our transcriptomic toolkit can be implemented in various microalgae to facilitate mechanistic studies on microalgal diversity and ecology. IMPORTANCE Microalgae are key players in the ecology and biogeochemistry of our oceans. Efforts to implement genomic and transcriptomic tools in laboratory studies involving microalgae suffer from the lack of published genomes. In the case of coccolithophore microalgae, the problem has long been recognized; the model species Emiliania huxleyi is a species complex with genomes composed of a core and a large variable portion. To study the role of the variable portion in niche adaptation, and specifically in microbial interactions, strain-specific genetic information is required. Here, we present a novel transcriptomic hybrid approach, and generated strain-specific genome-like information. We demonstrate our approach on an E. huxleyi strain that is cocultivated with bacteria. By constructing a "synthetic genome," we generated comprehensive gene annotations that enabled accurate analyses of gene expression patterns. Importantly, we unveiled novel genes in the variable portion of E. huxleyi that play putative roles in microbial interactions.
Assuntos
Haptófitas , Genômica , Haptófitas/genética , Haptófitas/metabolismo , Anotação de Sequência Molecular , Oceanos e Mares , TranscriptomaRESUMO
Understanding the molecular associations underlying pathogen resistance in invasive plant species is likely to provide useful insights into the effective control of alien plants, thereby facilitating the conservation of native biodiversity. In the current study, we investigated pathogen resistance in an invasive clonal plant, Sphagneticola trilobata, at the molecular level. Sphagneticola trilobata (i.e., Singapore daisy) is a noxious weed that affects both terrestrial and aquatic ecosystems, and is less affected by pathogens in the wild than co-occurring native species. We used Illumina sequencing to investigate the transcriptome of S. trilobata following infection by a globally distributed generalist pathogen (Rhizoctonia solani). RNA was extracted from leaves of inoculated and un-inoculated control plants, and a draft transcriptome of S. trilobata was generated to examine the molecular response of this species following infection. We obtained a total of 49,961,014 (94.3%) clean reads for control (un-inoculated plants) and 54,182,844 (94.5%) for the infected treatment (inoculated with R. solani). Our analyses facilitated the discovery of 117,768 de novo assembled contigs and 78,916 unigenes. Of these, we identified 3506 differentially expressed genes and 60 hormones associated with pathogen resistance. Numerous genes, including candidate genes, were associated with plant-pathogen interactions and stress response in S. trilobata. Many recognitions, signaling, and defense genes were differentially regulated between treatments, which were confirmed by qRT-PCR. Overall, our findings improve our understanding of the genes and molecular associations involved in plant defense of a rapidly spreading invasive clonal weed, and serve as a valuable resource for further work on mechanism of disease resistance and managing invasive plants.