RESUMO
BACKGROUND: Previous studies on plant long noncoding RNAs (lncRNAs) lacked consistency and suffered from many factors like heterogeneous data sources and experimental protocols, different plant tissues, inconsistent bioinformatics pipelines, etc. For example, the sequencing of RNAs with poly(A) tails excluded a large portion of lncRNAs without poly(A), and use of regular RNA-sequencing technique did not distinguish transcripts' direction for lncRNAs. The current study was designed to systematically discover and analyze lncRNAs across eight evolutionarily representative plant species, using strand-specific (directional) and whole transcriptome sequencing (RiboMinus) technique. RESULTS: A total of 39,945 lncRNAs (25,350 lincRNAs and 14,595 lncNATs) were identified, which showed molecular features of lncRNAs that are consistent across divergent plant species but different from those of mRNA. Further, transposable elements (TEs) were found to play key roles in the origination of lncRNA, as significantly large number of lncRNAs were found to contain TEs in gene body and promoter region, and transcription of many lncRNAs was driven by TE promoters. The lncRNA sequences were divergent even in closely related species, and most plant lncRNAs were genus/species-specific, amid rapid turnover in evolution. Evaluated with PhastCons scores, plant lncRNAs showed similar conservation level to that of intergenic sequences, suggesting that most lincRNAs were young and with short evolutionary age. INDUCED BY PHOSPHATE STARVATION (IPS) was found so far to be the only plant lncRNA group with conserved motifs, which may play important roles in the adaptation of terrestrial life during migration from aquatic to terrestrial. Most highly and specially expressed lncRNAs formed co-expression network with coding genes, and their functions were believed to be closely related to their co-expression genes. CONCLUSION: The study revealed novel features and complexity of lncRNAs in plants through systematic analysis, providing important insights into the origination and evolution of plant lncRNAs.
Assuntos
RNA Longo não Codificante , Biologia Computacional/métodos , Elementos de DNA Transponíveis , RNA Longo não Codificante/genética , RNA Mensageiro , RNA de Plantas/genética , Análise de Sequência de RNA , Transcriptoma , Sequenciamento do ExomaRESUMO
BACKGROUND: Chrysanthemum seticuspe has emerged as a model plant species of cultivated chrysanthemums, especially for studies involving diploid and self-compatible pure lines (Gojo-0). Its genome was sequenced and assembled into chromosomes. However, the genome annotation of C. seticuspe still needs to be improved to elucidate the complex regulatory networks in this species. RESULTS: In addition to the 74,259 mRNAs annotated in the C. seticuspe genome, we identified 18,265 novel mRNAs, 51,425 novel lncRNAs, 501 novel miRNAs and 22,065 novel siRNAs. Two C-class genes and YABBY family genes were highly expressed in disc florets, while B-class genes were highly expressed in ray florets. A WGCNA was performed to identify the hub lncRNAs and mRNAs in ray floret- and disc floret-specific modules, and CDM19, BBX22, HTH, HSP70 and several lncRNAs were identified. ceRNA and lncNAT networks related to flower development were also constructed, and we found a latent functional lncNAT-mRNA combination, LXLOC_026470 and MIF2. CONCLUSIONS: The annotations of mRNAs, lncRNAs and small RNAs in the C. seticuspe genome have been improved. The expression profiles of flower development-related genes, ceRNA networks and lncNAT networks were identified, laying a foundation for elucidating the regulatory mechanisms underlying disc floret and ray floret formation.