RESUMO
Long non-coding RNAs (lncRNAs) play essential roles in various biological processes, such as chromatin remodeling, post-transcriptional regulation, and epigenetic modifications. Despite their critical functions in regulating plant growth, root development, and seed dormancy, the identification of plant lncRNAs remains a challenge due to the scarcity of specific and extensively tested identification methods. Most mainstream machine learning-based methods used for plant lncRNA identification were initially developed using human or other animal datasets, and their accuracy and effectiveness in predicting plant lncRNAs have not been fully evaluated or exploited. To overcome this limitation, we retrained several models, including CPAT, PLEK, and LncFinder, using plant datasets and compared their performance with mainstream lncRNA prediction tools such as CPC2, CNCI, RNAplonc, and LncADeep. Retraining these models significantly improved their performance, and two of the retrained models, LncFinder-plant and CPAT-plant, alongside their ensemble, emerged as the most suitable tools for plant lncRNA identification. This underscores the importance of model retraining in tackling the challenges associated with plant lncRNA identification. Finally, we developed a pipeline (Plant-LncPipe) that incorporates an ensemble of the two best-performing models and covers the entire data analysis process, including reads mapping, transcript assembly, lncRNA identification, classification, and origin, for the efficient identification of lncRNAs in plants. The pipeline, Plant-LncPipe, is available at: https://github.com/xuechantian/Plant-LncRNA-pipline.
RESUMO
Poplar (Populus) is a well-established model system for tree genomics and molecular breeding, and hybrid poplar is widely used in forest plantations. However, distinguishing its diploid homologous chromosomes is difficult, complicating advanced functional studies on specific alleles. In this study, we applied a trio-binning design and PacBio high-fidelity long-read sequencing to obtain haplotype-phased telomere-to-telomere genome assemblies for the 2 parents of the well-studied F1 hybrid "84K" (Populus alba × Populus tremula var. glandulosa). Almost all chromosomes, including the telomeres and centromeres, were completely assembled for each haplotype subgenome apart from 2 small gaps on one chromosome. By incorporating information from these haplotype assemblies and extensive RNA-seq data, we analyzed gene expression patterns between the 2 subgenomes and alleles. Transcription bias at the subgenome level was not uncovered, but extensive-expression differences were detected between alleles. We developed machine-learning (ML) models to predict allele-specific expression (ASE) with high accuracy and identified underlying genome features most highly influencing ASE. One of our models with 15 predictor variables achieved 77% accuracy on the training set and 74% accuracy on the testing set. ML models identified gene body CHG methylation, sequence divergence, and transposon occupancy both upstream and downstream of alleles as important factors for ASE. Our haplotype-phased genome assemblies and ML strategy highlight an avenue for functional studies in Populus and provide additional tools for studying ASE and heterosis in hybrids.
Assuntos
Alelos , Genoma de Planta , Populus , Populus/genética , Genoma de Planta/genética , Regulação da Expressão Gênica de Plantas , Haplótipos/genética , Hibridização Genética , Aprendizado de MáquinaRESUMO
Terpenes and terpenoids are key natural compounds for plant defense, development, and composition of plant oil. The synthesis and accumulation of a myriad of volatile terpenoid compounds in these plants may dramatically alter the quality and flavor of the oils, which provide great commercial utilization value for oil-producing plants. Terpene synthases (TPSs) are important enzymes responsible for terpenic diversity. Investigating the differentiation of the TPS gene family could provide valuable theoretical support for the genetic improvement of oil-producing plants. While the origin and function of TPS genes have been extensively studied, the exact origin of the initial gene fusion event - it occurred in plants or microbes - remains uncertain. Furthermore, a comprehensive exploration of the TPS gene differentiation is still pending. Here, phylogenetic analysis revealed that the fusion of the TPS gene likely occurred in the ancestor of land plants, following the acquisition of individual C- and N- terminal domains. Potential mutual transfer of TPS genes was observed among microbes and plants. Gene synteny analysis disclosed a differential divergence pattern between TPS-c and TPS-e/f subfamilies involved in primary metabolism and those (TPS-a/b/d/g/h subfamilies) crucial for secondary metabolites. Biosynthetic gene clusters (BGCs) analysis suggested a correlation between lineage divergence and potential natural selection in structuring terpene diversities. This study provides fresh perspectives on the origin and evolution of the TPS gene family.
RESUMO
Coriaria nepalensis Wall. (Coriariaceae) is a nitrogen-fixing shrub which forms root nodules with the actinomycete Frankia. Oils and extracts of C. nepalensis have been reported to be bacteriostatic and insecticidal, and C. nepalensis bark provides a valuable tannin resource. Here, by combining PacBio HiFi sequencing and Hi-C scaffolding techniques, we generated a haplotype-resolved chromosome-scale genome assembly for C. nepalensis. This genome assembly is approximately 620 Mb in size with a contig N50 of 11 Mb, with 99.9% of the total assembled sequences anchored to 40 pseudochromosomes. We predicted 60,862 protein-coding genes of which 99.5% were annotated from databases. We further identified 939 tRNAs, 7,297 rRNAs, and 982 ncRNAs. The chromosome-scale genome of C. nepalensis is expected to be a significant resource for understanding the genetic basis of root nodulation with Frankia, toxicity, and tannin biosynthesis.
Assuntos
Genoma de Planta , Magnoliopsida , Haplótipos , Magnoliopsida/genética , Anotação de Sequência Molecular , Filogenia , Cromossomos de PlantasRESUMO
Wood decay resistance (WDR) is marking the value of wood utilization. Many trees of the Lauraceae have exceptional WDR, as evidenced by their use in ancient royal palace buildings in China. However, the genetics of WDR remain elusive. Here, through comparative genomics, we revealed the unique characteristics related to the high WDR in Lauraceae trees. We present a 1.27-Gb chromosome-level assembly for Lindera megaphylla (Lauraceae). Comparative genomics integrating major groups of angiosperm revealed Lauraceae species have extensively shared gene microsynteny associated with the biosynthesis of specialized metabolites such as isoquinoline alkaloids, flavonoid, lignins and terpenoid, which play significant roles in WDR. In Lauraceae genomes, tandem and proximal duplications (TD/PD) significantly expanded the coding space of key enzymes of biosynthesis pathways related to WDR, which may enhance the decay resistance of wood by increasing the accumulation of these compounds. Among Lauraceae species, genes of WDR-related biosynthesis pathways showed remarkable expansion by TD/PD and conveyed unique and conserved motifs in their promoter and protein sequences, suggesting conserved gene collinearity, gene expansion and gene regulation supporting the high WDR. Our study thus reveals genomic profiles related to biochemical transitions among major plant groups and the genomic basis of WDR in the Lauraceae.
RESUMO
Quercus dentata Thunb., a dominant forest tree species in northern China, has significant ecological and ornamental value due to its adaptability and beautiful autumn coloration, with color changes from green to yellow into red resulting from the autumnal shifts in leaf pigmentation. However, the key genes and molecular regulatory mechanisms for leaf color transition remain to be investigated. First, we presented a high-quality chromosome-scale assembly for Q. dentata. This 893.54 Mb sized genome (contig N50 = 4.21 Mb, scaffold N50 = 75.55 Mb; 2n = 24) harbors 31 584 protein-coding genes. Second, our metabolome analyses uncovered pelargonidin-3-O-glucoside, cyanidin-3-O-arabinoside, and cyanidin-3-O-glucoside as the main pigments involved in leaf color transition. Third, gene co-expression further identified the MYB-bHLH-WD40 (MBW) transcription activation complex as central to anthocyanin biosynthesis regulation. Notably, transcription factor (TF) QdNAC (QD08G038820) was highly co-expressed with this MBW complex and may regulate anthocyanin accumulation and chlorophyll degradation during leaf senescence through direct interaction with another TF, QdMYB (QD01G020890), as revealed by our further protein-protein and DNA-protein interaction assays. Our high-quality genome assembly, metabolome, and transcriptome resources further enrich Quercus genomics and will facilitate upcoming exploration of ornamental values and environmental adaptability in this important genus.
Assuntos
Antocianinas , Quercus , Antocianinas/metabolismo , Quercus/genética , Quercus/metabolismo , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica de Plantas , Transcriptoma/genética , Fatores de Transcrição/metabolismo , Metaboloma , Pigmentação/genética , Cromossomos , Glucosídeos , CorRESUMO
The genus Rhododendron (Ericaceae), with more than 1000 species highly diverse in flower color, is providing distinct ornamental values and a model system for flower color studies. Here, we investigated the divergence between two parental species with different flower color widely used for azalea breeding. Gapless genome assembly was generated for the yellow-flowered azalea, Rhododendron molle. Comparative genomics found recent proliferation of long terminal repeat retrotransposons (LTR-RTs), especially Gypsy, has resulted in a 125 Mb (19%) genome size increase in species-specific regions, and a significant amount of dispersed gene duplicates (13 402) and pseudogenes (17 437). Metabolomic assessment revealed that yellow flower coloration is attributed to the dynamic changes of carotenoids/flavonols biosynthesis and chlorophyll degradation. Time-ordered gene co-expression networks (TO-GCNs) and the comparison confirmed the metabolome and uncovered the specific gene regulatory changes underpinning the distinct flower pigmentation. B3 and ERF TFs were found dominating the gene regulation of carotenoids/flavonols characterized pigmentation in R. molle, while WRKY, ERF, WD40, C2H2, and NAC TFs collectively regulated the anthocyanins characterized pigmentation in the red-flowered R simsii. This study employed a multi-omics strategy in disentangling the complex divergence between two important azaleas and provided references for further functional genetics and molecular breeding.
RESUMO
BACKGROUND: We assessed the safety, immunogenicity and antibody persistence of two- and three-dose schedules of the novel bivalent HPV16/18 vaccine (HPV-2, Walrinvax) in the per-protocol target population of initially seronegative 9-14 year-old girls, including a non-inferiority comparison with the three-dose schedule in 18-26 year-old women. METHODS: This randomized phase 3b trial in Guangxi Zhuang Autonomous Region, China, involved healthy Chinese females in two age cohorts; 600 girls aged 9-14 years and 300 women aged 18-26 years. Girls were randomly assigned (1:1) to receive either two (Months 0,6) or three (Months 0,2,6) intramuscular doses of HPV-2. All participants were monitored for immunogenicity as neutralizing antibodies up to 36 months. Primary objectives were non-inferiority analyses of immunogenicity between two- and three-dose girl groups and adult women at Month 7; safety assessments were based on participant-completed diary cards. RESULTS: All groups demonstrated marked increases in neutralizing antibodies against HPV 16 and 18 that persisted above baseline to 36 months. Month 7 responses in both girl groups were non-inferior to those in the women and were statistically higher after two-doses than girls or women who received three doses. GMTs waned after month 7, but then maintained a plateau level until month 36. Vaccination was well tolerated in all groups with no serious adverse events reported. CONCLUSIONS: Immune responses to two doses of HPV-2 vaccine in adolescent girls were non-inferior to those after three doses in young women, an age cohort in which clinical efficacy of HPV-2 against cervical cancer has been demonstrated.
RESUMO
Xanthoceras sorbifolium (yellowhorn) is a woody oil plant with super stress resistance and excellent oil characteristics. The yellowhorn oil can be used as biofuel and edible oil with high nutritional and medicinal value. However, genetic studies on yellowhorn are just in the beginning, and fundamental biological questions regarding its very long-chain fatty acid (VLCFA) biosynthesis pathway remain largely unknown. In this study, we reconstructed the VLCFA biosynthesis pathway and annotated 137 genes encoding relevant enzymes. We identified four oleosin genes that package triacylglycerols (TAGs) and are specifically expressed in fruits, likely playing key roles in yellowhorn oil production. Especially, by examining time-ordered gene co-expression network (TO-GCN) constructed from fruit and leaf developments, we identified key enzymatic genes and potential regulatory transcription factors involved in VLCFA synthesis. In fruits, we further inferred a hierarchical regulatory network with MYB-related (XS03G0296800) and B3 (XS02G0057600) transcription factors as top-tier regulators, providing clues into factors controlling carbon flux into fatty acids. Our results offer new insights into key genes and transcriptional regulators governing fatty acid production in yellowhorn, laying the foundation for efforts to optimize oil content and fatty acid composition. Moreover, the gene expression patterns and putative regulatory relationships identified here will inform metabolic engineering and molecular breeding approaches tailored to meet biofuel and bioproduct demands.
RESUMO
In-depth genome characterization is still lacking for most of biofuel crops, especially for centromeres, which play a fundamental role during nuclear division and in the maintenance of genome stability. This study applied long-read sequencing technologies to assemble a highly contiguous genome for yellowhorn (Xanthoceras sorbifolium), an oil-producing tree, and conducted extensive comparative analyses to understand centromere structure and evolution, and fatty acid biosynthesis. We produced a reference-level genome of yellowhorn, â¼470 Mb in length with â¼95% of contigs anchored onto 15 chromosomes. Genome annotation identified 22,049 protein-coding genes and 65.7% of the genome sequence as repetitive elements. Long terminal repeat retrotransposons (LTR-RTs) account for â¼30% of the yellowhorn genome, which is maintained by a moderate birth rate and a low removal rate. We identified the centromeric regions on each chromosome and found enrichment of centromere-specific retrotransposons of LINE1 and Gypsy in these regions, which have evolved recently (â¼0.7 MYA). We compared the genomes of three cultivars and found frequent inversions. We analyzed the transcriptomes from different tissues and identified the candidate genes involved in very-long-chain fatty acid biosynthesis and their expression profiles. Collinear block analysis showed that yellowhorn shared the gamma (γ) hexaploidy event with Vitis vinifera but did not undergo any further whole-genome duplication. This study provides excellent genomic resources for understanding centromere structure and evolution and for functional studies in this important oil-producing plant.
RESUMO
Polyploidization plays a key role in plant evolution, but the forces driving the fate of homoeologs in polyploid genomes, i.e., paralogs resulting from a whole-genome duplication (WGD) event, remain to be elucidated. Here, we present a chromosome-scale genome assembly of tetraploid scarlet sage (Salvia splendens), one of the most diverse ornamental plants. We found evidence for three WGD events following an older WGD event shared by most eudicots (the γ event). A comprehensive, spatiotemporal, genome-wide analysis of homoeologs from the most recent WGD unveiled expression asymmetries, which could be associated with genomic rearrangements, transposable element proximity discrepancies, coding sequence variation, selection pressure, and transcription factor binding site differences. The observed differences between homoeologs may reflect the first step toward sub- and/or neofunctionalization. This assembly provides a powerful tool for understanding WGD and gene and genome evolution and is useful in developing functional genomics and genetic engineering strategies for scarlet sage and other Lamiaceae species.
RESUMO
Ginger (Zingiber officinale) is one of the most valued spice plants worldwide; it is prized for its culinary and folk medicinal applications and is therefore of high economic and cultural importance. Here, we present a haplotype-resolved, chromosome-scale assembly for diploid ginger anchored to 11 pseudochromosome pairs with a total length of 3.1 Gb. Remarkable structural variation was identified between haplotypes, and two inversions larger than 15 Mb on chromosome 4 may be associated with ginger infertility. We performed a comprehensive, spatiotemporal, genome-wide analysis of allelic expression patterns, revealing that most alleles are coordinately expressed. The alleles that exhibited the largest differences in expression showed closer proximity to transposable elements, greater coding sequence divergence, more relaxed selection pressure, and more transcription factor binding site differences. We also predicted the transcription factors potentially regulating 6-gingerol biosynthesis. Our allele-aware assembly provides a powerful platform for future functional genomics, molecular breeding, and genome editing in ginger.
RESUMO
LTR retrotransposons (LTR-RTs) are ubiquitous and represent the dominant repeat element in plant genomes, playing important roles in functional variation, genome plasticity and evolution. With the advent of new sequencing technologies, a growing number of whole-genome sequences have been made publicly available, making it possible to carry out systematic analyses of LTR-RTs. However, a comprehensive and unified annotation of LTR-RTs in plant groups is still lacking. Here, we constructed a plant intact LTR-RTs dataset, which is designed to classify and annotate intact LTR-RTs with a standardized procedure. The dataset currently comprises a total of 2,593,685 intact LTR-RTs from genomes of 300 plant species representing 93 families of 46 orders. The dataset is accompanied by sequence, diverse structural and functional annotation, age determination and classification information associated with the LTR-RTs. This dataset will contribute valuable resources for investigating the evolutionary dynamics and functional implications of LTR-RTs in plant genomes.
Assuntos
Genoma de Planta , Plantas/genética , Retroelementos , Sequências Repetidas Terminais , Evolução Molecular , Anotação de Sequência MolecularRESUMO
This paper aims to develop a method for the determination of aloe-emodin, rhein, chrysophanol and physcion and study the pharmacokinetic properties of four anthraquinones in rat plasma after oral administration of gardenia and rhubarb decoction. The plasma concentrations at different time points of four anthraquinones were determined by HPLC-FLD method. Plasma samples were extracted with liquid-liquid extraction procedure. Plasma samples were separated on a C18 column (4.6 mm x 150 mm, 5 µm), using 0.2% acetic acid and methanol as mobile phase at a flow rate of 1.0 mL min(-1) with gradient elution. The excitation and emission wavelengths were set at 430, 525 nm, respectively. DAS 2.0 software was applied to calculate the pharmacokinetic parameters. The results showed four anthraquinones can be absorbed. The main parameters of aloe-emodin, rhein, chrysophanol and physcion were as follows: C(max) for aloe-emodin was (0.085 ± 0.058), (3.772 ± 1.152), (0.464 ± 0.267), (0.028 ± 0.008) mg x L(-1) respectively; t(max) for rhein was (1.042 ± 0.510), (0.805 ± 0.307), (1.167 ± 0.283), (0.616 ± 0.162) h respectively; t½ for chrysophanol was (3.557 ± 1.250), (6.879 ± 1.126), (5.196 ± 2.032), (4.337 ± 1.816) h; AUC(0-t) for physcion was (0.504 ± 0.130), (9.558 ± 1.106), (2.545 ± 1.554), (0.052 ± 0.018) mg x h x L(-1). This paper developed a selective, accurate and sensitive HPLC-FLD method for the simultaneous determination of four anthraquiones in rat plasma.
Assuntos
Antraquinonas/sangue , Cromatografia Líquida de Alta Pressão/métodos , Medicamentos de Ervas Chinesas/análise , Animais , Antraquinonas/farmacocinética , Cromatografia Líquida de Alta Pressão/instrumentação , Medicamentos de Ervas Chinesas/farmacocinética , Masculino , Ratos , Ratos Sprague-DawleyRESUMO
A field experiment with 10 wheat cultivars was conducted to study the water consumption characteristics at different growth stages and the differences in the grain yield of the cultivars. Three irrigation treatments were installed, i.e., no irrigation (W0), irrigation before sowing and at jointing stage (W1), and irrigation before sowing and at jointing and anthesis stages (W2), with irrigation amount 60 mm each time. Based on the cluster analysis with the parameters grain yield and water use efficiency (WUE) in the three treatments, the test ten cultivars could be divided into three groups, i.e., high yield and high WUE (Group I), high yield and medium WUE (Group II), and medium yield and low WUE (Group III). The average values of grain yield and WUE in each group were calculated to elucidate the water consumption characteristics of the three groups. In treatment W0, the total water consumption amount in the whole growth period, the water consumption amount from anthesis to maturing stages and its proportion to the total water consumption amount of Group I were lower than those of Group II and Group III, but the grain yield of Group I was the highest. In treatment W1, the water consumption amount from jointing to anthesis stages and its proportion to total water consumption amount of Group I were lower than those of Group II and Group III, but the water consumption amount from anthesis to maturing stages had no significant differences among Group I, Group II, and Group III. In treatment W2, the total soil water consumption amount, water consumption amount from jointing to anthesis stages and its proportion to total water consumption amount of Group I were lower than those of Group II and Group III, while the water consumption amount from anthesis to maturity stages and its proportion to total water consumption amount of both Group I and Group III were lower than those of Group II. In terms of high-yield and water-saving under the present experimental condition, it was implicated that the most appropriate cultivars might fall into the Group I with high yield and high WUE, and the most appropriate irrigation regime with high yield and low water consumption was treatment W1, i.e., irrigated 60 mm each time before sowing and at jointing stage.
Assuntos
Irrigação Agrícola/métodos , Biomassa , Triticum/classificação , Triticum/crescimento & desenvolvimento , Água/metabolismo , China , Grão Comestível/crescimento & desenvolvimento , Triticum/metabolismoRESUMO
OBJECTIVE: To establish 2-dimensional electrophoresis (2-DE) maps of Helicobac-ter pylori in human gastritis, and gastric cancer, to identify the differentially expressed proteins,and to discuss the role of bacterial factor in pathogenesis. METHODS: The total proteins of Helicobacter pylori in human gastritis and gastric cancer were separated by immobilized pH gradient-based 2-DE. The differentially expressed proteins were screened by PDQuest analysis software and identified by peptide mass fingerprint based on matrix-assisted laser desorption/ionization time of flight mass spectrometry, and searched on database. RESULTS: A well-resolved and reproducible 2-DE pattern of Helicobacter pylori was obtained from patients with human gastritis and gastric cancer. Fourteen differentially expressed proteins were identified, including proteins related to anti-oxidation,molecular chape-rones and detoxification, enzymes related to metabolism,proteins related to cytoarchitecture,and proteins related to signal conduction. CONCLUSION: A well-resolved and reproducible 2-DE pattern of Helicobacter pylori in human gastritis and gastric cancer is established and differentially expressed proteins from these 2 diseases are identified. The differentiation of protein expression may play an important role in the pathogenesis of gastric cancer.
Assuntos
Proteínas de Bactérias/análise , Infecções por Helicobacter/microbiologia , Helicobacter pylori , Proteômica/métodos , Neoplasias Gástricas/microbiologia , Eletroforese em Gel Bidimensional , Feminino , Gastrite/microbiologia , Humanos , Masculino , Proteoma/análiseRESUMO
OBJECTIVE: To determine the influence factors of perinatal stage transmission of hepatitis B virus (HBV) and to provide scientific evidence for the prevention of perinatal stag transmission of HBV. METHODS: A 1:1 matched nested case-control study was conducted, and 141 pair of pregnant women with HBsAg-positive and their newborns were enrolled. A questionnaire was performed and blood-related indicators were detected. The data were dealt with single factor analysis and conditional logistic regression analysis using SPSS 13.0 and SAS 8.1. RESULTS: Single factor paired Chi-square test showed that education, first class family history, disfunction of liver, serum glutamic-pyruvic transaminase, systematic treatment, intrahepatic cholestasis of pregnancy (ICP), fetal distress, and vaccinating hepatitis B immune globulin (HBIG) were the risk factors of perinatal stage transmission of HBV. Conditional logistic regression analysis indicated that first class family history, vaccinating HBIG, systematic treatment, and ICP were the risk factors of perinatal stage transmission of HBV. CONCLUSION: For women with HB-sAg-positive, active treatment, the standard vaccination of HBIG, and preventing and controlling the incidence of ICP may reduce the incidence of perinatal stage transmission of HBV.