Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 527
Filter
Add more filters

Complementary Medicines
Publication year range
1.
Sci Data ; 11(1): 161, 2024 Feb 02.
Article in English | MEDLINE | ID: mdl-38307894

ABSTRACT

Anisodus tanguticus is a medicinal herb that belongs to the Anisodus genus of the Solanaceae family. This endangered herb is mainly distributed in Qinghai-Tibet Plateau. In this study, we combined the Illumina short-read, Nanopore long-read and high-throughput chromosome conformation capture (Hi-C) sequencing technologies to de novo assemble the A. tanguticus genome. A high-quality chromosomal-level genome assembly was obtained with a genome size of 1.26 Gb and a contig N50 of 25.07 Mb. Of the draft genome sequences, 97.47% were anchored to 24 pseudochromosomes with a scaffold N50 of 51.28 Mb. In addition, 842.14 Mb of transposable elements occupying 66.70% of the genome assembly were identified and 44,252 protein-coding genes were predicted. The genome assembly of A. tanguticus will provide genetic repertoire to understand the adaptation strategy of Anisodus species in the plateau, which will further promote the conservation of endangered A. tanguticus resources.


Subject(s)
Genome, Plant , Plants, Medicinal , Solanaceae , Molecular Sequence Annotation , Phylogeny , Plants, Medicinal/genetics , Solanaceae/genetics , Tibet , Chromosomes, Plant
2.
Genes Genomics ; 46(2): 187-202, 2024 02.
Article in English | MEDLINE | ID: mdl-38240922

ABSTRACT

BACKGROUND: Persicaria maackiana (Regel) is a potential medicinal plant that exerts anti-diabetic effects. However, the lack of genomic information on P. maackiana hinders research at the molecular level. OBJECTIVE: Herein, we aimed to construct a draft genome assembly and obtain comprehensive genomic information on P. maackiana using high-throughput sequencing tools PacBio Sequel II and Illumina. METHODS: Persicaria maackiana samples from three natural populations in Gaecheon, Gichi, and Uiryeong reservoirs in South Korea were used to generate genomic DNA libraries, perform genome de novo assembly, gene ontology analysis, phylogenetic tree analysis, genotyping, and identify microsatellite markers. RESULTS: The assembled P. maackiana genome yielded 32,179 contigs. Assessment of assembly integrity revealed 1503 (93.12%) complete Benchmarking Universal Single-Copy Orthologs. A total of 64,712 protein-coding genes were predicted and annotated successfully in the protein database. In the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologs, 13,778 genes were annotated into 18 categories. Genes that activated AMPK were identified in the KEGG pathway. A total of 316,992 microsatellite loci were identified, and primers targeting the flanking regions were developed for 292,059 microsatellite loci. Of these, 150 primer sets were randomly selected for amplification, and 30 of these primer sets were identified as polymorphic. These primers amplified 3-9 alleles. The mean observed and expected heterozygosity were 0.189 and 0.593, respectively. Polymorphism information content values of the markers were 0.361-0.754. CONCLUSION: Collectively, our study provides a valuable resource for future comparative genomics, phylogeny, and population studies of P. maackiana.


Subject(s)
Polygonaceae , Molecular Sequence Annotation , Phylogeny , Polygonaceae/genetics , Genomics , Microsatellite Repeats/genetics
3.
Nucleic Acids Res ; 52(D1): D1347-D1354, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37870445

ABSTRACT

Medicinal plants have garnered significant attention in ethnomedicine and traditional medicine due to their potential antitumor, anti-inflammatory and antioxidant properties. Recent advancements in genome sequencing and synthetic biology have revitalized interest in natural products. Despite the availability of sequenced genomes and transcriptomes of these plants, the absence of publicly accessible gene annotations and tabular formatted gene expression data has hindered their effective utilization. To address this pressing issue, we have developed IMP (Integrated Medicinal Plantomics), a freely accessible platform at https://www.bic.ac.cn/IMP. IMP curated a total of 8 565 672 genes for 84 high-quality genome assemblies, and 2156 transcriptome sequencing samples encompassing various organs, tissues, developmental stages and stimulations. With the integrated 10 analysis modules, users could simply examine gene annotations, sequences, functions, distributions and expressions in IMP in a one-stop mode. We firmly believe that IMP will play a vital role in enhancing the understanding of molecular metabolic pathways in medicinal plants or plants with medicinal benefits, thereby driving advancements in synthetic biology, and facilitating the exploration of natural sources for valuable chemical constituents like drug discovery and drug production.


Subject(s)
Plants, Medicinal , Software , Transcriptome , Chromosome Mapping , Genomics , Molecular Sequence Annotation , Plants, Medicinal/genetics , Plants, Medicinal/chemistry
4.
Sci Data ; 10(1): 873, 2023 12 06.
Article in English | MEDLINE | ID: mdl-38057329

ABSTRACT

Lithocarpus, with >320 species, is the second largest genus of Fagaceae. However, the lack of a reference genome limits the molecular biology and functional study of Lithocarpus species. Here, we report the chromosome-scale genome assembly of sweet tea (Lithocarpus polystachyus Rehder), the first Lithocarpus species to be sequenced to date. Sweet tea has a 952-Mb genome, with a 21.4-Mb contig N50 value and 98.6% complete BUSCO score. In addition, the per-base consensus accuracy and completeness of the genome were estimated at 60.6 and 81.4, respectively. Genome annotation predicted 37,396 protein-coding genes, with repetitive sequences accounting for 64.2% of the genome. The genome did not undergo whole-genome duplication after the gamma (γ) hexaploidy event. Phylogenetic analysis showed that sweet tea diverged from the genus Quercus approximately at 59 million years ago. The high-quality genome assembly and gene annotation resources enrich the genomics of sweet tea, and will facilitate functional genomic studies in sweet tea and other Fagaceae species.


Subject(s)
Genome, Plant , Quercus , Chromosomes , Molecular Sequence Annotation , Phylogeny , Quercus/genetics , Tea
5.
Sci Data ; 10(1): 901, 2023 Dec 15.
Article in English | MEDLINE | ID: mdl-38102170

ABSTRACT

Microcos paniculata is a shrub used traditionally as folk medicine and to make herbal teas. Previous research into this species has mainly focused on its chemical composition and medicinal value. However, the lack of a reference genome limits the study of the molecular mechanisms of active compounds in this species. Here, we assembled a haplotype-resolved chromosome-level genome of M. paniculata based on PacBio HiFi and Hi-C data. The assembly contains two haploid genomes with sizes 399.43 Mb and 393.10 Mb, with contig N50 lengths of 43.44 Mb and 30.17 Mb, respectively. About 99.93% of the assembled sequences could be anchored to 18 pseudo-chromosomes. Additionally, a total of 482 Mb repeat sequences were identified, accounting for 60.76% of the genome. A total of 49,439 protein-coding genes were identified, of which 48,979 (99%) were functionally annotated. This haplotype-resolved chromosome-level assembly and annotation of M. paniculata will serve as a valuable resource for investigating the biosynthesis and genetic basis of active compounds in this species, as well as advancing evolutionary phylogenomic studies in Malvales.


Subject(s)
Chromosomes, Plant , Genome, Plant , Biological Evolution , Haploidy , Haplotypes , Molecular Sequence Annotation , Phylogeny
6.
Zhongguo Zhong Yao Za Zhi ; 48(20): 5531-5539, 2023 Oct.
Article in Chinese | MEDLINE | ID: mdl-38114145

ABSTRACT

"Tangjie" leaves of cultivated Qinan agarwood were used to obtain the complete chloroplast genome using high-throughput sequencing technology. Combined with 12 chloroplast genomes of Aquilaria species downloaded from NCBI, bioinformatics method was employed to determine the chloroplast genome characteristics and phylogenetic relationships. The results showed that the chloroplast genome sequence length of cultivated Qinan agarwood "Tangjie" leaves was 174 909 bp with a GC content of 36.7%. A total of 136 genes were annotated, including 90 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. Sequence repeat analysis detected 80 simple sequence repeats(SSRs) and 124 long sequence repeats, with most SSRs composed of A and T bases. Codon preference analysis revealed that AUU was the most frequently used codon, and codons with A and U endings were preferred. Comparative analysis of Aquilaria chloroplast genomes showed relative conservation of the IR region boundaries and identified five highly variable regions: trnD-trnY, trnT-trnL, trnF-ndhJ, petA-cemA, and rpl32, which could serve as potential DNA barcodes specific to the Aquilaria genus. Selection pressure analysis indicated positive selection in the rbcL, rps11, and rpl32 genes. Phylogenetic analysis revealed that cultivated Qinan agarwood "Tangjie" and Aquilaria agallocha clustered together(100% support), supporting the Chinese origin of Qinan agarwood from Aquilaria agallocha. The chloroplast genome data obtained in this study provide a foundation for studying the genetic diversity of cultivated Qinan agarwood and molecular identification of the Aquilaria genus.


Subject(s)
Genome, Chloroplast , Thymelaeaceae , Phylogeny , Codon , Molecular Sequence Annotation , Thymelaeaceae/genetics
7.
BMC Genom Data ; 24(1): 73, 2023 11 28.
Article in English | MEDLINE | ID: mdl-38017381

ABSTRACT

OBJECTIVES: Erythrophleum is a genus in the Fabaceae family. The genus contains only about 10 species, and it is best known for its hardwood and medical properties worldwide. Erythrophleum fordii Oliv. is the only species of this genus distributed in China. It has superior wood and can be used in folk medicine, which leads to its overexploitation in the wild. For its effective conservation and elucidation of the distinctive genetic traits of wood formation and medical components, we present its first genome assembly. DATA DESCRIPTION: This work generated ~ 160.8 Gb raw Nanopore whole genome sequencing (WGS) long reads, ~ 126.0 Gb raw MGI WGS short reads and ~ 29.0 Gb raw RNA-seq reads using E. fordii leaf tissues. The de novo assembly contained 864,825,911 bp in the E. fordii genome, with 59 contigs and a contig N50 of 30,830,834 bp. Benchmarking Universal Single-Copy Orthologs (BUSCO) revealed 98.7% completeness of the assembly. The assembly contained 471,006,885 bp (54.4%) repetitive sequences and 28,761 genes that coded for 33,803 proteins. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.


Subject(s)
Fabaceae , Trees , Molecular Sequence Annotation , Genome , China
8.
Genes (Basel) ; 14(11)2023 Nov 02.
Article in English | MEDLINE | ID: mdl-38002978

ABSTRACT

This study introduces a meticulously constructed genome assembly at the chromosome level for the Rosaceae family species Prinsepia uniflora, a traditional Chinese medicinal herb. The final assembly encompasses 1272.71 megabases (Mb) distributed across 16 pseudochromosomes, boasting contig and super-scaffold N50 values of 2.77 and 79.32 Mb, respectively. Annotated within this genome is a substantial 875.99 Mb of repetitive sequences, with transposable elements occupying 777.28 Mb, constituting 61.07% of the entire genome. Our predictive efforts identified 49,261 protein-coding genes within the repeat-masked assembly, with 45,256 (91.87%) having functional annotations, 5127 (10.41%) demonstrating tandem duplication, and 2373 (4.82%) classified as transcription factor genes. Additionally, our investigation unveiled 3080 non-coding RNAs spanning 0.51 Mb of the genome sequences. According to our evolutionary study, P. uniflora underwent recent whole-genome duplication following its separation from Prunus salicina. The presented reference-level genome assembly and annotation for P. uniflora will significantly facilitate the in-depth exploration of genomic information pertaining to this species, offering substantial utility in comparative genomics and evolutionary analyses involving Rosaceae species.


Subject(s)
Rosaceae , Rosaceae/genetics , Molecular Sequence Annotation , Phylogeny , Genomics , DNA Transposable Elements/genetics
9.
Sci Rep ; 13(1): 17319, 2023 10 12.
Article in English | MEDLINE | ID: mdl-37828031

ABSTRACT

Phyllanthus emblica (Aonla, Indian Gooseberry) is known to have various medicinal properties, but studies to understand its genetic structure are limited. Among the various secondary metabolites, ascorbic acid, flavonoids, terpenoids, phenols and tannins possess great potential for its pharmacological applications. Keeping this consideration, we assembled the transcriptome using the Illumina RNASeq500 platform, generating 39,933,248 high-quality paired-end reads assembled into 1,26,606 transcripts. A total of 87,771 unigenes were recovered after isoforms and unambiguous sequences deletion. Functional annotation of 43,377 coding sequences against the NCBI non-redundant (Nr) database search using BlastX yielded 38,692 sequences containing blast hits and found 4685 coding sequences to be unique. The transcript showed maximum similarity to Hevea brasilensis (16%), followed by to Jatropha curcas (12%). Considering key genes involved in the biosynthesis of flavonoids and various classes of terpenoid compounds, thirty EST-SSR primer sequences were designed based on transcriptomic data. Of which, 12 were found to be highly polymorphic with an average of 86.38%. The average value for marker index (MI), effective multiplicity ratio (EMR), resolution power (Rp) and polymorphic information content (PIC) was 7.20, 8.34, 8.64 and 0.80, respectively. Thus, from this study, we developed newly EST-SSRs linked to important genes involved in the secondary metabolites biosynthesis that will be serving as an invaluable genetic resource for crop improvement including the selection of elite genotypes in P. emblica and its closely related Phyllanthaceae species.


Subject(s)
Phyllanthus emblica , Plants, Medicinal , Phyllanthus emblica/genetics , Sequence Analysis, DNA , Genes, Plant , Plants, Medicinal/genetics , Gene Expression Profiling , Transcriptome , Flavonoids , Molecular Sequence Annotation , Microsatellite Repeats/genetics
10.
Sheng Wu Gong Cheng Xue Bao ; 39(7): 2954-2964, 2023 Jul 25.
Article in Chinese | MEDLINE | ID: mdl-37584142

ABSTRACT

Incarvillea younghusbandii Sprague is a traditional tonic herb. The roots are used as herbal medicine for nourishing and strengthening, as well as treating postpartum milk deficiency and weakness. In this study, the chloroplast genome of I. younghusbandii was sequenced and assembled by the high-throughput sequencing technology. The sequence characteristics, sequence repeats, codon usage bias, phylogenetic relationships and estimated divergence time of I. younghusbandii were analyzed. The 159 323 bp sequence contained a large single copy (80 197 bp), a small single copy (9 030 bp) and two inverted repeat sequences (35 048 bp). It contained 120 genes, including 77 protein coding genes, 8 ribosomal RNA genes and 35 transfer RNA genes. AAA was the most frequent codon in the chloroplast coding sequence of I. younghusbandii. A total of 42 simple sequence repeats were identified in the chloroplast genome. Phylogenetic analysis revealed I. younghusbandii was mostly like its taxonomically close relative Incarvillea compacta. The divergence between I. younghusbandii and I. compacta was dated to 4.66 million years ago. This study was significant for the scientific conservation and development of resources related to I. compacta. It also provides a basic genetic resource for the subsequent species identification of the genus Incarvillea, and the population genetic diversity study of Bignoniaceae.


Subject(s)
Genome, Chloroplast , Phylogeny , Molecular Sequence Annotation , Sequence Analysis, DNA , Whole Genome Sequencing
11.
Sci Data ; 10(1): 507, 2023 08 02.
Article in English | MEDLINE | ID: mdl-37532689

ABSTRACT

Cyclocarya paliurus, an endemic species in the genus Juglandaceae with the character of heterodichogamy, is one of triterpene-rich medicinal plants in China. To uncover the genetic mechanisms behind the special characteristics, we sequenced the genomes of two diploid (protandry, PA-dip and protogyny, PG-dip) and one auto-tetraploid (PA-tetra) C. paliurus genomes. Based on 134.9 (~225x), 75.5 (~125x) and 271.8 Gb (~226x) subreads of PacBio platform sequencing data, we assembled 586.62 Mb (contig N50 = 1.9 Mb), 583.45 Mb (contig N50 = 1.4 Mb), and 2.38 Gb (contig N50 = 430.9 kb) for PA-dip, PG-dip and PA-tetra genome, respectively. Furthermore, 543.53, 553.87, and 2168.65 Mb in PA-dip, PG-dip, and PA-tetra, were respectively anchored to 16, 16, and 64 pseudo-chromosomes using over 65.4 Gb (~109x), 68 Gb (~113x), and 264 (~220x) Hi-C sequencing data. Annotation of PA-dip, PG-dip, and PA-tetra genome assembly identified 34,699, 35,221, and 34,633 protein-coding genes (90,752 gene models) or allele-defined genes, respectively. In addition, 45 accessions from nine locations were re-sequenced, and more than 10 × coverage reads were generated.


Subject(s)
Genome, Plant , Juglandaceae , Chromosomes , Diploidy , Juglandaceae/genetics , Molecular Sequence Annotation , Phylogeny , Tetraploidy
12.
Sci Data ; 10(1): 341, 2023 06 01.
Article in English | MEDLINE | ID: mdl-37264053

ABSTRACT

The prickly nightshade Solanum rostratum, an annual malignant weed, is native to North America and has globally invaded 34 countries, causing serious threats to ecosystems, agriculture, animal husbandry, and human health. In this study, we constructed a chromosome-level genome assembly and annotation of S. rostratum. The contig-level genome was initially assembled in 898.42 Mb with a contig N50 of 62.00 Mb from PacBio high-fidelity reads. With Hi-C sequencing data scaffolding, 96.80% of the initially assembled sequences were anchored and orientated onto 12 pseudo-chromosomes, generating a genome of 869.69 Mb with a contig N50 of 72.15 Mb. We identified 649.92 Mb (72.26%) of repetitive sequences and 3,588 non-coding RNAs in the genome. A total of 29,694 protein-coding genes were predicted, with 28,154 (94.81%) functionally annotated genes. We found 99.5% and 91.3% complete embryophyta_odb10 genes in the pseudo-chromosomes genome and predicted gene datasets by BUSCO assessment. The present genomic resource provides essential information for subsequent research on the mechanisms of environmental adaptation of S. rostratum and host shift in Colorado potato beetles.


Subject(s)
Genome, Plant , Solanum , Chromosomes , Ecosystem , Molecular Sequence Annotation , Phylogeny , Solanum/genetics
13.
Int J Mol Sci ; 24(11)2023 May 24.
Article in English | MEDLINE | ID: mdl-37298166

ABSTRACT

Andrographis paniculata belongs to the family Acanthaceae and is known for its medicinal properties owing to the presence of unique constituents belonging to the lactones, diterpenoids, diterpene glycosides, flavonoids, and flavonoid glycosides groups of chemicals. Andrographolide, a major therapeutic constituent of A. paniculata, is extracted primarily from the leaves of this plant and exhibits antimicrobial and anti-inflammatory activities. Using 454 GS-FLX pyrosequencing, we have generated a whole transcriptome profile of entire leaves of A. paniculata. A total of 22,402 high-quality transcripts were generated, with an average transcript length and N50 of 884 bp and 1007 bp, respectively. Functional annotation revealed that 19,264 (86%) of the total transcripts showed significant similarity with the NCBI-Nr database and were successfully annotated. Out of the 19,264 BLAST hits, 17,623 transcripts were assigned GO terms and distributed into three major functional categories: molecular function (44.62%), biological processes (29.19%), and cellular component (26.18%) based on BLAST2GO. Transcription factor analysis showed 6669 transcripts, belonging to 57 different transcription factor families. Fifteen TF genes that belong to the NAC, MYB, and bHLH TF categories were validated by RT PCR amplification. In silico analysis of gene families involved in the synthesis of biochemical compounds having medicinal values, such as cytochrome p450, protein kinases, heat shock proteins, and transporters, was completed and a total of 102 different transcripts encoding enzymes involved in the biosynthesis of terpenoids were predicted. Out of these, 33 transcripts belonged to terpenoid backbone biosynthesis. This study also identified 4254 EST-SSRs from 3661 transcripts, representing 16.34% of the total transcripts. Fifty-three novel EST-SSR markers generated from our EST dataset were used to assess the genetic diversity among eighteen A. paniculata accessions. The genetic diversity analysis revealed two distinct sub-clusters and all accessions based on the genetic similarity index were distinct from each other. A database based on EST transcripts, EST-SSR markers, and transcription factors has been developed using data generated from the present study combined with available transcriptomic resources from a public database using Meta transcriptome analysis to make genomic resources available in one place to the researchers working on this medicinal plant.


Subject(s)
Andrographis paniculata , Transcription Factors , Molecular Sequence Annotation , Transcription Factors/genetics , Expressed Sequence Tags , Gene Expression Profiling , Transcriptome , Microsatellite Repeats/genetics , Databases, Genetic , Glycosides
14.
BMC Genomics ; 24(1): 197, 2023 Apr 12.
Article in English | MEDLINE | ID: mdl-37046210

ABSTRACT

BACKGROUND: Peepal/Bodhi tree (Ficus religiosa L.) is an important, long-lived keystone ecological species. Communities on the Indian subcontinent have extensively employed the plant in Ayurveda, traditional medicine, and spiritual practices. The Peepal tree is often thought to produce oxygen both during the day and at night by Indian folks. The goal of our research was to produce molecular resources using whole-genome and transcriptome sequencing techniques. RESULTS: The complete genome of the Peepal tree was sequenced using two next-generation sequencers Illumina HiSeq1000 and MGISEQ-2000. We assembled the draft genome of 406 Mb, using a hybrid assembly workflow. The genome annotation resulted in 35,093 protein-coding genes; 53% of its genome consists of repetitive sequences. To understand the physiological pathways in leaf tissues, we analyzed photosynthetically distinct conditions: bright sunny days and nights. The RNA-seq analysis supported the expression of 26,479 unigenes. The leaf transcriptomic analysis of the diurnal and nocturnal periods revealed the expression of the significant number of genes involved in the carbon-fixation pathway. CONCLUSIONS: This study presents a draft hybrid genome assembly for F. religiosa and its functional annotated genes. The genomic and transcriptomic data-derived pathways have been analyzed for future studies on the Peepal tree.


Subject(s)
Ficus , Transcriptome , Gene Expression Profiling , Genomics , Base Sequence , Molecular Sequence Annotation
15.
DNA Res ; 30(1)2023 Feb 01.
Article in English | MEDLINE | ID: mdl-36383440

ABSTRACT

Perilla frutescens (Lamiaceae) is an important herbal plant with hundreds of bioactive chemicals, among which perillaldehyde and rosmarinic acid are the two major bioactive compounds in the plant. The leaves of red perilla are used as traditional Kampo medicine or food ingredients. However, the medicinal and nutritional uses of this plant could be improved by enhancing the production of valuable metabolites through the manipulation of key enzymes or regulatory genes using genome editing technology. Here, we generated a high-quality genome assembly of red perilla domesticated in Japan. A near-complete chromosome-level assembly of P. frutescens was generated contigs with N50 of 41.5 Mb from PacBio HiFi reads. 99.2% of the assembly was anchored into 20 pseudochromosomes, among which seven pseudochromosomes consisted of one contig, while the rest consisted of less than six contigs. Gene annotation and prediction of the sequences successfully predicted 86,258 gene models, including 76,825 protein-coding genes. Further analysis showed that potential targets of genome editing for the engineering of anthocyanin pathways in P. frutescens are located on the late-stage pathways. Overall, our genome assembly could serve as a valuable reference for selecting target genes for genome editing of P. frutescens.


Subject(s)
Lamiaceae , Perilla frutescens , Perilla , Perilla frutescens/genetics , Perilla frutescens/chemistry , Perilla frutescens/metabolism , Perilla/genetics , Perilla/chemistry , Japan , Lamiaceae/genetics , Molecular Sequence Annotation
16.
DNA Res ; 30(1)2023 Feb 01.
Article in English | MEDLINE | ID: mdl-36208288

ABSTRACT

A contiguous assembly of the inbred 'EL10' sugar beet (Beta vulgaris ssp. vulgaris) genome was constructed using PacBio long-read sequencing, BioNano optical mapping, Hi-C scaffolding, and Illumina short-read error correction. The EL10.1 assembly was 540 Mb, of which 96.2% was contained in nine chromosome-sized pseudomolecules with lengths from 52 to 65 Mb, and 31 contigs with a median size of 282 kb that remained unassembled. Gene annotation incorporating RNA-seq data and curated sequences via the MAKER annotation pipeline generated 24,255 gene models. Results indicated that the EL10.1 genome assembly is a contiguous genome assembly highly congruent with the published sugar beet reference genome. Gross duplicate gene analyses of EL10.1 revealed little large-scale intra-genome duplication. Reduced gene copy number for well-annotated gene families relative to other core eudicots was observed, especially for transcription factors. Variation in genome size in B. vulgaris was investigated by flow cytometry among 50 individuals producing estimates from 633 to 875 Mb/1C. Read-depth mapping with short-read whole-genome sequences from other sugar beet germplasm suggested that relatively few regions of the sugar beet genome appeared associated with high-copy number variation.


Subject(s)
Beta vulgaris , Humans , Beta vulgaris/genetics , DNA Copy Number Variations , Chromosomes , Molecular Sequence Annotation , Sugars
17.
Chinese Journal of Biotechnology ; (12): 2954-2964, 2023.
Article in Chinese | WPRIM | ID: wpr-981243

ABSTRACT

Incarvillea younghusbandii Sprague is a traditional tonic herb. The roots are used as herbal medicine for nourishing and strengthening, as well as treating postpartum milk deficiency and weakness. In this study, the chloroplast genome of I. younghusbandii was sequenced and assembled by the high-throughput sequencing technology. The sequence characteristics, sequence repeats, codon usage bias, phylogenetic relationships and estimated divergence time of I. younghusbandii were analyzed. The 159 323 bp sequence contained a large single copy (80 197 bp), a small single copy (9 030 bp) and two inverted repeat sequences (35 048 bp). It contained 120 genes, including 77 protein coding genes, 8 ribosomal RNA genes and 35 transfer RNA genes. AAA was the most frequent codon in the chloroplast coding sequence of I. younghusbandii. A total of 42 simple sequence repeats were identified in the chloroplast genome. Phylogenetic analysis revealed I. younghusbandii was mostly like its taxonomically close relative Incarvillea compacta. The divergence between I. younghusbandii and I. compacta was dated to 4.66 million years ago. This study was significant for the scientific conservation and development of resources related to I. compacta. It also provides a basic genetic resource for the subsequent species identification of the genus Incarvillea, and the population genetic diversity study of Bignoniaceae.


Subject(s)
Phylogeny , Molecular Sequence Annotation , Genome, Chloroplast , Sequence Analysis, DNA , Whole Genome Sequencing
18.
Mol Plant Microbe Interact ; 35(12): 1124-1126, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36508486

ABSTRACT

Acinetobacter schindleri is an endophyte of Pseudostellaria heterophylla, a traditional Chinese herbal plant. It has high degradation activity to toxins produced by fungal pathogen Fusarium graminearum. Here, we deployed PacBio single-molecule real-time long-read sequencing technology to generate a complete genome assembly for the Acinetobacter schindleri H4-3-C1 strain and obtained 1.59 Gb of clean reads. These reads were assembled to a single circular DNA chromosome with a length of 3,265,024 bp, and no plasmid was found in the genome. Totals of 3,193 coding sequences, 91 transfer RNA, 21 ribosomal RNA, and 75 small RNAs were identified in the genome. This high-quality genome assembly and gene annotation resource will facilitate the excavation of the zearalenone degradation gene and provide valuable resources for preventing and controlling toxigenic fungal diseases of P. heterophylla. [Formula: see text] Copyright © 2022 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.


Subject(s)
Acinetobacter , Endophytes , Molecular Sequence Annotation , Acinetobacter/genetics , Plasmids , Plant Diseases/microbiology , Genome, Fungal
19.
Planta ; 256(6): 109, 2022 Nov 09.
Article in English | MEDLINE | ID: mdl-36350413

ABSTRACT

MAIN CONCLUSION: We report the genome assembly of P. cochinchinensis, as the first high-quality chromosome-level genome of Phyllanthaceae which is rich in medicinal plants. Phyllanthus cochinchinensis, a member of the Phyllanthaceae, is one of the famous medicinal plants in South China. Here, we report a de novo chromosome-level genome assembly for P. cochinchinensis using a combination of Nanopore and Illumina sequencing technologies. In total, the assembled genome consists of 284.88 Mb genomic sequences with a contig N50 of 10.32 Mb, representing ~ 95.49% of the estimated genome size. By applying Hi-C data, 13 pseudochromosomes of P. cochinchinensis were constructed, covering ~ 99.87% of the assembled sequences. The genome is annotated with 59.12% repetitive sequences and 20,836 protein-coding genes. Whole-genome duplication of P. cochinchinensis is likely shared with Ricinus communis as well as Vitis vinifera. Homologous genes within the flavonoid pathway for P. cochinchinensis were identified and copy numbers and expression level of related genes revealed potential critical genes involved in flavonoid biosynthesis. This study provides the first whole-genome sequence for the Phyllanthaceae, confirms the evolutionary status of Phyllanthus from the genomic level, and provides foundations for accelerating functional genomic research of species from Phyllanthus.


Subject(s)
Malpighiales , Phyllanthus , Molecular Sequence Annotation , Phyllanthus/genetics , Phylogeny , Chromosomes
20.
Bioinformatics ; 38(23): 5168-5174, 2022 11 30.
Article in English | MEDLINE | ID: mdl-36227117

ABSTRACT

MOTIVATION: The advent of massive DNA sequencing technologies is producing a huge number of human single-nucleotide polymorphisms occurring in protein-coding regions and possibly changing their sequences. Discriminating harmful protein variations from neutral ones is one of the crucial challenges in precision medicine. Computational tools based on artificial intelligence provide models for protein sequence encoding, bypassing database searches for evolutionary information. We leverage the new encoding schemes for an efficient annotation of protein variants. RESULTS: E-SNPs&GO is a novel method that, given an input protein sequence and a single amino acid variation, can predict whether the variation is related to diseases or not. The proposed method adopts an input encoding completely based on protein language models and embedding techniques, specifically devised to encode protein sequences and GO functional annotations. We trained our model on a newly generated dataset of 101 146 human protein single amino acid variants in 13 661 proteins, derived from public resources. When tested on a blind set comprising 10 266 variants, our method well compares to recent approaches released in literature for the same task, reaching a Matthews Correlation Coefficient score of 0.72. We propose E-SNPs&GO as a suitable, efficient and accurate large-scale annotator of protein variant datasets. AVAILABILITY AND IMPLEMENTATION: The method is available as a webserver at https://esnpsandgo.biocomp.unibo.it. Datasets and predictions are available at https://esnpsandgo.biocomp.unibo.it/datasets. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Artificial Intelligence , Polymorphism, Single Nucleotide , Humans , Amino Acid Sequence , Proteins/genetics , Proteins/chemistry , Amino Acids , Computational Biology/methods , Molecular Sequence Annotation
SELECTION OF CITATIONS
SEARCH DETAIL