RESUMO
BACKGROUND: Dung beetles recycle organic matter through the decomposition of feces and support ecological balance. However, these insects are threatened by the indiscriminate use of agrochemicals and habitat destruction. Copris tripartitus Waterhouse (Coleoptera: Scarabaeidae), a dung beetle, is listed as a class-II Korean endangered species. Although the genetic diversity of C. tripartitus populations has been investigated through analysis of mitochondrial genes, genomic resources for this species remain limited. In this study, we analyzed the transcriptome of C. tripartitus to elucidate functions related to growth, immunity and reproduction for the purpose of informed conservation planning. RESULTS: The transcriptome of C. tripartitus was generated using next-generation Illumina sequencing and assembled de novo using a Trinity-based platform. In total, 98.59% of the raw sequence reads were processed as clean reads. These reads were assembled into 151,177 contigs, 101,352 transcripts, and 25,106 unigenes. A total of 23,450 unigenes (93.40%) were annotated to at least one database. The largest proportion of unigenes (92.76%) were annotated to the locally curated PANM-DB. A maximum of 5,512 unigenes had homologous sequences in Tribolium castaneum. Gene Ontology (GO) analysis revealed a maximum of 5,174 unigenes in the Molecular function category. Further, in Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, a total of 462 enzymes were associated with established biological pathways. Based on sequence homology to known proteins in PANM-DB, representative immunity, growth, and reproduction-related genes were screened. Potential immunity-related genes were categorized into pattern recognition receptors (PRRs), the Toll-like receptor signaling pathway, the MyD88- dependent pathway, endogenous ligands, immune effectors, antimicrobial peptides, apoptosis, and adaptation-related transcripts. Among PRRs, we conducted detailed in silico characterization of TLR-2, CTL, and PGRP_SC2-like. Repetitive elements such as long terminal repeats, short interspersed nuclear elements, long interspersed nuclear elements and DNA elements were enriched in the unigene sequences. A total of 1,493 SSRs were identified among all unigenes of C. tripartitus. CONCLUSIONS: This study provides a comprehensive resource for analysis of the genomic topography of the beetle C. tripartitus. The data presented here clarify the fitness phenotypes of this species in the wild and provide insight to support informed conservation planning.
Assuntos
Besouros , Tribolium , Animais , Besouros/genética , Perfilação da Expressão Gênica , Genes Mitocondriais , Transcriptoma , ReproduçãoRESUMO
The Korean endemic land snail Koreanohadra kurodana (Gastropoda: Bradybaenidae) found in humid areas of broadleaf forests and shrubs have been considered vulnerable as the number of individuals are declining in recent years. The species is poorly characterized at the genomic level that limits the understanding of functions at the molecular and genetics level. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset of visceral mass tissue of K. kurodana by the Illumina paired-end sequencing technology. Over 234 million quality reads were assembled to a total of 315,924 contigs and 191,071 unigenes, with an average and N50 length of 585.6 and 715 bp and 678 and 927 bp, respectively. Overall, 36.32 % of the unigenes found matches to known protein/nucleotide sequences in the public databases. The direction of the unigenes to functional categories was determined using COG, GO, KEGG, and InterProScan protein domain search. The GO analysis search resulted in 22,967 unigenes (12.02 %) being categorized into 40 functional groups. The KEGG annotation revealed that metabolism pathway genes were enriched. The most prominent protein motifs include the zinc finger, ribonuclease H, reverse transcriptase, and ankyrin repeat domains. The simple sequence repeats (SSRs) identified from >1 kb length of unigenes show a dominancy of dinucleotide repeat motifs followed with tri- and tetranucleotide motifs. A number of unigenes were putatively assessed to belong to adaptation and defense mechanisms including heat shock proteins 70, Toll-like receptor 4, AMP-activated protein kinase, aquaporin-2, etc. Our data provide a rich source for the identification and functional characterization of new genes and candidate polymorphic SSR markers in K. kurodana. The availability of transcriptome information ( http://bioinfo.sch.ac.kr/submission/ ) would promote the utilization of the resources for phylogenetics study and genetic diversity assessment.
Assuntos
Perfilação da Expressão Gênica/métodos , Repetições de Microssatélites , Caramujos/genética , Animais , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Filogenia , Análise de Sequência de RNA/métodosRESUMO
Aegista chejuensis and Aegista quelpartensis (Family-Bradybaenidae) are endemic to Korea, and are considered vulnerable due to declines in their population. The limited genetic resources for these species restricts the ability to prioritize conservation efforts. We sequenced the transcriptomes of these species using Illumina paired-end technology. Approximately 257 and 240 million reads were obtained and assembled into 198,531 and 230,497 unigenes for A. chejuensis and A. quelpartensis, respectively. The average and N50 unigene lengths were 735.4 and 1073 bp, respectively, for A. chejuensis, and 705.6 and 1001 bp, respectively, for A. quelpartensis. In total, 68,484 (34.5%) and 77,745 (33.73%) unigenes for A. chejuensis and A. quelpartensis, respectively, were annotated to databases. Gene Ontology terms were assigned to 23,778 (11.98%) and 26,396 (11.45) unigenes, for A. chejuensis and A. quelpartensis, respectively, while 5050 and 5838 unigenes were mapped to 117 and 124 pathways in the Kyoto Encyclopedia of Genes and Genomes database. In addition, we identified and annotated 9542 and 10,395 putative simple sequence repeats (SSRs) in unigenes from A. chejuensis and A. quelpartensis, respectively. We designed a list of PCR primers flanking the putative SSR regions. These microsatellites may be utilized for future phylogenetics and conservation initiatives.
Assuntos
Espécies em Perigo de Extinção , Anotação de Sequência Molecular , Análise de Sequência de DNA , Caramujos/genética , Transcriptoma , Animais , Genes , Repetições de MicrossatélitesRESUMO
The Lycaenidae butterflies, Protantigius superans and Spindasis takanosis, are endangered insects in Korea known for their symbiotic association with ants. However, necessary genomic and transcriptomics data are lacking in these species, limiting conservation efforts. In this study, the P. superans and S. takanosis transcriptomes were deciphered using Illumina HiSeq 2500 sequencing. The P. superans and S. takanosis transcriptome data included a total of 254,340,693 and 245,110,582 clean reads assembled into 159,074 and 170,449 contigs and 107,950 and 121,140 unigenes, respectively. BLASTX hits (E-value of 1.0 × 10(-5)) against the known protein databases annotated a total of 46,754 and 51,908 transcripts for P. superans and S. takanosis. Approximately 41.25% and 38.68% of the unigenes for P. superans and S. takanosis found homologous sequences in Protostome DB (PANM-DB). BLAST2GO analysis confirmed 18,611 unigenes representing Gene Ontology (GO) terms and a total of 5259 unigenes assigned to 116 pathways for P. superans. For S. takanosis, a total of 6697 unigenes were assigned to 119 pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database. Additionally, 382,164 and 390,516 Simple Sequence Repeats (SSRs) were compiled from the unigenes of P. superans and S. takanosis, respectively. This is the first report to record new genes and their utilization for conservation of lycaenid species population and as a reference information for closely related species.
Assuntos
Borboletas/genética , Espécies em Perigo de Extinção , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Transcriptoma/genética , Animais , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Ontologia Genética , Proteínas de Insetos/química , Proteínas de Insetos/genética , Repetições de Microssatélites/genética , Anotação de Sequência Molecular , Motivos de Nucleotídeos/genética , Estrutura Terciária de Proteína , Homologia de Sequência do Ácido Nucleico , Especificidade da EspécieRESUMO
BACKGROUND: The Bradybaenidae snail Karaftohelix adamsi is endemic to Korea, with the species tracked from Island Ulleung in North Gyeongsang Province of South Korea. K. adamsi has been classified under the Endangered Wildlife Class II species of Korea and poses a severe risk of extinction following habitat disturbances. With no available information at the DNA (genome) or mRNA (transcriptome) level for the species, conservation by utilizing informed molecular resources seems difficult. OBJECTIVE: In this study, we used the Illumina short-read sequencing and Trinity de novo assembly to draft the reference transcriptome of K. adamsi. RESULTS: After assembly, 13,753 unigenes were obtained of which 10,511 were annotated to public databases (a maximum of 10,165 unigenes found homologs in PANM DB). A total of 6,351, 3,535, 358, and 3,407 unigenes were ascribed to the functional categories under KOG, GO, KEGG, and IPS, respectively. The transcripts such as the HSP 70, aquaporin, TLR, and MAPK, among others, were screened as putative functional resources for adaptation. DNA transposons were found to be thickly populated in comparison to retrotransposons in the assembled unigenes. Further, 2,164 SSRs were screened with the promiscuous presence of dinucleotide repeats such as AC/GT and AG/CT. CONCLUSION: The transcriptome-guided discovery of molecular resources in K. adamsi will not only serve as a basis for functional genomics studies but also provide sustainable tools to be utilized for the protection of the species in the wild. Moreover, the development of polymorphic SSRs is valuable for the identification of species from newer habitats and cross-species genotyping.
Assuntos
Espécies em Perigo de Extinção , Repetições de Microssatélites , Caramujos , Transcriptoma , Animais , Repetições de Microssatélites/genética , Caramujos/genética , Transcriptoma/genética , República da Coreia , Anotação de Sequência Molecular , Aptidão GenéticaRESUMO
Background: Chronic kidney disease is a significant health burden worldwide, with increasing incidence. Although several genome-wide association studies (GWAS) have investigated single nucleotide polymorphisms (SNP) associated with kidney trait, most studies were focused on European ancestry. Methods: We utilized clinical and genetic information collected from the Korean Genome and Epidemiology Study (KoGES). Results: More than five million SNPs from 58,406 participants were analyzed. After meta-GWAS, 1,360 loci associated with estimated glomerular filtration rate (eGFR) at a genome-wide significant level (p = 5 × 10-8) were identified. Among them, 399 loci were validated with at least one other biomarker (blood urea nitrogen [BUN] or eGFRcysC) and 149 loci were validated using both markers. Among them, 18 SNPs (nine known ones and nine novel ones) with 20 putative genes were found. The aggregated effect of genes estimated by MAGMA gene analysis showed that these significant genes were enriched in kidney-associated pathways, with the kidney and liver being the most enriched tissues. Conclusion: In this study, we conducted GWAS for more than 50,000 Korean individuals and identified several variants associated with kidney traits, including eGFR, BUN, and eGFRcysC. We also investigated functions of relevant genes using computational methods to define putative causal variants.
RESUMO
BACKGROUND: Ticks are ectoparasites capable of directly damaging their hosts and transmitting vector-borne diseases. The ixodid tick Haemaphysalis flava has a broad distribution that extends from East to South Asia. This tick is a reservoir of severe fever with thrombocytopenia syndrome virus (SFTSV) that causes severe hemorrhagic disease, with cases reported from China, Japan and South Korea. Recently, the distribution of H. flava in South Korea was found to overlap with the occurrence of SFTSV. METHODS: This study was undertaken to discover the molecular resources of H. flava female ticks using the Illumina HiSeq 4000 system, the Trinity de novo sequence assembler and annotation against public databases. The locally curated Protostome database (PANM-DB) was used to screen the putative adaptation-related transcripts classified to gene families, such as angiotensin-converting enzyme, aquaporin, adenylate cyclase, AMP-activated protein kinase, glutamate receptors, heat shock proteins, molecular chaperones, insulin receptor, mitogen-activated protein kinase and solute carrier family proteins. Also, the repeats and simple sequence repeats (SSRs) were screened from the unigenes using RepeatMasker (v4.0.6) and MISA (v1.0) software tools, followed by the designing of SSRs flanking primers using BatchPrimer 3 (v1.0) software. RESULTS: The transcriptome produced a total of 69,822 unigenes, of which 46,175 annotated to the homologous proteins in the PANM-DB. The unigenes were also mapped to the EuKaryotic Orthologous Groups (KOG), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) specializations. Promiscuous presence of protein kinase, zinc finger (C2H2-type), reverse transcriptase, and RNA recognition motif domains was observed in the unigenes. A total of 3480 SSRs were screened, of which 1907 and 1274 were found as tri- and dinucleotide repeats, respectively. A list of primer sequences flanking the SSR motifs was detailed for validation of polymorphism in H. flava and the related tick species. CONCLUSIONS: The reference transcriptome information on H. flava female ticks will be useful for an enriched understanding of tick biology, its competency to act as a vector and the study of species diversity related to disease transmission.
Assuntos
Perfilação da Expressão Gênica , Ixodidae , Feminino , Animais , Anotação de Sequência Molecular , Transcriptoma , Genoma , Ixodidae/genética , Repetições de MicrossatélitesRESUMO
Transcriptome studies for conservation of endangered mollusks is a proactive approach towards managing threats and uncertainties facing these species in natural environments. The population of these species is declining due to habitat destruction, illicit wildlife trade, and global climate change. These activities risk the free movement of species across the wild landscape, loss of breeding grounds, and restrictions in displaying the physiological attributes so crucial for faunal welfare. Gastropods face the most negative ecological effects and have been enlisted under Korea's protective species consortium based on their population dynamics in the last few years. Moreover, with the genetic resources restricted for such species, conservation by informed planning is not possible. This review provides insights into the activities under the threatened species initiative of Korea with special reference to the transcriptome assemblies of endangered mollusks. The gastropods such as Ellobium chinense, Aegista chejuensis, Aegista quelpartensis, Incilaria fruhstorferi, Koreanohadra kurodana, Satsuma myomphala, and Clithon retropictus have been represented. Moreover, the transcriptome summary of bivalve Cristaria plicata and Caenogastropoda Charonia lampas sauliae is also discussed. Sequencing, de novo assembly, and annotation identified transcripts or homologs for the species and, based on an understanding of the biochemical and molecular pathways, were ascribed to predictive gene function. Mining for simple sequence repeats from the transcriptome have successfully assisted genetic polymorphism studies. A comparison of the transcriptome scheme of Korean endangered mollusks with the genomic resources of other endangered mollusks have been discussed with homologies and analogies for dictating future research.
Assuntos
Gastrópodes , Transcriptoma , Animais , Transcriptoma/genética , Espécies em Perigo de Extinção , Gastrópodes/genética , Genoma , República da CoreiaRESUMO
Satsuma myomphala is critically endangered through loss of natural habitats, predation by natural enemies, and indiscriminate collection. It is a protected species in Korea but lacks genomic resources for an understanding of varied functional processes attributable to evolutionary success under natural habitats. For assessing the genetic information of S. myomphala, we performed for the first time, de novo transcriptome sequencing and functional annotation of expressed sequences using Illumina Next-Generation Sequencing (NGS) platform and bioinformatics analysis. We identified 103,774 unigenes of which 37,959, 12,890, and 17,699 were annotated in the PANM (Protostome DB), Unigene, and COG (Clusters of Orthologous Groups) databases, respectively. In addition, 14,451 unigenes were predicted under Gene Ontology functional categories, with 4581 assigned to a single category. Furthermore, 3369 sequences with 646 having Enzyme Commission (EC) numbers were mapped to 122 pathways in the Kyoto Encyclopedia of Genes and Genomes Pathway database. The prominent protein domains included the Zinc finger (C2H2-like), Reverse Transcriptase, Thioredoxin-like fold, and RNA recognition motif domain. Many unigenes with homology to immunity, defense, and reproduction-related genes were screened in the transcriptome. We also detected 3120 putative simple sequence repeats (SSRs) encompassing dinucleotide to hexanucleotide repeat motifs from >1kb unigene sequences. A list of PCR primers of SSR loci have been identified to study the genetic polymorphisms. The transcriptome data represents a valuable resource for further investigations on the species genome structure and biology. The unigenes information and microsatellites would provide an indispensable tool for conservation of the species in natural and adaptive environments.
Assuntos
Espécies em Perigo de Extinção , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Repetições de Microssatélites/genética , Caramujos/genética , Transcriptoma/genética , Vísceras/metabolismo , Animais , Biologia Computacional , Genoma/genética , Anotação de Sequência Molecular , Caramujos/crescimento & desenvolvimentoRESUMO
BACKGROUND: The freshwater mussel Cristaria plicata (Bivalvia: Eulamellibranchia: Unionidae), is an economically important species in molluscan aquaculture due to its use in pearl farming. The species have been listed as endangered in South Korea due to the loss of natural habitats caused by anthropogenic activities. The decreasing population and a lack of genomic information on the species is concerning for environmentalists and conservationists. In this study, we conducted a de novo transcriptome sequencing and annotation analysis of C. plicata using Illumina HiSeq 2500 next-generation sequencing (NGS) technology, the Trinity assembler, and bioinformatics databases to prepare a sustainable resource for the identification of candidate genes involved in immunity, defense, and reproduction. RESULTS: The C. plicata transcriptome analysis included a total of 286,152,584 raw reads and 281,322,837 clean reads. The de novo assembly identified a total of 453,931 contigs and 374,794 non-redundant unigenes with average lengths of 731.2 and 737.1 bp, respectively. Furthermore, 100% coverage of C. plicata mitochondrial genes within two unigenes supported the quality of the assembler. In total, 84,274 unigenes showed homology to entries in at least one database, and 23,246 unigenes were allocated to one or more Gene Ontology (GO) terms. The most prominent GO biological process, cellular component, and molecular function categories (level 2) were cellular process, membrane, and binding, respectively. A total of 4,776 unigenes were mapped to 123 biological pathways in the KEGG database. Based on the GO terms and KEGG annotation, the unigenes were suggested to be involved in immunity, stress responses, sex-determination, and reproduction. A total of 17,251 cDNA simple sequence repeats (cSSRs) were identified from 61,141 unigenes (size of >1 kb) with the most abundant being dinucleotide repeats. CONCLUSIONS: This dataset represents the first transcriptome analysis of the endangered mollusc, C. plicata. The transcriptome provides a comprehensive sequence resource for the conservation of genetic information in this species and enrichment of the genetic database. The development of molecular markers will assist in the genetic improvement of C. plicata.
Assuntos
Espécies em Perigo de Extinção , Transcriptoma , Unionidae/genética , Animais , Feminino , Perfilação da Expressão Gênica , Ontologia Genética , Sequenciamento de Nucleotídeos em Larga Escala , Imunidade Inata , Masculino , Repetições de Microssatélites , Reprodução , Processos de Determinação Sexual , Unionidae/imunologia , Unionidae/fisiologiaRESUMO
The tadpole shrimp (Triops longicaudatus) is an aquatic crustacean that helps control pest populations. It inhabits freshwater ponds and pools and has been described as a living fossil. T. longicaudatus was officially declared an endangered species South Korea in 2005; however, through subsequent protection and conservation management, it was removed from the endangered species list in 2012. The limited number of available genetic resources on T. longicaudatus makes it difficult to obtain valuable genetic information for marker-aided selection programs. In this study, whole-transcriptome sequencing of T. longicaudatus generated 39.74 GB of clean data and a total of 269,822 contigs using the Illumina HiSeq 2500 platform. After clustering, a total of 208,813 unigenes with an N50 length of 1089 bp were generated. A total of 95,105 unigenes were successfully annotated against Protostome (PANM), Unigene, Eukaryotic Orthologous Groups (KOG), Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases using BLASTX with a cut-off of 1E-5. A total of 57,731 unigenes were assigned to GO terms, and 7247 unigenes were mapped to 129 KEGG pathways. Furthermore, 1595 simple sequence repeats (SSRs) were detected from the unigenes with 1387 potential SSR markers. This is the first report of high-throughput transcriptome analysis of T. longicaudatus, and it provides valuable insights for genetic research and molecular-assisted breeding of this important species.
RESUMO
An aquatic gastropod belonging to the family Neritidae, Clithon retropictus is listed as an endangered class II species in South Korea. The lack of information on its genomic background limits the ability to obtain functional data resources and inhibits informed conservation planning for this species. In the present study, the transcriptomic sequencing and de novo assembly of C. retropictus generated a total of 241,696,750 high-quality reads. These assembled to 282,838 unigenes with mean and N50 lengths of 736.9 and 1201 base pairs, respectively. Of these, 125,616 unigenes were subjected to annotation analysis with known proteins in Protostome DB, COG, GO, and KEGG protein databases (BLASTX; E ≤ 0.00001) and with known nucleotides in the Unigene database (BLASTN; E ≤ 0.00001). The GO analysis indicated that cellular process, cell, and catalytic activity are the predominant GO terms in the biological process, cellular component, and molecular function categories, respectively. In addition, 2093 unigenes were distributed in 107 different KEGG pathways. Furthermore, 49,280 simple sequence repeats were identified in the unigenes (>1 kilobase sequences). This is the first report on the identification of transcriptomic and microsatellite resources for C. retropictus, which opens up the possibility of exploring traits related to the adaptation and acclimatization of this species.
RESUMO
Vespa mandarinia found in the forests of East Asia, including Korea, occupies the highest rank in the arthropod food web within its geographical range. It serves as a source of nutrition in the form of Vespa amino acid mixture and is listed as a threatened species, although no conservation measures have been implemented. Here, we performed de novo assembly of the V. mandarinia transcriptome by Illumina HiSeq 4000 sequencing. Over 60 million raw reads and 59,184,811 clean reads were obtained. After assembly, a total of 66,837 unigenes were clustered, 40,887, 44,455, and 22,390 of which showed homologous matches against the PANM, Unigene, and KOG databases, respectively. A total of 15,675 unigenes were assigned to Gene Ontology terms, and 5,132 unigenes were mapped to 115 KEGG pathways. The zinc finger domain (C2H2-like), serine/threonine/dual specificity protein kinase domain, and RNA recognition motif domain were among the top InterProScan domains predicted for V. mandarinia sequences. Among the unigenes, we identified 534,922 cDNA simple sequence repeats as potential markers. This is the first transcriptomic analysis of the wasp V. mandarinia using Illumina HiSeq 4000. The obtained datasets should promote the search for new genes to understand the physiological attributes of this wasp.