RESUMO
BACKGROUND: Dung beetles recycle organic matter through the decomposition of feces and support ecological balance. However, these insects are threatened by the indiscriminate use of agrochemicals and habitat destruction. Copris tripartitus Waterhouse (Coleoptera: Scarabaeidae), a dung beetle, is listed as a class-II Korean endangered species. Although the genetic diversity of C. tripartitus populations has been investigated through analysis of mitochondrial genes, genomic resources for this species remain limited. In this study, we analyzed the transcriptome of C. tripartitus to elucidate functions related to growth, immunity and reproduction for the purpose of informed conservation planning. RESULTS: The transcriptome of C. tripartitus was generated using next-generation Illumina sequencing and assembled de novo using a Trinity-based platform. In total, 98.59% of the raw sequence reads were processed as clean reads. These reads were assembled into 151,177 contigs, 101,352 transcripts, and 25,106 unigenes. A total of 23,450 unigenes (93.40%) were annotated to at least one database. The largest proportion of unigenes (92.76%) were annotated to the locally curated PANM-DB. A maximum of 5,512 unigenes had homologous sequences in Tribolium castaneum. Gene Ontology (GO) analysis revealed a maximum of 5,174 unigenes in the Molecular function category. Further, in Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, a total of 462 enzymes were associated with established biological pathways. Based on sequence homology to known proteins in PANM-DB, representative immunity, growth, and reproduction-related genes were screened. Potential immunity-related genes were categorized into pattern recognition receptors (PRRs), the Toll-like receptor signaling pathway, the MyD88- dependent pathway, endogenous ligands, immune effectors, antimicrobial peptides, apoptosis, and adaptation-related transcripts. Among PRRs, we conducted detailed in silico characterization of TLR-2, CTL, and PGRP_SC2-like. Repetitive elements such as long terminal repeats, short interspersed nuclear elements, long interspersed nuclear elements and DNA elements were enriched in the unigene sequences. A total of 1,493 SSRs were identified among all unigenes of C. tripartitus. CONCLUSIONS: This study provides a comprehensive resource for analysis of the genomic topography of the beetle C. tripartitus. The data presented here clarify the fitness phenotypes of this species in the wild and provide insight to support informed conservation planning.
Assuntos
Besouros , Tribolium , Animais , Besouros/genética , Perfilação da Expressão Gênica , Genes Mitocondriais , Transcriptoma , ReproduçãoRESUMO
BACKGROUND: Incilaria (= Meghimatium) fruhstorferi is an air-breathing land slug found in restricted habitats of Japan, Taiwan and selected provinces of South Korea (Jeju, Chuncheon, Busan, and Deokjeokdo). The species is on a decline due to depletion of forest cover, predation by natural enemies, and collection. To facilitate the conservation of the species, it is important to decide on a number of traits related to growth, immunity and reproduction addressing fitness advantage of the species. RESULTS: The visceral mass transcriptome of I. fruhstorferi was enabled using the Illumina HiSeq 4000 sequencing platform. According to BUSCO (Benchmarking Universal Single-Copy Orthologs) method, the transcriptome was considered complete with 91.8% of ortholog genes present (Single: 70.7%; Duplicated: 21.1%). A total of 96.79% of the raw read sequences were processed as clean reads. TransDecoder identified 197,271 contigs that contained candidate-coding regions. Of a total of 50,230 unigenes, 34,470 (68.62% of the total unigenes) annotated to homologous proteins in the Protostome database (PANM-DB). The GO term and KEGG pathway analysis indicated genes involved in metabolism, phosphatidylinositol signalling system, aminobenzoate degradation, and T-cell receptor signalling pathway. Many genes associated with molluscan innate immunity were categorized under pathogen recognition receptor, TLR signalling pathway, MyD88 dependent pathway, endogenous ligands, immune effectors, antimicrobial peptides, apoptosis, and adaptation-related. The reproduction-associated unigenes showed homology to protein fem-1, spermatogenesis-associated protein, sperm associated antigen, and testis expressed sequences, among others. In addition, we identified key growth-related genes categorized under somatotrophic axis, muscle growth, chitinases and collagens. A total of 4822 Simple Sequence Repeats (SSRs) were also identified from the unigene sequences of I. fruhstorferi. CONCLUSIONS: This is the first available genomic information for non-model land slug, I. fruhstorferi focusing on genes related to growth, immunity, and reproduction, with additional focus on microsatellites and repeating elements. The transcriptome provides access to greater number of traits of unknown relevance in the species that could be exploited for in-depth analyses of evolutionary plasticity and making informed choices during conservation planning. This would be appropriate for understanding the dynamics of the species on a priority basis considering the ecological, health, and social benefits.
Assuntos
Gastrópodes/genética , Animais , DNA/química , Gastrópodes/crescimento & desenvolvimento , Gastrópodes/imunologia , Gastrópodes/metabolismo , Perfilação da Expressão Gênica , Imunidade/genética , Repetições de Microssatélites , Anotação de Sequência Molecular , Desenvolvimento Muscular/genética , Sequências Repetitivas de Ácido Nucleico , Reprodução/genética , Análise de Sequência de RNA/normas , Homologia de Sequência do Ácido Nucleico , Processos de Determinação Sexual/genéticaRESUMO
The Korean endemic land snail Koreanohadra kurodana (Gastropoda: Bradybaenidae) found in humid areas of broadleaf forests and shrubs have been considered vulnerable as the number of individuals are declining in recent years. The species is poorly characterized at the genomic level that limits the understanding of functions at the molecular and genetics level. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset of visceral mass tissue of K. kurodana by the Illumina paired-end sequencing technology. Over 234 million quality reads were assembled to a total of 315,924 contigs and 191,071 unigenes, with an average and N50 length of 585.6 and 715 bp and 678 and 927 bp, respectively. Overall, 36.32 % of the unigenes found matches to known protein/nucleotide sequences in the public databases. The direction of the unigenes to functional categories was determined using COG, GO, KEGG, and InterProScan protein domain search. The GO analysis search resulted in 22,967 unigenes (12.02 %) being categorized into 40 functional groups. The KEGG annotation revealed that metabolism pathway genes were enriched. The most prominent protein motifs include the zinc finger, ribonuclease H, reverse transcriptase, and ankyrin repeat domains. The simple sequence repeats (SSRs) identified from >1 kb length of unigenes show a dominancy of dinucleotide repeat motifs followed with tri- and tetranucleotide motifs. A number of unigenes were putatively assessed to belong to adaptation and defense mechanisms including heat shock proteins 70, Toll-like receptor 4, AMP-activated protein kinase, aquaporin-2, etc. Our data provide a rich source for the identification and functional characterization of new genes and candidate polymorphic SSR markers in K. kurodana. The availability of transcriptome information ( http://bioinfo.sch.ac.kr/submission/ ) would promote the utilization of the resources for phylogenetics study and genetic diversity assessment.
Assuntos
Perfilação da Expressão Gênica/métodos , Repetições de Microssatélites , Caramujos/genética , Animais , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Filogenia , Análise de Sequência de RNA/métodosRESUMO
Aegista chejuensis and Aegista quelpartensis (Family-Bradybaenidae) are endemic to Korea, and are considered vulnerable due to declines in their population. The limited genetic resources for these species restricts the ability to prioritize conservation efforts. We sequenced the transcriptomes of these species using Illumina paired-end technology. Approximately 257 and 240 million reads were obtained and assembled into 198,531 and 230,497 unigenes for A. chejuensis and A. quelpartensis, respectively. The average and N50 unigene lengths were 735.4 and 1073 bp, respectively, for A. chejuensis, and 705.6 and 1001 bp, respectively, for A. quelpartensis. In total, 68,484 (34.5%) and 77,745 (33.73%) unigenes for A. chejuensis and A. quelpartensis, respectively, were annotated to databases. Gene Ontology terms were assigned to 23,778 (11.98%) and 26,396 (11.45) unigenes, for A. chejuensis and A. quelpartensis, respectively, while 5050 and 5838 unigenes were mapped to 117 and 124 pathways in the Kyoto Encyclopedia of Genes and Genomes database. In addition, we identified and annotated 9542 and 10,395 putative simple sequence repeats (SSRs) in unigenes from A. chejuensis and A. quelpartensis, respectively. We designed a list of PCR primers flanking the putative SSR regions. These microsatellites may be utilized for future phylogenetics and conservation initiatives.
Assuntos
Espécies em Perigo de Extinção , Anotação de Sequência Molecular , Análise de Sequência de DNA , Caramujos/genética , Transcriptoma , Animais , Genes , Repetições de MicrossatélitesRESUMO
Acetyl xylan esterase (AXE), which hydrolyzes the ester linkages of the naturally acetylated xylan and thus known to have an important role for hemicellulose degradation, was isolated from the anaerobic rumen fungus Neocallimastix frontatlis PMA02, heterologously expressed in Escherichi coli (E.coli) and characterized. The full-length cDNA encoding NfAXE1 was 1,494 bp, of which 978 bp constituted an open reading frame. The estimated molecular weight of NfAXE1 was 36.5 kDa with 326 amino acid residues, and the calculated isoelectric point was 4.54. The secondary protein structure was predicted to consist of nine α-helixes and 12 ß-strands. The enzyme expressed in E.coli had the highest activity at 40°C and pH 8. The purified recombinant NfAXE1 had a specific activity of 100.1 U/mg when p-nitrophenyl acetate (p-NA) was used as a substrate at 40°C, optimum temperature. The amount of liberated acetic acids were the highest and the lowest when p-NA and acetylated birchwood xylan were used as substrates, respectively. The amount of xylose released from acetylated birchwod xylan was increased by 1.4 fold when NfAXE1 was mixed with xylanase in a reaction cocktail, implying a synergistic effect of NfAXE1 with xylanase on hemicellulose degradation.
RESUMO
The mammalian Y chromosome has unique characteristics compared with the autosomes or X chromosomes. Here we report the finished sequence of the chimpanzee Y chromosome (PTRY), including 271 kb of the Y-specific pseudoautosomal region 1 and 12.7 Mb of the male-specific region of the Y chromosome. Greater sequence divergence between the human Y chromosome (HSAY) and PTRY (1.78%) than between their respective whole genomes (1.23%) confirmed the accelerated evolutionary rate of the Y chromosome. Each of the 19 PTRY protein-coding genes analyzed had at least one nonsynonymous substitution, and 11 genes had higher nonsynonymous substitution rates than synonymous ones, suggesting relaxation of selective constraint, positive selection or both. We also identified lineage-specific changes, including deletion of a 200-kb fragment from the pericentromeric region of HSAY, expansion of young Alu families in HSAY and accumulation of young L1 elements and long terminal repeat retrotransposons in PTRY. Reconstruction of the common ancestral Y chromosome reflects the dynamic changes in our genomes in the 5-6 million years since speciation.
Assuntos
Cromossomos Humanos Y/genética , Evolução Molecular , Pan troglodytes/genética , Cromossomo Y/genética , Animais , Humanos , Masculino , Dados de Sequência Molecular , Sequências Repetitivas de Ácido Nucleico/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Sintenia/genéticaRESUMO
The Lycaenidae butterflies, Protantigius superans and Spindasis takanosis, are endangered insects in Korea known for their symbiotic association with ants. However, necessary genomic and transcriptomics data are lacking in these species, limiting conservation efforts. In this study, the P. superans and S. takanosis transcriptomes were deciphered using Illumina HiSeq 2500 sequencing. The P. superans and S. takanosis transcriptome data included a total of 254,340,693 and 245,110,582 clean reads assembled into 159,074 and 170,449 contigs and 107,950 and 121,140 unigenes, respectively. BLASTX hits (E-value of 1.0 × 10(-5)) against the known protein databases annotated a total of 46,754 and 51,908 transcripts for P. superans and S. takanosis. Approximately 41.25% and 38.68% of the unigenes for P. superans and S. takanosis found homologous sequences in Protostome DB (PANM-DB). BLAST2GO analysis confirmed 18,611 unigenes representing Gene Ontology (GO) terms and a total of 5259 unigenes assigned to 116 pathways for P. superans. For S. takanosis, a total of 6697 unigenes were assigned to 119 pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database. Additionally, 382,164 and 390,516 Simple Sequence Repeats (SSRs) were compiled from the unigenes of P. superans and S. takanosis, respectively. This is the first report to record new genes and their utilization for conservation of lycaenid species population and as a reference information for closely related species.
Assuntos
Borboletas/genética , Espécies em Perigo de Extinção , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Transcriptoma/genética , Animais , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Ontologia Genética , Proteínas de Insetos/química , Proteínas de Insetos/genética , Repetições de Microssatélites/genética , Anotação de Sequência Molecular , Motivos de Nucleotídeos/genética , Estrutura Terciária de Proteína , Homologia de Sequência do Ácido Nucleico , Especificidade da EspécieRESUMO
BACKGROUND: The Bradybaenidae snail Karaftohelix adamsi is endemic to Korea, with the species tracked from Island Ulleung in North Gyeongsang Province of South Korea. K. adamsi has been classified under the Endangered Wildlife Class II species of Korea and poses a severe risk of extinction following habitat disturbances. With no available information at the DNA (genome) or mRNA (transcriptome) level for the species, conservation by utilizing informed molecular resources seems difficult. OBJECTIVE: In this study, we used the Illumina short-read sequencing and Trinity de novo assembly to draft the reference transcriptome of K. adamsi. RESULTS: After assembly, 13,753 unigenes were obtained of which 10,511 were annotated to public databases (a maximum of 10,165 unigenes found homologs in PANM DB). A total of 6,351, 3,535, 358, and 3,407 unigenes were ascribed to the functional categories under KOG, GO, KEGG, and IPS, respectively. The transcripts such as the HSP 70, aquaporin, TLR, and MAPK, among others, were screened as putative functional resources for adaptation. DNA transposons were found to be thickly populated in comparison to retrotransposons in the assembled unigenes. Further, 2,164 SSRs were screened with the promiscuous presence of dinucleotide repeats such as AC/GT and AG/CT. CONCLUSION: The transcriptome-guided discovery of molecular resources in K. adamsi will not only serve as a basis for functional genomics studies but also provide sustainable tools to be utilized for the protection of the species in the wild. Moreover, the development of polymorphic SSRs is valuable for the identification of species from newer habitats and cross-species genotyping.
Assuntos
Espécies em Perigo de Extinção , Repetições de Microssatélites , Caramujos , Transcriptoma , Animais , Repetições de Microssatélites/genética , Caramujos/genética , Transcriptoma/genética , República da Coreia , Anotação de Sequência Molecular , Aptidão GenéticaRESUMO
A conjoined gene is defined as one formed at the time of transcription by combining at least part of one exon from each of two or more distinct genes that lie on the same chromosome, in the same or opposite orientation, which translate independently into different proteins. We comparatively studied the extent of conjoined genes in thirteen genomes by analyzing the public databases of expressed sequence tags and mRNA sequences using a set of computational tools designed to identify conjoined genes on the same DNA strand or opposite DNA strands of the same genomic locus. The CACG database, available at http://cgc.kribb.re.kr/map/, includes a number of conjoined genes (7131-human, 2-chimpanzee, 5-orangutan, 57-chicken, 4-rhesus monkey, 651-cow, 27-dog, 2512-mouse, 263-rat, 1482-zebrafish, 5-horse, 29-sheep, and 8-medaka) and is very effective and easy to use to analyze the evolutionary process of conjoined genes when comparing different species.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Splicing de RNA/genética , Animais , Bovinos , Galinhas , Cães , Éxons/genética , Etiquetas de Sequências Expressas , Genoma , Genômica , Cavalos , Humanos , Macaca mulatta , Camundongos , Oryzias , Pan troglodytes , Filogenia , Pongo , RNA Mensageiro/genética , Ratos , Alinhamento de Sequência , Análise de Sequência de DNA , Análise de Sequência de RNA , Ovinos , Peixe-ZebraRESUMO
BACKGROUND: Ticks are ectoparasites capable of directly damaging their hosts and transmitting vector-borne diseases. The ixodid tick Haemaphysalis flava has a broad distribution that extends from East to South Asia. This tick is a reservoir of severe fever with thrombocytopenia syndrome virus (SFTSV) that causes severe hemorrhagic disease, with cases reported from China, Japan and South Korea. Recently, the distribution of H. flava in South Korea was found to overlap with the occurrence of SFTSV. METHODS: This study was undertaken to discover the molecular resources of H. flava female ticks using the Illumina HiSeq 4000 system, the Trinity de novo sequence assembler and annotation against public databases. The locally curated Protostome database (PANM-DB) was used to screen the putative adaptation-related transcripts classified to gene families, such as angiotensin-converting enzyme, aquaporin, adenylate cyclase, AMP-activated protein kinase, glutamate receptors, heat shock proteins, molecular chaperones, insulin receptor, mitogen-activated protein kinase and solute carrier family proteins. Also, the repeats and simple sequence repeats (SSRs) were screened from the unigenes using RepeatMasker (v4.0.6) and MISA (v1.0) software tools, followed by the designing of SSRs flanking primers using BatchPrimer 3 (v1.0) software. RESULTS: The transcriptome produced a total of 69,822 unigenes, of which 46,175 annotated to the homologous proteins in the PANM-DB. The unigenes were also mapped to the EuKaryotic Orthologous Groups (KOG), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) specializations. Promiscuous presence of protein kinase, zinc finger (C2H2-type), reverse transcriptase, and RNA recognition motif domains was observed in the unigenes. A total of 3480 SSRs were screened, of which 1907 and 1274 were found as tri- and dinucleotide repeats, respectively. A list of primer sequences flanking the SSR motifs was detailed for validation of polymorphism in H. flava and the related tick species. CONCLUSIONS: The reference transcriptome information on H. flava female ticks will be useful for an enriched understanding of tick biology, its competency to act as a vector and the study of species diversity related to disease transmission.
Assuntos
Perfilação da Expressão Gênica , Ixodidae , Feminino , Animais , Anotação de Sequência Molecular , Transcriptoma , Genoma , Ixodidae/genética , Repetições de MicrossatélitesRESUMO
Transcriptome studies for conservation of endangered mollusks is a proactive approach towards managing threats and uncertainties facing these species in natural environments. The population of these species is declining due to habitat destruction, illicit wildlife trade, and global climate change. These activities risk the free movement of species across the wild landscape, loss of breeding grounds, and restrictions in displaying the physiological attributes so crucial for faunal welfare. Gastropods face the most negative ecological effects and have been enlisted under Korea's protective species consortium based on their population dynamics in the last few years. Moreover, with the genetic resources restricted for such species, conservation by informed planning is not possible. This review provides insights into the activities under the threatened species initiative of Korea with special reference to the transcriptome assemblies of endangered mollusks. The gastropods such as Ellobium chinense, Aegista chejuensis, Aegista quelpartensis, Incilaria fruhstorferi, Koreanohadra kurodana, Satsuma myomphala, and Clithon retropictus have been represented. Moreover, the transcriptome summary of bivalve Cristaria plicata and Caenogastropoda Charonia lampas sauliae is also discussed. Sequencing, de novo assembly, and annotation identified transcripts or homologs for the species and, based on an understanding of the biochemical and molecular pathways, were ascribed to predictive gene function. Mining for simple sequence repeats from the transcriptome have successfully assisted genetic polymorphism studies. A comparison of the transcriptome scheme of Korean endangered mollusks with the genomic resources of other endangered mollusks have been discussed with homologies and analogies for dictating future research.
Assuntos
Gastrópodes , Transcriptoma , Animais , Transcriptoma/genética , Espécies em Perigo de Extinção , Gastrópodes/genética , Genoma , República da CoreiaRESUMO
Lactobacillus fructivorans is important in the generation of particular flavors and in other ripening processes associated with fermented food. Here, we present the draft genome sequence of the type strain Lactobacillus fructivorans KCTC 3543 (1,373,326 bp, with a G+C content of 38.9%), which consists of 5 scaffolds. The genome sequence was obtained by using a whole-genome shotgun strategy with Roche 454 GS (FLX Titanium) pyrosequencing, and all of the reads were assembled using Newbler Assembler 2.3.
Assuntos
Genoma Bacteriano , Lactobacillus/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sequência de Bases , Microbiologia de Alimentos , Regulação Bacteriana da Expressão Gênica , Lactobacillus/classificação , Dados de Sequência MolecularRESUMO
The novel Sporolactobacillus vineae SL153(T) strain has excellent intestinal adherence and growth inhibitory effect on pathogenic microorganisms, including Vibrio genus microorganisms, and therefore can be effectively used for the prevention and treatment of disease caused by pathogenic microorganisms. Here, we first report the draft genome sequence of a novel species in the genus Sporolactobacillus.
Assuntos
Bacillales/genética , Genoma Bacteriano , Probióticos/isolamento & purificação , Bacillales/classificação , Bacillales/isolamento & purificação , Sequência de Bases , Dados de Sequência Molecular , Probióticos/classificação , República da Coreia , Microbiologia do SoloRESUMO
A new Peptoniphilus species has been isolated from samples from a patient who was scheduled for endoscopic sinus surgery for chronic rhinosinusitis. The isolate, Peptoniphilus rhinitidis 1-13(T) (KCTC 5985(T)), can use peptone as a sole carbon source and produce butyrate as a metabolic end product. This is the first report of the draft genome sequence of a novel species in the genus Peptoniphilus within the group of Gram-positive anaerobic cocci.
Assuntos
Genoma Bacteriano , Cocos Gram-Positivos/genética , Adulto , Anaerobiose , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Regulação Bacteriana da Expressão Gênica/fisiologia , Cocos Gram-Positivos/fisiologia , Humanos , Masculino , Pessoa de Meia-Idade , Dados de Sequência MolecularRESUMO
Fusobacterium nucleatum is classified into five subspecies. F. nucleatum ChDC F128 was isolated from a periodontitis lesion and proposed as a new subspecies based on the comparison of the nucleotide sequences of the RNA polymerase beta subunit and zinc protease genes. Here, we report the draft genome sequence of the strain.
Assuntos
Infecções por Fusobacterium/microbiologia , Fusobacterium nucleatum/genética , Genoma Bacteriano , Periodontite/microbiologia , Fusobacterium nucleatum/classificação , Humanos , Dados de Sequência MolecularRESUMO
Fusobacterium nucleatum, one of the major causative bacteria of periodontitis, is classified into five subspecies (nucleatum, polymorphum, vincentii, animalis, and fusiforme) on the basis of the several phenotypic characteristics and DNA homology. This is the first report of the draft genome sequence of F. nucleatum subsp. fusiforme ATCC 51190(T).
Assuntos
Fusobacterium nucleatum/classificação , Fusobacterium nucleatum/genética , Genoma Bacteriano , Dados de Sequência MolecularRESUMO
A new Myroides species has been isolated from the urine of a patient with fever in spite of multiple antibiotic treatments who had undergone a radical hysterectomy for cervical cancer and percutaneous nephrostomies for hydronephrosis in the past. The isolate, Myroides injenensis M09-0166(T) (KCTC 23367(T)), showed a high level of resistance to multiple antibiotic agents. Here we provide the first report of the draft genome sequence of a novel species in the genus Myroides within the nonfermenting Gram-negative group.
Assuntos
Flavobacterium/genética , Genoma Bacteriano , Antibacterianos , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Farmacorresistência Bacteriana Múltipla , Feminino , Flavobacterium/classificação , Flavobacterium/efeitos dos fármacos , Regulação Bacteriana da Expressão Gênica/fisiologia , Humanos , Dados de Sequência Molecular , Especificidade da EspécieRESUMO
A new Clostridium species has been isolated from pear orchard soil in Daejeon, Republic of Korea. The isolate, Clostridium arbusti SL206(T) (KCTC 5449(T)), showed a nitrogenase activity as well as an organic acid production. Here we first report the draft genome sequence of a novel species in the genus Clostridium within the largest Gram-positive group.
Assuntos
Clostridium/classificação , Clostridium/genética , Genoma Bacteriano , Dados de Sequência Molecular , República da Coreia , Microbiologia do SoloRESUMO
Recently, conjoined genes (CGs) have emerged as important genetic factors necessary for understanding the human genome. However, their formation mechanism and precise structures have remained mysterious. Based on a detailed structural analysis of 57 human CG transcript variants (CGTVs, discovered in this study) and all (833) known CGs in the human genome, we discovered that the poly(A) signal site from the upstream parent gene region is completely removed via the skipping or truncation of the final exon; consequently, CG transcription is terminated at the poly(A) signal site of the downstream parent gene. This result led us to propose a novel mechanism of CG formation: the complete removal of the poly(A) signal site from the upstream parent gene is a prerequisite for the CG transcriptional machinery to continue transcribing uninterrupted into the intergenic region and downstream parent gene. The removal of the poly(A) signal sequence from the upstream gene region appears to be caused by a deletion or truncation mutation in the human genome rather than post-transcriptional trans-splicing events. With respect to the characteristics of CG sequence structures, we found that intergenic regions are hot spots for novel exon creation during CGTV formation and that exons farther from the intergenic regions are more highly conserved in the CGTVs. Interestingly, many novel exons newly created within the intergenic and intragenic regions originated from transposable element sequences. Additionally, the CGTVs showed tumor tissue-biased expression. In conclusion, our study provides novel insights into the CG formation mechanism and expands the present concepts of the genetic structural landscape, gene regulation, and gene formation mechanisms in the human genome.
Assuntos
Éxons , Genoma Humano , Mutagênese , Proteínas Mutantes Quiméricas/genética , Regiões 3' não Traduzidas , Processamento Alternativo , Sequência de Bases , Clonagem Molecular , Células HEK293 , Humanos , Proteínas Mutantes Quiméricas/metabolismo , Neoplasias/metabolismo , Poliadenilação , RNA Mensageiro/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Deleção de Sequência , Transcrição GênicaRESUMO
Chromosome 11, although average in size, is one of the most gene- and disease-rich chromosomes in the human genome. Initial gene annotation indicates an average gene density of 11.6 genes per megabase, including 1,524 protein-coding genes, some of which were identified using novel methods, and 765 pseudogenes. One-quarter of the protein-coding genes shows overlap with other genes. Of the 856 olfactory receptor genes in the human genome, more than 40% are located in 28 single- and multi-gene clusters along this chromosome. Out of the 171 disorders currently attributed to the chromosome, 86 remain for which the underlying molecular basis is not yet known, including several mendelian traits, cancer and susceptibility loci. The high-quality data presented here--nearly 134.5 million base pairs representing 99.8% coverage of the euchromatic sequence--provide scientists with a solid foundation for understanding the genetic basis of these disorders and other biological phenomena.