RESUMO
BACKGROUND: Epstein-Barr virus (EBV) is an important human pathogenic gammaherpesvirus with carcinogenic potential. The EBV transcriptome has previously been analyzed using both Illumina-based short read-sequencing and Pacific Biosciences RS II-based long-read sequencing technologies. Since the various sequencing methods have distinct strengths and limitations, the use of multiplatform approaches have proven to be valuable. The aim of this study is to provide a more complete picture on the transcriptomic architecture of EBV. METHODS: In this work, we apply the Oxford Nanopore Technologies MinION (long-read sequencing) platform for the generation of novel transcriptomic data, and integrate these with other's data generated by another LRS approach, Pacific BioSciences RSII sequencing and Illumina CAGE-Seq and Poly(A)-Seq approaches. Both amplified and non-amplified cDNA sequencings were applied for the generation of sequencing reads, including both oligo-d(T) and random oligonucleotide-primed reverse transcription. EBV transcripts are identified and annotated using the LoRTIA software suite developed in our laboratory. RESULTS: This study detected novel genes embedded into longer host genes containing 5'-truncated in-frame open reading frames, which potentially encode N-terminally truncated proteins. We also detected a number of novel non-coding RNAs and transcript length isoforms encoded by the same genes but differing in their start and/or end sites. This study also reports the discovery of novel splice isoforms, many of which may represent altered coding potential, and of novel replication-origin-associated transcripts. Additionally, novel mono- and multigenic transcripts were identified. An intricate meshwork of transcriptional overlaps was revealed. CONCLUSIONS: An integrative approach applying multi-technique sequencing technologies is suitable for reliable identification of complex transcriptomes because each techniques has different advantages and limitations, and the they can be used for the validation of the results obtained by a particular approach.
Assuntos
Infecções por Vírus Epstein-Barr , Transcriptoma , Infecções por Vírus Epstein-Barr/genética , Perfilação da Expressão Gênica , Herpesvirus Humano 4/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Fases de Leitura AbertaRESUMO
BACKGROUND: Alternative polyadenylation is commonly examined using cDNA sequencing, which is known to be affected by template-switching artifacts. However, the effects of such template-switching artifacts on alternative polyadenylation are generally disregarded, while alternative polyadenylation artifacts are attributed to internal priming. RESULTS: Here, we analyzed both long-read cDNA sequencing and direct RNA sequencing data of two organisms, generated by different sequencing platforms. We developed a filtering algorithm which takes into consideration that template-switching can be a source of artifactual polyadenylation when filtering out spurious polyadenylation sites. The algorithm outperformed the conventional internal priming filters based on comparison to direct RNA sequencing data. We also showed that the polyadenylation artifacts arise in cDNA sequencing at consecutive stretches of as few as three adenines. There was no substantial difference between the lengths of poly(A) tails at the artifactual and the true transcriptional end sites even though it is expected that internal priming artifacts have shorter poly(A) tails than genuine polyadenylated reads. CONCLUSIONS: Our findings suggest that template switching plays an important role in the generation of spurious polyadenylation and support the need for more rigorous filtering of artifactual polyadenylation sites in cDNA data, or that alternative polyadenylation should be annotated using native RNA sequencing.
Assuntos
Poliadenilação , Artefatos , DNA Complementar/genética , Análise de Sequência de DNA , Transcrição GênicaRESUMO
BACKGROUND: Pseudorabies virus (PRV) is the causative agent of Aujeszky's disease giving rise to significant economic losses worldwide. Many countries have implemented national programs for the eradication of this virus. In this study, long-read sequencing was used to determine the nucleotide sequence of the genome of a novel PRV strain (PRV-MdBio) isolated in Serbia. RESULTS: In this study, a novel PRV strain was isolated and characterized. PRV-MdBio was found to exhibit similar growth properties to those of another wild-type PRV, the strain Kaplan. Single-molecule real-time (SMRT) sequencing has revealed that the new strain differs significantly in base composition even from strain Kaplan, to which it otherwise exhibits the highest similarity. We compared the genetic composition of PRV-MdBio to strain Kaplan and the China reference strain Ea and obtained that radical base replacements were the most common point mutations preceding conservative and silent mutations. We also found that the adaptation of PRV to cell culture does not lead to any tendentious genetic alteration in the viral genome. CONCLUSION: PRV-MdBio is a wild-type virus, which differs in base composition from other PRV strains to a relatively large extent.
RESUMO
BACKGROUND: Varicella zoster virus (VZV) is a human pathogenic alphaherpesvirus harboring a relatively large DNA molecule. The VZV transcriptome has already been analyzed by microarray and short-read sequencing analyses. However, both approaches have substantial limitations when used for structural characterization of transcript isoforms, even if supplemented with primer extension or other techniques. Among others, they are inefficient in distinguishing between embedded RNA molecules, transcript isoforms, including splice and length variants, as well as between alternative polycistronic transcripts. It has been demonstrated in several studies that long-read sequencing is able to circumvent these problems. RESULTS: In this work, we report the analysis of the VZV lytic transcriptome using the Oxford Nanopore Technologies sequencing platform. These investigations have led to the identification of 114 novel transcripts, including mRNAs, non-coding RNAs, polycistronic RNAs and complex transcripts, as well as 10 novel spliced transcripts and 25 novel transcription start site isoforms and transcription end site isoforms. A novel class of transcripts, the nroRNAs are described in this study. These transcripts are encoded by the genomic region located in close vicinity to the viral replication origin. We also show that the ORF63 exhibits a complex structural variation encompassing the splice sites of VZV latency transcripts. Additionally, we have detected RNA editing in a novel non-coding RNA molecule. CONCLUSIONS: Our investigations disclosed a composite transcriptomic architecture of VZV, including the discovery of novel RNA molecules and transcript isoforms, as well as a complex meshwork of transcriptional read-throughs and overlaps. The results represent a substantial advance in the annotation of the VZV transcriptome and in understanding the molecular biology of the herpesviruses in general.
Assuntos
Herpesvirus Humano 3/genética , Transcriptoma , Linhagem Celular , Humanos , Fases de Leitura Aberta/genética , Isoformas de Proteínas/genética , Edição de RNA , Splicing de RNA , RNA Mensageiro/química , RNA Mensageiro/metabolismo , RNA Viral/isolamento & purificação , RNA Viral/metabolismo , Análise de Sequência de DNA , Sítio de Iniciação de Transcrição , Proteínas Virais/genéticaRESUMO
Pseudorabies virus (PRV) is an animal alphaherpesvirus with a wide host range. PRV has 67 protein-coding genes and several non-coding RNA molecules, which can be classified into three temporal groups, immediate early, early and late classes. The ul54 gene of PRV and its homolog icp27 of herpes simplex virus have a multitude of functions, including the regulation of viral DNA synthesis and the control of the gene expression. Therefore, abrogation of PRV ul54 function was expected to exert a significant effect on the global transcriptome and on DNA replication. Real-time PCR and real-time RT-PCR platforms were used to investigate these presumed effects. Our analyses revealed a drastic impact of the ul54 mutation on the genome-wide expression of PRV genes, especially on the transcription of the true late genes. A more than two hour delay was observed in the onset of DNA replication, and the amount of synthesized DNA molecules was significantly decreased in comparison to the wild-type virus. Furthermore, in this work, we were able to successfully demonstrate the utility of long-read SMRT sequencing for genotyping of mutant viruses.
Assuntos
Replicação do DNA/fisiologia , DNA Viral/fisiologia , Deleção de Genes , Herpesvirus Suídeo 1/fisiologia , Animais , Linhagem Celular , Genótipo , RNA Viral/genética , Reação em Cadeia da Polimerase em Tempo Real , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Suínos , Proteínas ViraisRESUMO
The pseudorabies virus (PRV; also known as Suid herpesvirus-1) is a neurotropic herpesvirus of swine. The us7 and us8 genes of this virus encode the glycoprotein I and E membrane proteins that form a heterodimer that is known to control cell-to-cell spread in tissue culture and in animals. In this study, we investigated the effect of the deletion of the PRV us7 and us8 genes on the genome-wide transcription and DNA replication using a multi-time-point quantitative reverse transcriptase-based real-time PCR technique. Abrogation of the us7/8 gene function was found to exert a drastic but differential effect on the expression of PRV genes during lytic infection. In the mutant virus, all kinetic classes of viral genes were significantly down-regulated at the first 6 h of infection, while having been upregulated later. The level of upregulation was the highest in the immediate-early (IE) and the early (E) genes; lower in the early-late (E/L) genes; and the lowest in the late (L) genes. The relative contribution of the L transcripts to the global transcriptome became lower, while the rest of the transcripts were expressed at a higher level in the mutant than in the wild-type virus.
Assuntos
Deleção de Genes , Regulação Viral da Expressão Gênica , Herpesvirus Suídeo 1/genética , Pseudorraiva/virologia , Doenças dos Suínos/virologia , Proteínas Virais/genética , Animais , Replicação do DNA , Herpesvirus Suídeo 1/metabolismo , Suínos , Proteínas Virais/metabolismoRESUMO
BACKGROUND: Pseudorabies virus is a widely-studied model organism of the Herpesviridae family, with a compact genome arrangement of 72 known coding sequences. In order to obtain an up-to-date genetic map of the virus, a combination of RNA-sequencing approaches were applied, as recent advancements in high-throughput sequencing methods have provided a wealth of information on novel RNA species and transcript isoforms, revealing additional layers of transcriptome complexity in several viral species. RESULTS: The total RNA content and polyadenylation landscape of pseudorabies virus were characterized for the first time at high coverage by Illumina high-throughput sequencing of cDNA samples collected during the lytic infectious cycle. As anticipated, nearly all of the viral genome was transcribed, with the exception of loci in the large internal and terminal repeats, and several small intergenic repetitive sequences. Our findings included a small novel polyadenylated non-coding RNA near an origin of replication, and the single-base resolution mapping of 3' UTRs across the viral genome. Alternative polyadenylation sites were found in a number of genes and a novel alternative splice site was characterized in the ep0 gene, while previously known splicing events were confirmed, yielding no alternative splice isoforms. Additionally, we detected the active polyadenylation of transcripts earlier believed to be transcribed as part of polycistronic RNAs. CONCLUSION: To the best of our knowledge, the present work has furnished the highest-resolution transcriptome map of an alphaherpesvirus to date, and reveals further complexities of viral gene expression, with the identification of novel transcript boundaries, alternative splicing of the key transactivator EP0, and a highly abundant, novel non-coding RNA near the lytic replication origin. These advances provide a detailed genetic map of PRV for future research.
Assuntos
Perfilação da Expressão Gênica/métodos , Herpesvirus Suídeo 1/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Genoma Viral , Dados de Sequência Molecular , Poliadenilação , Sítios de Splice de RNA , RNA Mensageiro/análise , RNA Viral/análiseRESUMO
The architectural high mobility group box 1 (Hmgb1) protein acts as both a nuclear and an extracellular regulator of various biological processes, including skeletogenesis. Here we report its contribution to the evolutionarily conserved, distinctive regulation of the matrilin-1 gene (Matn1) expression in amniotes. We previously demonstrated that uniquely assembled proximal promoter elements restrict Matn1 expression to specific growth plate cartilage zones by allowing varying doses of L-Sox5/Sox6 and Nfi proteins to fine-tune their Sox9-mediated transactivation. Here, we dissected the regulatory mechanisms underlying the activity of a conserved distal promoter element 1. We show that this element carries three Sox-binding sites, works as an enhancer in vivo, and allows promoter activation by the Sox5/6/9 chondrogenic trio. In early steps of chondrogenesis, declining Hmgb1 expression overlaps with the onset of Sox9 expression. Unlike repression in late steps, Hmgb1 overexpression in early chondrogenesis increases Matn1 promoter activation by the Sox trio, and forced Hmgb1 expression in COS-7 cells facilitates induction of Matn1 expression by the Sox trio. The conserved Matn1 control elements bind Hmgb1 and SOX9 with opposite efficiency in vitro. They show higher HMGB1 than SOX trio occupancy in established chondrogenic cell lines, and HMGB1 silencing greatly increases MATN1 and COL2A1 expression. Together, these data thus suggest a model whereby Hmgb1 helps recruit the Sox trio to the Matn1 promoter and thereby facilitates activation of the gene in early chondrogenesis. We anticipate that Hmgb1 may similarly affect transcription of other cartilage-specific genes.
Assuntos
Condrogênese/genética , Proteína HMGB1/metabolismo , Proteínas Matrilinas/genética , Regiões Promotoras Genéticas/genética , Fatores de Transcrição SOX9/metabolismo , Fatores de Transcrição SOXD/metabolismo , Animais , Sítios de Ligação , Western Blotting , Células COS , Células Cultivadas , Embrião de Galinha , Chlorocebus aethiops , Condrócitos/citologia , Condrócitos/metabolismo , Imunoprecipitação da Cromatina , Ensaio de Desvio de Mobilidade Eletroforética , Imunofluorescência , Proteína HMGB1/genética , Humanos , Proteínas Matrilinas/metabolismo , Mesoderma/citologia , Mesoderma/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos CBA , Camundongos Transgênicos , RNA Mensageiro/genética , Ratos , Reação em Cadeia da Polimerase em Tempo Real , Elementos de Resposta/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Fatores de Transcrição SOX9/genética , Fatores de Transcrição SOXD/genéticaRESUMO
In this study, we employed short- and long-read sequencing technologies to delineate the transcriptional architecture of the human monkeypox virus and to identify key regulatory elements that govern its gene expression. Specifically, we conducted a transcriptomic analysis to annotate the transcription start sites (TSSs) and transcription end sites (TESs) of the virus by utilizing Cap Analysis of gene expression sequencing on the Illumina platform and direct RNA sequencing on the Oxford Nanopore technology device. Our investigations uncovered significant complexity in the use of alternative TSSs and TESs in viral genes. In this research, we also detected the promoter elements and poly(A) signals associated with the viral genes. Additionally, we identified novel genes in both the left and right variable regions of the viral genome.IMPORTANCEGenerally, gaining insight into how the transcription of a virus is regulated offers insights into the key mechanisms that control its life cycle. The recent outbreak of the human monkeypox virus has underscored the necessity of understanding the basic biology of its causative agent. Our results are pivotal for constructing a comprehensive transcriptomic atlas of the human monkeypox virus, providing valuable resources for future studies.
Assuntos
Análise de Sequência de RNA , Sítio de Iniciação de Transcrição , Transcriptoma , Humanos , Análise de Sequência de RNA/métodos , Monkeypox virus/genética , Perfilação da Expressão Gênica , Genoma Viral , Regiões Promotoras Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , RNA Viral/genéticaRESUMO
This study employed both short-read sequencing (SRS, Illumina) and long-read sequencing (LRS Oxford Nanopore Technologies) platforms to conduct a comprehensive analysis of the equid alphaherpesvirus 1 (EHV-1) transcriptome. The study involved the annotation of canonical mRNAs and their transcript variants, encompassing transcription start site (TSS) and transcription end site (TES) isoforms, in addition to alternative splicing forms. Furthermore, the study revealed the presence of numerous non-coding RNA (ncRNA) molecules, including intergenic and antisense transcripts, produced by EHV-1. An intriguing finding was the abundant production of chimeric transcripts, some of which potentially encode fusion polypeptides. Moreover, EHV-1 exhibited a greater incidence of transcriptional overlaps and splicing compared to related viruses. It is noteworthy that many genes have their unique TESs along with the co-terminal transcription ends, a characteristic scarcely seen in other alphaherpesviruses. The study also identified transcripts that overlap the replication origins of the virus. Moreover, a novel ncRNA, referred to as NOIR, was found to intersect with the 5'-ends of longer transcript isoform specified by the major transactivator genes ORF64 and ORF65, surrounding the OriL. These findings together imply the existence of a key regulatory mechanism that governs both transcription and replication through, among others, a process that involves interference between the DNA and RNA synthesis machineries.
RESUMO
Long-read sequencing (LRS) techniques enable the identification of full-length RNA molecules in a single run eliminating the need for additional assembly steps. LRS research has exposed unanticipated transcriptomic complexity in various organisms, including viruses. Herpesviruses are known to produce a range of transcripts, either close to or overlapping replication origins (Oris) and neighboring genes related to transcription or replication, which possess confirmed or potential regulatory roles. In our research, we employed both new and previously published LRS and short-read sequencing datasets to uncover additional Ori-proximal transcripts in nine herpesviruses from all three subfamilies (alpha, beta and gamma). We discovered novel long non-coding RNAs, as well as splice and length isoforms of mRNAs. Moreover, our analysis uncovered an intricate network of transcriptional overlaps within the examined genomic regions. We demonstrated that herpesviruses display distinct patterns of transcriptional overlaps in the vicinity of or at the Oris. Our findings suggest the existence of a 'super regulatory center' in the genome of alphaherpesviruses that governs the initiation of both DNA replication and global transcription through multilayered interactions among the molecular machineries.
Assuntos
Herpesviridae , Origem de Replicação , Origem de Replicação/genética , Herpesviridae/genética , Transcriptoma , Perfilação da Expressão Gênica , GenômicaRESUMO
The recent human Monkeypox outbreak underlined the importance of studying basic biology of orthopoxviruses. However, the transcriptome of its causative agent has not been investigated before neither with short-, nor with long-read sequencing approaches. This Oxford Nanopore long-read RNA-Sequencing dataset fills this gap. It will enable the in-depth characterization of the transcriptomic architecture of the monkeypox virus, and may even make possible to annotate novel host transcripts. Moreover, our direct cDNA and native RNA sequencing reads will allow the estimation of gene expression changes of both the virus and the host cells during the infection. Overall, our study will lead to a deeper understanding of the alterations caused by the viral infection on a transcriptome level.
Assuntos
Mpox , Sequenciamento por Nanoporos , Humanos , DNA Complementar , Perfilação da Expressão Gênica , TranscriptomaRESUMO
In this study, two long-read sequencing (LRS) techniques, MinION from Oxford Nanopore Technologies and Sequel from the Pacific Biosciences, were used for the transcriptional characterization of a prototype baculovirus, Autographa californica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcript species, of which 759 were novel and 116 were annotated previously. These RNA molecules include 41 novel putative protein coding transcripts [each containing 5'-truncated in-frame open reading frames (ORFs), 14 monocistronic transcripts, 99 polygenic RNAs, 101 non-coding RNAs, and 504 untranslated region isoforms. This work also identified novel replication origin-associated transcripts, upstream ORFs, cis-regulatory sequences and poly(A) sites. We also detected RNA methylation in 99 viral genes and RNA hyper-editing in the longer 5'-UTR transcript isoform of the canonical ORF 19 transcript.
Assuntos
Baculoviridae/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Metilação , Nucleopoliedrovírus/genética , Fases de Leitura Aberta , RNA Viral , TATA Box , Regiões não TraduzidasRESUMO
BACKGROUND: Recent studies have disclosed the genome, transcriptome, and epigenetic compositions of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the effect of viral infection on gene expression of the host cells. It has been demonstrated that, besides the major canonical transcripts, the viral genome also codes for noncanonical RNA molecules. While the structural characterizations have revealed a detailed transcriptomic architecture of the virus, the kinetic studies provided poor and often misleading results on the dynamics of both the viral and host transcripts due to the low temporal resolution of the infection event and the low virus/cell ratio (multiplicity of infection [MOI] = 0.1) applied for the infection. It has never been tested whether the alteration in the host gene expressions is caused by aging of the cells or by the viral infection. FINDINGS: In this study, we used Oxford Nanopore's direct cDNA and direct RNA sequencing methods for the generation of a high-coverage, high temporal resolution transcriptomic dataset of SARS-CoV-2 and of the primate host cells, using a high infection titer (MOI = 5). Sixteen sampling time points ranging from 1 to 96 hours with a varying time resolution and 3 biological replicates were used in the experiment. In addition, for each infected sample, corresponding noninfected samples were employed. The raw reads were mapped to the viral and to the host reference genomes, resulting in 49,661,499 mapped reads (54,62 Gbs). The genome of the viral isolate was also sequenced and phylogenetically classified. CONCLUSIONS: This dataset can serve as a valuable resource for profiling the SARS-CoV-2 transcriptome dynamics, the virus-host interactions, and the RNA base modifications. Comparison of expression profiles of the host gene in the virally infected and in noninfected cells at different time points allows making a distinction between the effect of the aging of cells in culture and the viral infection. These data can provide useful information for potential novel gene annotations and can also be used for studying the currently available bioinformatics pipelines.
Assuntos
COVID-19 , Sequenciamento por Nanoporos , Animais , COVID-19/genética , DNA Complementar/genética , Cinética , RNA , SARS-CoV-2/genéticaRESUMO
In this work, a long-read sequencing (LRS) technique based on the Oxford Nanopore Technology MinION platform was used for quantifying and kinetic characterization of the poly(A) fraction of bovine alphaherpesvirus type 1 (BoHV-1) lytic transcriptome across a 12-h infection period. Amplification-based LRS techniques frequently generate artefactual transcription reads and are biased towards the production of shorter amplicons. To avoid these undesired effects, we applied direct cDNA sequencing, an amplification-free technique. Here, we show that a single promoter can produce multiple transcription start sites whose distribution patterns differ among the viral genes but are similar in the same gene at different timepoints. Our investigations revealed that the circ gene is expressed with immediate-early (IE) kinetics by utilizing a special mechanism based on the use of the promoter of another IE gene (bicp4) for the transcriptional control. Furthermore, we detected an overlap between the initiation of DNA replication and the transcription from the bicp22 gene, which suggests an interaction between the two molecular machineries. This study developed a generally applicable LRS-based method for the time-course characterization of transcriptomes of any organism.
Assuntos
Herpesvirus Bovino 1 , Sequenciamento por Nanoporos , Perfilação da Expressão Gênica/métodos , Herpesvirus Bovino 1/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sítio de Iniciação de Transcrição , TranscriptomaRESUMO
Long-read sequencing (LRS) approaches shed new light on the complexity of viral (Kakuk et al., 2021 [1]; Boldogkoi et al., 2019 [2]; Depledge et a., 2019 [3]), bacterial (Yan et al., 2018 [4]) and eukaryotic (Tilgner et al., 2014 [5]) transcriptomes. Emerging RNA viruses are zoonotic (Woolhouse et al., 2016 [6]) and create public health problems, e.g. influenza pandemic caused by H1N1 virus in (Fraser et al., 2009 [7]), as well as the current SARS-CoV-2 pandemic (Kim et al., 2020 [8]). In this study, we carried out nanopore sequencing for generating transcriptomic data valuable for structural and kinetic profiling of six important human pathogen RNA viruses, the H1N1 subtype of Influenza A virus (IVA), the Zika virus (ZIKV), the West Nile virus (WNV), the Crimean-Congo hemorrhagic fever virus (CCHFV), the Coxsackievirus [group B serotype 5 (CVB5)] and the Vesicular stomatitis Indiana virus (VSIV), and the response of host cells upon viral infection. The raw sequencing data were filtered during basecalling and only high quality reads (Qscore ≥ 7) were mapped to the appropriate viral and host genomes. Length distribution of sequencing reads were assessed and statistics of data were plotted by the ReadStat.4 python script. The datasets can be used to profile the transcriptomic landscape of RNA viruses, provide information for novel gene annotations, can serve as resource for studying the virus-host interactions, and for the analysis of RNA base modifications. These datasets can be used to compare the different sequencing techniques, library preparation approaches, bioinformatics pipelines, and to analyze the RNA profiles of viruses with small RNA genomes.
RESUMO
In the last couple of years, the implementation of long-read sequencing (LRS) technologies for transcriptome profiling has uncovered an extreme complexity of viral gene expression. In this study, we carried out a systematic analysis on the pseudorabies virus transcriptome by combining our current data obtained by using Pacific Biosciences Sequel and Oxford Nanopore Technologies MinION sequencing with our earlier data generated by other LRS and short-read sequencing techniques. As a result, we identified a number of novel genes, transcripts, and transcript isoforms, including splice and length variants, and also confirmed earlier annotated RNA molecules. One of the major findings of this study is the discovery of a large number of 5'-truncations of larger putative mRNAs being 3'-co-terminal with canonical mRNAs of PRV. A large fraction of these putative RNAs contain in-frame ATGs, which might initiate translation of N-terminally truncated polypeptides. Our analyses indicate that CTO-S, a replication origin-associated RNA molecule is expressed at an extremely high level. This study demonstrates that the PRV transcriptome is much more complex than previously appreciated.
RESUMO
African swine fever virus (ASFV) is a large DNA virus belonging to the Asfarviridae family. Despite its agricultural importance, little is known about the fundamental molecular mechanisms of this pathogen. Short-read sequencing (SRS) can produce a huge amount of high-precision sequencing reads for transcriptomic profiling, but it is inefficient for comprehensively annotating transcriptomes. Long-read sequencing (LRS) can overcome some of SRS's limitations, but it also has drawbacks, such as low-coverage and high error rate. The limitations of the two approaches can be surmounted by the combined use of these techniques. In this study, we used Illumina SRS and Oxford Nanopore Technologies LRS platforms with multiple library preparation methods (amplified and direct cDNA sequencings and native RNA sequencing) for constructing the ASFV transcriptomic atlas. This work identified many novel transcripts and transcript isoforms and annotated the precise termini of previously described RNAs. This study identified a novel species of ASFV transcripts, the replication origin-associated RNAs. Additionally, we discovered several nested genes embedded into larger canonical genes. In contrast to the current view that the ASFV transcripts are monocistronic, we detected a significant extent of polycistronism. A multifaceted meshwork of transcriptional overlaps was also discovered.
Assuntos
Vírus da Febre Suína Africana/genética , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Animais , Células Cultivadas , Biblioteca Gênica , Genoma Viral , Macrófagos Alveolares/virologia , RNA Viral/genética , SuínosRESUMO
OBJECTIVE: In this study, we applied two long-read sequencing (LRS) approaches, including single-molecule real-time and nanopore-based sequencing methods to investigate the time-lapse transcriptome patterns of host gene expression as a response to Vaccinia virus infection. Transcriptomes determined using short-read sequencing approaches are incomplete because these platforms are inefficient or fail to distinguish between polycistronic RNAs, transcript isoforms, transcriptional start sites, as well as transcriptional readthroughs and overlaps. Long-read sequencing is able to read full-length nucleic acids and can therefore be used to assemble complete transcriptome atlases. RESULTS: In this work, we identified a number of novel transcripts and transcript isoforms of Chlorocebus sabaeus. Additionally, analysis of the most abundant 768 host transcripts revealed a significant overrepresentation of the class of genes in the "regulation of signaling receptor activity" Gene Ontology annotation as a result of viral infection.
Assuntos
Perfilação da Expressão Gênica , Infecções por Poxviridae , Animais , Chlorocebus aethiops , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Isoformas de Proteínas/genética , TranscriptomaRESUMO
Long-read sequencing (LRS), a powerful novel approach, is able to read full-length transcripts and confers a major advantage over the earlier gold standard short-read sequencing in the efficiency of identifying for example polycistronic transcripts and transcript isoforms, including transcript length- and splice variants. In this work, we profile the human cytomegalovirus transcriptome using two third-generation LRS platforms: the Sequel from Pacific BioSciences, and MinION from Oxford Nanopore Technologies. We carried out both cDNA and direct RNA sequencing, and applied the LoRTIA software, developed in our laboratory, for the transcript annotations. This study identified a large number of novel transcript variants, including splice isoforms and transcript start and end site isoforms, as well as putative mRNAs with truncated in-frame ORFs (located within the larger ORFs of the canonical mRNAs), which potentially encode N-terminally truncated polypeptides. Our work also disclosed a highly complex meshwork of transcriptional read-throughs and overlaps.