RESUMO
Mutation density patterns reveal unique biological properties of specific genomic regions and shed light on the mechanisms of carcinogenesis. Although previous studies reported insightful mutation density patterns associated with certain genomic regions such as transcription start sites and DNA replication origins, a tool that can systematically investigate mutational spatial patterns is still lacking. Thus, we developed MutDens, a bioinformatic tool for comprehensive analysis of mutation density patterns around genomic features, namely, genomic positions, in humans and model species. By scanning the bidirectional vicinity regions of given positions, MutDens systematically characterizes the mutation density for single-base substitution mutational classes after adjusting for total mutation burden and local nucleotide proportion. Analysis results using MutDens not only verified the previously reported transcriptional strand bias around transcription start sites and replicative strand bias around DNA replication origins, but also identified novel mutation density patterns around other genomics features, such as enhancers and retrotransposon insertion polymorphism sites. To our knowledge, MutDens is the first tool that systematically calculates, examines, and compares mutation density patterns, thus providing a valuable avenue for investigating the mutational landscapes associated with important genomic features.
Assuntos
Genômica , Origem de Replicação , Humanos , Mutação , Sítio de Iniciação de Transcrição , DNARESUMO
RCF1 is a highly conserved DEAD-box RNA helicase found in yeast, plants, and mammals. Studies about the functions of RCF1 in plants are limited. Here, we uncovered the functions of RCF1 in Arabidopsis thaliana as a player in pri-miRNA processing and splicing, as well as in pre-mRNA splicing. A mutant with miRNA biogenesis defects was isolated, and the defect was traced to a recessive point mutation in RCF1 (rcf1-4). We show that RCF1 promotes D-body formation and facilitates the interaction between pri-miRNAs and HYL1. Finally, we show that intron-containing pri-miRNAs and pre-mRNAs exhibit a global splicing defect in rcf1-4. Together, this work uncovers roles for RCF1 in miRNA biogenesis and RNA splicing in Arabidopsis.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , MicroRNAs , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , RNA Helicases DEAD-box/genética , Regulação da Expressão Gênica de Plantas/genética , MicroRNAs/genética , MicroRNAs/metabolismo , Processamento Pós-Transcricional do RNA , Splicing de RNA/genética , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismoRESUMO
Cadmium (Cd) is a non-essential heavy metal, assimilated in plant tissue with other nutrients, disturbing the ions' homeostasis in plants. The plant develops different mechanisms to tolerate the hazardous environmental effects of Cd. Recently studies found different miRNAs that are involved in Cd stress. In the current study, miR397 mutant lines were constructed to explore the molecular mechanisms of miR397 underlying Cd tolerance. Compared with the genetically modified line of overexpressed miR397 (artificial miR397, amiR397), the lines of downregulated miR397 (Short Tandem Target Mimic miR397, STTM miR397) showed more substantial Cd tolerance with higher chlorophyll a & b, carotenoid and lignin content. ICP-OES revealed higher cell wall Cd and low total Cd levels in STTM miR397 than in the wild-type and amiR397 plants.Further, the STTM plants produced fewer reactive oxygen species (ROS) and lower activity of antioxidants enzymes (e.g., catalase [CAT], malondialdehyde [MDA]) compared with amiR397 and wild-type plants after stress, indicating that silencing the expression of miR397 can reduce oxidative damage. In addition, the different family transporters' gene expression was much higher in the amiR397 plants than in the wild type and STTM miRNA397. Our results suggest that miR397 plays a role in Cd tolerance in Arabidopsis thaliana. Overexpression of miR397 could decrease Cd tolerance in plants by regulating the expression of LAC 2/4/17, changing the lignin content, which may play an important role in inducing different stress-tolerant mechanisms and protecting the cell from a hazardous condition. This study provides a basis to elucidate the functions of miR397 and the Cd stress tolerance mechanism in Arabidopsis thaliana.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/metabolismo , Cádmio/metabolismo , Lignina/metabolismo , Clorofila A/metabolismo , Antioxidantes/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Plantas Geneticamente Modificadas/metabolismo , Regulação da Expressão Gênica de PlantasRESUMO
Significant advances have been achieved in understanding the critical role of enhancer RNAs (eRNAs) in the complex field of gene regulation. However, notable uncertainty remains concerning the biology of eRNAs, highlighting the need for continued research to uncover their exact functions in cellular processes and diseases. We present a comprehensive study to scrutinize mutation density patterns, mutation strand bias, and mutation burden in eRNAs across multiple cancer types. Our findings reveal that eRNAs exhibit mutation strand bias akin to that observed in protein-coding RNAs. We also identified a novel pattern, in which mutation density is notably diminished around the central region of the eRNA, but conspicuously elevated towards both the beginning and end. This pattern can be potentially explained by a mechanism involving heightened transcriptional activity and the activation of transcription-coupled repair. The central regions of the eRNAs appear to be more conserved, hinting at a potential mechanism preserving their structural and functional integrity, while the extremities may be more susceptible to mutations due to increased exposure. The evolutionary trajectory of this mutational pattern suggests a nuanced adaptation in eRNAs, where stability at their core coexists with flexibility at their extremities, potentially facilitating their diverse interactions with other genetic entities.
Assuntos
RNAs Intensificadores , Neoplasias , Humanos , Evolução Biológica , Reparo por Excisão , Mutação , Neoplasias/genética , Agitação PsicomotoraRESUMO
BACKGROUND: Genome-wide association studies (GWAS) have uncovered thousands of genetic variants that are associated with complex human traits and diseases. miRNAs are single-stranded non-coding RNAs. In particular, genetic variants located in the 3'UTR region of mRNAs may play an important role in gene regulation through their interaction with miRNAs. Existing studies have not been thoroughly conducted to elucidate 3'UTR variants discovered through GWAS. The goal of this study is to analyze patterns of GWAS functional variants located in 3'UTRs about their relevance in the network between hosting genes and targeting miRNAs, and elucidate the association between the genes harboring these variants and genetic traits. METHODS: We employed MIGWAS, ANNOVAR, MEME, and DAVID software packages to annotate the variants obtained from GWAS for 31 traits and elucidate the association between their harboring genes and their related traits. We identified variants that occurred in the motif regions that may be functionally important in affecting miRNA binding. We also conducted pathway analysis and functional annotation on miRNA targeted genes harboring 3'UTR variants for a trait with the highest percentage of 3'UTR variants occurring. RESULTS: The Child Obesity trait has the highest percentage of 3'UTR variants (75%). Of the 16 genes related to the Child Obesity trait, 5 genes (ETV7, GMEB1, NFIX, ZNF566, ZBTB40) had a significant association with the term DNA-Binding (p < 0.05). EQTL analysis revealed 2 relevant tissues and 10 targeted genes associated with the Child Obesity trait. In addition, Red Blood Cells (RBC), Hemoglobin (HB), and Package Cell Volume (PCV) have overlapping variants. In particular, the PIM1 variant occurred inside the HB Motif region 37,174,641-37,174,660, and LUC7L3 variant occurred inside RBC Motif region 50,753,918-50,753,937. CONCLUSION: Variants located in 3'UTR can alter the binding affinity of miRNA and impact gene regulation, thus warranting further annotation and analysis. We have developed a bioinformatics bash pipeline to automatically annotate variants, determine the number of variants in different categories for each given trait, and check common variants across different traits. This is a valuable tool to annotate a large number of GWAS result files.
Assuntos
MicroRNAs , Obesidade Infantil , Regiões 3' não Traduzidas , Criança , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/genética , Obesidade Infantil/genéticaRESUMO
The alga Chlamydomonas reinhardtii is a potential platform for recombinant protein expression in the future due to various advantages. Dozens of C. reinhardtii strains producing genetically engineered recombinant therapeutic protein have been reported. However, owing to extremely low protein expression efficiency, none have been applied for industrial purposes. Improving protein expression efficiency at the molecular level is, therefore, a priority. The 3'-end poly(A) tail of mRNAs is strongly correlated with mRNA transcription and protein translation efficiency. In this study, we identified a canonical C. reinhardtii poly(A) polymerase (CrePAPS), verified its polyadenylate activity, generated a series of overexpressing transformants, and performed proteomic analysis. Proteomic results demonstrated that overexpressing CrePAPS promoted ribosomal assembly and enhanced protein accumulation. The accelerated translation was further verified by increased crude and dissolved protein content detected by Kjeldahl and bicinchoninic acid (BCA) assay approaches. The findings provide a novel direction in which to exploit photosynthetic green algae as a recombinant protein expression platform.
Assuntos
Chlamydomonas reinhardtii , Chlamydomonas reinhardtii/genética , Chlamydomonas reinhardtii/metabolismo , Biossíntese de Proteínas , Proteômica , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas Recombinantes/metabolismoRESUMO
Complex two-dimensional warranty equipment is usually composed of many multi-component systems, which include several key components. During the warranty period, conducting maintenance according to the preventive maintenance plan of each component will increase the warranty costs. Opportunistic maintenance is an effective approach to combine the preventive maintenance of each individual component, which can reduce the warranty cost and improve the system availability. This study explored the optimal opportunistic maintenance scheme of multi-component systems. Firstly, the failure rate model and reliability evaluation model of the multi-component system considering failure dependence were established. Secondly, the preventive maintenance plan of each individual component was determined, with the goal of obtaining the lowest warranty cost per unit time in the component life cycle. Thirdly, the preventive maintenance work of each individual component was combined, and the two-dimensional warranty cost model of the multi-component system was established according to the reliability threshold when performing opportunistic maintenance. In the experimental verification and result analysis, the genetic algorithm was used to find the optimal opportunistic maintenance scheme for the power transmission device. The comparative analysis results show that the opportunistic maintenance scheme reduced the warranty cost by 5.5% and improved the availability by 10%, which fully verified the effectiveness of the opportunistic maintenance strategy.
Assuntos
Algoritmos , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Although there are many studies on the characteristics of miRNA-mRNA interactions using miRNA and mRNA sequencing data, the complexity of the change of the correlation coefficients and expression values of the miRNA-mRNA pairs between tumor and normal samples is still not resolved, and this hinders the potential clinical applications. There is an urgent need to develop innovative methodologies and tools that can characterize and visualize functional consequences of cancer risk gene and miRNA pairs while analyzing the tumor and normal samples simultaneously. RESULTS: We developed an innovative bioinformatics tool for visualizing functional annotation of miRNA-mRNA pairs in a network, known as MMiRNA-Viewer2. The tool takes mRNA and miRNA interaction pairs and visualizes mRNA and miRNA regulation network. Moreover, our MMiRNA-Viewer2 web server integrates and displays the mRNA and miRNA gene annotation information, signaling cascade pathways and direct cancer association between miRNAs and mRNAs. Functional annotation and gene regulatory information can be directly retrieved from our web server, which can help users quickly identify significant interaction sub-network and report possible disease or cancer association. The tool can identify pivotal miRNAs or mRNAs that contribute to the complexity of cancer, while engaging modern next-generation sequencing technology to analyze the tumor and normal samples concurrently. We compared our tools with other visualization tools. CONCLUSION: Our MMiRNA-Viewer2 serves as a multitasking platform in which users can identify significant interaction clusters and retrieve functional and cancer-associated information for miRNA-mRNA pairs between tumor and normal samples. Our tool is applicable across a range of diseases and cancers and has advantages over existing tools.
Assuntos
Biologia Computacional/métodos , MicroRNAs/genética , RNA Mensageiro/genética , HumanosRESUMO
MicroRNAs (miRNA) are short noncoding RNAs that can repress the expression of protein-coding messenger RNAs (mRNAs) by binding to the 3'-untranslated region (UTR) of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3'-UTR of the mRNAs can disrupt miRNA regulation. In this study, we presented dbMTS, a database for miRNA target site (MTS) SNVs and their functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available for academic use at http://database.liulab.science/dbMTS as a web service or a downloadable attached database of dbNSFP.
Assuntos
Bases de Dados Genéticas , MicroRNAs , Polimorfismo de Nucleotídeo Único , Regiões 3' não Traduzidas , Biologia Computacional , Humanos , Internet , MicroRNAs/genética , SoftwareRESUMO
Although glucose uniquely stimulates proinsulin biosynthesis in ß cells, surprisingly little is known of the underlying mechanism(s). Here, we demonstrate that glucose activates the unfolded protein response transducer inositol-requiring enzyme 1 alpha (IRE1α) to initiate X-box-binding protein 1 (Xbp1) mRNA splicing in adult primary ß cells. Using mRNA sequencing (mRNA-Seq), we show that unconventional Xbp1 mRNA splicing is required to increase and decrease the expression of several hundred mRNAs encoding functions that expand the protein secretory capacity for increased insulin production and protect from oxidative damage, respectively. At 2 wk after tamoxifen-mediated Ire1α deletion, mice develop hyperglycemia and hypoinsulinemia, due to defective ß cell function that was exacerbated upon feeding and glucose stimulation. Although previous reports suggest IRE1α degrades insulin mRNAs, Ire1α deletion did not alter insulin mRNA expression either in the presence or absence of glucose stimulation. Instead, ß cell failure upon Ire1α deletion was primarily due to reduced proinsulin mRNA translation primarily because of defective glucose-stimulated induction of a dozen genes required for the signal recognition particle (SRP), SRP receptors, the translocon, the signal peptidase complex, and over 100 other genes with many other intracellular functions. In contrast, Ire1α deletion in ß cells increased the expression of over 300 mRNAs encoding functions that cause inflammation and oxidative stress, yet only a few of these accumulated during high glucose. Antioxidant treatment significantly reduced glucose intolerance and markers of inflammation and oxidative stress in mice with ß cell-specific Ire1α deletion. The results demonstrate that glucose activates IRE1α-mediated Xbp1 splicing to expand the secretory capacity of the ß cell for increased proinsulin synthesis and to limit oxidative stress that leads to ß cell failure.
Assuntos
Processamento Alternativo , Proteínas de Ligação a DNA/metabolismo , Endorribonucleases/metabolismo , Hiperglicemia/metabolismo , Células Secretoras de Insulina/metabolismo , Insulina/metabolismo , Estresse Oxidativo , Proteínas Serina-Treonina Quinases/metabolismo , Fatores de Transcrição/metabolismo , Adolescente , Adulto , Animais , Células Cultivadas , Cruzamentos Genéticos , Proteínas de Ligação a DNA/genética , Endorribonucleases/genética , Feminino , Humanos , Hiperglicemia/sangue , Hiperglicemia/patologia , Secreção de Insulina , Células Secretoras de Insulina/patologia , Células Secretoras de Insulina/ultraestrutura , Masculino , Camundongos Knockout , Camundongos Transgênicos , Pessoa de Meia-Idade , Proteínas Serina-Treonina Quinases/genética , Proteínas Recombinantes/metabolismo , Fatores de Transcrição de Fator Regulador X , Transdução de Sinais , Doadores de Tecidos , Fatores de Transcrição/genética , Proteína 1 de Ligação a X-Box , Adulto JovemRESUMO
BACKGROUND: Small noncoding regulatory RNAs (sRNAs) are post-transcriptional regulators, regulating mRNAs, proteins, and DNA in bacteria. One class of sRNAs, trans-acting sRNAs, are the most abundant sRNAs transcribed from the intergenic regions (IGRs) of the bacterial genome. In Streptococcus pyogenes, a common and potentially deadly pathogen, many sRNAs have been identified, but only a few have been studied. The goal of this study is to identify trans-acting sRNAs that can be substrates of RNase III. The endoribonuclease RNase III cleaves double stranded RNAs, which can be formed during the interaction between an sRNA and target mRNAs. RESULTS: For this study, we created an RNase III null mutant of Streptococcus pyogenes and its RNA sequencing (RNA-Seq) data were analyzed and compared to that of the wild-type. First, we developed a custom script that can detect intergenic regions of the S. pyogenes genome. A differential expression analysis with Cufflinks and Stringtie was then performed to identify the intergenic regions whose expression was influenced by the RNase III gene deletion. CONCLUSION: This analysis yielded 12 differentially expressed regions with >|2| fold change and p ≤ 0.05. Using Artemis and Bamview genome viewers, these regions were visually verified leaving 6 putative sRNAs. This study not only expanded our knowledge on novel sRNAs but would also give us new insight into sRNA degradation.
Assuntos
Biologia Computacional/métodos , RNA Bacteriano/genética , Pequeno RNA não Traduzido/genética , Ribonuclease III/metabolismo , Análise de Sequência de RNA , Streptococcus pyogenes/genética , Sequência de Bases , DNA Intergênico/genética , Deleção de Genes , Genoma Bacteriano , RNA Bacteriano/metabolismo , RNA Mensageiro/genética , Pequeno RNA não Traduzido/metabolismo , Reprodutibilidade dos TestesRESUMO
BACKGROUND: It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The "Read-Split-Walk" (RSW) and "Read-Split-Run" (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, "Read-Split-Fly" (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis. RESULTS: We used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5'ss and 3'ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5'ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers. CONCLUSIONS: Our RSF pipeline is able to detect many possible junctions (especially those with a high RPKM) with very high overall accuracy and relative high accuracy for novel junctions. We have incorporated useful parameter features into the pipeline such as, handling variable-length read data, and searching spliced sequences for splicing signatures and miRNA events. We suggest RSF, a tool for identifying novel splicing events, is applicable to study a range of diseases across biological systems under different experimental conditions.
Assuntos
Algoritmos , Processamento Alternativo/genética , Genoma , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , MicroRNAs/genética , MicroRNAs/metabolismo , Sítios de Splice de RNA/genéticaRESUMO
BACKGROUND: Alternative splicing (AS) is a posttranscriptional process that produces differ-ent transcripts from the same gene and is important to produce diverse protein products in response to environmental stimuli. AS occurs at specific sites on the mRNA sequence, some of which have been de-fined. Multiple bioinformatics tools have been developed to detect AS from experimental data. OBJECTIVES: The goal of this review is to help researchers use specific tools to aid their research and to develop new AS detection tools based on these previously established tools. METHOD: We selected 15 AS detection tools that were recently published; we classified and delineated them on several aspects. Also, a performance comparison of these tools with the same starting input was conducted. RESULT: We reviewed the following categorized features of the tools: Publication information, working principles, generic and distinct workflows, running platform, input data requirement, sequencing depth dependency, reads mapped to multiple locations, isoform annotation basis, precise detected AS types, and performance benchmarks. CONCLUSION: Through comparisons of these tools, we provide a panorama of the advantages and short-comings of each tool and their scopes of application.
RESUMO
BACKGROUND: MicroRNAs (miRNA) are short nucleotides that interact with their target genes through 3' untranslated regions (UTRs). The Cancer Genome Atlas (TCGA) harbors an increasing amount of cancer genome data for both tumor and normal samples. However, there are few visualization tools focusing on concurrently displaying important relationships and attributes between miRNAs and mRNAs of both cancer tumor and normal samples. Moreover, a deep investigation of miRNA-mRNA target and biological relationships across multiple cancer types by integrating web-based analysis has not been thoroughly conducted. RESULTS: We developed an interactive visualization tool called MMiRNA-Viewer that can concurrently present the co-relationships of expression between miRNA-mRNA pairs of both tumor and normal samples into a single graph. The input file of MMiRNA-Viewer contains the expression information including fold changes between normal and tumor samples for mRNAs and miRNAs, the correlation between mRNA and miRNA, and the predicted target relationship by a number of databases. Users can also load their own input data into MMiRNA-Viewer and visualize and compare detailed information about cancer-related gene expression changes, and also changes in the expression of transcription-regulating miRNAs. To validate the MMiRNA-Viewer, eight types of TCGA cancer datasets with both normal and control samples were selected in this study and three filter steps were applied subsequently. We performed Gene Ontology (GO) analysis for genes available in final selected 238 pairs and also for genes in the top 5 % (95 percentile) for each of eight cancer types to report a significant number of genes involved in various biological functions and pathways. We also calculated various centrality measurement matrices for the largest connected component(s) in each of eight cancers and reported top genes and miRNAs with high centrality measurements. CONCLUSIONS: With its user-friendly interface, dynamic visualization and advanced queries, we also believe MMiRNA-Viewer offers an intuitive approach for visualizing and elucidating co-relationships between miRNAs and mRNAs of both tumor and normal samples. We suggest that miRNA and mRNA pairs with opposite fold changes of their expression and with inverted correlation values between tumor and normal samples might be most relevant for explaining the decoupling of mRNAs and their targeting miRNAs in tumor samples for certain cancer types.
Assuntos
Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , MicroRNAs/genética , Neoplasias/metabolismo , RNA Mensageiro/genética , Software , Regiões 3' não Traduzidas , Biologia Computacional/métodos , Humanos , MicroRNAs/metabolismo , Neoplasias/genética , RNA Mensageiro/metabolismo , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de RNARESUMO
BACKGROUND: The objective of this research was to investigate the variation of gene expression in the blood transcriptome profile of Chinese Holstein cows associated to the milk yield traits. RESULTS: We used RNA-seq to generate the bovine transcriptome from the blood of 23 lactating Chinese Holstein cows with extremely high and low milk yield. A total of 100 differentially expressed genes (DEGs) (p < 0.05, FDR < 0.05) were revealed between the high and low groups. Gene ontology (GO) analysis demonstrated that the 100 DEGs were enriched in specific biological processes with regard to defense response, immune response, inflammatory response, icosanoid metabolic process, and fatty acid metabolic process (p < 0.05). The KEGG pathway analysis with 100 DEGs revealed that the most statistically-significant metabolic pathway was related with Toll-like receptor signaling pathway (p < 0.05). The expression level of four selected DEGs was analyzed by qRT-PCR, and the results indicated that the expression patterns were consistent with the deep sequencing results by RNA-Seq. Furthermore, alternative splicing analysis of 100 DEGs demonstrated that there were different splicing pattern between high and low yielders. The alternative 3' splicing site was the major splicing pattern detected in high yielders. However, in low yielders the major type was exon skipping. CONCLUSION: This study provides a non-invasive method to identify the DEGs in cattle blood using RNA-seq for milk yield. The revealed 100 DEGs between Holstein cows with extremely high and low milk yield, and immunological pathway are likely involved in milk yield trait. Finally, this study allowed us to explore associations between immune traits and production traits related to milk production.
Assuntos
Processamento Alternativo/genética , Lactação/genética , Transcrição Gênica , Transcriptoma/genética , Animais , Bovinos , Éxons , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Metabolismo dos Lipídeos/genética , Leite , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: Most existing tools for detecting next-generation sequencing-based splicing events focus on generic splicing events. Consequently, special types of non-canonical splicing events of short mRNA regions (IRE1α targeted) have not yet been thoroughly addressed at a genome-wide level using bioinformatics approaches in conjunction with next-generation technologies. During endoplasmic reticulum (ER) stress, the gene encoding the RNase Ire1α is known to splice out a short 26 nt region from the mRNA of the transcription factor Xbp1 non-canonically within the cytosol. This causes an open reading frame-shift that induces expression of many downstream genes in reaction to ER stress as part of the unfolded protein response (UPR). We previously published an algorithm termed "Read-Split-Walk" (RSW) to identify non-canonical splicing regions using RNA-Seq data and applied it to ER stress-induced Ire1α heterozygote and knockout mouse embryonic fibroblast cell lines. In this study, we have developed an improved algorithm "Read-Split-Run" (RSR) for detecting genome-wide Ire1α-targeted genes with non-canonical spliced regions at a faster speed. We applied the RSR algorithm using different combinations of several parameters to the previously RSW tested mouse embryonic fibroblast cells (MEF) and the human Encyclopedia of DNA Elements (ENCODE) RNA-Seq data. We also compared the performance of RSR with two other alternative splicing events identification tools (TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012)) utilizing the context of the spliced Xbp1 mRNA as a positive control in the data sets we identified it to be the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples and this comparison was also extended to human ENCODE RNA-Seq data. RESULTS: Proof of principle came in our results by the fact that the 26 nt non-conventional splice site in Xbp1 was detected as the top hit by our new RSR algorithm in heterozygote (Het) samples from both Thapsigargin (Tg) and Dithiothreitol (Dtt) treated experiments but absent in the negative control Ire1α knock-out (KO) samples. Applying different combinations of parameters to the mouse MEF RNA-Seq data, we suggest a General Linear Model (GLM) for both Tg and Dtt treated experiments. We also ran RSR for a human ENCODE RNA-Seq dataset and identified 32,597 spliced regions for regular chromosomes. TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012) identified 237,155 spliced junctions and 9,129 exon skipping events (excluding chr14), respectively. Our Read-Split-Run algorithm also outperformed others in the context of ranking Xbp1 gene as the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples. The RSR package including source codes is available at http://bioinf1.indstate.edu/RSR and its pipeline source codes are also freely available at https://github.com/xuric/read-split-run for academic use. CONCLUSIONS: Our new RSR algorithm has the capability of processing massive amounts of human ENCODE RNA-Seq data for identifying novel splice junction sites at a genome-wide level in a much more efficient manner when compared to the previous RSW algorithm. Our proposed model can also predict the number of spliced regions under any combinations of parameters. Our pipeline can detect novel spliced sites for other species using RNA-Seq data generated under similar conditions.
Assuntos
Processamento Alternativo/genética , Sequência de Bases/genética , Genoma , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Algoritmos , Animais , Biologia Computacional/métodos , Proteínas de Ligação a DNA/genética , Bases de Dados Genéticas , Genômica/métodos , Humanos , Camundongos , Software , Resposta a Proteínas não Dobradas/genéticaRESUMO
The purpose of this study was to identify the major molecular components in the secretory and maturation stages of amelogenesis through transcriptome analyses. Ameloblasts (40 sections per age group) were laser micro-dissected from Day 5 (secretory stage) and Days 11-12 (maturation stage) first molars. PolyA+ RNA was isolated from the lysed cells, converted to cDNA, and amplified to generate a cDNA library. DNA sequences were obtained using next generation sequencing and analyzed to identify genes whose expression had increased or decreased at least 1.5-fold in maturation stage relative to secretory stage ameloblasts. Among the 9198 genes that surpassed the quality threshold, 373 showed higher expression in secretory stage, while 614 genes increased in maturation stage ameloblasts. The results were cross-checked against a previously published transcriptome generated from tissues overlying secretory and maturation stage mouse incisor enamel and 34 increasing and 26 decreasing expressers common to the two studies were identified. Expression of F2r, which encodes protease activated receptor 1 (PAR1) that showed 10-fold higher expression during the secretory stage in our transcriptome analysis, was characterized in mouse incisors by immunohistochemistry. PAR1 was detected in secretory, but not maturation stage ameloblasts. We conclude that transcriptome analyses are a good starting point for identifying genes/proteins that are critical for proper dental enamel formation and that PAR1 is specifically expressed by secretory stage ameloblasts.
Assuntos
Ameloblastos/metabolismo , Amelogênese/genética , Proteínas do Esmalte Dentário/genética , Perfilação da Expressão Gênica , Transcriptoma/genética , Animais , Órgão do Esmalte/crescimento & desenvolvimento , Sequenciamento de Nucleotídeos em Larga Escala , CamundongosRESUMO
Around 80 to 85% of all lung cancers are non-small cell lung cancer (NSCLC). Previous research has aimed at exploring the genetic basis of NSCLC through individual approaches, but studies have yet to investigate the results of combining them. Here we show that analyzing NSCLC genetics through three approaches simultaneously creates unique insights into our understanding of the disease. Through a combination of previous research and bioinformatics tools, we determined 35 NSCLC candidate genes. We analyzed these genes in 3 different approaches. First, we found the gene fusions between these candidate genes. Second, we found the common superfamilies between genes. Finally, we identified mutational signatures that are possibly associated with NSCLC. Each approach has its individual, unique results. Fusion relationships identify specific gene fusion targets, common superfamilies identify possible avenues to determine novel target genes, and identifying NSCLC associated mutational signatures has diagnostic and prognostic benefits. Combining the approaches, we found that gene CD74 has significant fusion relationships, but it has no association with the other two approaches, suggesting that CD74 is associated with NSCLC mainly because of its fusion relationships. Targeting the gene fusions of CD74 may be an alternative NSCLC treatment. This genetic analysis has indeed created unique insight into NSCLC genes. Both the results from each of the approaches separately and combined allow pursuit of more effective treatment strategies for this cancer. The methodology presented can also apply to other cancers, creating insights that current analytical methods could not find.
RESUMO
Mutations in oncogenes and tumor suppressor genes can significantly impact cellular function during cancer development. A comprehensive analysis of their mutation patterns and significant gene ontology terms can provide insights into cancer emergence and suggest potential targets for drug development. This study analyzes twelve cancer subtypes by focusing on significant genetic and molecular factors. Two common genetic mutations associated with cancer are single nucleotide variants (SNVs) and copy number alterations (CNAs). Oncogenes, derived from mutated proto-oncogenes, disrupt normal cell functions and promote cancer, while tumor suppressor genes, often inactivated by mutations, regulate cell processes like proliferation and DNA damage response. This study analyzed datasets from The Cancer Genome Atlas (TCGA), which provides extensive genomic data across various cancers. In our analysis results, many genes with significant p-values based on Kaplan Meier gene expression data were identified in eight cancers (BRCA, BLCA, HNSC, KIRC, LUAD, KIRP, LUSC, STAD). Moreover, STAD is the only cancer for genes with both significant p-values and functional terms reported. Interestingly, we found that LIHC was the cancer reported with only one CNA mutated gene and its survival plot p-value being significant. Additionally, KICH has no reported significant genes at all. Our study proposed the relationship between tumor suppressor and oncogenes and shed light on cancer tumorigenesis due to genetic mutations.