Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
1.
EMBO J ; 40(15): e107976, 2021 08 02.
Article in English | MEDLINE | ID: mdl-34184765

ABSTRACT

Nuclear stress bodies (nSBs) are nuclear membraneless organelles formed around stress-inducible HSATIII architectural long noncoding RNAs (lncRNAs). nSBs repress splicing of hundreds of introns during thermal stress recovery, which are partly regulated by CLK1 kinase phosphorylation of temperature-dependent Ser/Arg-rich splicing factors (SRSFs). Here, we report a distinct mechanism for this splicing repression through protein sequestration by nSBs. Comprehensive identification of RNA-binding proteins revealed HSATIII association with proteins related to N6 -methyladenosine (m6 A) RNA modification. 11% of the first adenosine in the repetitive HSATIII sequence were m6 A-modified. nSBs sequester the m6 A writer complex to methylate HSATIII, leading to subsequent sequestration of the nuclear m6 A reader, YTHDC1. Sequestration of these factors from the nucleoplasm represses m6 A modification of pre-mRNAs, leading to repression of m6 A-dependent splicing during stress recovery phase. Thus, nSBs serve as a common platform for regulation of temperature-dependent splicing through dual mechanisms employing two distinct ribonucleoprotein modules with partially m6 A-modified architectural lncRNAs.


Subject(s)
Nerve Tissue Proteins/genetics , RNA Splicing Factors/genetics , RNA Splicing , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Adenosine/analogs & derivatives , Adenosine/metabolism , Cell Nucleus/genetics , HeLa Cells , Humans , Nerve Tissue Proteins/metabolism , Phosphorylation , Protein Serine-Threonine Kinases/genetics , Protein-Tyrosine Kinases/genetics , RNA Splicing Factors/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Repetitive Sequences, Nucleic Acid , Serine-Arginine Splicing Factors/genetics , Serine-Arginine Splicing Factors/metabolism , Temperature
2.
RNA ; 29(2): 170-177, 2023 02.
Article in English | MEDLINE | ID: mdl-36384963

ABSTRACT

The mammalian cell nucleus contains dozens of membrane-less nuclear bodies that play significant roles in various aspects of gene expression. Several nuclear bodies are nucleated by specific architectural noncoding RNAs (arcRNAs) acting as structural scaffolds. We have reported that a minor population of cellular RNAs exhibits an unusual semi-extractable feature upon using the conventional procedure of RNA preparation and that needle shearing or heating of cell lysates remarkably improves extraction of dozens of RNAs. Because semi-extractable RNAs, including known arcRNAs, commonly localize in nuclear bodies, this feature may be a hallmark of arcRNAs. Using the semi-extractability of RNA, we performed genome-wide screening of semi-extractable long noncoding RNAs to identify new candidate arcRNAs for arcRNA under hyperosmotic and heat stress conditions. After screening stress-inducible and semi-extractable RNAs, hundreds of readthrough downstream-of-gene (DoG) transcripts over several hundreds of kilobases, many of which were not detected among RNAs prepared by the conventional extraction procedure, were found to be stress-inducible and semi-extractable. We further characterized some of the abundant DoGs and found that stress-inducible transient extension of the 3'-UTR made DoGs semi-extractable. Furthermore, they were localized in distinct nuclear foci that were sensitive to 1,6-hexanediol. These data suggest that semi-extractable DoGs exhibit arcRNA-like features and our semi-extractable RNA-seq is a powerful tool to extensively monitor DoGs that are induced under specific physiological conditions.


Subject(s)
Cell Nucleus , RNA, Long Noncoding , Animals , Base Sequence , Cell Nucleus/metabolism , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Mammals/genetics
3.
EMBO J ; 39(3): e102729, 2020 02 03.
Article in English | MEDLINE | ID: mdl-31782550

ABSTRACT

A number of long noncoding RNAs (lncRNAs) are induced in response to specific stresses to construct membrane-less nuclear bodies; however, their function remains poorly understood. Here, we report the role of nuclear stress bodies (nSBs) formed on highly repetitive satellite III (HSATIII) lncRNAs derived from primate-specific satellite III repeats upon thermal stress exposure. A transcriptomic analysis revealed that depletion of HSATIII lncRNAs, resulting in elimination of nSBs, promoted splicing of 533 retained introns during thermal stress recovery. A HSATIII-Comprehensive identification of RNA-binding proteins by mass spectrometry (ChIRP-MS) analysis identified multiple splicing factors in nSBs, including serine and arginine-rich pre-mRNA splicing factors (SRSFs), the phosphorylation states of which affect splicing patterns. SRSFs are rapidly de-phosphorylated upon thermal stress exposure. During stress recovery, CDC like kinase 1 (CLK1) was recruited to nSBs and accelerated the re-phosphorylation of SRSF9, thereby promoting target intron retention. Our findings suggest that HSATIII-dependent nSBs serve as a conditional platform for phosphorylation of SRSFs by CLK1 to promote the rapid adaptation of gene expression through intron retention following thermal stress exposure.


Subject(s)
Cell Nucleus/metabolism , Heat-Shock Response , Microsatellite Repeats , Protein Serine-Threonine Kinases/metabolism , Protein-Tyrosine Kinases/metabolism , RNA, Long Noncoding/metabolism , Serine-Arginine Splicing Factors/metabolism , Animals , CHO Cells , Cricetulus , Gene Expression Profiling , Gene Expression Regulation , HeLa Cells , Humans , Introns , Phosphorylation , RNA Splicing Factors/metabolism , Exome Sequencing
4.
Nucleic Acids Res ; 50(13): e73, 2022 07 22.
Article in English | MEDLINE | ID: mdl-35390152

ABSTRACT

Recent technological advances have enabled the generation of large amounts of data consisting of RNA sequences and their functional activity. Here, we propose a method for extracting secondary structure features that affect the functional activity of RNA from sequence-activity data. Given pairs of RNA sequences and their corresponding bioactivity values, our method calculates position-specific structural features of the input RNA sequences, considering every possible secondary structure of each RNA. A Ridge regression model is trained using the structural features as feature vectors and the bioactivity values as response variables. Optimized model parameters indicate how secondary structure features affect bioactivity. We used our method to extract intramolecular structural features of bacterial translation initiation sites and self-cleaving ribozymes, and the intermolecular features between rRNAs and Shine-Dalgarno sequences and between U1 RNAs and splicing sites. We not only identified known structural features but also revealed more detailed insights into structure-activity relationships than previously reported. Importantly, the datasets we analyzed here were obtained from different experimental systems and differed in size, sequence length and similarity, and number of RNA molecules involved, demonstrating that our method is applicable to various types of data consisting of RNA sequences and bioactivity values.


Subject(s)
RNA, Catalytic , Nucleic Acid Conformation , Peptide Chain Initiation, Translational , RNA, Catalytic/chemistry , Regression Analysis , Structure-Activity Relationship
5.
Chembiochem ; 24(14): e202200572, 2023 07 17.
Article in English | MEDLINE | ID: mdl-37253903

ABSTRACT

Controlling PCR fidelity is an important issue for molecular biology and high-fidelity PCR is essential for gene cloning. In general, fidelity control is achieved by protein engineering of polymerases. In contrast, only a few studies have reported controlling fidelity using chemically modified nucleotide substrates. In this report, we synthesized nucleotide substrates possessing a modification on Pγ and evaluated the effect of this modification on PCR fidelity. One of the substrates, nucleotide tetraphosphate, caused a modest decrease in Taq DNA polymerase activity and the effect on PCR fidelity was dependent on the type of mutation. The use of deoxyadenosine tetraphosphate enhanced the A : T→G : C mutation dramatically, which is common when using Taq polymerase. Conversely, deoxyguanosine tetraphosphate (dG4P) suppressed this mutation but increased the G : C→A : T mutation during PCR. Using an excess amount of dG4P suppressed both mutations successfully and total fidelity was improved.


Subject(s)
Nucleic Acid Amplification Techniques , Phosphates , Taq Polymerase/genetics , Taq Polymerase/metabolism , Polymerase Chain Reaction , Mutation , Nucleotides
6.
Nucleic Acids Res ; 48(14): e81, 2020 08 20.
Article in English | MEDLINE | ID: mdl-32504488

ABSTRACT

RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman's ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites.


Subject(s)
Escherichia coli/genetics , Nucleic Acid Conformation , RNA, Bacterial/metabolism , RNA, Messenger/metabolism , RNA-Binding Proteins/metabolism , 5' Untranslated Regions , Algorithms , Codon, Initiator/metabolism , Datasets as Topic , Escherichia coli/metabolism , Forecasting , Linear Models , Machine Learning , Peptide Chain Initiation, Translational , RNA, Bacterial/chemistry , RNA, Messenger/chemistry , Regulatory Sequences, Ribonucleic Acid , Structure-Activity Relationship
7.
Nucleic Acids Res ; 48(22): 13000-13012, 2020 12 16.
Article in English | MEDLINE | ID: mdl-33257988

ABSTRACT

In the yeast Saccharomyces cerevisiae, terminator sequences not only terminate transcription but also affect expression levels of the protein-encoded upstream of the terminator. The non-conventional yeast Pichia pastoris (syn. Komagataella phaffii) has frequently been used as a platform for metabolic engineering but knowledge regarding P. pastoris terminators is limited. To explore terminator sequences available to tune protein expression levels in P. pastoris, we created a 'terminator catalog' by testing 72 sequences, including terminators from S. cerevisiae or P. pastoris and synthetic terminators. Altogether, we found that the terminators have a tunable range of 17-fold. We also found that S. cerevisiae terminator sequences maintain function when transferred to P. pastoris. Successful tuning of protein expression levels was shown not only for the reporter gene used to define the catalog but also using betaxanthin production as an example application in pathway flux regulation. Moreover, we found experimental evidence that protein expression levels result from mRNA abundance and in silico evidence that levels reflect the stability of mRNA 3'-UTR secondary structure. In combination with promoter selection, the novel terminator catalog constitutes a basic toolbox for tuning protein expression levels in metabolic engineering and synthetic biology in P. pastoris.


Subject(s)
RNA Stability/genetics , RNA, Messenger/genetics , Saccharomycetales/genetics , Terminator Regions, Genetic/genetics , Gene Expression Regulation, Fungal/genetics , Metabolic Engineering , Promoter Regions, Genetic , Saccharomyces cerevisiae/genetics , Synthetic Biology
8.
Bioinformatics ; 36(Suppl_1): i227-i235, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32657400

ABSTRACT

MOTIVATION: RNA folding kinetics plays an important role in the biological functions of RNA molecules. An important goal in the investigation of the kinetic behavior of RNAs is to find the folding pathway with the lowest energy barrier. For this purpose, most of the existing methods use heuristics because the number of possible pathways is huge even if only the shortest (direct) folding pathways are considered. RESULTS: In this study, we propose a new method using a best-first search strategy to efficiently compute the exact solution of the minimum barrier energy of direct pathways. Using our method, we can find the exact direct pathways within a Hamming distance of 20, whereas the previous methods even miss the exact short pathways. Moreover, our method can be used to improve the pathways found by existing methods for exploring indirect pathways. AVAILABILITY AND IMPLEMENTATION: The source code and datasets created and used in this research are available at https://github.com/eukaryo/czno. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , RNA , Nucleic Acid Conformation , RNA Folding , Software
9.
RNA Biol ; 17(2): 264-280, 2020 02.
Article in English | MEDLINE | ID: mdl-31601146

ABSTRACT

MicroRNAs (miRNAs) are small non-coding RNAs that play essential roles in the regulation of gene function by a mechanism known as RNA silencing. In a previous study, we revealed that miRNA-mediated silencing efficacy is correlated with the combinatorial thermodynamic properties of the miRNA seed-target mRNA duplex and the 5´-terminus of the miRNA duplex, which can be predicted using 'miScore'. In this study, a robust refined-miScore was developed by integrating the thermodynamic properties of various miRNA secondary structures and the latest thermodynamic parameters of wobble base-pairing, including newly established parameters for I:C base pairing. Through repeated random sampling and machine learning, refined-miScore models calculated with either melting temperature (Tm) or free energy change (ΔG) values were successfully built and validated in both wild-type and adenosine-to-inosine edited miRNAs. In addition to the previously reported contribution of the seed-target duplex and 5´-terminus region, the refined-miScore suggests that the central and 3´-terminus regions of the miRNA duplex also play a role in the thermodynamic regulation of miRNA-mediated silencing efficacy.


Subject(s)
Adenosine , Amino Acid Substitution , Inosine , MicroRNAs/genetics , Models, Biological , RNA Editing , RNA Interference , Algorithms , Machine Learning , Nucleic Acid Conformation , RNA Stability , RNA, Messenger/genetics , Thermodynamics
10.
Bioinformatics ; 33(11): 1613-1620, 2017 Jun 01.
Article in English | MEDLINE | ID: mdl-28130234

ABSTRACT

MOTIVATION: Enhancing expression levels of a target protein is an important goal in synthetic biology. A widely used strategy is to integrate multiple copies of genes encoding a target protein into a host organism genome. Integrating highly similar sequences, however, can induce homologous recombination between them, resulting in the ultimate reduction of the number of integrated genes. RESULTS: We propose a method for designing multiple protein-coding sequences (i.e. CDSs) that are unlikely to induce homologous recombination, while encoding the same protein. The method, which is based on multi-objective genetic algorithm, is intended to design a set of CDSs whose nucleotide sequences are as different as possible and whose codon usage frequencies are as highly adapted as possible to the host organism. We show that our method not only successfully designs a set of intended CDSs, but also provides insight into the trade-off between nucleotide differences among gene copies and codon usage frequencies. AVAILABILITY AND IMPLEMENTATION: Our method, named Tandem Designer, is available as a web-based application at http://tandem.trahed.jp/tandem/ . CONTACT: : terai_goro@intec.co.jp or asai@k.u-tokyo.ac.jp. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Open Reading Frames , Sequence Analysis, DNA/methods , Biological Evolution , Codon , Homologous Recombination , Proteins/genetics , Sequence Analysis, Protein/methods
11.
Bioinformatics ; 32(6): 828-34, 2016 03 15.
Article in English | MEDLINE | ID: mdl-26589279

ABSTRACT

MOTIVATION: An important problem in synthetic biology is to design a nucleotide sequence of an mRNA that confers a desirable expression level of a target protein. The secondary structure of protein-coding sequences (CDSs) is one potential factor that could have both positive and negative effects on protein production. To elucidate the role of secondary structure in CDSs, algorithms for manipulating secondary structure should be developed. RESULTS: We developed an algorithm for designing a CDS with the most stable secondary structure among all possible ones translated into the same protein, and implemented it as the program CDSfold. The algorithm runs the Zuker algorithm under the constraint of a given amino acid sequence. The time and space complexity is O(L(3)) and O(L(2)), respectively, where L is the length of the CDS to be designed. Although our algorithm is slower than the original Zuker algorithm, it could design a relatively long (2.7-kb) CDS in approximately 1 h. AVAILABILITY AND IMPLEMENTATION: The CDSfold program is freely available for non-commercial users as stand-alone and web-based software from http://cdsfold.trahed.jp/cdsfold/ CONTACTS: terai-goro@aist.go.jp or asai@k.u-tokyo.ac.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Algorithms , Amino Acid Sequence , Base Sequence , Open Reading Frames , Protein Structure, Secondary , Sequence Alignment
12.
BMC Genomics ; 17 Suppl 1: 12, 2016 Jan 11.
Article in English | MEDLINE | ID: mdl-26818453

ABSTRACT

MOTIVATION: Recent studies have revealed that large numbers of non-coding RNAs are transcribed in humans, but only a few of them have been identified with their functions. Identification of the interaction target RNAs of the non-coding RNAs is an important step in predicting their functions. The current experimental methods to identify RNA-RNA interactions, however, are not fast enough to apply to a whole human transcriptome. Therefore, computational predictions of RNA-RNA interactions are desirable, but this is a challenging task due to the huge computational costs involved. RESULTS: Here, we report comprehensive predictions of the interaction targets of lncRNAs in a whole human transcriptome for the first time. To achieve this, we developed an integrated pipeline for predicting RNA-RNA interactions on the K computer, which is one of the fastest super-computers in the world. Comparisons with experimentally-validated lncRNA-RNA interactions support the quality of the predictions. Additionally, we have developed a database that catalogs the predicted lncRNA-RNA interactions to provide fundamental information about the targets of lncRNAs.


Subject(s)
MicroRNAs/metabolism , RNA, Long Noncoding/metabolism , RNA, Messenger/metabolism , Transcriptome , User-Computer Interface , 3' Untranslated Regions , Algorithms , Databases, Genetic , Humans , Internet , RNA, Long Noncoding/genetics , Tumor Suppressor Protein p53/genetics
13.
Bioinformatics ; 31(7): 981-5, 2015 Apr 01.
Article in English | MEDLINE | ID: mdl-25414363

ABSTRACT

MOTIVATION: Ustiloxins A and B are toxic cyclic tetrapeptides, Tyr-Val/Ala-Ile-Gly (Y-V/A-I-G), that were originally identified from Ustilaginoidea virens, a pathogenic fungus affecting rice plants. Contrary to our report that ustiloxin B is ribosomally synthesized in Aspergillus flavus, a recent report suggested that ustiloxins are synthesized by a non-ribosomal peptide synthetase in U.virens. Thus, we analyzed the U.virens genome, to identify the responsible gene cluster. RESULTS: The biosynthetic gene cluster was identified from the genome of U.virens based on homologies to the ribosomal peptide biosynthetic gene cluster for ustiloxin B identified from A.flavus. It contains a gene encoding precursor protein having five Tyr-Val-Ile-Gly and three Tyr-Ala-Ile-Gly motifs for ustiloxins A and B, respectively, strongly indicating that ustiloxins A and B from U.virens are ribosomally synthesized. AVAILABILITY AND IMPLEMENTATION: Accession codes of the U.virens and A.flavus gene clusters in NCBI are BR001221 and BR001206, respectively. Supplementary data are available at Bioinformatics online.


Subject(s)
Fungal Proteins/genetics , Genes, Fungal , Multigene Family , Peptides, Cyclic/genetics , Ribosomes/metabolism , Ustilago/genetics , Amino Acid Sequence , Biosynthetic Pathways , Fungal Proteins/biosynthesis , Genome, Fungal , Molecular Sequence Data , Peptides, Cyclic/biosynthesis , Sequence Analysis, DNA/methods , Sequence Homology, Amino Acid , Ustilago/growth & development , Ustilago/metabolism
14.
RSC Chem Biol ; 5(4): 360-371, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38576723

ABSTRACT

We developed chemically modified PCR primers that allow the design of flexible sticky ends by introducing a photo-cleavable group at the phosphate moiety. Nucleic acid derivatives containing o-nitrobenzyl photo-cleavable groups with a tert-butyl group at the benzyl position were stable during strong base treatment for oligonucleotide synthesis and thermal cycling in PCR reactions. PCR using primers incorporating these nucleic acid derivatives confirmed that chain extension reactions completely stopped at position 1 before and after the site of the photo-cleavable group was introduced. DNA fragments of 2 and 3 kbp, with sticky ends of 50 bases, were successfully concatenated with a high yield of 77%. A plasmid was constructed using this method. Finally, we applied this approach to construct a 48.5 kbp lambda phage DNA, which is difficult to achieve using restriction enzyme-based methods. After 7 days, we were able to confirm the generation of DNA of the desired length. Although the efficiency is yet to be improved, the chemically modified PCR primer offers potential to complement enzymatic methods and serve as a DNA concatenation technique.

15.
Bioinformatics ; 27(13): 1788-97, 2011 Jul 01.
Article in English | MEDLINE | ID: mdl-21531769

ABSTRACT

MOTIVATION: The importance of RNA sequence analysis has been increasing since the discovery of various types of non-coding RNAs transcribed in animal cells. Conventional RNA sequence analyses have mainly focused on structured regions, which are stabilized by the stacking energies acting on adjacent base pairs. On the other hand, recent findings regarding the mechanisms of small interfering RNAs (siRNAs) and transcription regulation by microRNAs (miRNAs) indicate the importance of analyzing accessible regions where no base pairs exist. So far, relatively few studies have investigated the nature of such regions. RESULTS: We have conducted a detailed investigation of accessibilities around the target sites of siRNAs and miRNAs. We have exhaustively calculated the correlations between the accessibilities around the target sites and the repression levels of the corresponding mRNAs. We have computed the accessibilities with an originally developed software package, called 'Raccess', which computes the accessibility of all the segments of a fixed length for a given RNA sequence when the maximal distance between base pairs is limited to a fixed size W. We show that the computed accessibilities are relatively insensitive to the choice of the maximal span W. We have found that the efficacy of siRNAs depends strongly on the accessibility of the very 3'-end of their binding sites, which might reflect a target site recognition mechanism in the RNA-induced silencing complex. We also show that the efficacy of miRNAs has a similar dependence on the accessibilities, but some miRNAs also show positive correlations between the efficacy and the accessibilities in broad regions downstream of their putative binding sites, which might imply that the downstream regions of the target sites are bound by other proteins that allow the miRNAs to implement their functions. We have also investigated the off-target effects of an siRNA as a potential RNAi therapeutic. We show that the off-target effects of the siRNA have similar correlations to the miRNA repression, indicating that they are caused by the same mechanism. AVAILABILITY: The C++ source code of the Raccess software is available at http://www.ncrna.org/software/Raccess/ The microarray data on the measurements of the siRNA off-target effects are also available at the same site. CONTACT: kiryu-h@k.u-tokyo.ac.jp


Subject(s)
MicroRNAs/chemistry , MicroRNAs/metabolism , RNA, Small Interfering/chemistry , RNA, Small Interfering/metabolism , Animals , Fibroblasts/metabolism , Gene Expression Regulation , HeLa Cells , Humans , Mice , Nucleic Acid Conformation , RNA, Messenger/genetics , RNA-Induced Silencing Complex/metabolism , Rats , Sequence Analysis, RNA , Thermodynamics
16.
Nucleic Acids Res ; 38(4): 1163-71, 2010 Mar.
Article in English | MEDLINE | ID: mdl-19965772

ABSTRACT

More than 40% of the human genome is generated by retrotransposition, a series of in vivo processes involving reverse transcription of RNA molecules and integration of the transcripts into the genomic sequence. The mechanism of retrotransposition, however, is not fully understood, and additional genomic elements generated by retrotransposition may remain to be discovered. Here, we report that the human genome contains many previously unidentified short pseudogenes generated by retrotransposition of mRNAs. Genomic elements generated by non-long terminal repeat retrotransposition have specific sequence signatures: a poly-A tract that is immediately downstream and a pair of duplicated sequences, called target site duplications (TSDs), at either end. Using a new computer program, TSDscan, that can accurately detect pseudogenes based on the presence of the poly-A tract and TSDs, we found 654 short (< or = 300 bp), previously unknown pseudogenes derived from mRNAs. Comprehensive analyses of the pseudogenes that we identified and their parent mRNAs revealed that the pseudogene length depends on the parent mRNA length: long mRNAs generate more short pseudogenes than do short mRNAs. To explain this phenomenon, we hypothesize that most long mRNAs are truncated before they are reverse transcribed. Truncated mRNAs would be rapidly degraded during reverse transcription, resulting in the generation of short pseudogenes.


Subject(s)
Pseudogenes , RNA, Messenger/chemistry , Retroelements , Algorithms , Genome, Human , Humans , Models, Genetic , Poly A/analysis
17.
Microb Biotechnol ; 15(9): 2364-2378, 2022 09.
Article in English | MEDLINE | ID: mdl-35656803

ABSTRACT

In our previous study, we serendipitously discovered that protein secretion in the methylotrophic yeast Pichia pastoris is enhanced by a mutation (V50A) in the mating factor alpha (MFα) prepro-leader signal derived from Saccharomyces cerevisiae. In the present study, we investigated 20 single-amino-acid substitutions, including V50A, located within the MFα signal peptide, indicating that V50A and several single mutations alone provided significant increase in production of the secreted proteins. In addition to hydrophobicity index analysis, both an unfolded protein response (UPR) biosensor analysis and a microscopic observation showed a clear difference on the levels of UPR induction and mis-sorting of secretory protein into vacuoles among the wild-type and mutated MFα signal peptides. This work demonstrates the importance of avoiding entry of secretory proteins into the intracellular protein degradation pathways, an observation that is expected to contribute to the engineering of strains with increased production of recombinant secreted proteins.


Subject(s)
Fungal Proteins , Pichia , Amino Acid Sequence , Fungal Proteins/genetics , Fungal Proteins/metabolism , Mating Factor/genetics , Mating Factor/metabolism , Mutation , Pichia/genetics , Pichia/metabolism , Protein Sorting Signals/genetics , Proteolysis , Recombinant Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Saccharomycetales
18.
Commun Biol ; 5(1): 561, 2022 06 08.
Article in English | MEDLINE | ID: mdl-35676418

ABSTRACT

Expression of secreted recombinant proteins burdens the protein secretion machinery, limiting production. Here, we describe an approach to improving protein production by the non-conventional yeast Komagataella phaffii comprised of genome-wide screening for effective gene disruptions, combining them in a single strain, and recovering growth reduction by adaptive evolution. For the screen, we designed a multiwell-formatted, streamlined workflow to high-throughput assay of secretion of a single-chain small antibody, which is cumbersome to detect but serves as a good model of proteins that are difficult to secrete. Using the consolidated screening system, we evaluated >19,000 mutant strains from a mutant library prepared by a modified random gene-disruption method, and identified six factors for which disruption led to increased antibody production. We then combined the disruptions, up to quadruple gene knockouts, which appeared to contribute independently, in a single strain and observed an additive effect. Target protein and promoter were basically interchangeable for the effects of knockout genes screened. We finally used adaptive evolution to recover reduced cell growth by multiple gene knockouts and examine the possibility for further enhancing protein secretion. Our successful, three-part approach holds promise as a method for improving protein production by non-conventional microorganisms.


Subject(s)
Saccharomycetales , Gene Knockout Techniques , Recombinant Proteins/metabolism , Saccharomycetales/genetics , Saccharomycetales/metabolism , Workflow
19.
Nature ; 438(7071): 1157-61, 2005 Dec 22.
Article in English | MEDLINE | ID: mdl-16372010

ABSTRACT

The genome of Aspergillus oryzae, a fungus important for the production of traditional fermented foods and beverages in Japan, has been sequenced. The ability to secrete large amounts of proteins and the development of a transformation system have facilitated the use of A. oryzae in modern biotechnology. Although both A. oryzae and Aspergillus flavus belong to the section Flavi of the subgenus Circumdati of Aspergillus, A. oryzae, unlike A. flavus, does not produce aflatoxin, and its long history of use in the food industry has proved its safety. Here we show that the 37-megabase (Mb) genome of A. oryzae contains 12,074 genes and is expanded by 7-9 Mb in comparison with the genomes of Aspergillus nidulans and Aspergillus fumigatus. Comparison of the three aspergilli species revealed the presence of syntenic blocks and A. oryzae-specific blocks (lacking synteny with A. nidulans and A. fumigatus) in a mosaic manner throughout the genome of A. oryzae. The blocks of A. oryzae-specific sequence are enriched for genes involved in metabolism, particularly those for the synthesis of secondary metabolites. Specific expansion of genes for secretory hydrolytic enzymes, amino acid metabolism and amino acid/sugar uptake transporters supports the idea that A. oryzae is an ideal microorganism for fermentation.


Subject(s)
Aspergillus oryzae/genetics , Genome, Fungal , Genomics , Aspartic Acid Endopeptidases/genetics , Aspergillus oryzae/enzymology , Aspergillus oryzae/metabolism , Chromosomes, Fungal/genetics , Cytochrome P-450 Enzyme System/genetics , Genes, Fungal/genetics , Molecular Sequence Data , Phylogeny , Synteny
20.
Nucleic Acids Res ; 37(Database issue): D89-92, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18948287

ABSTRACT

We developed a pair of databases that support two important tasks: annotation of anonymous RNA transcripts and discovery of novel non-coding RNAs. The database combo is called the Functional RNA Database and consists of two databases: a rewrite of the original version of the Functional RNA Database (fRNAdb) and the latest version of the UCSC GenomeBrowser for Functional RNA. The former is a sequence database equipped with a powerful search function and hosts a large collection of known/predicted non-coding RNA sequences acquired from existing databases as well as novel/predicted sequences reported by researchers of the Functional RNA Project. The latter is a UCSC Genome Browser mirror with large additional custom tracks specifically associated with non-coding elements. It also includes several functional enhancements such as a presentation of a common secondary structure prediction at any given genomic window < or =500 bp. Our GenomeBrowser supports user authentication and user-specific tracks. The current version of the fRNAdb is a complete rewrite of the former version, hosting a larger number of sequences and with a much friendlier interface. The current version of UCSC GenomeBrowser for Functional RNA features a larger number of tracks and richer features than the former version. The databases are available at http://www.ncrna.org/.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , Animals , Genomics , Humans , Mice , RNA, Untranslated/physiology , Rats , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL