Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 818
Filtrar
1.
J Chem Inf Model ; 64(7): 2221-2235, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37158609

RESUMO

Noncoding RNAs (ncRNAs) play crucial roles in many cellular life activities by interacting with proteins. Identification of ncRNA-protein interactions (ncRPIs) is key to understanding the function of ncRNAs. Although a number of computational methods for predicting ncRPIs have been developed, the problem of predicting ncRPIs remains challenging. It has always been the focus of ncRPIs research to select suitable feature extraction methods and develop a deep learning architecture with better recognition performance. In this work, we proposed an ensemble deep learning framework, RPI-EDLCN, based on a capsule network (CapsuleNet) to predict ncRPIs. In terms of feature input, we extracted the sequence features, secondary structure sequence features, motif information, and physicochemical properties of ncRNA/protein. The sequence and secondary structure sequence features of ncRNA/protein are encoded by the conjoint k-mer method and then input into an ensemble deep learning model based on CapsuleNet by combining the motif information and physicochemical properties. In this model, the encoding features are processed by convolution neural network (CNN), deep neural network (DNN), and stacked autoencoder (SAE). Then the advanced features obtained from the processing are input into the CapsuleNet for further feature learning. Compared with other state-of-the-art methods under 5-fold cross-validation, the performance of RPI-EDLCN is the best, and the accuracy of RPI-EDLCN on RPI1807, RPI2241, and NPInter v2.0 data sets was 93.8%, 88.2%, and 91.9%, respectively. The results of the independent test indicated that RPI-EDLCN can effectively predict potential ncRPIs in different organisms. In addition, RPI-EDLCN successfully predicted hub ncRNAs and proteins in Mus musculus ncRNA-protein networks. Overall, our model can be used as an effective tool to predict ncRPIs and provides some useful guidance for future biological studies.


Assuntos
Aprendizado Profundo , Animais , Camundongos , RNA não Traduzido/química , RNA não Traduzido/metabolismo , Proteínas , Redes Neurais de Computação
2.
Nucleic Acids Res ; 52(1): 274-287, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38000384

RESUMO

Most of the transcribed eukaryotic genomes are composed of non-coding transcripts. Among these transcripts, some are newly transcribed when compared to outgroups and are referred to as de novo transcripts. De novo transcripts have been shown to play a major role in genomic innovations. However, little is known about the rates at which de novo transcripts are gained and lost in individuals of the same species. Here, we address this gap and estimate the de novo transcript turnover rate with an evolutionary model. We use DNA long reads and RNA short reads from seven geographically remote samples of inbred individuals of Drosophila melanogaster to detect de novo transcripts that are gained on a short evolutionary time scale. Overall, each sampled individual contains around 2500 unspliced de novo transcripts, with most of them being sample specific. We estimate that around 0.15 transcripts are gained per year, and that each gained transcript is lost at a rate around 5× 10-5 per year. This high turnover of transcripts suggests frequent exploration of new genomic sequences within species. These rate estimates are essential to comprehend the process and timescale of de novo gene birth.


Assuntos
Drosophila melanogaster , Evolução Molecular , RNA não Traduzido , Transcrição Gênica , Animais , Humanos , Evolução Biológica , Drosophila melanogaster/genética , Genoma , Genômica , RNA , RNA não Traduzido/química , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Geografia
3.
Eur J Med Chem ; 261: 115850, 2023 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-37839343

RESUMO

The growing information currently available on the central role of non-coding RNAs (ncRNAs) including microRNAs (miRNAS) and long non-coding RNAs (lncRNAs) for chronic and degenerative human diseases makes them attractive therapeutic targets. RNAs carry out different functional roles in human biology and are deeply deregulated in several diseases. So far, different attempts to therapeutically target the 3D RNA structures with small molecules have been reported. In this scenario, the development of computational tools suitable for describing RNA structures and their potential interactions with small molecules is gaining more and more interest. Here, we describe the most suitable strategies to study ncRNAs through computational tools. We focus on methods capable of predicting 2D and 3D ncRNA structures. Furthermore, we describe computational tools to identify, design and optimize small molecule ncRNA binders. This review aims to outline the state of the art and perspectives of computational methods for ncRNAs over the past decade.


Assuntos
MicroRNAs , RNA Longo não Codificante , Humanos , RNA não Traduzido/genética , RNA não Traduzido/química , MicroRNAs/genética , RNA Longo não Codificante/genética , RNA Longo não Codificante/uso terapêutico
4.
Nucleic Acids Res ; 51(16): 8367-8382, 2023 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-37471030

RESUMO

Understanding the 3D structure of RNA is key to understanding RNA function. RNA 3D structure is modular and can be seen as a composition of building blocks of various sizes called tertiary motifs. Currently, long-range motifs formed between distant loops and helical regions are largely less studied than the local motifs determined by the RNA secondary structure. We surveyed long-range tertiary interactions and motifs in a non-redundant set of non-coding RNA 3D structures. A new dataset of annotated LOng-RAnge RNA 3D modules (LORA) was built using an approach that does not rely on the automatic annotations of non-canonical interactions. An original algorithm, ARTEM, was developed for annotation-, sequence- and topology-independent superposition of two arbitrary RNA 3D modules. The proposed methods allowed us to identify and describe the most common long-range RNA tertiary motifs. Along with the prevalent canonical A-minor interactions, a large number of previously undescribed staple interactions were observed. The most frequent long-range motifs were found to belong to three main motif families: planar staples, tilted staples, and helical packing motifs.


Assuntos
Conformação de Ácido Nucleico , RNA não Traduzido , Pareamento de Bases , Motivos de Nucleotídeos , RNA não Traduzido/química
5.
Int J Mol Sci ; 24(10)2023 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-37240230

RESUMO

Non-coding RNA (ncRNA) classes take over important housekeeping and regulatory functions and are quite heterogeneous in terms of length, sequence conservation and secondary structure. High-throughput sequencing reveals that the expressed novel ncRNAs and their classification are important to understand cell regulation and identify potential diagnostic and therapeutic biomarkers. To improve the classification of ncRNAs, we investigated different approaches of utilizing primary sequences and secondary structures as well as the late integration of both using machine learning models, including different neural network architectures. As input, we used the newest version of RNAcentral, focusing on six ncRNA classes, including lncRNA, rRNA, tRNA, miRNA, snRNA and snoRNA. The late integration of graph-encoded structural features and primary sequences in our MncR classifier achieved an overall accuracy of >97%, which could not be increased by more fine-grained subclassification. In comparison to the actual best-performing tool ncRDense, we had a minimal increase of 0.5% in all four overlapping ncRNA classes on a similar test set of sequences. In summary, MncR is not only more accurate than current ncRNA prediction tools but also allows the prediction of long ncRNA classes (lncRNAs, certain rRNAs) up to 12.000 nts and is trained on a more diverse ncRNA dataset retrieved from RNAcentral.


Assuntos
MicroRNAs , RNA Longo não Codificante , RNA não Traduzido/química , RNA Longo não Codificante/genética , Redes Neurais de Computação , Aprendizado de Máquina , RNA Ribossômico
6.
Comput Biol Med ; 157: 106783, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36958237

RESUMO

Noncoding RNA (ncRNA) is a functional RNA derived from DNA transcription, and most transcribed genes are transcribed into ncRNA. ncRNA is not directly involved in the translation of proteins, but it can participate in gene expression in cells and affect protein synthesis, thus playing an important role in biological processes such as growth, proliferation, metabolism, and information transmission. Therefore, understanding the interaction between ncRNA and protein is the basis for studying ncRNA regulation of protein-related biological activities. However, it is very expensive and time-consuming to verify ncRNA-protein interaction through biological experiments, and prediction methods based on machine learning have been developed rapidly. Recently, the graph neural network model (GNN) stands out for its excellent performance, but lacks a general framework for predicting ncRNA-protein interactions. We propose a GNN-based framework to predict ncRNA-protein interactions, which can utilize topological structure information to complete prediction tasks faster and more accurately. Meanwhile, for some smaller datasets, many ncRNA nodes lack neighbor information, resulting in lower prediction accuracy. For some larger datasets, the long-tail distribution causes the prediction of the tail nodes (sparse nodes linking few neighbors) to be affected. Therefore, we propose a new sampling method named HeadTailTransfer to mitigate these effects. Experimental results illustrate the effectiveness of this method. Especially for task-specific prediction on the RPI369 dataset in the Graphsage-based neural network framework, the AUC and ACC values increased from 56.8% and 52.2% to 80.2% and 71.8%, respectively. Our data and codes are available: https://github.com/kkkayle/HeadTailTransfer.


Assuntos
Redes Neurais de Computação , RNA não Traduzido , RNA não Traduzido/genética , RNA não Traduzido/química , RNA não Traduzido/metabolismo , Aprendizado de Máquina , Ligação Proteica , Proteínas/metabolismo
7.
Methods Mol Biol ; 2586: 121-146, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36705902

RESUMO

Noncoding RNAs, ncRNAs, naturally fold into structures, which allow them to perform their functions in the cell. Evolutionarily close species share structures and functions. This occurs because of shared selective pressures, resulting in conserved groups. Previous efforts in finding functional RNAs have been made in detecting conserved structures in genomes or alignments. It may occur that, within a conserved group, species-specific structures arise after species split due to positive selection. Detecting positive selection in ncRNAs is a hard problem in biology as well as bioinformatics. To detect positive selection, one should find species-specific structures within a conserved set. This chapter provides protocols to detect and analyze positive selection in ncRNA structures with the SSS-test and other free software.


Assuntos
RNA não Traduzido , RNA , RNA/genética , RNA não Traduzido/genética , RNA não Traduzido/química , Software , Evolução Biológica , Conformação de Ácido Nucleico
8.
J Phys Chem B ; 126(48): 10018-10033, 2022 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-36417896

RESUMO

Less than one in thirty of the RNA sequences transcribed in humans are translated into protein. The noncoding RNA (ncRNA) functions in catalysis, structure, regulation, and more. However, for the most part, these functions are poorly characterized. RNA is modular and described by motifs that include helical A-RNA with canonical Watson-Crick base-pairing as well as structures with only noncanonical base pairs. Understanding the structure and dynamics of motifs will aid in deciphering functions of specific ncRNAs. We present computational studies on a standard sarcin/ricin domain (SRD), citrus bark cracking viroid SRD, as well as A-RNA. We have applied enhanced molecular dynamics techniques that construct an inverse free-energy surface (iFES) determined by collective variables that monitor base-pairing and backbone conformation. Each SRD RNA is flanked on each side by A-RNA, allowing comparison of the behavior of these motifs in the same molecule. The RNA iFESs have single peaks, indicating that the combined motifs should denature as a single cohesive unit, rather than by regional melting. Local root-mean-square deviation (RMSD) analysis and communication propensity (CProp, variance in distances between residue pairs) reveal distinct motif properties. Our analysis indicates that the standard SRD is more stable than the viroid SRD, which is more stable than A-RNA. Base pairs at SRD to A-RNA transitions have limited flexibility. Application of CProp reveals extraordinary stiffness of the SRD, allowing residues on opposite sides of the motif to sense each other's motions.


Assuntos
Simulação de Dinâmica Molecular , Motivos de Nucleotídeos , RNA não Traduzido , Humanos , Ricina , RNA não Traduzido/química , Pareamento de Bases , Conformação de Ácido Nucleico
9.
Nucleic Acids Res ; 50(19): 11229-11242, 2022 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-36259651

RESUMO

Non-coding RNAs (ncRNAs) ubiquitously exist in normal and cancer cells. Despite their prevalent distribution, the functions of most long ncRNAs remain uncharacterized. The fission yeast Schizosaccharomyces pombe expresses >1800 ncRNAs annotated to date, but most unconventional ncRNAs (excluding tRNA, rRNA, snRNA and snoRNA) remain uncharacterized. To discover the functional ncRNAs, here we performed a combinatory screening of computational and biological tests. First, all S. pombe ncRNAs were screened in silico for those showing conservation in sequence as well as in secondary structure with ncRNAs in closely related species. Almost a half of the 151 selected conserved ncRNA genes were uncharacterized. Twelve ncRNA genes that did not overlap with protein-coding sequences were next chosen for biological screening that examines defects in growth or sexual differentiation, as well as sensitivities to drugs and stresses. Finally, we highlighted an ncRNA transcribed from SPNCRNA.1669, which inhibited untimely initiation of sexual differentiation. A domain that was predicted as conserved secondary structure by the computational operations was essential for the ncRNA to function. Thus, this study demonstrates that in silico selection focusing on conservation of the secondary structure over species is a powerful method to pinpoint novel functional ncRNAs.


Assuntos
Schizosaccharomyces , Schizosaccharomyces/genética , Diferenciação Sexual , RNA não Traduzido/genética , RNA não Traduzido/química , RNA Nucleolar Pequeno/genética , Fases de Leitura Aberta
10.
Neural Netw ; 156: 170-178, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36274524

RESUMO

Non-coding RNAs (ncRNAs) play an important role in revealing the mechanism of human disease for anti-tumor and anti-virus substances. Detecting subcellular locations of ncRNAs is a necessary way to study ncRNA. Traditional biochemical methods are time-consuming and labor-intensive, and computational-based methods can help detect the location of ncRNAs on a large scale. However, many models did not consider the correlation information among multiple subcellular localizations of ncRNAs. This study proposes a radial basis function neural network based on shared subspace learning (RBFNN-SSL), which extract shared structures in multi-labels. To evaluate performance, our classifier is tested on three ncRNA datasets. Our model achieves better performance in experimental results.


Assuntos
Redes Neurais de Computação , RNA não Traduzido , Humanos , RNA não Traduzido/genética , RNA não Traduzido/química , Biologia Computacional/métodos
11.
Nature ; 609(7926): 394-399, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35978193

RESUMO

Cellular RNAs are heterogeneous with respect to their alternative processing and secondary structures, but the functional importance of this complexity is still poorly understood. A set of alternatively processed antisense non-coding transcripts, which are collectively called COOLAIR, are generated at the Arabidopsis floral-repressor locus FLOWERING LOCUS C (FLC)1. Different isoforms of COOLAIR influence FLC transcriptional output in warm and cold conditions2-7. Here, to further investigate the function of COOLAIR, we developed an RNA structure-profiling method to determine the in vivo structure of single RNA molecules rather than the RNA population average. This revealed that individual isoforms of the COOLAIR transcript adopt multiple structures with different conformational dynamics. The major distally polyadenylated COOLAIR isoform in warm conditions adopts three predominant structural conformations, the proportions and conformations of which change after cold exposure. An alternatively spliced, strongly cold-upregulated distal COOLAIR isoform6 shows high structural diversity, in contrast to proximally polyadenylated COOLAIR. A hyper-variable COOLAIR structural element was identified that was complementary to the FLC transcription start site. Mutations altering the structure of this region changed FLC expression and flowering time, consistent with an important regulatory role of the COOLAIR structure in FLC transcription. Our work demonstrates that isoforms of non-coding RNA transcripts adopt multiple distinct and functionally relevant structural conformations, which change in abundance and shape in response to external conditions.


Assuntos
Arabidopsis , Conformação de Ácido Nucleico , RNA Antissenso , RNA de Plantas , RNA não Traduzido , Imagem Individual de Molécula , Arabidopsis/genética , Proteínas de Arabidopsis/genética , Flores/genética , Flores/crescimento & desenvolvimento , Regulação da Expressão Gênica de Plantas , Proteínas de Domínio MADS/genética , RNA Antissenso/química , RNA Antissenso/genética , RNA de Plantas/química , RNA de Plantas/genética , RNA não Traduzido/química , RNA não Traduzido/genética , Sítio de Iniciação de Transcrição , Transcrição Gênica
12.
PLoS Comput Biol ; 18(7): e1010240, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35797361

RESUMO

It is well-established that neural networks can predict or identify structural motifs of non-coding RNAs (ncRNAs). Yet, the neural network based identification of RNA structural motifs is limited by the availability of training data that are often insufficient for learning features of specific ncRNA families or structural motifs. Aiming to reliably identify intrinsic transcription terminators in bacteria, we introduce a novel pre-training approach that uses inverse folding to generate training data for predicting or identifying a specific family or structural motif of ncRNA. We assess the ability of neural networks to identify secondary structure by systematic in silico mutagenesis experiments. In a study to identify intrinsic transcription terminators as functionally well-understood RNA structural motifs, our inverse folding based pre-training approach significantly boosts the performance of neural network topologies, which outperform previous approaches to identify intrinsic transcription terminators. Inverse-folding based pre-training provides a simple, yet highly effective way to integrate the well-established thermodynamic energy model into deep neural networks for identifying ncRNA families or motifs. The pre-training technique is broadly applicable to a range of network topologies as well as different types of ncRNA families and motifs.


Assuntos
Redes Neurais de Computação , RNA não Traduzido , Humanos , Motivos de Nucleotídeos , RNA não Traduzido/química , RNA não Traduzido/genética
13.
J Virol ; 96(8): e0194621, 2022 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-35353000

RESUMO

Hepatitis C virus (HCV) is a positive-strand RNA virus that remains one of the main contributors to chronic liver disease worldwide. Studies over the last 30 years have demonstrated that HCV contains a highly structured RNA genome and many of these structures play essential roles in the HCV life cycle. Despite the importance of riboregulation in this virus, most of the HCV RNA genome remains functionally unstudied. Here, we report a complete secondary structure map of the HCV RNA genome in vivo, which was studied in parallel with the secondary structure of the same RNA obtained in vitro. Our results show that HCV is folded extensively in the cellular context. By performing comprehensive structural analyses on both in vivo data and in vitro data, we identify compact and conserved secondary and tertiary structures throughout the genome. Genetic and evolutionary functional analyses demonstrate that many of these elements play important roles in the virus life cycle. In addition to providing a comprehensive map of RNA structures and riboregulatory elements in HCV, this work provides a resource for future studies aimed at identifying therapeutic targets and conducting further mechanistic studies on this important human pathogen. IMPORTANCE HCV has one of the most highly structured RNA genomes studied to date, and it is a valuable model system for studying the role of RNA structure in protein-coding genes. While previous studies have identified individual cases of regulatory RNA structures within the HCV genome, the full-length structure of the HCV genome has not been determined in vivo. Here, we present the complete secondary structure map of HCV determined both in cells and from corresponding transcripts generated in vitro. In addition to providing a comprehensive atlas of functional secondary structural elements throughout the genomic RNA, we identified a novel set of tertiary interactions and demonstrated their functional importance. In terms of broader implications, the pipeline developed in this study can be applied to other long RNAs, such as long noncoding RNAs. In addition, the RNA structural motifs characterized in this study broaden the repertoire of known riboregulatory elements.


Assuntos
Genoma Viral , Hepacivirus , RNA Viral , Genoma Viral/genética , Hepacivirus/genética , Hepatite C/virologia , Humanos , RNA não Traduzido/química , RNA Viral/química , RNA Viral/genética
14.
Genome Res ; 32(5): 968-985, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35332099

RESUMO

The recent development and application of methods based on the general principle of "crosslinking and proximity ligation" (crosslink-ligation) are revolutionizing RNA structure studies in living cells. However, extracting structure information from such data presents unique challenges. Here, we introduce a set of computational tools for the systematic analysis of data from a wide variety of crosslink-ligation methods, specifically focusing on read mapping, alignment classification, and clustering. We design a new strategy to map short reads with irregular gaps at high sensitivity and specificity. Analysis of previously published data reveals distinct properties and bias caused by the crosslinking reactions. We perform rigorous and exhaustive classification of alignments and discover eight types of arrangements that provide distinct information on RNA structures and interactions. To deconvolve the dense and intertwined gapped alignments, we develop a network/graph-based tool Crosslinked RNA Secondary Structure Analysis using Network Techniques (CRSSANT), which enables clustering of gapped alignments and discovery of new alternative and dynamic conformations. We discover that multiple crosslinking and ligation events can occur on the same RNA, generating multisegment alignments to report complex high-level RNA structures and multi-RNA interactions. We find that alignments with overlapped segments are produced from potential homodimers and develop a new method for their de novo identification. Analysis of overlapping alignments revealed potential new homodimers in cellular noncoding RNAs and RNA virus genomes in the Picornaviridae family. Together, this suite of computational tools enables rapid and efficient analysis of RNA structure and interaction data in living cells.


Assuntos
RNA não Traduzido , RNA , Algoritmos , Análise por Conglomerados , RNA/química , RNA/genética , RNA não Traduzido/química , Análise de Sequência de RNA/métodos , Software
15.
Gigascience ; 122022 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-37848616

RESUMO

BACKGROUND: While web-based tools such as BLAST have made identifying conserved gene homologs appear easy, genes with variable sequences pose significant challenges. Functionally important noncoding RNAs (ncRNA) often show low sequence conservation due to genetic variations, including insertions and deletions. Rather than conserved sequences, these RNAs possess highly conserved structural features across a broad phylogenetic range. Such features can be identified using the covariance models approach, which combines sequence alignment with a secondary RNA structure consensus. However, running standard implementation of that approach (Infernal) requires advanced bioinformatics knowledge compared to user-friendly web services like BLAST. The issue is partially addressed by RNAcentral, which can be used to search for homologs across a broad range of ncRNA sequence collections from diverse organisms but not across the genome assemblies. RESULTS: Here, we present GERONIMO, which conducts evolutionary searches across hundreds of genomes in a fully automated way. It provides results extended with taxonomy context, as summary tables and visualizations, to facilitate analysis for user convenience. Additionally, GERONIMO supplements homologous sequences with genomic regions to analyze promoter motifs or gene collinearity, enhancing the validation of results. CONCLUSION: GERONIMO, built using Snakemake, has undergone extensive testing on hundreds of genomes, establishing itself as a valuable tool in the identification of ncRNA homologs across diverse taxonomic groups. Consequently, GERONIMO facilitates the investigation of the evolutionary patterns of functionally significant ncRNA players, whose understanding has previously been limited to individual organisms and close relatives.


Assuntos
Algoritmos , RNA , Filogenia , Alinhamento de Sequência , Genômica , RNA não Traduzido/genética , RNA não Traduzido/química
16.
Molecules ; 26(20)2021 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-34684745

RESUMO

The non-coding RNAs (ncRNA) are RNA transcripts with different sizes, structures and biological functions that do not encode functional proteins. RNA G-quadruplexes (rG4s) have been found in small and long ncRNAs. The existence of an equilibrium between rG4 and stem-loop structures in ncRNAs and its effect on biological processes remains unexplored. For example, deviation from the stem-loop leads to deregulated mature miRNA levels, demonstrating that miRNA biogenesis can be modulated by ions or small molecules. In light of this, we report several examples of rG4s in certain types of ncRNAs, and the implications of G4 stabilization using small molecules, also known as G4 ligands, in the regulation of gene expression, miRNA biogenesis, and miRNA-mRNA interactions. Until now, different G4 ligands scaffolds were synthesized for these targets. The regulatory role of the above-mentioned rG4s in ncRNAs can be used as novel therapeutic approaches for adjusting miRNA levels.


Assuntos
Quadruplex G/efeitos dos fármacos , RNA não Traduzido/química , Humanos , Sequências Repetidas Invertidas/genética , Sequências Repetidas Invertidas/fisiologia , Ligantes , MicroRNAs/genética , RNA Mensageiro/genética , RNA não Traduzido/metabolismo
17.
J Mol Biol ; 433(21): 167229, 2021 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-34487791

RESUMO

Although RNA-binding proteins (RBPs) are known to be enriched in intrinsic disorder, no previous analysis focused on RBPs interacting with specific RNA types. We fill this gap with a comprehensive analysis of the putative disorder in RBPs binding to six common RNA types: messenger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), non-coding RNA (ncRNA), ribosomal RNA (rRNA), and internal ribosome RNA (irRNA). We also analyze the amount of putative intrinsic disorder in the RNA-binding domains (RBDs) and non-RNA-binding-domain regions (non-RBD regions). Consistent with previous studies, we show that in comparison with human proteome, RBPs are significantly enriched in disorder. However, closer examination finds significant enrichment in predicted disorder for the mRNA-, rRNA- and snRNA-binding proteins, while the proteins that interact with ncRNA and irRNA are not enriched in disorder, and the tRNA-binding proteins are significantly depleted in disorder. We show a consistent pattern of significant disorder enrichment in the non-RBD regions coupled with low levels of disorder in RBDs, which suggests that disorder is relatively rarely utilized in the RNA-binding regions. Our analysis of the non-RBD regions suggests that disorder harbors posttranslational modification sites and is involved in the putative interactions with DNA. Importantly, we utilize experimental data from DisProt and independent data from Pfam to validate the above observations that rely on the disorder predictions. This study provides new insights into the distribution of disorder across proteins that bind different RNA types and the functional role of disorder in the regions where it is enriched.


Assuntos
Proteínas Intrinsicamente Desordenadas/química , RNA Mensageiro/química , RNA Ribossômico/química , RNA Nuclear Pequeno/química , RNA de Transferência/química , RNA não Traduzido/química , Proteínas de Ligação a RNA/química , Acetilação , Sítios de Ligação , Expressão Gênica , Humanos , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Metilação , Fosforilação , Ligação Proteica , Processamento de Proteína Pós-Traducional , Proteoma/genética , Proteoma/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Ribossômico/genética , RNA Ribossômico/metabolismo , RNA Nuclear Pequeno/genética , RNA Nuclear Pequeno/metabolismo , RNA de Transferência/genética , RNA de Transferência/metabolismo , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Ubiquitinação
18.
Biochem Soc Trans ; 49(4): 1867-1879, 2021 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-34338292

RESUMO

Different classes of non-coding RNA (ncRNA) influence the organization of chromatin. Imprinted gene domains constitute a paradigm for exploring functional long ncRNAs (lncRNAs). Almost all express an lncRNA in a parent-of-origin dependent manner. The mono-allelic expression of these lncRNAs represses close by and distant protein-coding genes, through diverse mechanisms. Some control genes on other chromosomes as well. Interestingly, several imprinted chromosomal domains show a developmentally regulated, chromatin-based mechanism of imprinting with apparent similarities to X-chromosome inactivation. At these domains, the mono-allelic lncRNAs show a relatively stable, focal accumulation in cis. This facilitates the recruitment of Polycomb repressive complexes, lysine methyltranferases and other nuclear proteins - in part through direct RNA-protein interactions. Recent chromosome conformation capture and microscopy studies indicate that the focal aggregation of lncRNA and interacting proteins could play an architectural role as well, and correlates with close positioning of target genes. Higher-order chromatin structure is strongly influenced by CTCF/cohesin complexes, whose allelic association patterns and actions may be influenced by lncRNAs as well. Here, we review the gene-repressive roles of imprinted non-coding RNAs, particularly of lncRNAs, and discuss emerging links with chromatin architecture.


Assuntos
Cromatina/química , Impressão Genômica , Domínios Proteicos , RNA não Traduzido/química , Animais , Humanos , Conformação Proteica , Inativação do Cromossomo X
19.
Nucleic Acids Res ; 49(11): 6128-6143, 2021 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-34086938

RESUMO

Many non-coding RNAs with known functions are structurally conserved: their intramolecular secondary and tertiary interactions are maintained across evolutionary time. Consequently, the presence of conserved structure in multiple sequence alignments can be used to identify candidate functional non-coding RNAs. Here, we present a bioinformatics method that couples iterative homology search with covariation analysis to assess whether a genomic region has evidence of conserved RNA structure. We used this method to examine all unannotated regions of five well-studied fungal genomes (Saccharomyces cerevisiae, Candida albicans, Neurospora crassa, Aspergillus fumigatus, and Schizosaccharomyces pombe). We identified 17 novel structurally conserved non-coding RNA candidates, which include four H/ACA box small nucleolar RNAs, four intergenic RNAs and nine RNA structures located within the introns and untranslated regions (UTRs) of mRNAs. For the two structures in the 3' UTRs of the metabolic genes GLY1 and MET13, we performed experiments that provide evidence against them being eukaryotic riboswitches.


Assuntos
RNA Fúngico/química , RNA não Traduzido/química , Regiões 3' não Traduzidas , Biologia Computacional/métodos , Genoma Fúngico , Íntrons , Lisina-tRNA Ligase/genética , Cadeias de Markov , Conformação de Ácido Nucleico , RNA Nucleolar Pequeno/química , Proteínas Ribossômicas/genética , Riboswitch , Alinhamento de Sequência , Tiorredoxinas/genética
20.
Environ Microbiol Rep ; 13(4): 540-554, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34121356

RESUMO

The expression of non-coding RNAs (ncRNAs) has been observed in a variety of bacteria. However, the function of ncRNAs and their regulatory targets are largely unknown, and few ncRNAs are found to be associated with bacterial virulence. The bacterial brown stripe pathogen Acidovorax oryzae (Ao) RS-1 shows a high level of condition-dependent differential expression of ncRNA, which we identified in a genome wide screen. We experimentally validated 66 differentially expressed ncRNAs using an integrative analysis of conservative genome sequences and transcriptomic data during in vivo interaction of the bacterial pathogen with the rice plant. To test the relevance of the differentially expressed ncRNAs, we chose four with different positions within the genome, and with different secondary structures and promoter activities. The results show that the overexpression of the four ncRNAs caused a significant change in virulence-related phenotypes, resistance to various environmental stresses, expression of secretion systems and effector proteins, while changing the expression of ncRNA putative target genes. We conclude that these ncRNAs are examples for the inherent regulatory roles for many of the observed ncRNAs in response to changing conditions such as host interaction or environmental adaption.


Assuntos
Comamonadaceae , Oryza , Comamonadaceae/genética , Oryza/microbiologia , RNA não Traduzido/química , RNA não Traduzido/genética , Virulência/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...