Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 131
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(9): 5152-5165, 2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38647067

RESUMEN

Structured noncoding RNAs (ncRNAs) contribute to many important cellular processes involving chemical catalysis, molecular recognition and gene regulation. Few ncRNA classes are broadly distributed among organisms from all three domains of life, but the list of rarer classes that exhibit surprisingly diverse functions is growing. We previously developed a computational pipeline that enables the near-comprehensive identification of structured ncRNAs expressed from individual bacterial genomes. The regions between protein coding genes are first sorted based on length and the fraction of guanosine and cytidine nucleotides. Long, GC-rich intergenic regions are then examined for sequence and structural similarity to other bacterial genomes. Herein, we describe the implementation of this pipeline on 50 bacterial genomes from varied phyla. More than 4700 candidate intergenic regions with the desired characteristics were identified, which yielded 44 novel riboswitch candidates and numerous other putative ncRNA motifs. Although experimental validation studies have yet to be conducted, this rate of riboswitch candidate discovery is consistent with predictions that many hundreds of novel riboswitch classes remain to be discovered among the bacterial species whose genomes have already been sequenced. Thus, many thousands of additional novel ncRNA classes likely remain to be discovered in the bacterial domain of life.


Asunto(s)
Genoma Bacteriano , ARN Bacteriano , ARN no Traducido , ADN Intergénico/genética , Genoma Bacteriano/genética , Genómica/métodos , Riboswitch/genética , ARN Bacteriano/genética , ARN Bacteriano/química , ARN no Traducido/genética , ARN no Traducido/clasificación , ARN no Traducido/química
2.
PLoS Comput Biol ; 20(9): e1012446, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39264986

RESUMEN

The involvement of non-coding RNAs in biological processes and diseases has made the exploration of their functions crucial. Most non-coding RNAs have yet to be studied, creating the need for methods that can rapidly classify large sets of non-coding RNAs into functional groups, or classes. In recent years, the success of deep learning in various domains led to its application to non-coding RNA classification. Multiple novel architectures have been developed, but these advancements are not covered by current literature reviews. We present an exhaustive comparison of the different methods proposed in the state-of-the-art and describe their associated datasets. Moreover, the literature lacks objective benchmarks. We perform experiments to fairly evaluate the performance of various tools for non-coding RNA classification on popular datasets. The robustness of methods to non-functional sequences and sequence boundary noise is explored. We also measure computation time and CO2 emissions. With regard to these results, we assess the relevance of the different architectural choices and provide recommendations to consider in future methods.


Asunto(s)
Benchmarking , Biología Computacional , Aprendizaje Profundo , ARN no Traducido , Benchmarking/métodos , Biología Computacional/métodos , ARN no Traducido/genética , ARN no Traducido/clasificación , Humanos , Algoritmos
3.
EMBO J ; 39(6): e103777, 2020 03 16.
Artículo en Inglés | MEDLINE | ID: mdl-32090359

RESUMEN

Research on non-coding RNA (ncRNA) is a rapidly expanding field. Providing an official gene symbol and name to ncRNA genes brings order to otherwise potential chaos as it allows unambiguous communication about each gene. The HUGO Gene Nomenclature Committee (HGNC, www.genenames.org) is the only group with the authority to approve symbols for human genes. The HGNC works with specialist advisors for different classes of ncRNA to ensure that ncRNA nomenclature is accurate and informative, where possible. Here, we review each major class of ncRNA that is currently annotated in the human genome and describe how each class is assigned a standardised nomenclature.


Asunto(s)
Genoma Humano/genética , ARN no Traducido/clasificación , Terminología como Asunto , Humanos , ARN no Traducido/genética
4.
Mol Cell ; 63(1): 7-20, 2016 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-27392145

RESUMEN

In modern molecular biology, RNA has emerged as a versatile macromolecule capable of mediating an astonishing number of biological functions beyond its role as a transient messenger of genetic information. The recent discovery and functional analyses of new classes of noncoding RNAs (ncRNAs) have revealed their widespread use in many pathways, including several in the nucleus. This Review focuses on the mechanisms by which nuclear ncRNAs directly contribute to the maintenance of genome stability. We discuss how ncRNAs inhibit spurious recombination among repetitive DNA elements, repress mobilization of transposable elements (TEs), template or bridge DNA double-strand breaks (DSBs) during repair, and direct developmentally regulated genome rearrangements in some ciliates. These studies reveal an unexpected repertoire of mechanisms by which ncRNAs contribute to genome stability and even potentially fuel evolution by acting as templates for genome modification.


Asunto(s)
Núcleo Celular/metabolismo , Inestabilidad Genómica , ARN no Traducido/genética , Animales , Roturas del ADN de Doble Cadena , Reparación del ADN , Dosificación de Gen , Silenciador del Gen , Heterocromatina/genética , Heterocromatina/metabolismo , Humanos , Conformación de Ácido Nucleico , ARN no Traducido/química , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Relación Estructura-Actividad , Telómero/genética , Telómero/metabolismo
5.
Nucleic Acids Res ; 50(D1): D950-D955, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34723317

RESUMEN

The rapid development of single-molecule long-read sequencing (LRS) and single-cell assay for transposase accessible chromatin sequencing (scATAC-seq) technologies presents both challenges and opportunities for the annotation of noncoding variants. Here, we updated 3DSNP, a comprehensive database for human noncoding variant annotation, to expand its applications to structural variation (SV) and to implement variant annotation down to single-cell resolution. The updates of 3DSNP include (i) annotation of 108 317 SVs from a full spectrum of functions, especially their potential effects on three-dimensional chromatin structures, (ii) evaluation of the accessible chromatin peaks flanking the variants across 126 cell types/subtypes in 15 human fetal tissues and 54 cell types/subtypes in 25 human adult tissues by integrating scATAC-seq data and (iii) expansion of Hi-C data to 49 human cell types. In summary, this version is a significant and comprehensive improvement over the previous version. The 3DSNP v2.0 database is freely available at https://omic.tech/3dsnpv2/.


Asunto(s)
Cromatina/química , Bases de Datos Genéticas , Anotación de Secuencia Molecular , ARN no Traducido/genética , Programas Informáticos , Adulto , Linaje de la Célula/genética , Cromatina/metabolismo , Mapeo Cromosómico , Células Eucariotas/citología , Células Eucariotas/metabolismo , Feto , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Internet , Polimorfismo de Nucleótido Simple , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Imagen Individual de Molécula/métodos , Análisis de la Célula Individual/métodos
6.
Nucleic Acids Res ; 50(D1): D928-D933, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34723320

RESUMEN

As a means to aid in the investigation of viral infection mechanisms and identification of more effective antivirus targets, the availability of a source which continually collects and updates information on the virus and host ncRNA-associated interaction resources is essential. Here, we update the ViRBase database to version 3.0 (http://www.virbase.org/ or http://www.rna-society.org/virbase/). This update represents a major revision: (i) the total number of interaction entries is now greater than 820,000, an approximately 70-fold increment, involving 116 virus and 36 host organisms, (ii) it supplements and provides more details on RNA annotations (including RNA editing, RNA localization and RNA modification), ncRNA SNP and ncRNA-drug related information and (iii) it provides two additional tools for predicting binding sites (IntaRNA and PRIdictor), a visual plug-in to display interactions and a website which is optimized for more practical and user-friendly operation. Overall, ViRBase v3.0 provides a more comprehensive resource for virus and host ncRNA-associated interactions enabling researchers a more effective means for investigation of viral infections.


Asunto(s)
Bases de Datos Genéticas , Genoma Viral , Interacciones Huésped-Patógeno/genética , ARN no Traducido/genética , Programas Informáticos , Virus/genética , Sitios de Unión , Cromatina/química , Cromatina/metabolismo , Humanos , Internet , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Edición de ARN , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Transducción de Señal , Virosis/genética , Virosis/metabolismo , Virosis/patología , Virosis/virología , Virus/clasificación , Virus/metabolismo , Virus/patogenicidad
7.
Nucleic Acids Res ; 50(D1): D333-D339, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34551440

RESUMEN

Resolving the spatial distribution of the transcriptome at a subcellular level can increase our understanding of biology and diseases. To facilitate studies of biological functions and molecular mechanisms in the transcriptome, we updated RNALocate, a resource for RNA subcellular localization analysis that is freely accessible at http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/. Compared to RNALocate v1.0, the new features in version 2.0 include (i) expansion of the data sources and the coverage of species; (ii) incorporation and integration of RNA-seq datasets containing information about subcellular localization; (iii) addition and reorganization of RNA information (RNA subcellular localization conditions and descriptive figures for method, RNA homology information, RNA interaction and ncRNA disease information) and (iv) three additional prediction tools: DM3Loc, iLoc-lncRNA and iLoc-mRNA. Overall, RNALocate v2.0 provides a comprehensive RNA subcellular localization resource for researchers to deconvolute the highly complex architecture of the cell.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/genética , Programas Informáticos , Transcriptoma , Animales , Secuencia de Bases , Compartimento Celular , Conjuntos de Datos como Asunto , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucariotas/citología , Células Eucariotas/metabolismo , Regulación de la Expresión Génica , Ontología de Genes , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Ratas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Alineación de Secuencia , Homología de Secuencia de Ácido Nucleico , Fracciones Subcelulares/química , Fracciones Subcelulares/metabolismo , Pez Cebra/genética , Pez Cebra/metabolismo
8.
Nucleic Acids Res ; 50(D1): D222-D230, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34850920

RESUMEN

MicroRNAs (miRNAs) are noncoding RNAs with 18-26 nucleotides; they pair with target mRNAs to regulate gene expression and produce significant changes in various physiological and pathological processes. In recent years, the interaction between miRNAs and their target genes has become one of the mainstream directions for drug development. As a large-scale biological database that mainly provides miRNA-target interactions (MTIs) verified by biological experiments, miRTarBase has undergone five revisions and enhancements. The database has accumulated >2 200 449 verified MTIs from 13 389 manually curated articles and CLIP-seq data. An optimized scoring system is adopted to enhance this update's critical recognition of MTI-related articles and corresponding disease information. In addition, single-nucleotide polymorphisms and disease-related variants related to the binding efficiency of miRNA and target were characterized in miRNAs and gene 3' untranslated regions. miRNA expression profiles across extracellular vesicles, blood and different tissues, including exosomal miRNAs and tissue-specific miRNAs, were integrated to explore miRNA functions and biomarkers. For the user interface, we have classified attributes, including RNA expression, specific interaction, protein expression and biological function, for various validation experiments related to the role of miRNA. We also used seed sequence information to evaluate the binding sites of miRNA. In summary, these enhancements render miRTarBase as one of the most research-amicable MTI databases that contain comprehensive and experimentally verified annotations. The newly updated version of miRTarBase is now available at https://miRTarBase.cuhk.edu.cn/.


Asunto(s)
Regiones no Traducidas 3' , Bases de Datos de Ácidos Nucleicos , Redes Reguladoras de Genes , MicroARNs/genética , Neoplasias/genética , ARN no Traducido/genética , Animales , Sitios de Unión , Biomarcadores/metabolismo , Minería de Datos/estadística & datos numéricos , Exosomas/química , Exosomas/metabolismo , Regulación de la Expresión Génica , Humanos , Internet , Ratones , MicroARNs/clasificación , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Neoplasias/metabolismo , Neoplasias/patología , Polimorfismo de Nucleótido Simple , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Células Tumorales Cultivadas , Interfaz Usuario-Computador
9.
Nucleic Acids Res ; 50(D1): D279-D286, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34747466

RESUMEN

RNA polymerase III (Pol III) transcribes hundreds of non-coding RNA genes (ncRNAs), which involve in a variety of cellular processes. However, the expression, functions, regulatory networks and evolution of these Pol III-transcribed ncRNAs are still largely unknown. In this study, we developed a novel resource, Pol3Base (http://rna.sysu.edu.cn/pol3base/), to decode the interactome, expression, evolution, epitranscriptome and disease variations of Pol III-transcribed ncRNAs. The current release of Pol3Base includes thousands of regulatory relationships between ∼79 000 ncRNAs and transcription factors by mining 56 ChIP-seq datasets. By integrating CLIP-seq datasets, we deciphered the interactions of these ncRNAs with >240 RNA binding proteins. Moreover, Pol3Base contains ∼9700 RNA modifications located within thousands of Pol III-transcribed ncRNAs. Importantly, we characterized expression profiles of ncRNAs in >70 tissues and 28 different tumor types. In addition, by comparing these ncRNAs from human and mouse, we revealed about 4000 evolutionary conserved ncRNAs. We also identified ∼11 403 tRNA-derived small RNAs (tsRNAs) in 32 different tumor types. Finally, by analyzing somatic mutation data, we investigated the mutation map of these ncRNAs to help uncover their potential roles in diverse diseases. This resource will help expand our understanding of potential functions and regulatory networks of Pol III-transcribed ncRNAs.


Asunto(s)
Bases de Datos Genéticas , Neoplasias/genética , ARN Polimerasa III/genética , ARN no Traducido/genética , Proteínas de Unión al ARN/genética , Programas Informáticos , Factores de Transcripción/genética , Animales , Minería de Datos , Conjuntos de Datos como Asunto , Evolución Molecular , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Internet , Ratones , Mutación , Neoplasias/clasificación , Neoplasias/metabolismo , Neoplasias/patología , ARN Polimerasa III/metabolismo , ARN de Transferencia/clasificación , ARN de Transferencia/genética , ARN de Transferencia/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Proteínas de Unión al ARN/clasificación , Proteínas de Unión al ARN/metabolismo , Factores de Transcripción/clasificación , Factores de Transcripción/metabolismo , Transcripción Genética
10.
Nucleic Acids Res ; 49(D1): D212-D220, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33106848

RESUMEN

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org.


Asunto(s)
Bases de Datos de Ácidos Nucleicos/organización & administración , Anotación de Secuencia Molecular , ARN no Traducido/genética , Programas Informáticos , Animales , Apicomplexa/clasificación , Apicomplexa/genética , Secuencia de Bases , Betacoronavirus/clasificación , Betacoronavirus/genética , Bases de Datos de Ácidos Nucleicos/provisión & distribución , Hongos/clasificación , Hongos/genética , Ontología de Genes , Humanos , Internet , Conformación de Ácido Nucleico , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Análisis de Secuencia de ARN
11.
Nucleic Acids Res ; 49(D1): D1094-D1101, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33095860

RESUMEN

Most mutations in cancer genomes occur in the non-coding regions with unknown impact on tumor development. Although the increase in the number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies making it difficult to understand their roles in tumorigenesis of different cancer types. We have developed CNCDatabase, Cornell Non-coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5' and 3' UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection using whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations in human cancers.


Asunto(s)
Carcinogénesis/genética , Bases de Datos Genéticas , Regulación Neoplásica de la Expresión Génica , Genoma Humano , Neoplasias/genética , Regiones no Traducidas 3' , Regiones no Traducidas 5' , Carcinogénesis/metabolismo , Carcinogénesis/patología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Elementos de Facilitación Genéticos , Genes Reporteros , Humanos , Elementos Aisladores , Luciferasas/genética , Luciferasas/metabolismo , Mutación , Neoplasias/metabolismo , Neoplasias/patología , Sistemas de Lectura Abierta , Regiones Promotoras Genéticas , ARN no Traducido/clasificación , ARN no Traducido/genética , ARN no Traducido/metabolismo , Regiones no Traducidas , Secuenciación Completa del Genoma
12.
Nucleic Acids Res ; 49(D1): D160-D164, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-32833025

RESUMEN

Many studies have indicated that non-coding RNA (ncRNA) dysfunction is closely related to numerous diseases. Recently, accumulated ncRNA-disease associations have made related databases insufficient to meet the demands of biomedical research. The constant updating of ncRNA-disease resources has become essential. Here, we have updated the mammal ncRNA-disease repository (MNDR, http://www.rna-society.org/mndr/) to version 3.0, containing more than one million entries, four-fold increment in data compared to the previous version. Experimental and predicted circRNA-disease associations have been integrated, increasing the number of categories of ncRNAs to five, and the number of mammalian species to 11. Moreover, ncRNA-disease related drug annotations and associations, as well as ncRNA subcellular localizations and interactions, were added. In addition, three ncRNA-disease (miRNA/lncRNA/circRNA) prediction tools were provided, and the website was also optimized, making it more practical and user-friendly. In summary, MNDR v3.0 will be a valuable resource for the investigation of disease mechanisms and clinical treatment strategies.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , MicroARNs/genética , Neoplasias/genética , ARN Circular/genética , ARN no Traducido/genética , Animales , Humanos , Internet , Mamíferos , MicroARNs/clasificación , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Neoplasias/clasificación , Neoplasias/metabolismo , Neoplasias/patología , ARN Circular/clasificación , ARN Circular/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Programas Informáticos
13.
Nucleic Acids Res ; 49(D1): D192-D200, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33211869

RESUMEN

Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Metagenoma , MicroARNs/genética , ARN Bacteriano/genética , ARN no Traducido/genética , ARN Viral/genética , Bacterias/genética , Bacterias/metabolismo , Emparejamiento Base , Secuencia de Bases , Humanos , Internet , MicroARNs/clasificación , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , ARN Bacteriano/clasificación , ARN Bacteriano/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , ARN Viral/clasificación , ARN Viral/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARN , Programas Informáticos , Virus/genética , Virus/metabolismo
14.
Nucleic Acids Res ; 48(5): 2332-2347, 2020 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-31863587

RESUMEN

Temperature profoundly affects the kinetics of biochemical reactions, yet how large molecular complexes such as the transcription machinery accommodate changing temperatures to maintain cellular function is poorly understood. Here, we developed plant native elongating transcripts sequencing (plaNET-seq) to profile genome-wide nascent RNA polymerase II (RNAPII) transcription during the cold-response of Arabidopsis thaliana with single-nucleotide resolution. Combined with temporal resolution, these data revealed transient genome-wide reprogramming of nascent RNAPII transcription during cold, including characteristics of RNAPII elongation and thousands of non-coding transcripts connected to gene expression. Our results suggest a role for promoter-proximal RNAPII stalling in predisposing genes for transcriptional activation during plant-environment interactions. At gene 3'-ends, cold initially facilitated transcriptional termination by limiting the distance of read-through transcription. Within gene bodies, cold reduced the kinetics of co-transcriptional splicing leading to increased intragenic stalling. Our data resolved multiple distinct mechanisms by which temperature transiently altered the dynamics of nascent RNAPII transcription and associated RNA processing, illustrating potential biotechnological solutions and future focus areas to promote food security in the context of a changing climate.


Asunto(s)
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulación de la Expresión Génica de las Plantas , Genoma de Planta , ARN Polimerasa II/genética , ARN Mensajero/genética , ARN no Traducido/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Frío , Interacción Gen-Ambiente , Secuenciación de Nucleótidos de Alto Rendimiento , Regiones Promotoras Genéticas , ARN Polimerasa II/metabolismo , Empalme del ARN , ARN Mensajero/clasificación , ARN Mensajero/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Activación Transcripcional
15.
Nucleic Acids Res ; 48(5): 2271-2286, 2020 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-31980822

RESUMEN

The study of RNA expression is the fastest growing area of genomic research. However, despite the dramatic increase in the number of sequenced transcriptomes, we still do not have accurate estimates of the number and expression levels of non-coding RNA genes. Non-coding transcripts are often overlooked due to incomplete genome annotation. In this study, we use annotation-independent detection of RNA reads generated using a reverse transcriptase with low structure bias to identify non-coding RNA. Transcripts between 20 and 500 nucleotides were filtered and crosschecked with non-coding RNA annotations revealing 111 non-annotated non-coding RNAs expressed in different cell lines and tissues. Inspecting the sequence and structural features of these transcripts indicated that 60% of these transcripts correspond to new snoRNA and tRNA-like genes. The identified genes exhibited features of their respective families in terms of structure, expression, conservation and response to depletion of interacting proteins. Together, our data reveal a new group of RNA that are difficult to detect using standard gene prediction and RNA sequencing techniques, suggesting that reliance on actual gene annotation and sequencing techniques distorts the perceived architecture of the human transcriptome.


Asunto(s)
Anotación de Secuencia Molecular/métodos , ARN Mensajero/genética , ARN Nucleolar Pequeño/genética , ARN de Transferencia/genética , ARN no Traducido/genética , Transcriptoma , Animales , Emparejamiento Base , Secuencia de Bases , Línea Celular Tumoral , Conjuntos de Datos como Asunto , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Conformación de Ácido Nucleico , Filogenia , ARN Mensajero/clasificación , ARN Mensajero/metabolismo , ARN Nucleolar Pequeño/clasificación , ARN Nucleolar Pequeño/metabolismo , ARN de Transferencia/clasificación , ARN de Transferencia/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Análisis de Secuencia de ARN , Secuenciación del Exoma
16.
RNA Biol ; 18(12): 2168-2182, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34110970

RESUMEN

Mitochondrial noncoding RNAs (mt-ncRNAs) include noncoding RNAs inside the mitochondria that are transcribed from the mitochondrial genome or nuclear genome, and noncoding RNAs transcribed from the mitochondrial genome that are transported to the cytosol or nucleus. Recent findings have revealed that mt-ncRNAs play important roles in not only mitochondrial functions, but also other cellular activities. This review proposes a classification of mt-ncRNAs and outlines the emerging understanding of mitochondrial circular RNAs (mt-circRNAs), mitochondrial microRNAs (mitomiRs), and mitochondrial long noncoding RNAs (mt-lncRNAs), with an emphasis on their identification and functions.


Asunto(s)
Mitocondrias/genética , ARN no Traducido/genética , Animales , Epigénesis Genética , Regulación de la Expresión Génica , Humanos , ARN Mitocondrial/genética , ARN no Traducido/clasificación
17.
Nucleic Acids Res ; 47(8): e43, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-30753596

RESUMEN

The rapid and accurate approach to distinguish between coding RNAs and ncRNAs has been playing a critical role in analyzing thousands of novel transcripts, which have been generated in recent years by next-generation sequencing technology. Previously developed methods CPAT, CPC2 and PLEK can distinguish coding RNAs and ncRNAs very well, but poorly distinguish between small coding RNAs and small ncRNAs. Herein, we report an approach, CPPred (coding potential prediction), which is based on SVM classifier and multiple sequence features including novel RNA features encoded by the global description. The CPPred can better distinguish not only between coding RNAs and ncRNAs, but also between small coding RNAs and small ncRNAs than the state-of-the-art methods due to the addition of the novel RNA features. A recent study proposes 1335 novel human coding RNAs from a large number of RNA-seq datasets. However, only 119 transcripts are predicted as coding RNAs by the CPPred. In fact, almost all proposed novel coding RNAs are ncRNAs (91.1%), which is consistent with previous reports. Remarkably, we also reveal that the global description of encoding features (T2, C0 and GC) plays an important role in the prediction of coding potential.


Asunto(s)
Algoritmos , Biología Computacional/métodos , ARN Mensajero/genética , ARN no Traducido/genética , Animales , Secuencia de Bases , Conjuntos de Datos como Asunto , Drosophila melanogaster/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Ratones , ARN Mensajero/clasificación , ARN Mensajero/metabolismo , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ARN , Pez Cebra/genética
18.
Int J Mol Sci ; 22(16)2021 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-34445436

RESUMEN

Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.


Asunto(s)
Biología Computacional/métodos , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Animales , Inteligencia Artificial , Bases de Datos Factuales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , ARN no Traducido/genética , Análisis de Secuencia de ARN , Distribución Tisular
19.
Microbiology (Reading) ; 166(2): 149-156, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31860438

RESUMEN

Pseudomonas putida is a micro-organism with great potential for industry due to its stress-endurance traits and easy manipulation of the metabolism. However, optimization is still required to improve production yields. In the last years, manipulation of bacterial small non-coding RNAs (ncRNAs) has been recognized as an effective tool to improve the production of industrial compounds. So far, very few ncRNAs are annotated in P. putida beyond the generally conserved. In the present study, P. putida was cultivated in a two-compartment scale-down bioreactor that simulates large-scale industrial bioreactors. We performed RNA-Seq of samples collected at distinct locations and time-points to predict novel and potentially important ncRNAs for the adaptation of P. putida to bioreactor stress conditions. Instead of using a purely genomic approach, we have rather identified regions of putative ncRNAs with high expression levels using two different programs (Artemis and sRNA detect). Only the regions identified with both approaches were considered for further analysis and, in total, 725 novel ncRNAs were predicted. We also found that their expression was not constant throughout the bioreactor, showing different patterns of expression with time and position. This is the first work focusing on the ncRNAs whose expression is triggered in a bioreactor environment. This information is of great importance for industry, since it provides possible targets to engineer more effective P. putida strains for large-scale production.


Asunto(s)
Reactores Biológicos/microbiología , Pseudomonas putida/fisiología , ARN Bacteriano/metabolismo , ARN no Traducido/metabolismo , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano/genética , Pseudomonas putida/genética , Pseudomonas putida/crecimiento & desarrollo , Pseudomonas putida/metabolismo , ARN Bacteriano/clasificación , ARN Bacteriano/genética , ARN no Traducido/clasificación , ARN no Traducido/genética , Análisis de Secuencia de ARN , Estrés Fisiológico
20.
EMBO Rep ; 19(10)2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30126926

RESUMEN

The molecular roles of the dually targeted ElaC domain protein 2 (ELAC2) during nuclear and mitochondrial RNA processing in vivo have not been distinguished. We generated conditional knockout mice of ELAC2 to identify that it is essential for life and its activity is non-redundant. Heart and skeletal muscle-specific loss of ELAC2 causes dilated cardiomyopathy and premature death at 4 weeks. Transcriptome-wide analyses of total RNAs, small RNAs, mitochondrial RNAs, and miRNAs identified the molecular targets of ELAC2 in vivo We show that ELAC2 is required for processing of tRNAs and for the balanced maintenance of C/D box snoRNAs, miRNAs, and a new class of tRNA fragments. We identify that correct biogenesis of regulatory non-coding RNAs is essential for both cytoplasmic and mitochondrial protein synthesis and the assembly of mitochondrial ribosomes and cytoplasmic polysomes. We show that nuclear tRNA processing is required for the balanced production of snoRNAs and miRNAs for gene expression and that 3' tRNA processing is an essential step in the production of all mature mitochondrial RNAs and the majority of nuclear tRNAs.


Asunto(s)
Endorribonucleasas/genética , Proteínas de Neoplasias/genética , ARN Mitocondrial/genética , ARN no Traducido/genética , Animales , Núcleo Celular/genética , Perfilación de la Expresión Génica , Ratones , MicroARNs/genética , ARN Nucleolar Pequeño/genética , ARN de Transferencia/genética , ARN no Traducido/clasificación , ARN no Traducido/aislamiento & purificación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA