RESUMEN
MicroRNAs (miRNAs) are small non-coding RNAs of â¼ 22 nucleotides that are involved in negative regulation of mRNA at the post-transcriptional level. Previously, we developed miRTarBase which provides information about experimentally validated miRNA-target interactions (MTIs). Here, we describe an updated database containing 422 517 curated MTIs from 4076 miRNAs and 23 054 target genes collected from over 8500 articles. The number of MTIs curated by strong evidence has increased â¼1.4-fold since the last update in 2016. In this updated version, target sites validated by reporter assay that are available in the literature can be downloaded. The target site sequence can extract new features for analysis via a machine learning approach which can help to evaluate the performance of miRNA-target prediction tools. Furthermore, different ways of browsing enhance user browsing specific MTIs. With these improvements, miRTarBase serves as more comprehensively annotated, experimentally validated miRNA-target interactions databases in the field of miRNA related research. miRTarBase is available at http://miRTarBase.mbc.nctu.edu.tw/.
Asunto(s)
Bases de Datos Genéticas , MicroARNs/metabolismo , ARN Mensajero/metabolismo , Minería de Datos , Humanos , ARN Mensajero/química , Interfaz Usuario-ComputadorRESUMEN
BACKGROUND: Emerging evidence indicates that Circular RNAs (circRNAs) exert post-transcriptional regulation of gene expression. A subclass of circRNA was found enriched with miRNA target sites. This evidence suggests that this kind of circRNA functions as natural miRNA sponge. Noticing the potential impacts of circular RNA research, we were motivated to identify novel circRNAs as well as putative circRNA-miRNA interactions through retroactive sourced transcriptome sequencing samples. RESULTS: Through the analysis in 465 RNA-seq runs and 22 reports published in recent years, putatively circRNA sponged miRNA that had been experimentally verified targeting circRNA host gene were found. From this observation, supporting evidence of the competitive endogenous relationship of circRNAs and miRNAs targeting circRNA host genes can be observed. Given the self-regulation and self-induction nature of these circRNAs, this kind of hypothetical phenomenon was hereby called Ouroboros Resembling Competitive Endogenous Loop (ORCEL) in circular RNAs. CONCLUSIONS: The fact that miRNA sponge circRNA originated from region miRNA target sites enriched regions, while genes encoded from these regions are conserved to be miRNA targets rationalize the existence of ORCEL.
Asunto(s)
Perfilación de la Expresión Génica/métodos , MicroARNs/genética , ARN/genética , Análisis de Secuencia de ARN/métodos , Algoritmos , Biología Computacional/métodos , Bases de Datos Genéticas , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , ARN CircularRESUMEN
MicroRNAs (miRNAs) are small non-coding RNAs of approximately 22 nucleotides, which negatively regulate the gene expression at the post-transcriptional level. This study describes an update of the miRTarBase (http://miRTarBase.mbc.nctu.edu.tw/) that provides information about experimentally validated miRNA-target interactions (MTIs). The latest update of the miRTarBase expanded it to identify systematically Argonaute-miRNA-RNA interactions from 138 crosslinking and immunoprecipitation sequencing (CLIP-seq) data sets that were generated by 21 independent studies. The database contains 4966 articles, 7439 strongly validated MTIs (using reporter assays or western blots) and 348 007 MTIs from CLIP-seq. The number of MTIs in the miRTarBase has increased around 7-fold since the 2014 miRTarBase update. The miRNA and gene expression profiles from The Cancer Genome Atlas (TCGA) are integrated to provide an effective overview of this exponential growth in the miRNA experimental data. These improvements make the miRTarBase one of the more comprehensively annotated, experimentally validated miRNA-target interactions databases and motivate additional miRNA research efforts.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , MicroARNs/metabolismo , ARN Mensajero/metabolismo , Enfermedad/genética , Perfilación de la Expresión Génica , Humanos , ARN Mensajero/química , Análisis de Secuencia de ARNRESUMEN
Sudden cardiac death (SCD) is an important cause of mortality worldwide. It accounts for approximately half of all deaths from cardiovascular disease. While coronary artery disease and acute myocardial infarction account for the majority of SCD in the elderly population, inherited cardiac diseases (inherited CDs) comprise a substantial proportion of younger SCD victims with a significant genetic component. Currently, the use of next-generation sequencing enables the rapid analysis to investigate relationships between genetic variants and inherited CDs causing SCD. Genetic contribution to risk has been considered an alternate predictor of SCD. In the past years, large numbers of SCD susceptibility variants were reported, but these results are scattered in numerous publications. Here, we present the SCD-associated Variants Annotation Database (SVAD) to facilitate the interpretation of variants and to meet the needs of data integration. SVAD contains data from a broad screening of scientific literature. It was constructed to provide a comprehensive collection of genetic variants along with integrated information regarding their effects. At present, SVAD has accumulated 2,292 entries within 1,239 variants by manually surveying pertinent literature, and approximately one-third of the collected variants are pathogenic/likely-pathogenic following the ACMG guidelines. To the best of our knowledge, SVAD is the most comprehensive database that can provide integrated information on the associated variants in various types of inherited CDs. SVAD represents a valuable source of variant information based on scientific literature and benefits clinicians and researchers, and it is now available on http://svad.mbc.nctu.edu.tw/.
Asunto(s)
Bases de Datos Genéticas/estadística & datos numéricos , Muerte Súbita Cardíaca/etiología , Cardiopatías/genética , Modelos Genéticos , Simulación por Computador , Muerte Súbita Cardíaca/epidemiología , Cardiopatías/mortalidad , Humanos , Mutación , Polimorfismo de Nucleótido Simple , Medición de Riesgo/métodosRESUMEN
Introduction: In the United States and Europe, endometrial endometrioid carcinoma (EEC) is the most prevalent gynecologic malignancy. Lymph node metastasis (LNM) is the key determinant of the prognosis and treatment of EEC. A biomarker that predicts LNM in patients with EEC would be beneficial, enabling individualized treatment. Current preoperative assessment of LNM in EEC is not sufficiently accurate to predict LNM and prevent overtreatment. This pilot study established a biomarker signature for the prediction of LNM in early stage EEC. Methods: We performed RNA sequencing in 24 clinically early stage (T1) EEC tumors (lymph nodes positive and negative in 6 and 18, respectively) from Cathay General Hospital and analyzed the RNA sequencing data of 289 patients with EEC from The Cancer Genome Atlas (lymph node positive and negative in 33 and 256, respectively). We analyzed clinical data including tumor grade, depth of tumor invasion, and age to construct a sequencing-based prediction model using machine learning. For validation, we used another independent cohort of early stage EEC samples (n = 72) and performed quantitative real-time polymerase chain reaction (qRT-PCR). Finally, a PCR-based prediction model and risk score formula were established. Results: Eight genes (ASRGL1, ESR1, EYA2, MSX1, RHEX, SCGB2A1, SOX17, and STX18) plus one clinical parameter (depth of myometrial invasion) were identified for use in a sequencing-based prediction model. After qRT-PCR validation, five genes (ASRGL1, RHEX, SCGB2A1, SOX17, and STX18) were identified as predictive biomarkers. Receiver operating characteristic curve analysis revealed that these five genes can predict LNM. Combined use of these five genes resulted in higher diagnostic accuracy than use of any single gene, with an area under the curve of 0.898, sensitivity of 88.9%, and specificity of 84.1%. The accuracy, negative, and positive predictive values were 84.7, 98.1, and 44.4%, respectively. Conclusion: We developed a five-gene biomarker panel associated with LNM in early stage EEC. These five genes may represent novel targets for further mechanistic study. Our results, after corroboration by a prospective study, may have useful clinical implications and prevent unnecessary elective lymph node dissection while not adversely affecting the outcome of treatment for early stage EEC.
RESUMEN
The dysbiosis of human gut microbiota is strongly associated with the development of colorectal cancer (CRC). The dysbiotic features of the transition from advanced polyp to early-stage CRC are largely unknown. We performed a 16S rRNA gene sequencing and enterotype-based gut microbiota analysis study. In addition to Bacteroides- and Prevotella-dominated enterotypes, we identified an Escherichia-dominated enterotype. We found that the dysbiotic features of CRC were dissimilar in overall samples and especially Escherichia-dominated enterotype. Besides a higher abundance of Fusobacterium, Enterococcus, and Aeromonas in all CRC faecal microbiota, we found that the most notable characteristic of CRC faecal microbiota was a decreased abundance of potential beneficial butyrate-producing bacteria. Notably, Oscillospira was depleted in the transition from advanced adenoma to stage 0 CRC, whereas Haemophilus was depleted in the transition from stage 0 to early-stage CRC. We further identified 7 different CAGs by analysing bacterial clusters. The abundance of microbiota in cluster 3 significantly increased in the CRC group, whereas that of cluster 5 decreased. The abundance of both cluster 5 and cluster 7 decreased in the Escherichia-dominated enterotype of the CRC group. We present the first enterotype-based faecal microbiota analysis. The gut microbiota of colorectal neoplasms can be influenced by its enterotype.