RESUMEN
Long non-coding RNAs were commonly viewed as non-coding elements. However, they are increasingly recognized for their ability to be translated into proteins, thereby playing a significant role in various cellular processes and diseases. With developments in biotechnology and computational algorithms, a range of novel approaches are being applied to investigate the translation of long non-coding RNA (lncRNAs). Herein, we developed the LncPepAtlas database (http://www.cnitbiotool.net/LncPepAtlas/), which aims to compile multiple evidences for the translation of lncRNAs and annotations for the upstream regulation of lncRNAs across various species. LncPepAtlas integrated compelling evidence from nine distinct sources for the translation of lncRNAs. These include a dataset comprising 2631 publicly available Ribo-seq samples from nine species, which has been collected and analysed. LncPepAtlas offers extensive annotation for lncRNA upstream regulation and expression profiles across various cancers, tissues or cell lines at transcriptional and translational levels. Importantly, it enables novel antigen predictions for lncRNA-encoded peptides. By identifying numerous peptide candidates that could potentially bind to major histocompatibility complex class I and II molecules, this work may provide new insights into cancer immunotherapy. The function of peptides were inferred by aligning them with experimentally detected proteins. LncPepAtlas aims to become a convenient resource for exploring translatable lncRNAs.
RESUMEN
Sepsis is one of the major challenges in intensive care units, characterized by the complexity of the host immune status. To gain a deeper understanding of the pathogenesis of sepsis, it is crucial to study the phenotypic changes in immune cells and their underlying molecular mechanisms. We conducted Summary data-based Mendelian randomization analysis by integrating genome-wide association studies data for sepsis with expression quantitative trait locus data, revealing a significant decrease in the expression levels of 17 biomarkers in sepsis patients. Furthermore, based on single-cell RNA sequencing data, we elucidated potential molecular mechanisms at single-cell resolution and identified that LGALS9 inhibition in sepsis patients leads to the activation and differentiation of monocyte and T-cell subtypes. These findings are expected to assist researchers in gaining a more in-depth understanding of the immune dysregulation in sepsis.
Asunto(s)
Galectinas , Estudio de Asociación del Genoma Completo , Análisis de la Aleatorización Mendeliana , Sitios de Carácter Cuantitativo , Sepsis , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Humanos , Sepsis/genética , Sepsis/inmunología , Sepsis/sangre , Análisis de la Célula Individual/métodos , Galectinas/genética , Análisis de Secuencia de ARN/métodos , Biomarcadores , Polimorfismo de Nucleótido Simple , Monocitos/metabolismo , Monocitos/inmunología , Predisposición Genética a la EnfermedadRESUMEN
The rapid development of genomic high-throughput sequencing has identified a large number of DNA regulatory elements with abundant epigenetics markers, which promotes the rapid accumulation of functional genomic region data. The comprehensively understanding and research of human functional genomic regions is still a relatively urgent work at present. However, the existing analysis tools lack extensive annotation and enrichment analytical abilities for these regions. Here, we designed a novel software, Genomic Region sets Enrichment Analysis Platform (GREAP), which provides comprehensive region annotation and enrichment analysis capabilities. Currently, GREAP supports 85 370 genomic region reference sets, which cover 634 681 107 regions across 11 different data types, including super enhancers, transcription factors, accessible chromatins, etc. GREAP provides widespread annotation and enrichment analysis of genomic regions. To reflect the significance of enrichment analysis, we used the hypergeometric test and also provided a Locus Overlap Analysis. In summary, GREAP is a powerful platform that provides many types of genomic region sets for users and supports genomic region annotations and enrichment analyses. In addition, we developed a customizable genome browser containing >400 000 000 customizable tracks for visualization. The platform is freely available at http://www.liclab.net/Greap/view/index.
Asunto(s)
Genómica , Programas Informáticos , Cromatina , Genoma Humano , Humanos , Anotación de Secuencia Molecular , Factores de TranscripciónRESUMEN
MOTIVATION: DNA methylation within gene body and promoters in cancer cells is well documented. An increasing number of studies showed that cytosine-phosphate-guanine (CpG) sites falling within other regulatory elements could also regulate target gene activation, mainly by affecting transcription factors (TFs) binding in human cancers. This led to the urgent need for comprehensively and effectively collecting distinct cis-regulatory elements and TF-binding sites (TFBS) to annotate DNA methylation regulation. RESULTS: We developed a database (CanMethdb, http://meth.liclab.net/CanMethdb/) that focused on the upstream and downstream annotations for CpG-genes in cancers. This included upstream cis-regulatory elements, especially those involving distal regions to genes, and TFBS annotations for the CpGs and downstream functional annotations for the target genes, computed through integrating abundant DNA methylation and gene expression profiles in diverse cancers. Users could inquire CpG-target gene pairs for a cancer type through inputting a genomic region, a CpG, a gene name, or select hypo/hypermethylated CpG sets. The current version of CanMethdb documented a total of 38 986 060 CpG-target gene pairs (with 6 769 130 unique pairs), involving 385 217 CpGs and 18 044 target genes, abundant cis-regulatory elements and TFs for 33 TCGA cancer types. CanMethdb might help biologists perform in-depth studies of target gene regulations based on DNA methylations in cancer. AVAILABILITY AND IMPLEMENTATION: The main program is available at https://github.com/chunquanlipathway/CanMethdb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Metilación de ADN , Neoplasias , Humanos , Factores de Transcripción/metabolismo , Genoma , Secuencias Reguladoras de Ácidos Nucleicos , Regiones Promotoras Genéticas , Neoplasias/genética , ADN/metabolismo , Islas de CpGRESUMEN
Transcription co-factors (TcoFs) play crucial roles in gene expression regulation by communicating regulatory cues from enhancers to promoters. With the rapid accumulation of TcoF associated chromatin immunoprecipitation sequencing (ChIP-seq) data, the comprehensive collection and integrative analyses of these data are urgently required. Here, we developed the TcoFBase database (http://tcof.liclab.net/TcoFbase), which aimed to document a large number of available resources for mammalian TcoFs and provided annotations and enrichment analyses of TcoFs. TcoFBase curated 2322 TcoFs and 6759 TcoFs associated ChIP-seq data from over 500 tissues/cell types in human and mouse. Importantly, TcoFBase provided detailed and abundant (epi) genetic annotations of ChIP-seq based TcoF binding regions. Furthermore, TcoFBase supported regulatory annotation information and various functional annotations for TcoFs. Meanwhile, TcoFBase embedded five types of TcoF regulatory analyses for users, including TcoF gene set enrichment, TcoF binding genomic region annotation, TcoF regulatory network analysis, TcoF-TF co-occupancy analysis and TcoF regulatory axis analysis. TcoFBase was designed to be a useful resource that will help reveal the potential biological effects of TcoFs and elucidate TcoF-related regulatory mechanisms.
Asunto(s)
Bases de Datos Genéticas , Redes Reguladoras de Genes , Programas Informáticos , Factores de Transcripción/genética , Transcripción Genética , Animales , Cromatina/química , Cromatina/metabolismo , Conjuntos de Datos como Asunto , Elementos de Facilitación Genéticos , Regulación de la Expresión Génica , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Regiones Promotoras Genéticas , Factores de Transcripción/clasificación , Factores de Transcripción/metabolismoRESUMEN
Transcription factors (TFs) play key roles in biological processes and are usually used as cell markers. The emerging importance of TFs and related markers in identifying specific cell types in human diseases increases the need for a comprehensive collection of human TFs and related markers sets. Here, we developed the TF-Marker database (TF-Marker, http://bio.liclab.net/TF-Marker/), aiming to provide cell/tissue-specific TFs and related markers for human. By manually curating thousands of published literature, 5905 entries including information about TFs and related markers were classified into five types according to their functions: (i) TF: TFs which regulate expression of the markers; (ii) T Marker: markers which are regulated by the TF; (iii) I Marker: markers which influence the activity of TFs; (iv) TFMarker: TFs which play roles as markers and (v) TF Pmarker: TFs which play roles as potential markers. The 5905 entries of TF-Marker include 1316 TFs, 1092 T Markers, 473 I Markers, 1600 TFMarkers and 1424 TF Pmarkers, involving 383 cell types and 95 tissue types in human. TF-Marker further provides a user-friendly interface to browse, query and visualize the detailed information about TFs and related markers. We believe TF-Marker will become a valuable resource to understand the regulation patterns of different tissues and cells.
Asunto(s)
Bases de Datos Genéticas , Neoplasias/genética , Programas Informáticos , Factores de Transcripción/genética , Transcripción Genética , Huesos/química , Huesos/metabolismo , Encéfalo/metabolismo , Colon/química , Colon/metabolismo , Femenino , Regulación de la Expresión Génica , Marcadores Genéticos , Humanos , Internet , Hígado/química , Hígado/metabolismo , Pulmón/química , Pulmón/metabolismo , Masculino , Glándulas Mamarias Humanas/química , Glándulas Mamarias Humanas/metabolismo , Anotación de Secuencia Molecular , Neoplasias/metabolismo , Neoplasias/patología , Especificidad de Órganos , Próstata/química , Próstata/metabolismo , Factores de Transcripción/clasificación , Factores de Transcripción/metabolismoRESUMEN
Long noncoding RNAs (lncRNAs) have been proven to play important roles in transcriptional processes and biological functions. With the increasing study of human diseases and biological processes, information in human H3K27ac ChIP-seq, ATAC-seq and DNase-seq datasets is accumulating rapidly, resulting in an urgent need to collect and process data to identify transcriptional regulatory regions of lncRNAs. We therefore developed a comprehensive database for human regulatory information of lncRNAs (TRlnc, http://bio.licpathway.net/TRlnc), which aimed to collect available resources of transcriptional regulatory regions of lncRNAs and to annotate and illustrate their potential roles in the regulation of lncRNAs in a cell type-specific manner. The current version of TRlnc contains 8 683 028 typical enhancers/super-enhancers and 32 348 244 chromatin accessibility regions associated with 91 906 human lncRNAs. These regions are identified from over 900 human H3K27ac ChIP-seq, ATAC-seq and DNase-seq samples. Furthermore, TRlnc provides the detailed genetic and epigenetic annotation information within transcriptional regulatory regions (promoter, enhancer/super-enhancer and chromatin accessibility regions) of lncRNAs, including common SNPs, risk SNPs, eQTLs, linkage disequilibrium SNPs, transcription factors, methylation sites, histone modifications and 3D chromatin interactions. It is anticipated that the use of TRlnc will help users to gain in-depth and useful insights into the transcriptional regulatory mechanisms of lncRNAs.
Asunto(s)
Bases de Datos Genéticas , ARN Largo no Codificante/genética , Secuencias Reguladoras de Ácidos Nucleicos , Transcripción Genética , Inmunoprecipitación de Cromatina , Elementos de Facilitación Genéticos , Humanos , Desequilibrio de Ligamiento , Metilación , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas , Sitios de Carácter CuantitativoRESUMEN
Long non-coding RNAs (lncRNAs) have been proven to play important roles in transcriptional processes and various biological functions. Establishing a comprehensive collection of human lncRNA sets is urgent work at present. Using reference lncRNA sets, enrichment analyses will be useful for analyzing lncRNA lists of interest submitted by users. Therefore, we developed a human lncRNA sets database, called LncSEA, which aimed to document a large number of available resources for human lncRNA sets and provide annotation and enrichment analyses for lncRNAs. LncSEA supports >40 000 lncRNA reference sets across 18 categories and 66 sub-categories, and covers over 50 000 lncRNAs. We not only collected lncRNA sets based on downstream regulatory data sources, but also identified a large number of lncRNA sets regulated by upstream transcription factors (TFs) and DNA regulatory elements by integrating TF ChIP-seq, DNase-seq, ATAC-seq and H3K27ac ChIP-seq data. Importantly, LncSEA provides annotation and enrichment analyses of lncRNA sets associated with upstream regulators and downstream targets. In summary, LncSEA is a powerful platform that provides a variety of types of lncRNA sets for users, and supports lncRNA annotations and enrichment analyses. The LncSEA database is freely accessible at http://bio.liclab.net/LncSEA/index.php.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Regulación de la Expresión Génica , ARN Largo no Codificante/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/genética , Minería de Datos/métodos , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Análisis de Secuencia de ARN/métodos , Interfaz Usuario-ComputadorRESUMEN
In recent years, high-throughput genomic technologies like chromatin immunoprecipitation sequencing (ChIp-seq) and transcriptome sequencing (RNA-seq) have been becoming both more refined and less expensive, making them more accessible. Many circular RNAs (circRNAs) that originate from back-spliced exons have been identified in various cell lines across different species. However, the regulatory mechanism for transcription of circRNAs remains unclear. Therefore, there is an urgent need to construct a database detailing the transcriptional regulation of circRNAs. TRCirc (http://www.licpathway.net/TRCirc) provides a resource for efficient retrieval, browsing and visualization of transcriptional regulation information of circRNAs. The current version of TRCirc documents 92 375 circRNAs and 161 transcription factors (TFs) from more than 100 cell types and together represent more than 765 000 TF-circRNA regulatory relationships. Furthermore, TRCirc provides other regulatory information about transcription of circRNAs, including their expression, methylation levels, H3K27ac signals in regulation regions and super-enhancers associated with circRNAs. TRCirc provides a convenient, user-friendly interface to search, browse and visualize detailed information about these circRNAs.
Asunto(s)
Regulación de la Expresión Génica , ARN Circular/genética , Transcripción Genética , Bases de Datos Genéticas , Humanos , Almacenamiento y Recuperación de la InformaciónRESUMEN
Transcription factors (TFs) are major contributors to gene transcription, especially in controlling cell-specific gene expression and disease occurrence and development. Uncovering the relationship between TFs and their target genes is critical to understanding the mechanism of action of TFs. With the development of high-throughput sequencing techniques, a large amount of TF-related data has accumulated, which can be used to identify their target genes. In this study, we developed TFTG (Transcription Factor and Target Genes) database (http://tf.liclab.net/TFTG), which aimed to provide a large number of available human TF-target gene resources by multiple strategies, besides performing a comprehensive functional and epigenetic annotations and regulatory analyses of TFs. We identified extensive available TF-target genes by collecting and processing TF-associated ChIP-seq datasets, perturbation RNA-seq datasets and motifs. We also obtained experimentally confirmed relationships between TF and target genes from available resources. Overall, the target genes of TFs were obtained through integrating the relevant data of various TFs as well as fourteen identification strategies. Meanwhile, TFTG was embedded with user-friendly search, analysis, browsing, downloading and visualization functions. TFTG is designed to be a convenient resource for exploring human TF-target gene regulations, which will be useful for most users in the TF and gene expression regulation research.
RESUMEN
Esophageal carcinoma ranks as the sixth leading cause of cancer-related mortality globally, with esophageal squamous cell carcinoma (ESCC) being particularly prevalent among Asian populations. Alternative splicing (AS) plays a pivotal role in ESCC development and progression by generating diverse transcript isoforms. However, the current landscape lacks a specialized database focusing on alternative splicing events (ASEs) derived from a large number of ESCC cases. Additionally, most existing AS databases overlook the contribution of long non-coding RNAs (lncRNAs) in ESCC molecular mechanisms, predominantly focusing on mRNA-based ASE identification. To address these limitations, we deployed DASES (http://www.hxdsjzx.cn/DASES). Employing a combination of publicly available and in-house ESCC RNA-seq datasets, our extensive analysis of 346 samples, with 93% being paired tumor and adjacent non-tumor tissues, led to the identification of 257 novel lncRNAs in esophageal squamous cell carcinoma. Leveraging a paired comparison of tumor and adjacent normal tissues, DASES identified 59,094 ASEs that may be associated with ESCC. DASES fills a critical gap by providing comprehensive insights into ASEs in ESCC, encompassing lncRNAs and mRNA, thus facilitating a deeper understanding of ESCC molecular mechanisms and serving as a valuable resource for ESCC research communities.