RESUMO
Circular RNAs (circRNAs) are RNA molecules with a continuous loop structure characterized by back-splice junctions (BSJs). While analyses of short-read RNA sequencing have identified millions of BSJ events, it is inherently challenging to determine exact full-length sequences and alternatively spliced (AS) isoforms of circRNAs. Recent advances in nanopore long-read sequencing with circRNA enrichment bring an unprecedented opportunity for investigating the issues. Here, we developed FL-circAS (https://cosbi.ee.ncku.edu.tw/FL-circAS/), which collected such long-read sequencing data of 20 cell lines/tissues and thereby identified 884 636 BSJs with 1 853 692 full-length circRNA isoforms in human and 115 173 BSJs with 135 617 full-length circRNA isoforms in mouse. FL-circAS also provides multiple circRNA features. For circRNA expression, FL-circAS calculates expression levels for each circRNA isoform, cell line/tissue specificity at both the BSJ and isoform levels, and AS entropy for each BSJ across samples. For circRNA biogenesis, FL-circAS identifies reverse complementary sequences and RNA binding protein (RBP) binding sites residing in flanking sequences of BSJs. For functional patterns, FL-circAS identifies potential microRNA/RBP binding sites and several types of evidence for circRNA translation on each full-length circRNA isoform. FL-circAS provides user-friendly interfaces for browsing, searching, analyzing, and downloading data, serving as the first resource for discovering full-length circRNAs at the isoform level.
Assuntos
Bases de Dados de Ácidos Nucleicos , RNA Circular , Animais , Humanos , Camundongos , Processamento Alternativo/genética , MicroRNAs/genética , MicroRNAs/metabolismo , Sequenciamento por Nanoporos , RNA Circular/genética , Isoformas de RNA/genéticaRESUMO
BACKGROUND AND OBJECTIVE: Proteome microarrays are one of the popular high-throughput screening methods for large-scale investigation of protein interactions in cells. These interactions can be measured on protein chips when coupled with fluorescence-labeled probes, helping indicate potential biomarkers or discover drugs. Several computational tools were developed to help analyze the protein chip results. However, existing tools fail to provide a user-friendly interface for biologists and present only one or two data analysis methods suitable for limited experimental designs, restricting the use cases. METHODS: In order to facilitate the biomarker examination using protein chips, we implemented a user-friendly and comprehensive web tool called BAPCP (Biomarker Analysis tool for Protein Chip Platforms) in this research to deal with diverse chip data distributions. RESULTS: BAPCP is well integrated with standard chip result files and includes 7 data normalization methods and 7 custom-designed quality control/differential analysis filters for biomarker extraction among experiment groups. Moreover, it can handle cost-efficient chip designs that repeat several blocks/samples within one single slide. Using experiments of the human coronavirus (HCoV) protein microarray and the E. coli proteome chip that helps study the immune response of Kawasaki disease as examples, we demonstrated that BAPCP can accelerate the time-consuming week-long manual biomarker identification process to merely 3 min. CONCLUSIONS: The developed BAPCP tool provides substantial analysis support for protein interaction studies and conforms to the necessity of expanding computer usage and exchanging information in bioscience and medicine. The web service of BAPCP is available at https://cosbi.ee.ncku.edu.tw/BAPCP/.