Search | VHL Search Portal

Genome-wide identification of associations between enhancer and alternative splicing in human and mouse.

Shiau, Cheng-Kai; Huang, Jia-Hsin; Liu, Yu-Ting; Tsai, Huai-Kuang.

BMC Genomics ; 22(Suppl 5): 919, 2022 May 09.

Article in English | MEDLINE | ID: mdl-35534820

ABSTRACT

BACKGROUND: Alternative splicing (AS) increases the diversity of transcriptome and could fine-tune the function of genes, so that understanding the regulation of AS is vital. AS could be regulated by many different cis-regulatory elements, such as enhancer. Enhancer has been experimentally proved to regulate AS in some genes. However, there is a lack of genome-wide studies on the association between enhancer and AS (enhancer-AS association). To bridge the gap, here we developed an integrative analysis on a genome-wide scale to identify enhancer-AS associations in human and mouse. RESULT: We collected enhancer datasets which include 28 human and 24 mouse tissues and cell lines, and RNA-seq datasets which are paired with the selected tissues. Combining with data integration and statistical analysis, we identified 3,242 human and 7,716 mouse genes which have significant enhancer-AS associations in at least one tissue. On average, for each gene, about 6% of enhancers in human (5% in mouse) are associated to AS change and for each enhancer, approximately one gene is identified to have enhancer-AS association in both human and mouse. We found that 52% of the human significant (34% in mouse) enhancer-AS associations are the co-existence of homologous genes and homologous enhancers. We further constructed a user-friendly platform, named Visualization of Enhancer-associated Alternative Splicing (VEnAS, http://venas.iis.sinica.edu.tw/ ), to provide genomic architecture, intuitive association plot, and contingency table of the significant enhancer-AS associations. CONCLUSION: This study provides the first genome-wide identification of enhancer-AS associations in human and mouse. The results suggest that a notable portion of enhancers are playing roles in AS regulations. The analyzed results and the proposed platform VEnAS would provide a further understanding of enhancers on regulating alternative splicing.

Subject(s)

Alternative Splicing , Enhancer Elements, Genetic , Animals , Genome-Wide Association Study , Genomics/methods , Humans , Mice , RNA-Seq

CATANA: a tool for generating comprehensive annotations of alternative transcript events.

Shiau, Cheng-Kai; Huang, Jia-Hsin; Tsai, Huai-Kuang.

Bioinformatics ; 35(8): 1414-1415, 2019 04 15.

Article in English | MEDLINE | ID: mdl-30202999

ABSTRACT

SUMMARY: In higher eukaryotes, the generation of transcript isoforms from a single gene through alternative splicing (AS) and alternative transcription (AT) mechanisms increases functional and regulatory diversities. Annotating these alternative transcript events is essential for genomic studies. However, there are no existing tools that generate comprehensive annotations of all these alternative transcript events including both AS and AT events. In the present study, we develop CATANA, with the encoded exon usage patterns based on the flattened gene model, to identify ten types of AS and AT events. We demonstrate the power and versatility of CATANA by showing greater depth of annotations of alternative transcript events according to either genome annotation or RNA-seq data. AVAILABILITY AND IMPLEMENTATION: CATANA is available on https://github.com/shiauck/CATANA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Alternative Splicing , Software , Transcription, Genetic , Exons , Genome , Sequence Analysis, RNA

IGDB.NSCLC: integrated genomic database of non-small cell lung cancer.

Kao, Sen; Shiau, Cheng-Kai; Gu, De-Leung; Ho, Chun-Ming; Su, Wen-Hui; Chen, Chian-Feng; Lin, Chi-Hung; Jou, Yuh-Shan.

Nucleic Acids Res ; 40(Database issue): D972-7, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22139933

ABSTRACT

Lung cancer is the most common cause of cancer-related mortality with more than 1.4 million deaths per year worldwide. To search for significant somatic alterations in lung cancer, we analyzed, integrated and manually curated various data sets and literatures to present an integrated genomic database of non-small cell lung cancer (IGDB.NSCLC, http://igdb.nsclc.ibms.sinica.edu.tw). We collected data sets derived from hundreds of human NSCLC (lung adenocarcinomas and/or squamous cell carcinomas) to illustrate genomic alterations [chromosomal regions with copy number alterations (CNAs), gain/loss and loss of heterozygosity], aberrant expressed genes and microRNAs, somatic mutations and experimental evidence and clinical information of alterations retrieved from literatures. IGDB.NSCLC provides user friendly interfaces and searching functions to display multiple layers of evidence especially emphasizing on concordant alterations of CNAs with co-localized altered gene expression, aberrant microRNAs expression, somatic mutations or genes with associated clinicopathological features. These significant concordant alterations in NSCLC are graphically or tabularly presented to facilitate and prioritize as the putative cancer targets for pathological and mechanistic studies of lung tumorigenesis and for developing new strategies in clinical interventions.

Subject(s)

Carcinoma, Non-Small-Cell Lung/genetics , Databases, Genetic , Lung Neoplasms/genetics , Carcinoma, Non-Small-Cell Lung/metabolism , Gene Expression Profiling , Genes, Neoplasm , Genetic Variation , Genomics , Humans , Lung Neoplasms/metabolism , MicroRNAs/metabolism , Mutation , Systems Integration

IGRhCellID: integrated genomic resources of human cell lines for identification.

Shiau, Cheng-Kai; Gu, De-Leung; Chen, Chian-Feng; Lin, Chi-Hung; Jou, Yuh-Shan.

Nucleic Acids Res ; 39(Database issue): D520-4, 2011 Jan.

Article in English | MEDLINE | ID: mdl-21051335

ABSTRACT

Cell line identification is emerging as an essential method for every cell line user in research community to avoid using misidentified cell lines for experiments and publications. IGRhCellID (http://igrcid.ibms.sinica.edu.tw) is designed to integrate eight cell identification methods including seven methods (STR profile, gender, immunotypes, karyotype, isoenzyme profile, TP53 mutation and mutations of cancer genes) available in various public databases and our method of profiling genome alterations of human cell lines. With data validation of 11 small deleted genes in human cancer cell lines, profiles of genomic alterations further allow users to search for human cell lines with deleted gene to serve as indigenous knock-out cell model (such as SMAD4 in gene view), with amplified gene to be the cell models for testing therapeutic efficacy (such as ERBB2 in gene view) and with overlapped aberrant chromosomal loci for revealing common cancer genes (such as 9p21.3 homozygous deletion with co-deleted CDKN2A, CDKN2B and MTAP in chromosome view). IGRhCellID provides not only available methods for cell identification to help eradicating concerns of using misidentified cells but also designated genetic features of human cell lines for experiments.

Subject(s)

Cell Line , Databases, Factual , Genomics , Cell Line, Tumor , Genes , Genetic Loci , Humans

Delineating genotypes and phenotypes of individual cells from long-read single cell transcriptomes.

Shiau, Cheng-Kai; Lu, Lina; Kieser, Rachel; Fukumura, Kazutaka; Pan, Timothy; Lin, Hsiao-Yun; Yang, Jie; Tong, Eric L; Lee, GaHyun; Yan, Yuanqing; Huse, Jason T; Gao, Ruli.

bioRxiv ; 2023 Feb 03.

Article in English | MEDLINE | ID: mdl-36778278

ABSTRACT

Single-cell nanopore sequencing of full-length mRNAs (scNanoRNAseq) is transforming singlecell multi-omics studies. However, challenges include computational complexity and dependence on short-read curation. To address this, we developed a comprehensive toolkit, scNanoGPS to calculate same-cell genotypes-phenotypes without short-read guidance. We applied scNanoGPS onto 23,587 long-read transcriptomes from 4 tumors and 2 cell lines. Standalone, scNanoGPS accurately deconvoluted error-prone long-reads into single-cells and single-molecules. Further, scNanoGPS simultaneously accessed both phenotypes (expressions/isoforms) and genotypes (mutations) of individual cells. Our analyses revealed that tumor and stroma/immune cells often expressed significantly distinct combinations of isoforms (DCIs). In a kidney tumor, we identified 924 genes with DCIs involved in cell-type-specific functions such as PDE10A in tumor cells and CCL3 in lymphocytes. Moreover, transcriptome-wide mutation analyses identified many cell-type-specific mutations including VEGFA mutations in tumor cells and HLA-A mutations in immune cells, highlighting critical roles of different populations in tumors. Together, scNanoGPS facilitates applications of single-cell long-read sequencing.

High throughput single cell long-read sequencing analyses of same-cell genotypes and phenotypes in human tumors.

Shiau, Cheng-Kai; Lu, Lina; Kieser, Rachel; Fukumura, Kazutaka; Pan, Timothy; Lin, Hsiao-Yun; Yang, Jie; Tong, Eric L; Lee, GaHyun; Yan, Yuanqing; Huse, Jason T; Gao, Ruli.

Nat Commun ; 14(1): 4124, 2023 07 11.

Article in English | MEDLINE | ID: mdl-37433798

ABSTRACT

Single-cell nanopore sequencing of full-length mRNAs transforms single-cell multi-omics studies. However, challenges include high sequencing errors and dependence on short-reads and/or barcode whitelists. To address these, we develop scNanoGPS to calculate same-cell genotypes (mutations) and phenotypes (gene/isoform expressions) without short-read nor whitelist guidance. We apply scNanoGPS onto 23,587 long-read transcriptomes from 4 tumors and 2 cell-lines. Standalone, scNanoGPS deconvolutes error-prone long-reads into single-cells and single-molecules, and simultaneously accesses both phenotypes and genotypes of individual cells. Our analyses reveal that tumor and stroma/immune cells express distinct combination of isoforms (DCIs). In a kidney tumor, we identify 924 DCI genes involved in cell-type-specific functions such as PDE10A in tumor cells and CCL3 in lymphocytes. Transcriptome-wide mutation analyses identify many cell-type-specific mutations including VEGFA mutations in tumor cells and HLA-A mutations in immune cells, highlighting the critical roles of different mutant populations in tumors. Together, scNanoGPS facilitates applications of single-cell long-read sequencing technologies.

Subject(s)

Carcinoma, Intraductal, Noninfiltrating , Kidney Neoplasms , Humans , Genotype , High-Throughput Nucleotide Sequencing , Phenotype , Phosphoric Diester Hydrolases

Anaplastic transformation in thyroid cancer revealed by single-cell transcriptomics.

Lu, Lina; Wang, Jennifer Rui; Henderson, Ying C; Bai, Shanshan; Yang, Jie; Hu, Min; Shiau, Cheng-Kai; Pan, Timothy; Yan, Yuanqing; Tran, Tuan M; Li, Jianzhuo; Kieser, Rachel; Zhao, Xiao; Wang, Jiping; Nurieva, Roza; Williams, Michelle D; Cabanillas, Maria E; Dadu, Ramona; Busaidy, Naifa Lamki; Zafereo, Mark; Navin, Nicholas; Lai, Stephen Y; Gao, Ruli.

J Clin Invest ; 133(11)2023 06 01.

Article in English | MEDLINE | ID: mdl-37053016

ABSTRACT

The deadliest anaplastic thyroid cancer (ATC) often transforms from indolent differentiated thyroid cancer (DTC); however, the complex intratumor transformation process is poorly understood. We investigated an anaplastic transformation model by dissecting both cell lineage and cell fate transitions using single-cell transcriptomic and genetic alteration data from patients with different subtypes of thyroid cancer. The resulting spectrum of ATC transformation included stress-responsive DTC cells, inflammatory ATC cells (iATCs), and mitotic-defective ATC cells and extended all the way to mesenchymal ATC cells (mATCs). Furthermore, our analysis identified 2 important milestones: (a) a diploid stage, in which iATC cells were diploids with inflammatory phenotypes and (b) an aneuploid stage, in which mATCs gained aneuploid genomes and mesenchymal phenotypes, producing excessive amounts of collagen and collagen-interacting receptors. In parallel, cancer-associated fibroblasts showed strong interactions among mesenchymal cell types, macrophages shifted from M1 to M2 states, and T cells reprogrammed from cytotoxic to exhausted states, highlighting new therapeutic opportunities for the treatment of ATC.

Subject(s)

Thyroid Carcinoma, Anaplastic , Thyroid Neoplasms , Humans , Transcriptome , Thyroid Neoplasms/genetics , Thyroid Neoplasms/metabolism , Thyroid Carcinoma, Anaplastic/genetics , Gene Expression Profiling , Aneuploidy , Cell Line, Tumor

Vir-Mir db: prediction of viral microRNA candidate hairpins.

Li, Sung-Chou; Shiau, Cheng-Kai; Lin, Wen-Chang.

Nucleic Acids Res ; 36(Database issue): D184-9, 2008 Jan.

Article in English | MEDLINE | ID: mdl-17702763

ABSTRACT

MicroRNAs have been found in various organisms and play essential roles in gene expression regulation of many critical cellular processes. Large-scale computational prediction of miRNAs has been conducted for many organisms using known genomic sequences; however, there has been no such effort for the thousands of known viral genomes. Some viruses utilize existing host cellular pathways for their own benefit. Furthermore, viruses are capable of encoding miRNAs and using them to repress host genes. Thus, identifying potential miRNAs in all viral genomes would be valuable to virologists who study virus-host interactions. Based on our previously reported hairpin secondary structure and feature selection filters, we have examined the 2266 available viral genome sequences for putative miRNA hairpins and identified 33 691 hairpin candidates in 1491 genomes. Evaluation of the system performance indicated that our discovery pipeline exhibited 84.4% sensitivity. We established an interface for users to query the predicted viral miRNA hairpins based on taxonomic classification, and a host target gene prediction service based on the RNAhybrid program and the 3'-UTR gene sequences of human, mouse, rat, zebrafish, rice and Arabidopsis. The viral miRNA prediction database (Vir-Mir) can be accessed via http://alk.ibms.sinica.edu.tw.

Subject(s)

Databases, Nucleic Acid , MicroRNAs/chemistry , RNA, Viral/chemistry , 3' Untranslated Regions/chemistry , Animals , Databases, Nucleic Acid/statistics & numerical data , Genome, Viral , Humans , Internet , Mice , MicroRNAs/classification , Open Reading Frames , RNA, Viral/classification , Rats , Sequence Analysis, RNA , Software , User-Computer Interface

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL