Your browser doesn't support javascript.
loading
MkcDBGAS: a reference-free approach to identify comprehensive alternative splicing events in a transcriptome.
Zhang, Quanbao; Cao, Lei; Song, Hongtao; Lin, Kui; Pang, Erli.
Afiliação
  • Zhang Q; MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China.
  • Cao L; MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China.
  • Song H; MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China.
  • Lin K; MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China.
  • Pang E; MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China.
Brief Bioinform ; 24(6)2023 09 22.
Article em En | MEDLINE | ID: mdl-37833843
ABSTRACT
Alternative splicing (AS) is an essential post-transcriptional mechanism that regulates many biological processes. However, identifying comprehensive types of AS events without guidance from a reference genome is still a challenge. Here, we proposed a novel method, MkcDBGAS, to identify all seven types of AS events using transcriptome alone, without a reference genome. MkcDBGAS, modeled by full-length transcripts of human and Arabidopsis thaliana, consists of three modules. In the first module, MkcDBGAS, for the first time, uses a colored de Bruijn graph with dynamic- and mixed- kmers to identify bubbles generated by AS with precision higher than 98.17% and detect AS types overlooked by other tools. In the second module, to further classify types of AS, MkcDBGAS added the motifs of exons to construct the feature matrix followed by the XGBoost-based classifier with the accuracy of classification greater than 93.40%, which outperformed other widely used machine learning models and the state-of-the-art methods. Highly scalable, MkcDBGAS performed well when applied to Iso-Seq data of Amborella and transcriptome of mouse. In the third module, MkcDBGAS provides the analysis of differential splicing across multiple biological conditions when RNA-sequencing data is available. MkcDBGAS is the first accurate and scalable method for detecting all seven types of AS events using the transcriptome alone, which will greatly empower the studies of AS in a wider field.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Arabidopsis / Processamento Alternativo Limite: Animals / Humans Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Arabidopsis / Processamento Alternativo Limite: Animals / Humans Idioma: En Revista: Brief Bioinform Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2023 Tipo de documento: Article País de afiliação: China