Your browser doesn't support javascript.
loading
COSSMO: predicting competitive alternative splice site selection using deep learning.
Bretschneider, Hannes; Gandhi, Shreshth; Deshwar, Amit G; Zuberi, Khalid; Frey, Brendan J.
Afiliación
  • Bretschneider H; Deep Genomics Inc, Toronto, Canada.
  • Gandhi S; Department of Computer Science, University of Toronto, Toronto, Canada.
  • Deshwar AG; Deep Genomics Inc, Toronto, Canada.
  • Zuberi K; Deep Genomics Inc, Toronto, Canada.
  • Frey BJ; Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada.
Bioinformatics ; 34(13): i429-i437, 2018 07 01.
Article en En | MEDLINE | ID: mdl-29949959
ABSTRACT
Motivation Alternative splice site selection is inherently competitive and the probability of a given splice site to be used also depends on the strength of neighboring sites. Here, we present a new model named the competitive splice site model (COSSMO), which explicitly accounts for these competitive effects and predicts the percent selected index (PSI) distribution over any number of putative splice sites. We model an alternative splicing event as the choice of a 3' acceptor site conditional on a fixed upstream 5' donor site or the choice of a 5' donor site conditional on a fixed 3' acceptor site. We build four different architectures that use convolutional layers, communication layers, long short-term memory and residual networks, respectively, to learn relevant motifs from sequence alone. We also construct a new dataset from genome annotations and RNA-Seq read data that we use to train our model.

Results:

COSSMO is able to predict the most frequently used splice site with an accuracy of 70% on unseen test data, and achieve an R2 of 0.6 in modeling the PSI distribution. We visualize the motifs that COSSMO learns from sequence and show that COSSMO recognizes the consensus splice site sequences and many known splicing factors with high specificity. Availability and implementation Model predictions, our training dataset, and code are available from http//cossmo.genes.toronto.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia de ARN / Empalme Alternativo / Sitios de Empalme de ARN / Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2018 Tipo del documento: Article País de afiliación: Canadá

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia de ARN / Empalme Alternativo / Sitios de Empalme de ARN / Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2018 Tipo del documento: Article País de afiliación: Canadá