Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
1.
Bioinformatics ; 34(13): i429-i437, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29949959

RESUMO

Motivation: Alternative splice site selection is inherently competitive and the probability of a given splice site to be used also depends on the strength of neighboring sites. Here, we present a new model named the competitive splice site model (COSSMO), which explicitly accounts for these competitive effects and predicts the percent selected index (PSI) distribution over any number of putative splice sites. We model an alternative splicing event as the choice of a 3' acceptor site conditional on a fixed upstream 5' donor site or the choice of a 5' donor site conditional on a fixed 3' acceptor site. We build four different architectures that use convolutional layers, communication layers, long short-term memory and residual networks, respectively, to learn relevant motifs from sequence alone. We also construct a new dataset from genome annotations and RNA-Seq read data that we use to train our model. Results: COSSMO is able to predict the most frequently used splice site with an accuracy of 70% on unseen test data, and achieve an R2 of 0.6 in modeling the PSI distribution. We visualize the motifs that COSSMO learns from sequence and show that COSSMO recognizes the consensus splice site sequences and many known splicing factors with high specificity. Availability and implementation: Model predictions, our training dataset, and code are available from http://cossmo.genes.toronto.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento Alternativo , Aprendizado Profundo , Sítios de Splice de RNA , Análise de Sequência de RNA/métodos , Biologia Computacional/métodos , Humanos , Modelos Genéticos , Probabilidade , Software
2.
Science ; 347(6218): 1254806, 2015 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-25525159

RESUMO

To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.


Assuntos
Inteligência Artificial , Transtornos Globais do Desenvolvimento Infantil/genética , Neoplasias Colorretais Hereditárias sem Polipose/genética , Estudo de Associação Genômica Ampla/métodos , Anotação de Sequência Molecular/métodos , Atrofia Muscular Espinal/genética , Splicing de RNA/genética , Proteínas Adaptadoras de Transdução de Sinal/genética , Simulação por Computador , DNA/genética , Éxons/genética , Código Genético , Marcadores Genéticos , Variação Genética , Humanos , Íntrons/genética , Modelos Genéticos , Proteína 1 Homóloga a MutL , Mutação de Sentido Incorreto , Proteínas Nucleares/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Sítios de Splice de RNA/genética , Proteínas de Ligação a RNA/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA