Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.
Nat Biotechnol
; 33(8): 831-8, 2015 Aug.
Article
en En
| MEDLINE
| ID: mdl-26213851
Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Programas Informáticos
/
Proteínas de Unión al ARN
/
Biología Computacional
/
Análisis de Secuencia de Proteína
/
Proteínas de Unión al ADN
Tipo de estudio:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
Nat Biotechnol
Asunto de la revista:
BIOTECNOLOGIA
Año:
2015
Tipo del documento:
Article
País de afiliación:
Canadá