Búsqueda | Portal de Búsqueda de la BVS España

De novo distillation of thermodynamic affinity from deep learning regulatory sequence models of in vivo protein-DNA binding.

Alexandari, Amr M; Horton, Connor A; Shrikumar, Avanti; Shah, Nilay; Li, Eileen; Weilert, Melanie; Pufall, Miles A; Zeitlinger, Julia; Fordyce, Polly M; Kundaje, Anshul.

bioRxiv ; 2023 May 11.

Artículo en Inglés | MEDLINE | ID: mdl-37214836

RESUMEN

Transcription factors (TF) are proteins that bind DNA in a sequence-specific manner to regulate gene transcription. Despite their unique intrinsic sequence preferences, in vivo genomic occupancy profiles of TFs differ across cellular contexts. Hence, deciphering the sequence determinants of TF binding, both intrinsic and context-specific, is essential to understand gene regulation and the impact of regulatory, non-coding genetic variation. Biophysical models trained on in vitro TF binding assays can estimate intrinsic affinity landscapes and predict occupancy based on TF concentration and affinity. However, these models cannot adequately explain context-specific, in vivo binding profiles. Conversely, deep learning models, trained on in vivo TF binding assays, effectively predict and explain genomic occupancy profiles as a function of complex regulatory sequence syntax, albeit without a clear biophysical interpretation. To reconcile these complementary models of in vitro and in vivo TF binding, we developed Affinity Distillation (AD), a method that extracts thermodynamic affinities de-novo from deep learning models of TF chromatin immunoprecipitation (ChIP) experiments by marginalizing away the influence of genomic sequence context. Applied to neural networks modeling diverse classes of yeast and mammalian TFs, AD predicts energetic impacts of sequence variation within and surrounding motifs on TF binding as measured by diverse in vitro assays with superior dynamic range and accuracy compared to motif-based methods. Furthermore, AD can accurately discern affinities of TF paralogs. Our results highlight thermodynamic affinity as a key determinant of in vivo binding, suggest that deep learning models of in vivo binding implicitly learn high-resolution affinity landscapes, and show that these affinities can be successfully distilled using AD. This new biophysical interpretation of deep learning models enables high-throughput in silico experiments to explore the influence of sequence context and variation on both intrinsic affinity and in vivo occupancy.

Short tandem repeats bind transcription factors to tune eukaryotic gene expression.

Horton, Connor A; Alexandari, Amr M; Hayes, Michael G B; Marklund, Emil; Schaepe, Julia M; Aditham, Arjun K; Shah, Nilay; Suzuki, Peter H; Shrikumar, Avanti; Afek, Ariel; Greenleaf, William J; Gordân, Raluca; Zeitlinger, Julia; Kundaje, Anshul; Fordyce, Polly M.

Science ; 381(6664): eadd1250, 2023 09 22.

Artículo en Inglés | MEDLINE | ID: mdl-37733848

RESUMEN

Short tandem repeats (STRs) are enriched in eukaryotic cis-regulatory elements and alter gene expression, yet how they regulate transcription remains unknown. We found that STRs modulate transcription factor (TF)-DNA affinities and apparent on-rates by about 70-fold by directly binding TF DNA-binding domains, with energetic impacts exceeding many consensus motif mutations. STRs maximize the number of weakly preferred microstates near target sites, thereby increasing TF density, with impacts well predicted by statistical mechanics. Confirming that STRs also affect TF binding in cells, neural networks trained only on in vivo occupancies predicted effects identical to those observed in vitro. Approximately 90% of TFs preferentially bound STRs that need not resemble known motifs, providing a cis-regulatory mechanism to target TFs to genomic sites.

Asunto(s)

Regulación de la Expresión Génica , Repeticiones de Microsatélite , Factores de Transcripción , Células Eucariotas , Factores de Transcripción/química , Factores de Transcripción/genética , Unión Proteica , Humanos , Animales , Saccharomyces cerevisiae , Dominios Proteicos , Conformación Proteica

Base-resolution models of transcription-factor binding reveal soft motif syntax.

Avsec, Ziga; Weilert, Melanie; Shrikumar, Avanti; Krueger, Sabrina; Alexandari, Amr; Dalal, Khyati; Fropf, Robin; McAnany, Charles; Gagneur, Julien; Kundaje, Anshul; Zeitlinger, Julia.

Nat Genet ; 53(3): 354-366, 2021 03.

Artículo en Inglés | MEDLINE | ID: mdl-33603233

RESUMEN

The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.

Asunto(s)

Biología Computacional/métodos , Motivos de Nucleótidos , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Inmunoprecipitación de Cromatina , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Aprendizaje Profundo , Ratones , Células Madre Embrionarias de Ratones/fisiología , Proteína Homeótica Nanog/metabolismo , Redes Neurales de la Computación , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Reproducibilidad de los Resultados , Factores de Transcripción SOXB1/metabolismo

Opportunities and obstacles for deep learning in biology and medicine.

Ching, Travers; Himmelstein, Daniel S; Beaulieu-Jones, Brett K; Kalinin, Alexandr A; Do, Brian T; Way, Gregory P; Ferrero, Enrico; Agapow, Paul-Michael; Zietz, Michael; Hoffman, Michael M; Xie, Wei; Rosen, Gail L; Lengerich, Benjamin J; Israeli, Johnny; Lanchantin, Jack; Woloszynek, Stephen; Carpenter, Anne E; Shrikumar, Avanti; Xu, Jinbo; Cofer, Evan M; Lavender, Christopher A; Turaga, Srinivas C; Alexandari, Amr M; Lu, Zhiyong; Harris, David J; DeCaprio, Dave; Qi, Yanjun; Kundaje, Anshul; Peng, Yifan; Wiley, Laura K; Segler, Marwin H S; Boca, Simina M; Swamidass, S Joshua; Huang, Austin; Gitter, Anthony; Greene, Casey S.

J R Soc Interface ; 15(141)2018 04.

Artículo en Inglés | MEDLINE | ID: mdl-29618526

RESUMEN

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.

Asunto(s)

Investigación Biomédica/tendencias , Tecnología Biomédica/tendencias , Aprendizaje Profundo/tendencias , Algoritmos , Investigación Biomédica/métodos , Toma de Decisiones , Atención a la Salud/métodos , Atención a la Salud/tendencias , Enfermedad/genética , Diseño de Fármacos , Registros Electrónicos de Salud/tendencias , Humanos , Terminología como Asunto

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA