Búsqueda | Portal Regional de la BVS

1.

A systematic analysis of regression models for protein engineering.

Michael, Richard; Kæstel-Hansen, Jacob; Mørch Groth, Peter; Bartels, Simon; Salomon, Jesper; Tian, Pengfei; Hatzakis, Nikos S; Boomsma, Wouter.

PLoS Comput Biol ; 20(5): e1012061, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38701099

RESUMEN

To optimize proteins for particular traits holds great promise for industrial and pharmaceutical purposes. Machine Learning is increasingly applied in this field to predict properties of proteins, thereby guiding the experimental optimization process. A natural question is: How much progress are we making with such predictions, and how important is the choice of regressor and representation? In this paper, we demonstrate that different assessment criteria for regressor performance can lead to dramatically different conclusions, depending on the choice of metric, and how one defines generalization. We highlight the fundamental issues of sample bias in typical regression scenarios and how this can lead to misleading conclusions about regressor performance. Finally, we make the case for the importance of calibrated uncertainty in this domain.

Asunto(s)

Biología Computacional , Aprendizaje Automático , Ingeniería de Proteínas , Ingeniería de Proteínas/métodos , Análisis de Regresión , Biología Computacional/métodos , Proteínas/química , Algoritmos

2.

Deep learning assisted single particle tracking for automated correlation between diffusion and function.

Hatzakis, Nikos; Kaestel-Hansen, Jacob; de Sautu, Marilina; Saminathan, Anand; Scanavachi, Gustavo; Correia, Ricardo; Nielsen, Annette Juma; Bleshoey, Sara; Boomsma, Wouter; Kirchhausen, Tomas.

Res Sq ; 2024 Feb 02.

Artículo en Inglés | MEDLINE | ID: mdl-38352328

RESUMEN

Sub-cellular diffusion in living systems reflects cellular processes and interactions. Recent advances in optical microscopy allow the tracking of this nanoscale diffusion of individual objects with an unprecedented level of precision. However, the agnostic and automated extraction of functional information from the diffusion of molecules and organelles within the sub-cellular environment, is labor-intensive and poses a significant challenge. Here we introduce DeepSPT, a deep learning framework to interpret the diffusional 2D or 3D temporal behavior of objects in a rapid and efficient manner, agnostically. Demonstrating its versatility, we have applied DeepSPT to automated mapping of the early events of viral infections, identifying distinct types of endosomal organelles, and clathrin-coated pits and vesicles with up to 95% accuracy and within seconds instead of weeks. The fact that DeepSPT effectively extracts biological information from diffusion alone illustrates that besides structure, motion encodes function at the molecular and subcellular level.

3.

Deep learning assisted single particle tracking for automated correlation between diffusion and function.

Kæstel-Hansen, Jacob; de Sautu, Marilina; Saminathan, Anand; Scanavachi, Gustavo; Da Cunha Correia, Ricardo F Bango; Nielsen, Annette Juma; Bleshøy, Sara Vogt; Boomsma, Wouter; Kirchhausen, Tom; Hatzakis, Nikos S.

bioRxiv ; 2023 Nov 17.

Artículo en Inglés | MEDLINE | ID: mdl-38014323

RESUMEN

Sub-cellular diffusion in living systems reflects cellular processes and interactions. Recent advances in optical microscopy allow the tracking of this nanoscale diffusion of individual objects with an unprecedented level of precision. However, the agnostic and automated extraction of functional information from the diffusion of molecules and organelles within the sub-cellular environment, is labor-intensive and poses a significant challenge. Here we introduce DeepSPT, a deep learning framework to interpret the diffusional 2D or 3D temporal behavior of objects in a rapid and efficient manner, agnostically. Demonstrating its versatility, we have applied DeepSPT to automated mapping of the early events of viral infections, identifying distinct types of endosomal organelles, and clathrin-coated pits and vesicles with up to 95% accuracy and within seconds instead of weeks. The fact that DeepSPT effectively extracts biological information from diffusion alone indicates that besides structure, motion encodes function at the molecular and subcellular level.

4.

Phosphorylation of Schizosaccharomyces pombe Dss1 mediates direct binding to the ubiquitin-ligase Dma1 in vitro.

Jacobsen, Nina L; Bloch, Magnus; Millard, Peter S; Ruidiaz, Sarah F; Elsborg, Jonas D; Boomsma, Wouter; Hendus-Altenburger, Ruth; Hartmann-Petersen, Rasmus; Kragelund, Birthe B.

Protein Sci ; 32(9): e4733, 2023 09.

Artículo en Inglés | MEDLINE | ID: mdl-37463013

RESUMEN

Intrinsically disordered proteins (IDPs) are often multifunctional and frequently posttranslationally modified. Deleted in split hand/split foot 1 (Dss1-Sem1 in budding yeast) is a highly multifunctional IDP associated with a range of protein complexes. However, it remains unknown if the different functions relate to different modified states. In this work, we show that Schizosaccharomyces pombe Dss1 is a substrate for casein kinase 2 in vitro, and we identify three phosphorylated threonines in its linker region separating two known disordered ubiquitin-binding motifs. Phosphorylations of the threonines had no effect on ubiquitin-binding but caused a slight destabilization of the C-terminal α-helix and mediated a direct interaction with the forkhead-associated (FHA) domain of the RING-FHA E3-ubiquitin ligase defective in mitosis 1 (Dma1). The phosphorylation sites are not conserved and are absent in human Dss1. Sequence analyses revealed that the Txx(E/D) motif, which is important for phosphorylation and Dma1 binding, is not linked to certain branches of the evolutionary tree. Instead, we find that the motif appears randomly, supporting the mechanism of ex nihilo evolution of novel motifs. In support of this, other threonine-based motifs, although frequent, are nonconserved in the linker, pointing to additional functions connected to this region. We suggest that Dss1 acts as an adaptor protein that docks to Dma1 via the phosphorylated FHA-binding motifs, while the C-terminal α-helix is free to bind mitotic septins, thereby stabilizing the complex. The presence of Txx(D/E) motifs in the disordered regions of certain septin subunits may be of further relevance to the formation and stabilization of these complexes.

Asunto(s)

Proteínas de Ciclo Celular , Proteínas de Schizosaccharomyces pombe , Schizosaccharomyces , Ubiquitina-Proteína Ligasas , Humanos , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Fosforilación , Unión Proteica , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo , Proteínas de Schizosaccharomyces pombe/genética , Proteínas de Schizosaccharomyces pombe/metabolismo , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismo

5.

Lysine deserts prevent adventitious ubiquitylation of ubiquitin-proteasome components.

Kampmeyer, Caroline; Grønbæk-Thygesen, Martin; Oelerich, Nicole; Tatham, Michael H; Cagiada, Matteo; Lindorff-Larsen, Kresten; Boomsma, Wouter; Hofmann, Kay; Hartmann-Petersen, Rasmus.

Cell Mol Life Sci ; 80(6): 143, 2023 May 09.

Artículo en Inglés | MEDLINE | ID: mdl-37160462

RESUMEN

In terms of its relative frequency, lysine is a common amino acid in the human proteome. However, by bioinformatics we find hundreds of proteins that contain long and evolutionarily conserved stretches completely devoid of lysine residues. These so-called lysine deserts show a high prevalence in intrinsically disordered proteins with known or predicted functions within the ubiquitin-proteasome system (UPS), including many E3 ubiquitin-protein ligases and UBL domain proteasome substrate shuttles, such as BAG6, RAD23A, UBQLN1 and UBQLN2. We show that introduction of lysine residues into the deserts leads to a striking increase in ubiquitylation of some of these proteins. In case of BAG6, we show that ubiquitylation is catalyzed by the E3 RNF126, while RAD23A is ubiquitylated by E6AP. Despite the elevated ubiquitylation, mutant RAD23A appears stable, but displays a partial loss of function phenotype in fission yeast. In case of UBQLN1 and BAG6, introducing lysine leads to a reduced abundance due to proteasomal degradation of the proteins. For UBQLN1 we show that arginine residues within the lysine depleted region are critical for its ability to form cytosolic speckles/inclusions. We propose that selective pressure to avoid lysine residues may be a common evolutionary mechanism to prevent unwarranted ubiquitylation and/or perhaps other lysine post-translational modifications. This may be particularly relevant for UPS components as they closely and frequently encounter the ubiquitylation machinery and are thus more susceptible to nonspecific ubiquitylation.

Asunto(s)

Complejo de la Endopetidasa Proteasomal , Schizosaccharomyces , Humanos , Ubiquitina , Lisina , Citoplasma , Ubiquitinación , Schizosaccharomyces/genética , Chaperonas Moleculares , Proteínas Relacionadas con la Autofagia , Proteínas Adaptadoras Transductoras de Señales , Ubiquitina-Proteína Ligasas

6.

Rapid protein stability prediction using deep learning representations.

Blaabjerg, Lasse M; Kassem, Maher M; Good, Lydia L; Jonsson, Nicolas; Cagiada, Matteo; Johansson, Kristoffer E; Boomsma, Wouter; Stein, Amelie; Lindorff-Larsen, Kresten.

Elife ; 122023 05 15.

Artículo en Inglés | MEDLINE | ID: mdl-37184062

RESUMEN

Predicting the thermodynamic stability of proteins is a common and widely used step in protein engineering, and when elucidating the molecular mechanisms behind evolution and disease. Here, we present RaSP, a method for making rapid and accurate predictions of changes in protein stability by leveraging deep learning representations. RaSP performs on-par with biophysics-based methods and enables saturation mutagenesis stability predictions in less than a second per residue. We use RaSP to calculate â¼ 230 million stability changes for nearly all single amino acid changes in the human proteome, and examine variants observed in the human population. We find that variants that are common in the population are substantially depleted for severe destabilization, and that there are substantial differences between benign and pathogenic variants, highlighting the role of protein stability in genetic diseases. RaSP is freely available-including via a Web interface-and enables large-scale analyses of stability in experimental and predicted protein structures.

Asunto(s)

Aprendizaje Profundo , Humanos , Proteínas/metabolismo , Mutagénesis , Aminoácidos/genética , Estabilidad Proteica , Biología Computacional/métodos

7.

A context-dependent and disordered ubiquitin-binding motif.

Dreier, Jesper E; Prestel, Andreas; Martins, João M; Brøndum, Sebastian S; Nielsen, Olaf; Garbers, Anna E; Suga, Hiroaki; Boomsma, Wouter; Rogers, Joseph M; Hartmann-Petersen, Rasmus; Kragelund, Birthe B.

Cell Mol Life Sci ; 79(9): 484, 2022 Aug 16.

Artículo en Inglés | MEDLINE | ID: mdl-35974206

RESUMEN

Ubiquitin is a small, globular protein that is conjugated to other proteins as a posttranslational event. A palette of small, folded domains recognizes and binds ubiquitin to translate and effectuate this posttranslational signal. Recent computational studies have suggested that protein regions can recognize ubiquitin via a process of folding upon binding. Using peptide binding arrays, bioinformatics, and NMR spectroscopy, we have uncovered a disordered ubiquitin-binding motif that likely remains disordered when bound and thus expands the palette of ubiquitin-binding proteins. We term this motif Disordered Ubiquitin-Binding Motif (DisUBM) and find it to be present in many proteins with known or predicted functions in degradation and transcription. We decompose the determinants of the motif showing it to rely on features of aromatic and negatively charged residues, and less so on distinct sequence positions in line with its disordered nature. We show that the affinity of the motif is low and moldable by the surrounding disordered chain, allowing for an enhanced interaction surface with ubiquitin, whereby the affinity increases ~ tenfold. Further affinity optimization using peptide arrays pushed the affinity into the low micromolar range, but compromised context dependence. Finally, we find that DisUBMs can emerge from unbiased screening of randomized peptide libraries, featuring in de novo cyclic peptides selected to bind ubiquitin chains. We suggest that naturally occurring DisUBMs can recognize ubiquitin as a posttranslational signal to act as affinity enhancers in IDPs that bind to folded and ubiquitylated binding partners.

Asunto(s)

Proteínas Intrínsecamente Desordenadas , Proteínas , Secuencia de Aminoácidos , Proteínas Intrínsecamente Desordenadas/química , Péptidos/metabolismo , Unión Proteica , Proteínas/metabolismo , Ubiquitina/metabolismo

8.

Learning meaningful representations of protein sequences.

Detlefsen, Nicki Skafte; Hauberg, Søren; Boomsma, Wouter.

Nat Commun ; 13(1): 1914, 2022 04 08.

Artículo en Inglés | MEDLINE | ID: mdl-35395843

RESUMEN

How we choose to represent our data has a fundamental impact on our ability to subsequently extract information from them. Machine learning promises to automatically determine efficient representations from large unstructured datasets, such as those arising in biology. However, empirical evidence suggests that seemingly minor changes to these machine learning models yield drastically different data representations that result in different biological interpretations of data. This begs the question of what even constitutes the most meaningful representation. Here, we approach this question for representations of protein sequences, which have received considerable attention in the recent literature. We explore two key contexts in which representations naturally arise: transfer learning and interpretable learning. In the first context, we demonstrate that several contemporary practices yield suboptimal performance, and in the latter we demonstrate that taking representation geometry into account significantly improves interpretability and lets the models reveal biological information that is otherwise obscured.

Asunto(s)

Aprendizaje Automático , Secuencia de Aminoácidos

9.

Single-particle diffusional fingerprinting: A machine-learning framework for quantitative analysis of heterogeneous diffusion.

Pinholt, Henrik D; Bohr, Søren S-R; Iversen, Josephine F; Boomsma, Wouter; Hatzakis, Nikos S.

Proc Natl Acad Sci U S A ; 118(31)2021 08 03.

Artículo en Inglés | MEDLINE | ID: mdl-34321355

RESUMEN

Single-particle tracking (SPT) is a key tool for quantitative analysis of dynamic biological processes and has provided unprecedented insights into a wide range of systems such as receptor localization, enzyme propulsion, bacteria motility, and drug nanocarrier delivery. The inherently complex diffusion in such biological systems can vary drastically both in time and across systems, consequently imposing considerable analytical challenges, and currently requires an a priori knowledge of the system. Here we introduce a method for SPT data analysis, processing, and classification, which we term "diffusional fingerprinting." This method allows for dissecting the features that underlie diffusional behavior and establishing molecular identity, regardless of the underlying diffusion type. The method operates by isolating 17 descriptive features for each observed motion trajectory and generating a diffusional map of all features for each type of particle. Precise classification of the diffusing particle identity is then obtained by training a simple logistic regression model. A linear discriminant analysis generates a feature ranking that outputs the main differences among diffusional features, providing key mechanistic insights. Fingerprinting operates by both training on and predicting experimental data, without the need for pretraining on simulated data. We found this approach to work across a wide range of simulated and experimentally diverse systems, such as tracked lipases on fat substrates, transcription factors diffusing in cells, and nanoparticles diffusing in mucus. This flexibility ultimately supports diffusional fingerprinting's utility as a universal paradigm for SPT diffusional analysis and prediction.

Asunto(s)

Aprendizaje Automático , Imagen Individual de Molécula/métodos , Simulación por Computador , Difusión , Interpretación de Imagen Asistida por Computador , Movimiento , Tamaño de la Partícula

10.

Developing and validating COVID-19 adverse outcome risk prediction models from a bi-national European cohort of 5594 patients.

Jimenez-Solem, Espen; Petersen, Tonny S; Hansen, Casper; Hansen, Christian; Lioma, Christina; Igel, Christian; Boomsma, Wouter; Krause, Oswin; Lorenzen, Stephan; Selvan, Raghavendra; Petersen, Janne; Nyeland, Martin Erik; Ankarfeldt, Mikkel Zöllner; Virenfeldt, Gert Mehl; Winther-Jensen, Matilde; Linneberg, Allan; Ghazi, Mostafa Mehdipour; Detlefsen, Nicki; Lauritzen, Andreas David; Smith, Abraham George; de Bruijne, Marleen; Ibragimov, Bulat; Petersen, Jens; Lillholm, Martin; Middleton, Jon; Mogensen, Stine Hasling; Thorsen-Meyer, Hans-Christian; Perner, Anders; Helleberg, Marie; Kaas-Hansen, Benjamin Skov; Bonde, Mikkel; Bonde, Alexander; Pai, Akshay; Nielsen, Mads; Sillesen, Martin.

Sci Rep ; 11(1): 3246, 2021 02 05.

Artículo en Inglés | MEDLINE | ID: mdl-33547335

RESUMEN

Patients with severe COVID-19 have overwhelmed healthcare systems worldwide. We hypothesized that machine learning (ML) models could be used to predict risks at different stages of management and thereby provide insights into drivers and prognostic markers of disease progression and death. From a cohort of approx. 2.6 million citizens in Denmark, SARS-CoV-2 PCR tests were performed on subjects suspected for COVID-19 disease; 3944 cases had at least one positive test and were subjected to further analysis. SARS-CoV-2 positive cases from the United Kingdom Biobank was used for external validation. The ML models predicted the risk of death (Receiver Operation Characteristics-Area Under the Curve, ROC-AUC) of 0.906 at diagnosis, 0.818, at hospital admission and 0.721 at Intensive Care Unit (ICU) admission. Similar metrics were achieved for predicted risks of hospital and ICU admission and use of mechanical ventilation. Common risk factors, included age, body mass index and hypertension, although the top risk features shifted towards markers of shock and organ dysfunction in ICU patients. The external validation indicated fair predictive performance for mortality prediction, but suboptimal performance for predicting ICU admission. ML may be used to identify drivers of progression to more severe disease and for prognostication patients in patients with COVID-19. We provide access to an online risk calculator based on these findings.

Asunto(s)

COVID-19/diagnóstico , COVID-19/mortalidad , Simulación por Computador , Aprendizaje Automático , Factores de Edad , Anciano , Anciano de 80 o más Años , Índice de Masa Corporal , COVID-19/complicaciones , COVID-19/fisiopatología , Comorbilidad , Cuidados Críticos , Femenino , Hospitalización , Humanos , Hipertensión/complicaciones , Unidades de Cuidados Intensivos , Masculino , Persona de Mediana Edad , Pronóstico , Estudios Prospectivos , Curva ROC , Respiración Artificial , Factores de Riesgo , Factores Sexuales

11.

Orchestration of signaling by structural disorder in class 1 cytokine receptors.

Seiffert, Pernille; Bugge, Katrine; Nygaard, Mads; Haxholm, Gitte W; Martinsen, Jacob H; Pedersen, Martin N; Arleth, Lise; Boomsma, Wouter; Kragelund, Birthe B.

Cell Commun Signal ; 18(1): 132, 2020 08 24.

Artículo en Inglés | MEDLINE | ID: mdl-32831102

RESUMEN

BACKGROUND: Class 1 cytokine receptors (C1CRs) are single-pass transmembrane proteins responsible for transmitting signals between the outside and the inside of cells. Remarkably, they orchestrate key biological processes such as proliferation, differentiation, immunity and growth through long disordered intracellular domains (ICDs), but without having intrinsic kinase activity. Despite these key roles, their characteristics remain rudimentarily understood. METHODS: The current paper asks the question of why disorder has evolved to govern signaling of C1CRs by reviewing the literature in combination with new sequence and biophysical analyses of chain properties across the family. RESULTS: We uncover that the C1CR-ICDs are fully disordered and brimming with SLiMs. Many of these short linear motifs (SLiMs) are overlapping, jointly signifying a complex regulation of interactions, including network rewiring by isoforms. The C1CR-ICDs have unique properties that distinguish them from most IDPs and we forward the perception that the C1CR-ICDs are far from simple strings with constitutively bound kinases. Rather, they carry both organizational and operational features left uncovered within their disorder, including mechanisms and complexities of regulatory functions. CONCLUSIONS: Critically, the understanding of the fascinating ability of these long, completely disordered chains to orchestrate complex cellular signaling pathways is still in its infancy, and we urge a perceptional shift away from the current simplistic view towards uncovering their full functionalities and potential. Video abstract.

Asunto(s)

Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Receptores de Citocinas/química , Receptores de Citocinas/metabolismo , Transducción de Señal , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Humanos , Conformación Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo

12.

IDDomainSpotter: Compositional bias reveals domains in long disordered protein regions-Insights from transcription factors.

Millard, Peter S; Bugge, Katrine; Marabini, Riccardo; Boomsma, Wouter; Burow, Meike; Kragelund, Birthe B.

Protein Sci ; 29(1): 169-183, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-31642121

RESUMEN

Protein domains constitute regions of distinct structural properties and molecular functions that are retained when removed from the rest of the protein. However, due to the lack of tertiary structure, the identification of domains has been largely neglected for long (>50 residues) intrinsically disordered regions. Here we present a sequence-based approach to assess and visualize domain organization in long intrinsically disordered regions based on compositional sequence biases. An online tool to find putative intrinsically disordered domains (IDDomainSpotter) in any protein sequence or sequence alignment using any particular sequence trait is available at http://www.bio.ku.dk/sbinlab/IDDomainSpotter. Using this tool, we have identified a putative domain enriched in hydrophilic and disorder-promoting residues (Pro, Ser, and Thr) and depleted in positive charges (Arg and Lys) bordering the folded DNA-binding domains of several transcription factors (p53, GCR, NAC46, MYB28, and MYB29). This domain, from two different MYB transcription factors, was characterized biophysically to determine its properties. Our analyses show the domain to be extended, dynamic and highly disordered. It connects the DNA-binding domain to other disordered domains and is present and conserved in several transcription factors from different families and domains of life. This example illustrates the potential of IDDomainSpotter to predict, from sequence alone, putative domains of functional interest in otherwise uncharacterized disordered proteins.

Asunto(s)

Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Arabidopsis/química , Arabidopsis/genética , Factores de Transcripción/química , Factores de Transcripción/genética , Secuencia de Aminoácidos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Sesgo , Sitios de Unión , Histona Acetiltransferasas , Humanos , Modelos Moleculares , Unión Proteica , Dominios Proteicos , Desplegamiento Proteico , Dispersión del Ángulo Pequeño , Factores de Transcripción/metabolismo , Difracción de Rayos X

13.

Random coil chemical shifts for serine, threonine and tyrosine phosphorylation over a broad pH range.

Hendus-Altenburger, Ruth; Fernandes, Catarina B; Bugge, Katrine; Kunze, Micha B A; Boomsma, Wouter; Kragelund, Birthe B.

J Biomol NMR ; 73(12): 713-725, 2019 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-31598803

RESUMEN

Phosphorylation is one of the main regulators of cellular signaling typically occurring in flexible parts of folded proteins and in intrinsically disordered regions. It can have distinct effects on the chemical environment as well as on the structural properties near the modification site. Secondary chemical shift analysis is the main NMR method for detection of transiently formed secondary structure in intrinsically disordered proteins (IDPs) and the reliability of the analysis depends on an appropriate choice of random coil model. Random coil chemical shifts and sequence correction factors were previously determined for an Ac-QQXQQ-NH2-peptide series with X being any of the 20 common amino acids. However, a matching dataset on the phosphorylated states has so far only been incompletely determined or determined only at a single pH value. Here we extend the database by the addition of the random coil chemical shifts of the phosphorylated states of serine, threonine and tyrosine measured over a range of pH values covering the pKas of the phosphates and at several temperatures (www.bio.ku.dk/sbinlab/randomcoil). The combined results allow for accurate random coil chemical shift determination of phosphorylated regions at any pH and temperature, minimizing systematic biases of the secondary chemical shifts. Comparison of chemical shifts using random coil sets with and without inclusion of the phosphoryl group, revealed under/over estimations of helicity of up to 33%. The expanded set of random coil values will improve the reliability in detection and quantification of transient secondary structure in phosphorylation-modified IDPs.

Asunto(s)

Aminoácidos/metabolismo , Proteínas Intrínsecamente Desordenadas/química , Resonancia Magnética Nuclear Biomolecular/métodos , Concentración de Iones de Hidrógeno , Fosforilación , Estructura Secundaria de Proteína , Serina/metabolismo , Temperatura , Treonina/metabolismo , Tirosina/metabolismo

14.

The PCNA interaction motifs revisited: thinking outside the PIP-box.

Prestel, Andreas; Wichmann, Nanna; Martins, Joao M; Marabini, Riccardo; Kassem, Noah; Broendum, Sebastian S; Otterlei, Marit; Nielsen, Olaf; Willemoës, Martin; Ploug, Michael; Boomsma, Wouter; Kragelund, Birthe B.

Cell Mol Life Sci ; 76(24): 4923-4943, 2019 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-31134302

RESUMEN

Proliferating cell nuclear antigen (PCNA) is a cellular hub in DNA metabolism and a potential drug target. Its binding partners carry a short linear motif (SLiM) known as the PCNA-interacting protein-box (PIP-box), but sequence-divergent motifs have been reported to bind to the same binding pocket. To investigate how PCNA accommodates motif diversity, we assembled a set of 77 experimentally confirmed PCNA-binding proteins and analyzed features underlying their binding affinity. Combining NMR spectroscopy, affinity measurements and computational analyses, we corroborate that most PCNA-binding motifs reside in intrinsically disordered regions, that structure preformation is unrelated to affinity, and that the sequence-patterns that encode binding affinity extend substantially beyond the boundaries of the PIP-box. Our systematic multidisciplinary approach expands current views on PCNA interactions and reveals that the PIP-box affinity can be modulated over four orders of magnitude by positive charges in the flanking regions. Including the flanking regions as part of the motif is expected to have broad implications, particularly for interpretation of disease-causing mutations and drug-design, targeting DNA-replication and -repair.

Asunto(s)

Secuencias de Aminoácidos/genética , Proteínas de Unión al ADN/química , ADN/química , Antígeno Nuclear de Célula en Proliferación/química , ADN/genética , Reparación del ADN/genética , Replicación del ADN/genética , Proteínas de Unión al ADN/genética , Humanos , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/genética , Espectroscopía de Resonancia Magnética , Antígeno Nuclear de Célula en Proliferación/genética , Conformación Proteica

15.

Barnaba: software for analysis of nucleic acid structures and trajectories.

Bottaro, Sandro; Bussi, Giovanni; Pinamonti, Giovanni; Reißer, Sabine; Boomsma, Wouter; Lindorff-Larsen, Kresten.

RNA ; 25(2): 219-231, 2019 02.

Artículo en Inglés | MEDLINE | ID: mdl-30420522

RESUMEN

RNA molecules are highly dynamic systems characterized by a complex interplay between sequence, structure, dynamics, and function. Molecular simulations can potentially provide powerful insights into the nature of these relationships. The analysis of structures and molecular trajectories of nucleic acids can be nontrivial because it requires processing very high-dimensional data that are not easy to visualize and interpret. Here we introduce Barnaba, a Python library aimed at facilitating the analysis of nucleic acid structures and molecular simulations. The software consists of a variety of analysis tools that allow the user to (i) calculate distances between three-dimensional structures using different metrics, (ii) back-calculate experimental data from three-dimensional structures, (iii) perform cluster analysis and dimensionality reductions, (iv) search three-dimensional motifs in PDB structures and trajectories, and (v) construct elastic network models for nucleic acids and nucleic acids-protein complexes. In addition, Barnaba makes it possible to calculate torsion angles, pucker conformations, and to detect base-pairing/base-stacking interactions. Barnaba produces graphics that conveniently visualize both extended secondary structure and dynamics for a set of molecular conformations. The software is available as a command-line tool as well as a library, and supports a variety of file formats such as PDB, dcd, and xtc files. Source code, documentation, and examples are freely available at https://github.com/srnas/barnaba under GNU GPLv3 license.

Asunto(s)

Biología Computacional/métodos , Conformación de Ácido Nucleico , ARN/ultraestructura , Programas Informáticos , Emparejamiento Base/genética , Bases de Datos de Proteínas , Modelos Moleculares

16.

Monte Carlo Sampling of Protein Folding by Combining an All-Atom Physics-Based Model with a Native State Bias.

Wang, Yong; Tian, Pengfei; Boomsma, Wouter; Lindorff-Larsen, Kresten.

J Phys Chem B ; 122(49): 11174-11185, 2018 12 13.

Artículo en Inglés | MEDLINE | ID: mdl-30141937

RESUMEN

Energy landscape theory suggests that native interactions are a major determinant of the folding mechanism of a protein. Thus, structure-based (GoÌ) models have, aided by coarse-graining techniques, shown great success in capturing the mechanisms of protein folding and conformational changes. In certain cases, however, non-native interactions and atomic details are also essential to describe the protein dynamics, prompting the development of a variety of structure-based models that include non-native interactions, and differentiate between different types of attractive potentials. Here, we describe an all-protein-atom hybrid model, termed ProfasiGo, that integrates an implicit solvent all-atom physics-based model (called Profasi) and a structure-based GoÌ potential and its implementation in two software packages (PHAISTOS and ProFASi) that are developed for Monte Carlo sampling of protein molecules. We apply the ProfasiGo model to study the folding free energy landscapes of four topologically similar proteins, one of which can be folded by the simplified potential Profasi and two that have been folded by explicit solvent, all-atom molecular dynamics simulations with the CHARMM22* force field. Our results reveal that the hybrid ProfasiGo model is able to capture many of the details present in the physics-based potentials while retaining the advantages of GoÌ models for sampling and guiding to the native state. We expect that the model will be widely applicable to the study of the folding of more complex proteins or to the study of conformational dynamics and integration with experimental data.

Asunto(s)

Proteínas de Homeodominio/química , Método de Montecarlo , Pliegue de Proteína , Algoritmos , Simulación de Dinámica Molecular , Dominios Proteicos , Termodinámica

17.

Driving Structural Transitions in Molecular Simulations Using the Nonequilibrium Candidate Monte Carlo.

Kurut, Anil; Fonseca, Rasmus; Boomsma, Wouter.

J Phys Chem B ; 122(3): 1195-1204, 2018 01 25.

Artículo en Inglés | MEDLINE | ID: mdl-29260565

RESUMEN

Hybrid simulation procedures which combine molecular dynamics with Monte Carlo are attracting increasing attention as tools for improving the sampling efficiency in molecular simulations. In particular, encouraging results have been reported for nonequilibrium candidate protocols, in which a Monte Carlo move is applied gradually, and interleaved with a process that equilibrates the remaining degrees of freedom. Although initial studies have uncovered a substantial potential of the method, its practical applicability for sampling structural transitions in macromolecules remains incompletely understood. Here, we address this issue by systematically investigating the efficiency of the nonequilibrium candidate Monte Carlo on the sampling of rotameric distributions of two peptide systems at atomistic resolution both in vacuum and explicit solvent. The studied systems allow us to directly probe the efficiency with which a single or a few slow degrees of freedom can be driven between well-separated free-energy minima and to explore the sensitivity of the method toward the involved free parameters. In line with results on other systems, our study suggests that order-of-magnitude gains can be obtained in certain scenarios but also identifies challenges that arise when applying the procedure in explicit solvent.

18.

Structure of the Bacterial Cytoskeleton Protein Bactofilin by NMR Chemical Shifts and Sequence Variation.

Kassem, Maher M; Wang, Yong; Boomsma, Wouter; Lindorff-Larsen, Kresten.

Biophys J ; 110(11): 2342-2348, 2016 06 07.

Artículo en Inglés | MEDLINE | ID: mdl-27276252

RESUMEN

Bactofilins constitute a recently discovered class of bacterial proteins that form cytoskeletal filaments. They share a highly conserved domain (DUF583) of which the structure remains unknown, in part due to the large size and noncrystalline nature of the filaments. Here, we describe the atomic structure of a bactofilin domain from Caulobacter crescentus. To determine the structure, we developed an approach that combines a biophysical model for proteins with recently obtained solid-state NMR spectroscopy data and amino acid contacts predicted from a detailed analysis of the evolutionary history of bactofilins. Our structure reveals a triangular ß-helical (solenoid) conformation with conserved residues forming the tightly packed core and polar residues lining the surface. The repetitive structure explains the presence of internal repeats as well as strongly conserved positions, and is reminiscent of other fibrillar proteins. Our work provides a structural basis for future studies of bactofilin biology and for designing molecules that target them, as well as a starting point for determining the organization of the entire bactofilin filament. Finally, our approach presents new avenues for determining structures that are difficult to obtain by traditional means.

Asunto(s)

Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Citoesqueleto/química , Citoesqueleto/genética , Secuencia de Aminoácidos , Caulobacter crescentus , Simulación por Computador , Modelos Moleculares , Método de Montecarlo , Resonancia Magnética Nuclear Biomolecular , Estructura Secundaria de Proteína , Propiedades de Superficie

19.

Rapid expansion of the protein disulfide isomerase gene family facilitates the folding of venom peptides.

Safavi-Hemami, Helena; Li, Qing; Jackson, Ronneshia L; Song, Albert S; Boomsma, Wouter; Bandyopadhyay, Pradip K; Gruber, Christian W; Purcell, Anthony W; Yandell, Mark; Olivera, Baldomero M; Ellgaard, Lars.

Proc Natl Acad Sci U S A ; 113(12): 3227-32, 2016 Mar 22.

Artículo en Inglés | MEDLINE | ID: mdl-26957604

RESUMEN

Formation of correct disulfide bonds in the endoplasmic reticulum is a crucial step for folding proteins destined for secretion. Protein disulfide isomerases (PDIs) play a central role in this process. We report a previously unidentified, hypervariable family of PDIs that represents the most diverse gene family of oxidoreductases described in a single genus to date. These enzymes are highly expressed specifically in the venom glands of predatory cone snails, animals that synthesize a remarkably diverse set of cysteine-rich peptide toxins (conotoxins). Enzymes in this PDI family, termed conotoxin-specific PDIs, significantly and differentially accelerate the kinetics of disulfide-bond formation of several conotoxins. Our results are consistent with a unique biological scenario associated with protein folding: The diversification of a family of foldases can be correlated with the rapid evolution of an unprecedented diversity of disulfide-rich structural domains expressed by venomous marine snails in the superfamily Conoidea.

Asunto(s)

Venenos de Moluscos/química , Péptidos/química , Proteína Disulfuro Isomerasas/genética , Secuencia de Aminoácidos , Animales , Caracol Conus , Datos de Secuencia Molecular , Proteína Disulfuro Isomerasas/química , Pliegue de Proteína , Homología de Secuencia de Aminoácido

20.

Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases.

Boomsma, Wouter; Nielsen, Sofie V; Lindorff-Larsen, Kresten; Hartmann-Petersen, Rasmus; Ellgaard, Lars.

PeerJ ; 4: e1725, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-26966660

RESUMEN

The ubiquitin-proteasome system targets misfolded proteins for degradation. Since the accumulation of such proteins is potentially harmful for the cell, their prompt removal is important. E3 ubiquitin-protein ligases mediate substrate ubiquitination by bringing together the substrate with an E2 ubiquitin-conjugating enzyme, which transfers ubiquitin to the substrate. For misfolded proteins, substrate recognition is generally delegated to molecular chaperones that subsequently interact with specific E3 ligases. An important exception is San1, a yeast E3 ligase. San1 harbors extensive regions of intrinsic disorder, which provide both conformational flexibility and sites for direct recognition of misfolded targets of vastly different conformations. So far, no mammalian ortholog of San1 is known, nor is it clear whether other E3 ligases utilize disordered regions for substrate recognition. Here, we conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of their ordered regions, and did not capture the unique disorder patterns that encode the functional mechanism of San1. However, by searching specifically for key features of the San1 sequence, such as long regions of intrinsic disorder embedded with short stretches predicted to be suitable for substrate interaction, we identified several E3 ligases with these characteristics. Our initial analysis revealed that another remarkable trait of San1 is shared with several candidate E3 ligases: long stretches of complete lysine suppression, which in San1 limits auto-ubiquitination. We encode these characteristic features into a San1 similarity-score, and present a set of proteins that are plausible candidates as San1 counterparts in humans. In conclusion, our work indicates that San1 is not a unique case, and that several other yeast and human E3 ligases have sequence properties that may allow them to recognize substrates by a similar mechanism as San1.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA