Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
PLoS Comput Biol ; 20(5): e1012144, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38781245

RESUMO

Intrinsically disordered proteins have dynamic structures through which they play key biological roles. The elucidation of their conformational ensembles is a challenging problem requiring an integrated use of computational and experimental methods. Molecular simulations are a valuable computational strategy for constructing structural ensembles of disordered proteins but are highly resource-intensive. Recently, machine learning approaches based on deep generative models that learn from simulation data have emerged as an efficient alternative for generating structural ensembles. However, such methods currently suffer from limited transferability when modeling sequences and conformations absent in the training data. Here, we develop a novel generative model that achieves high levels of transferability for intrinsically disordered protein ensembles. The approach, named idpSAM, is a latent diffusion model based on transformer neural networks. It combines an autoencoder to learn a representation of protein geometry and a diffusion model to sample novel conformations in the encoded space. IdpSAM was trained on a large dataset of simulations of disordered protein regions performed with the ABSINTH implicit solvent model. Thanks to the expressiveness of its neural networks and its training stability, idpSAM faithfully captures 3D structural ensembles of test sequences with no similarity in the training set. Our study also demonstrates the potential for generating full conformational ensembles from datasets with limited sampling and underscores the importance of training set size for generalization. We believe that idpSAM represents a significant progress in transferable protein ensemble modeling through machine learning.


Assuntos
Biologia Computacional , Proteínas Intrinsicamente Desordenadas , Redes Neurais de Computação , Conformação Proteica , Proteínas Intrinsicamente Desordenadas/química , Biologia Computacional/métodos , Modelos Moleculares , Aprendizado de Máquina , Aprendizado Profundo , Algoritmos , Bases de Dados de Proteínas
2.
Bioinformatics ; 37(10): 1471-1472, 2021 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-33010156

RESUMO

SUMMARY: The PyMod project is designed to act as a fully integrated interface between the popular molecular graphics viewer PyMOL, and some of the most frequently used tools for structural bioinformatics, e.g. BLAST, HMMER, Clustal, MUSCLE, PSIPRED, DOPE and MODELLER. Here we report its latest release, PyMod 3, which has been completely renewed with a graphical interface written in PyQt, to make it compatible with the most recent PyMOL versions, and has been extended with a large set of new functionalities compared to its predecessor, i.e. PyMod 2. Starting from the amino acid sequence of a target protein, users can take advantage of PyMod 3 to carry out all the steps of the homology modeling process (i.e. template searching, target-template sequence alignment, model building and quality assessment). Additionally, the integrated tools in PyMod 3 may also be used alone, in order to extend PyMOL with a wide range of capabilities. Sequence similarity searches, multiple sequence/structure alignment building, phylogenetic trees and evolutionary conservation analyses, domain parsing, single/multiple chains and loop modeling can be performed in the PyMod 3/PyMOL environment. AVAILABILITY AND IMPLEMENTATION: A cross-platform PyMod 3 installer package for Windows, Linux and Mac OS X and a complete user guide with tutorials, are available at https://github.com/pymodproject/pymod. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Software , Filogenia , Proteínas , Alinhamento de Sequência
3.
Proteins ; 89(12): 1870-1887, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34156124

RESUMO

Protein structure refinement is the last step in protein structure prediction pipelines. Physics-based refinement via molecular dynamics (MD) simulations has made significant progress during recent years. During CASP14, we tested a new refinement protocol based on an improved sampling strategy via MD simulations. MD simulations were carried out at an elevated temperature (360 K). An optimized use of biasing restraints and the use of multiple starting models led to enhanced sampling. The new protocol generally improved the model quality. In comparison with our previous protocols, the CASP14 protocol showed clear improvements. Our approach was successful with most initial models, many based on deep learning methods. However, we found that our approach was not able to refine machine-learning models from the AlphaFold2 group, often decreasing already high initial qualities. To better understand the role of refinement given new types of models based on machine-learning, a detailed analysis via MD simulations and Markov state modeling is presented here. We continue to find that MD-based refinement has the potential to improve AI predictions. We also identified several practical issues that make it difficult to realize that potential. Increasingly important is the consideration of inter-domain and oligomeric contacts in simulations; the presence of large kinetic barriers in refinement pathways also continues to present challenges. Finally, we provide a perspective on how physics-based refinement could continue to play a role in the future for improving initial predictions based on machine learning-based methods.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Simulação de Dinâmica Molecular , Conformação Proteica , Proteínas , Software , Cadeias de Markov , Fenômenos Físicos , Proteínas/química , Proteínas/metabolismo
4.
PLoS Comput Biol ; 15(12): e1007219, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31846452

RESUMO

The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the "modeling by satisfaction of spatial restraints" strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program's predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER's objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance.


Assuntos
Modelos Moleculares , Software , Homologia Estrutural de Proteína , Algoritmos , Biologia Computacional , Simulação de Dinâmica Molecular/estatística & dados numéricos , Proteínas/química , Alinhamento de Sequência/estatística & dados numéricos
5.
Dermatol Online J ; 26(8)2020 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-32941720

RESUMO

Pancreatic cancer-melanoma syndrome (PCMS) is an inherited condition in which mutation carriers have an increased risk of malignant melanoma and/or pancreatic cancer. About 30% of PCMS cases carry mutations in CDKN2A. This gene encodes several protein isoforms, one of which, known as p16, regulates the cell-cycle by interacting with CDK4/CDK6 kinases and with several non-CDK proteins. Herein, we report on a novel CDKN2A germline in-frame deletion (c.52_57delACGGCC) found in an Italian family with PCMS. By segregation analysis, the c.52_57delACGGCC was proven to segregate in kindred with cutaneous melanoma (CM), in kindred with CM and pancreatic cancer, and in a single case presenting only with pancreatic cancer. In the literature, duplication mapping in the same genic region has been already reported at the germline level in several unrelated CM cases as a variant of unknown clinical significance. A computational approach for studying the effect of mutational changes over p16 protein structure showed that both the deletion and the duplication of the c.52_57 nucleotides result in protein misfolding and loss of interactors' binding. In conclusion, the present results argue that the quantitative alteration of nucleotides c.52_57 has a pathogenic role in p16 function and that the c.52_57delACGGCC is associated with PCMS.


Assuntos
Inibidor p16 de Quinase Dependente de Ciclina/genética , Mutação em Linhagem Germinativa , Melanoma/genética , Síndromes Neoplásicas Hereditárias/genética , Neoplasias Pancreáticas/genética , Inibidor p16 de Quinase Dependente de Ciclina/ultraestrutura , Feminino , Deleção de Genes , Humanos , Masculino , Melanoma/etiologia , Pessoa de Meia-Idade , Síndromes Neoplásicas Hereditárias/etiologia , Neoplasias Pancreáticas/etiologia , Linhagem , Estrutura Quaternária de Proteína
6.
Bioinformatics ; 33(3): 444-446, 2017 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28158668

RESUMO

Motivation: The recently released PyMod GUI integrates many of the individual steps required for protein sequence-structure analysis and homology modeling within the interactive visualization capabilities of PyMOL. Here we describe the improvements introduced into the version 2.0 of PyMod. Results: The original code of PyMod has been completely rewritten and improved in version 2.0 to extend PyMOL with packages such as Clustal Omega, PSIPRED and CAMPO. Integration with the popular web services ESPript and WebLogo is also provided. Finally, a number of new MODELLER functionalities have also been implemented, including SALIGN, modeling of quaternary structures, DOPE scores, disulfide bond modeling and choice of heteroatoms to be included in the final model. Availability and Implementation: PyMod 2.0 installer packages for Windows, Linux and Mac OS X and user guides are available at http://schubert.bio.uniroma1.it/pymod/index.html. The open source code of the project is hosted at https://github.com/pymodproject/pymod. Contact: alessandro.paiardini@uniroma1.it or giacomo.janson@uniroma1.it


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Conformação Proteica , Análise de Sequência de Proteína/métodos , Software
7.
IUBMB Life ; 70(3): 215-223, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29356298

RESUMO

Aromatic amino acid or Dopa decarboxylase (AADC or DDC) is a homodimeric pyridoxal 5'-phosphate (PLP) enzyme responsible for the generation of the neurotransmitters dopamine and serotonin. AADC deficiency is a rare inborn disease caused by mutations of the AADC gene leading to a defect of AADC enzyme and resulting in impaired dopamine and serotonin synthesis. Until now, only the molecular effects of homozygous mutations were analyzed. However, although heterozygous carriers of AADC deficiency were identified, the molecular aspects of their enzymatic phenotypes are not yet investigated. Here, we focus our attention on the R347Q/R358H and R347Q/R160W heterozygous mutations, and report for the first time the isolation and characterization, in the purified recombinant form, of the R347Q/R358H heterodimer and of the R358H homodimer. The results, integrated with those already known of the R347Q homodimeric variant, provide evidence that (i) the R358H mutation strongly reduces the PLP-binding affinity and the catalytic activity, and (ii) a positive interallelic complementation exists between the R347Q and the R358H mutations. Bioinformatics analyses provide the structural basis for these data. Unfortunately, the R347Q/R160W heterodimer was not obtained in a sufficient amount to allow its purification and characterization. Nevertheless, the biochemical features of the R160W homodimer give a contribution to the enzymatic phenotype of the heterozygous R347Q/R160W and suggest the possible relevance of Arg160 in the proper folding of human DDC. © 2018 IUBMB Life, 70(3):215-223, 2018.


Assuntos
Erros Inatos do Metabolismo dos Aminoácidos/embriologia , Descarboxilases de Aminoácido-L-Aromático/química , Descarboxilases de Aminoácido-L-Aromático/deficiência , Multimerização Proteica/genética , Proteínas Recombinantes/química , Erros Inatos do Metabolismo dos Aminoácidos/enzimologia , Erros Inatos do Metabolismo dos Aminoácidos/genética , Descarboxilases de Aminoácido-L-Aromático/genética , Descarboxilases de Aminoácido-L-Aromático/metabolismo , Catálise , Dopamina/biossíntese , Heterozigoto , Humanos , Mutação , Dobramento de Proteína , Proteínas Recombinantes/genética , Serotonina/biossíntese
8.
Int J Mol Sci ; 19(7)2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30036991

RESUMO

Sulfur-containing amino acids play essential roles in many organisms. The protozoan parasite Toxoplasma gondii includes the genes for cystathionine ß-synthase and cystathionine γ-lyase (TgCGL), as well as for cysteine synthase, which are crucial enzymes of the transsulfuration and de novo pathways for cysteine biosynthesis, respectively. These enzymes are specifically expressed in the oocyst stage of T. gondii. However, their functionality has not been investigated. Herein, we expressed and characterized the putative CGL from T. gondii. Recombinant TgCGL almost exclusively catalyses the α,γ-hydrolysis of l-cystathionine to form l-cysteine and displays marginal reactivity toward l-cysteine. Structure-guided homology modelling revealed two striking amino acid differences between the human and parasite CGL active-sites (Glu59 and Ser340 in human to Ser77 and Asn360 in toxoplasma). Mutation of Asn360 to Ser demonstrated the importance of this residue in modulating the specificity for the catalysis of α,ß- versus α,γ-elimination of l-cystathionine. Replacement of Ser77 by Glu completely abolished activity towards l-cystathionine. Our results suggest that CGL is an important functional enzyme in T. gondii, likely implying that the reverse transsulfuration pathway is operative in the parasite; we also probed the roles of active-site architecture and substrate binding conformations as determinants of reaction specificity in transsulfuration enzymes.


Assuntos
Cistationina gama-Liase/genética , Cistationina gama-Liase/metabolismo , Análise Mutacional de DNA/métodos , Mutação/genética , Toxoplasma/enzimologia , Cistationina , Monoéster Fosfórico Hidrolases/genética , Monoéster Fosfórico Hidrolases/metabolismo , Toxoplasma/genética , Toxoplasma/metabolismo
9.
IUBMB Life ; 66(1): 52-62, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24408864

RESUMO

Modulation of the interaction of regulatory 14-3-3 proteins to their physiological partners through small cell-permeant molecules is a promising strategy to control cellular processes where 14-3-3s are engaged. Here, we show that the fungal phytotoxin fusicoccin (FC), known to stabilize 14-3-3 association to the plant plasma membrane H(+) -ATPase, is able to stabilize 14-3-3 interaction to several client proteins with a mode III binding motif. Isothermal titration calorimetry analysis of the interaction between 14-3-3s and different peptides reproducing a mode III binding site demonstrated the FC ability to stimulate 14-3-3 the association. Moreover, molecular docking studies provided the structural rationale for the differential FC effect, which exclusively depends on the biochemical properties of the residue in peptide C-terminal position. Our study proposes FC as a promising tool to control cellular processes regulated by 14-3-3 proteins, opening new perspectives on its potential pharmacological applications.


Assuntos
Proteínas 14-3-3/metabolismo , Regulação da Expressão Gênica/efeitos dos fármacos , Glicosídeos/farmacologia , Micotoxinas/farmacologia , Fosfopeptídeos/metabolismo , Domínios e Motivos de Interação entre Proteínas/efeitos dos fármacos , Proteínas 14-3-3/química , Sítios de Ligação , Calorimetria , Membrana Celular/metabolismo , Inibidor de Quinase Dependente de Ciclina p27/metabolismo , Humanos , Modelos Moleculares , Proteínas do Tecido Nervoso/metabolismo , Fosfolipase D/metabolismo , Fosfopeptídeos/química , Canais de Potássio de Domínios Poros em Tandem/metabolismo , Ligação Proteica , Conformação Proteica , ATPases Translocadoras de Prótons/metabolismo , Receptores Acoplados a Proteínas G/metabolismo , Receptores de Interleucina-9/metabolismo , Receptores de Peptídeos/metabolismo , Termodinâmica
10.
bioRxiv ; 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38370653

RESUMO

Intrinsically disordered proteins have dynamic structures through which they play key biological roles. The elucidation of their conformational ensembles is a challenging problem requiring an integrated use of computational and experimental methods. Molecular simulations are a valuable computational strategy for constructing structural ensembles of disordered proteins but are highly resource-intensive. Recently, machine learning approaches based on deep generative models that learn from simulation data have emerged as an efficient alternative for generating structural ensembles. However, such methods currently suffer from limited transferability when modeling sequences and conformations absent in the training data. Here, we develop a novel generative model that achieves high levels of transferability for intrinsically disordered protein ensembles. The approach, named idpSAM, is a latent diffusion model based on transformer neural networks. It combines an autoencoder to learn a representation of protein geometry and a diffusion model to sample novel conformations in the encoded space. IdpSAM was trained on a large dataset of simulations of disordered protein regions performed with the ABSINTH implicit solvent model. Thanks to the expressiveness of its neural networks and its training stability, idpSAM faithfully captures 3D structural ensembles of test sequences with no similarity in the training set. Our study also demonstrates the potential for generating full conformational ensembles from datasets with limited sampling and underscores the importance of training set size for generalization. We believe that idpSAM represents a significant progress in transferable protein ensemble modeling through machine learning.

11.
Nat Commun ; 14(1): 774, 2023 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-36774359

RESUMO

Dynamics and conformational sampling are essential for linking protein structure to biological function. While challenging to probe experimentally, computer simulations are widely used to describe protein dynamics, but at significant computational costs that continue to limit the systems that can be studied. Here, we demonstrate that machine learning can be trained with simulation data to directly generate physically realistic conformational ensembles of proteins without the need for any sampling and at negligible computational cost. As a proof-of-principle we train a generative adversarial network based on a transformer architecture with self-attention on coarse-grained simulations of intrinsically disordered peptides. The resulting model, idpGAN, can predict sequence-dependent coarse-grained ensembles for sequences that are not present in the training set demonstrating that transferability can be achieved beyond the limited training data. We also retrain idpGAN on atomistic simulation data to show that the approach can be extended in principle to higher-resolution conformational ensemble generation.


Assuntos
Proteínas Intrinsicamente Desordenadas , Simulação de Dinâmica Molecular , Conformação Proteica , Proteínas , Peptídeos/química , Aprendizado de Máquina , Proteínas Intrinsicamente Desordenadas/metabolismo
12.
Life Sci Alliance ; 6(2)2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36450448

RESUMO

Mitotic kinase Aurora A (AURKA) diverges from other kinases in its multiple active conformations that may explain its interphase roles and the limited efficacy of drugs targeting the kinase pocket. Regulation of AURKA activity by the cell is critically dependent on destruction mediated by the anaphase-promoting complex (APC/CFZR1) during mitotic exit and G1 phase and requires an atypical N-terminal degron in AURKA called the "A-box" in addition to a reported canonical D-box degron in the C-terminus. Here, we find that the reported C-terminal D-box of AURKA does not act as a degron and instead mediates essential structural features of the protein. In living cells, the N-terminal intrinsically disordered region of AURKA containing the A-box is sufficient to confer FZR1-dependent mitotic degradation. Both in silico and in cellulo assays predict the QRVL short linear interacting motif of the A-box to be a phospho-regulated D-box. We propose that degradation of full-length AURKA also depends on an intact C-terminal domain because of critical conformational parameters permissive for both activity and mitotic degradation of AURKA.


Assuntos
Aurora Quinase A , Bioensaio , Humanos , Aurora Quinase A/genética , Núcleo Celular , Proteínas Cdh1
13.
Biomolecules ; 12(2)2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35204702

RESUMO

Protein-peptide interactions (PpIs) are a subset of the overall protein-protein interaction (PPI) network in the living cell and are pivotal for the majority of cell processes and functions. High-throughput methods to detect PpIs and PPIs usually require time and costs that are not always affordable. Therefore, reliable in silico predictions represent a valid and effective alternative. In this work, a new algorithm is described, implemented in a freely available tool, i.e., "PepThreader", to carry out PPIs and PpIs prediction and analysis. PepThreader threads multiple fragments derived from a full-length protein sequence (or from a peptide library) onto a second template peptide, in complex with a protein target, "spotting" the potential binding peptides and ranking them according to a sequence-based and structure-based threading score. The threading algorithm first makes use of a scoring function that is based on peptides sequence similarity. Then, a rerank of the initial hits is performed, according to structure-based scoring functions. PepThreader has been benchmarked on a dataset of 292 protein-peptide complexes that were collected from existing databases of experimentally determined protein-peptide interactions. An accuracy of 80%, when considering the top predicted 25 hits, was achieved, which performs in a comparable way with the other state-of-art tools in PPIs and PpIs modeling. Nonetheless, PepThreader is unique in that it is able at the same time to spot a binding peptide within a full-length sequence involved in PPI and model its structure within the receptor. Therefore, PepThreader adds to the already-available tools supporting the experimental PPIs and PpIs identification and characterization.


Assuntos
Peptídeos , Mapeamento de Interação de Proteínas , Sequência de Aminoácidos , Biblioteca de Peptídeos , Peptídeos/química , Mapeamento de Interação de Proteínas/métodos , Software
14.
J Chem Theory Comput ; 17(3): 1931-1943, 2021 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-33562962

RESUMO

Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. These methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore the conformational space more broadly. Based on the insights of this analysis, we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here.


Assuntos
Simulação de Dinâmica Molecular , Proteínas/química , Cristalografia por Raios X , Bases de Dados de Proteínas , Aprendizado de Máquina , Conformação Proteica
15.
Cell Rep ; 33(12): 108548, 2020 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-33357424

RESUMO

Chromatin architect of muscle expression (Charme) is a muscle-restricted long noncoding RNA (lncRNA) that plays an important role in myogenesis. Earlier evidence indicates that the nuclear Charme isoform, named pCharme, acts on the chromatin by assisting the formation of chromatin domains where myogenic transcription occurs. By combining RNA antisense purification (RAP) with mass spectrometry and loss-of-function analyses, we have now identified the proteins that assist these chromatin activities. These proteins-which include a sub-set of splicing regulators, principally PTBP1 and the multifunctional RNA/DNA binding protein MATR3-bind to sequences located within the alternatively spliced intron-1 to form nuclear aggregates. Consistent with the functional importance of pCharme interactome in vivo, a targeted deletion of the intron-1 by a CRISPR-Cas9 approach in mouse causes the release of pCharme from the chromatin and results in cardiac defects similar to what was observed upon knockout of the full-length transcript.


Assuntos
Ribonucleoproteínas Nucleares Heterogêneas/metabolismo , Íntrons/genética , Proteínas Associadas à Matriz Nuclear/metabolismo , Proteína de Ligação a Regiões Ricas em Polipirimidinas/metabolismo , RNA Longo não Codificante/metabolismo , Proteínas de Ligação a RNA/metabolismo , Animais , Humanos , Camundongos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA