Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
J Proteome Res ; 22(8): 2548-2557, 2023 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-37459437

RESUMO

Phosphorylation is one of the most important post-translational modifications and plays a pivotal role in various cellular processes. Although there exist several computational tools to predict phosphorylation sites, existing tools have not yet harnessed the knowledge distilled by pretrained protein language models. Herein, we present a novel deep learning-based approach called LMPhosSite for the general phosphorylation site prediction that integrates embeddings from the local window sequence and the contextualized embedding obtained using global (overall) protein sequence from a pretrained protein language model to improve the prediction performance. Thus, the LMPhosSite consists of two base-models: one for capturing effective local representation and the other for capturing global per-residue contextualized embedding from a pretrained protein language model. The output of these base-models is integrated using a score-level fusion approach. LMPhosSite achieves a precision, recall, Matthew's correlation coefficient, and F1-score of 38.78%, 67.12%, 0.390, and 49.15%, for the combined serine and threonine independent test data set and 34.90%, 62.03%, 0.298, and 44.67%, respectively, for the tyrosine independent test data set, which is better than the compared approaches. These results demonstrate that LMPhosSite is a robust computational tool for the prediction of the general phosphorylation sites in proteins.


Assuntos
Aprendizado Profundo , Fosforilação , Proteínas/metabolismo , Processamento de Proteína Pós-Traducional , Sequência de Aminoácidos
2.
PLoS Comput Biol ; 17(7): e1009135, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34214078

RESUMO

There are currently 85,000 chemicals registered with the Environmental Protection Agency (EPA) under the Toxic Substances Control Act, but only a small fraction have measured toxicological data. To address this gap, high-throughput screening (HTS) and computational methods are vital. As part of one such HTS effort, embryonic zebrafish were used to examine a suite of morphological and mortality endpoints at six concentrations from over 1,000 unique chemicals found in the ToxCast library (phase 1 and 2). We hypothesized that by using a conditional generative adversarial network (cGAN) or deep neural networks (DNN), and leveraging this large set of toxicity data we could efficiently predict toxic outcomes of untested chemicals. Utilizing a novel method in this space, we converted the 3D structural information into a weighted set of points while retaining all information about the structure. In vivo toxicity and chemical data were used to train two neural network generators. The first was a DNN (Go-ZT) while the second utilized cGAN architecture (GAN-ZT) to train generators to produce toxicity data. Our results showed that Go-ZT significantly outperformed the cGAN, support vector machine, random forest and multilayer perceptron models in cross-validation, and when tested against an external test dataset. By combining both Go-ZT and GAN-ZT, our consensus model improved the SE, SP, PPV, and Kappa, to 71.4%, 95.9%, 71.4% and 0.673, respectively, resulting in an area under the receiver operating characteristic (AUROC) of 0.837. Considering their potential use as prescreening tools, these models could provide in vivo toxicity predictions and insight into the hundreds of thousands of untested chemicals to prioritize compounds for HT testing.


Assuntos
Biologia Computacional , Ensaios de Triagem em Larga Escala , Redes Neurais de Computação , Toxicologia , Animais , Embrião não Mamífero/efeitos dos fármacos , Modelos Químicos , Testes de Toxicidade , Peixe-Zebra
3.
BMC Bioinformatics ; 21(Suppl 3): 63, 2020 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-32321437

RESUMO

BACKGROUND: Protein succinylation has recently emerged as an important and common post-translation modification (PTM) that occurs on lysine residues. Succinylation is notable both in its size (e.g., at 100 Da, it is one of the larger chemical PTMs) and in its ability to modify the net charge of the modified lysine residue from + 1 to - 1 at physiological pH. The gross local changes that occur in proteins upon succinylation have been shown to correspond with changes in gene activity and to be perturbed by defects in the citric acid cycle. These observations, together with the fact that succinate is generated as a metabolic intermediate during cellular respiration, have led to suggestions that protein succinylation may play a role in the interaction between cellular metabolism and important cellular functions. For instance, succinylation likely represents an important aspect of genomic regulation and repair and may have important consequences in the etiology of a number of disease states. In this study, we developed DeepSuccinylSite, a novel prediction tool that uses deep learning methodology along with embedding to identify succinylation sites in proteins based on their primary structure. RESULTS: Using an independent test set of experimentally identified succinylation sites, our method achieved efficiency scores of 79%, 68.7% and 0.48 for sensitivity, specificity and MCC respectively, with an area under the receiver operator characteristic (ROC) curve of 0.8. In side-by-side comparisons with previously described succinylation predictors, DeepSuccinylSite represents a significant improvement in overall accuracy for prediction of succinylation sites. CONCLUSION: Together, these results suggest that our method represents a robust and complementary technique for advanced exploration of protein succinylation.


Assuntos
Aprendizado Profundo , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Succinatos/metabolismo , Sítios de Ligação , Ciclo do Ácido Cítrico , Lisina/metabolismo , Proteínas/química
4.
J Am Chem Soc ; 140(24): 7377-7380, 2018 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-29851341

RESUMO

This work addresses the need for chemical tools that can selectively form cross-links. Contemporary thiol-selective cross-linkers, for example, modify all accessible thiols, but only form cross-links between a subset. The resulting terminal "dead-end" modifications of lone thiols are toxic, confound cross-linking-based studies of macromolecular structure, and are an undesired, and currently unavoidable, byproduct in polymer synthesis. Using the thiol pair of Cu/Zn-superoxide dismutase (SOD1), we demonstrated that cyclic disulfides, including the drug/nutritional supplement lipoic acid, efficiently cross-linked thiol pairs but avoided dead-end modifications. Thiolate-directed nucleophilic attack upon the cyclic disulfide resulted in thiol-disulfide exchange and ring cleavage. The resulting disulfide-tethered terminal thiolate moiety either directed the reverse reaction, releasing the cyclic disulfide, or participated in oxidative disulfide (cross-link) formation. We hypothesized, and confirmed with density functional theory (DFT) calculations, that mono- S-oxo derivatives of cyclic disulfides formed a terminal sulfenic acid upon ring cleavage that obviated the previously rate-limiting step, thiol oxidation, and accelerated the new rate-determining step, ring cleavage. Our calculations suggest that the origin of accelerated ring cleavage is improved frontier molecular orbital overlap in the thiolate-disulfide interchange transition. Five- to seven-membered cyclic thiosulfinates were synthesized and efficiently cross-linked up to 104-fold faster than their cyclic disulfide precursors; functioned in the presence of biological concentrations of glutathione; and acted as cell-permeable, potent, tolerable, intracellular cross-linkers. This new class of thiol cross-linkers exhibited click-like attributes including, high yields driven by the enthalpies of disulfide and water formation, orthogonality with common functional groups, water-compatibility, and ring strain-dependence.


Assuntos
Reagentes de Ligações Cruzadas/química , Dissulfetos/química , Compostos de Sulfidrila/química , Ácidos Sulfínicos/química , Superóxido Dismutase-1/química , Linhagem Celular Tumoral , Reagentes de Ligações Cruzadas/síntese química , Dissulfetos/síntese química , Humanos , Modelos Químicos , Oxirredução , Teoria Quântica , Ácidos Sulfênicos/química , Ácidos Sulfínicos/síntese química
6.
Curr Opin Plant Biol ; 71: 102326, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36538837

RESUMO

The plant-associated microbiome is a key component of plant systems, contributing to their health, growth, and productivity. The application of machine learning (ML) in this field promises to help untangle the relationships involved. However, measurements of microbial communities by high-throughput sequencing pose challenges for ML. Noise from low sample sizes, soil heterogeneity, and technical factors can impact the performance of ML. Additionally, the compositional and sparse nature of these datasets can impact the predictive accuracy of ML. We review recent literature from plant studies to illustrate that these properties often go unmentioned. We expand our analysis to other fields to quantify the degree to which mitigation approaches improve the performance of ML and describe the mathematical basis for this. With the advent of accessible analytical packages for microbiome data including learning models, researchers must be familiar with the nature of their datasets.


Assuntos
Microbiota , Algoritmos , Aprendizado de Máquina , Plantas
7.
iScience ; 26(10): 107817, 2023 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-37744034

RESUMO

Extracellular signal-regulated kinases 1 and 2 (ERK1/2) are dysregulated in many pervasive diseases. Recently, we discovered that ERK1/2 is oxidized by signal-generated hydrogen peroxide in various cell types. Since the putative sites of oxidation lie within or near ERK1/2's ligand-binding surfaces, we investigated how oxidation of ERK2 regulates interactions with the model substrates Sub-D and Sub-F. These studies revealed that ERK2 undergoes sulfenylation at C159 on its D-recruitment site surface and that this modification modulates ERK2 activity differentially between substrates. Integrated biochemical, computational, and mutational analyses suggest a plausible mechanism for peroxide-dependent changes in ERK2-substrate interactions. Interestingly, oxidation decreased ERK2's affinity for some D-site ligands while increasing its affinity for others. Finally, oxidation by signal-generated peroxide enhanced ERK1/2's ability to phosphorylate ribosomal S6 kinase A1 (RSK1) in HeLa cells. Together, these studies lay the foundation for examining crosstalk between redox- and phosphorylation-dependent signaling at the level of kinase-substrate selection.

8.
Front Cell Dev Biol ; 9: 662983, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34249915

RESUMO

Phosphorylation, which is mediated by protein kinases and opposed by protein phosphatases, is an important post-translational modification that regulates many cellular processes, including cellular metabolism, cell migration, and cell division. Due to its essential role in cellular physiology, a great deal of attention has been devoted to identifying sites of phosphorylation on cellular proteins and understanding how modification of these sites affects their cellular functions. This has led to the development of several computational methods designed to predict sites of phosphorylation based on a protein's primary amino acid sequence. In contrast, much less attention has been paid to dephosphorylation and its role in regulating the phosphorylation status of proteins inside cells. Indeed, to date, dephosphorylation site prediction tools have been restricted to a few tyrosine phosphatases. To fill this knowledge gap, we have employed a transfer learning strategy to develop a deep learning-based model to predict sites that are likely to be dephosphorylated. Based on independent test results, our model, which we termed DTL-DephosSite, achieved efficiency scores for phosphoserine/phosphothreonine residues of 84%, 84% and 0.68 with respect to sensitivity (SN), specificity (SP) and Matthew's correlation coefficient (MCC). Similarly, DTL-DephosSite exhibited efficiency scores of 75%, 88% and 0.64 for phosphotyrosine residues with respect to SN, SP, and MCC.

9.
Sci Rep ; 11(1): 12550, 2021 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-34131195

RESUMO

Protein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC-MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.


Assuntos
Chlamydomonas reinhardtii/genética , Aprendizado Profundo , Fosfoproteínas/genética , Proteoma/genética , Cromatografia Líquida , Fosforilação/genética , Processamento de Proteína Pós-Traducional/genética , Serina/genética , Espectrometria de Massas em Tandem , Treonina/genética
10.
Mol Omics ; 16(5): 448-454, 2020 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-32555810

RESUMO

Methylation, which is one of the most prominent post-translational modifications on proteins, regulates many important cellular functions. Though several model-based methylation site predictors have been reported, all existing methods employ machine learning strategies, such as support vector machines and random forest, to predict sites of methylation based on a set of "hand-selected" features. As a consequence, the subsequent models may be biased toward one set of features. Moreover, due to the large number of features, model development can often be computationally expensive. In this paper, we propose an alternative approach based on deep learning to predict arginine methylation sites. Our model, which we termed DeepRMethylSite, is computationally less expensive than traditional feature-based methods while eliminating potential biases that can arise through features selection. Based on independent testing on our dataset, DeepRMethylSite achieved efficiency scores of 68%, 82% and 0.51 with respect to sensitivity (SN), specificity (SP) and Matthew's correlation coefficient (MCC), respectively. Importantly, in side-by-side comparisons with other state-of-the-art methylation site predictors, our method performs on par or better in all scoring metrics tested.


Assuntos
Algoritmos , Arginina/metabolismo , Aprendizado Profundo , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Bases de Dados de Proteínas , Metilação , Redes Neurais de Computação , Curva ROC , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA