RESUMO
Cysteine (Cys) is the most reactive amino acid participating in a wide range of biological functions. In-silico predictions complement the experiments to meet the need of functional characterization. Multiple Cys function prediction algorithm is scarce, in contrast to specific function prediction algorithms. Here we present a deep neural network-based multiple Cys function prediction, available on web-server (DeepCys) (https://deepcys.herokuapp.com/). DeepCys model was trained and tested on two independent datasets curated from protein crystal structures. This prediction method requires three inputs, namely, PDB identifier (ID), chain ID and residue ID for a given Cys and outputs the probabilities of four cysteine functions, namely, disulphide, metal-binding, thioether and sulphenylation and predicts the most probable Cys function. The algorithm exploits the local and global protein properties, like, sequence and secondary structure motifs, buried fractions, microenvironments and protein/enzyme class. DeepCys outperformed most of the multiple and specific Cys function algorithms. This method can predict maximum number of cysteine functions. Moreover, for the first time, explicitly predicts thioether function. This tool was used to elucidate the cysteine functions on domains of unknown functions belonging to cytochrome C oxidase subunit-II like transmembrane domains. Apart from the web-server, a standalone program is also available on GitHub (https://github.com/vam-sin/deepcys).
Assuntos
Cisteína/química , Aprendizado Profundo , Dissulfetos/química , Complexo IV da Cadeia de Transporte de Elétrons/química , Processamento de Proteína Pós-Traducional , Software , Sequência de Aminoácidos , Cátions Bivalentes/química , Cátions Bivalentes/metabolismo , Cisteína/metabolismo , Dissulfetos/metabolismo , Complexo IV da Cadeia de Transporte de Elétrons/metabolismo , Glutationa/química , Glutationa/metabolismo , Modelos Moleculares , Compostos Nitrosos/química , Compostos Nitrosos/metabolismo , Domínios Proteicos , Estrutura Secundária de Proteína , Relação Estrutura-Atividade , Sulfetos/química , Sulfetos/metabolismo , Ácidos Sulfínicos/química , Ácidos Sulfínicos/metabolismo , Ácidos Sulfônicos/química , Ácidos Sulfônicos/metabolismoRESUMO
We have demonstrated earlier that protein microenvironments were conserved around disulfide-bridged cystine motifs with similar functions, irrespective of diversity in protein sequences. Here, cysteine thiol modifications were characterized based on protein microenvironments, secondary structures and specific protein functions. Protein microenvironment around an amino acid was defined as the summation of hydrophobic contributions from the surrounding protein fragments and the solvent molecules present within its first contact shell. Cysteine functions (modifications) were grouped into enzymatic and non-enzymatic classes. Modifications studied were-disulfide formation, thio-ether formation, metal-binding, nitrosylation, acylation, selenylation, glutathionylation, sulfenylation, and ribosylation. 1079 enzymatic proteins were reported from high-resolution crystal structures. Protein microenvironments around cysteine thiol, derived from above crystal structures, were clustered into 3 groups-buried-hydrophobic, intermediate and exposed-hydrophilic clusters. Characterization of cysteine functions were statistically meaningful for 4 modifications (disulfide formation, thioether formation, sulfenylation, and iron/zinc binding) those have sufficient amount of data in the current dataset. Results showed that protein microenvironment, secondary structure and protein functions were conserved for enzymatic cysteine functions, in contrast to the same function from non-enzymatic cysteines. Disulfide forming enzymatic cysteines were tightly packed within intermediate protein microenvironment cluster, have alpha-helical conformation and mostly belonged to CxxC motif of electron transport proteins. Disulfide forming non-enzymatic cysteines did not belong to conserved motif and have variable secondary structures. Similarly, enzymatic thioether forming cysteines have conserved microenvironment compared to non-enzymatic cystienes. Based on the compatibility between protein microenvironment and cysteine modifications, more efficient drug molecules could be designed against cysteine-related diseases.
Assuntos
Cisteína/análise , Proteínas/química , Compostos de Sulfidrila/análise , Animais , Bactérias/química , Bactérias/metabolismo , Proteínas de Bactérias/química , Proteínas de Bactérias/metabolismo , Cisteína/metabolismo , Bases de Dados de Proteínas , Humanos , Interações Hidrofóbicas e Hidrofílicas , Metais/metabolismo , Modelos Moleculares , Oxirredução , Ligação Proteica , Conformação Proteica , Estrutura Secundária de Proteína , Proteínas/metabolismo , Compostos de Sulfidrila/metabolismoRESUMO
In our previous study, we have shown that the microenvironments around conserved amino acids are also conserved in protein families (Bandyopadhyay and Mehler, Proteins 2008; 72:646-659). In this study, we have hypothesized that amino acids perform similar functions when embedded in a certain type of protein microenvironment. We have tested this hypothesis on the microenvironments around disulfide-bridged cysteines from high-resolution protein crystal structures. Although such cystines mainly play structural role in proteins, in certain enzymes they participate in catalysis and redox reactions. We have performed and report a functional annotation of enzymatically active cystines to their respective microenvironments. Three protein microenvironment clusters were identified: (i) buried-hydrophobic, (ii) exposed-hydrophilic, and (iii) buried-hydrophilic. The buried-hydrophobic cluster encompasses a small group of 22 redox-active cystines, mostly in alpha-helical conformations in a -C-x-x-C- motif from the Oxido-reductase enzyme class. All these cystines have high strain energy and near identical microenvironments. Most of the active cystines in hydrolase enzyme class belong to buried hydrophilic microenvironment cluster. In total there are 34 half-cystines detected in buried hydrophilic cluster from hydrolases, as a part of enzyme active site. Even within the buried hydrophilic cluster, there is clear separation of active half-cystines between surface exposed part of the protein and protein interior. Half-cystines toward the surface exposed region are higher in number compared to those in protein interior. Apart from cystines at the active sites of the enzymes, many more half-cystines were detected in buried hydrophilic cluster those are part of the microenvironment of enzyme active sites. However, no active half-cystines were detected in extremely hydrophilic microenvironment cluster, that is, exposed hydrophilic cluster, indicating that total exposure of cystine toward the solvent is not favored for enzymatic reactions. Although half-cystines in exposed-hydrophilic clusters occasionally stabilize enzyme active sites, as a part of their microenvironments. Analysis performed in this work revealed that cystines as a part of active sites in specific enzyme families or folds share very similar protein microenvironment regions, despite of their dissimilarity in protein sequences and position specific sequence conservations. Proteins 2016; 84:1576-1589. © 2016 Wiley Periodicals, Inc.
Assuntos
Cistina/química , Dissulfetos/química , Hidrolases/química , Liases/química , Oxirredutases/química , Transferases/química , Motivos de Aminoácidos , Animais , Domínio Catalítico , Cristalografia por Raios X , Cisteína/química , Humanos , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Oxirredução , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Dobramento de ProteínaRESUMO
Computational characterization of multiple Histidine (His) post-translational-modifications (PTM) at enzyme active sites complements tedious experimental characterization in proteins-of-unknown-functions (PUFs) and domain-of-unknown-functions (DUFs). There are only a handful of Histidine-PTM-prediction-tools and those also annotate only a single function. Here, we addressed the problem using artificial neural networks on functional histidine dataset curated from enzyme (protein) sequences available in UniProt database (sample size n = 1584). The convolution-neural-network (CNN) model ('Hist-i-fy') performed the best with 75% overall accuracy/F1-score. A case study was performed on histidine-phosphorylation (n = 34) obtained from mass spectroscopy data. For the first time, we report multiple His-PTM-prediction-tool (https://histify.streamlit.app/& https://github.com/dibyansu24-maker/Histify), with optimal performance. The inputs to the tool are (i) protein sequence containing histidine, and (ii) the histidine residue number. Prediction output is one out of the eight histidine functions-acetylation, ribosylation, glycosylation, hydroxylation, methylation, oxidation, phosphorylation, and protein splicing.Communicated by Ramaswamy H. Sarma.
RESUMO
Quinolone synthase from Aegle marmelos (AmQNS) is a type III polyketide synthase that yields therapeutically effective quinolone and acridone compounds. Addressing the structural and molecular underpinnings of AmQNS and its substrate interaction in terms of its high selectivity and specificity can aid in the development of numerous novel compounds. This paper presents a high-resolution AmQNS crystal structure and explains its mechanistic role in synthetic selectivity. Additionally, we provide a model framework to comprehend structural constraints on ketide insertion and postulate that AmQNS's steric and electrostatic selectivity plays a role in its ability to bind to various core substrates, resulting in its synthetic diversity. AmQNS prefers quinolone synthesis and can accommodate large substrates because of its wide active site entrance. However, our research suggests that acridone is exclusively synthesized in the presence of high malonyl-CoA concentrations. Potential implications of functionally relevant residue mutations were also investigated, which will assist in harnessing the benefits of mutations for targeted polyketide production. The pharmaceutical industry stands to gain from these findings as they expand the pool of potential drug candidates, and these methodologies can also be applied to additional promising enzymes.
Assuntos
Quinolonas , Especificidade por Substrato , Quinolonas/química , Quinolonas/metabolismo , Domínio Catalítico , Modelos Moleculares , Policetídeo Sintases/química , Policetídeo Sintases/metabolismo , Policetídeo Sintases/genética , Cristalografia por Raios X , Conformação ProteicaRESUMO
Background and Aims: Nonstructural (NS1) protein is mainly involved in virulence and replication of several viruses, including influenza virus A (H1N1); surveillance of the latter started in India in 2009. The objective of this study was to identify the new substitutions in NS1 protein from the influenza virus A (H1N1) pandemic 2009 (pdm09) strain isolated in India. Methods: The sequences of NS1 proteins from influenza A(H1N1) pdm09 strains isolated in India were obtained from publicly available databases. Multiple sequence alignment and phylogeny analyses were performed to confirm the "consistent substitutions" on NS1 protein from H1N1 (pdm09) Indian strains. Here, "consistent substitutions" were defined as the substitutions observed in all the sequences isolated in a year. Comparative analyses were performed among NS1 Indian sequences from A(H1N1) pdm09, A (H1N1) seasonal and A(H3N2) strains, and from A (H1N1) pdm09 global strains. Results: Eight substitutions were identified in the NS1 Indian sequence from the A(H1N1) pdm09 strain, two in RBD, five in ED, and one in the linker region. Three new substitutions were reported in this study at NS1 sequence positions 2, 80, and 155, which evolved within 2015-2019 and became "consistent." These new substitutions were associated with conservative paired substitutions in the alternative domains of the NS1 protein. Three paired substitutions were (i) D2E and E125D, (ii) T80A and A155T, and (iii) E55K and K131E. Conclusions: This study indicates the continuous evolution of NS1 protein from the influenza A virus. The new substitutions at positions 2 and 80 occurred in the RNA binding and eIF4GI binding domains. The D2E substitution evolved simultaneously with the E125D substitution that involved viral replication. The third new substitution at position 155 occurred in the PI3K binding domain. The possible consequences of these substitutions on host-pathogen interactions are subject to further experimental and computational verification.
RESUMO
Proteins involved in proton-/electron-transfer processes often possess "functional" aspartates/aspartic acids (Asp) with variable protonation states. The mechanism of Asp protonation-deprotonation within proteins is unclear. Two questions were asked-the possible types of determinants responsible for Asp protonation-deprotonation and the spatial arrangements of the determinants leading to selective stabilization. The questions were analyzed using nine different solvent models, which scanned the complete protein dielectric range, and four protein models, which illustrated the spatial arrangements around Asp, termed as "molecular association". The methods employed were quantum chemical calculations and constant pH simulations. The types of the determinants identified were charge-charge interaction, H bonding, dipole-π interaction, extended electronic conjugation, dielectric effect, and solvent accessibility. All solvent-exposed Asp [buried fraction (BF) less than 0.5] were aspartates, and buried Asp were either aspartic acids or aspartates, each having a different "molecular association". The exposed aspartates were stabilized via a H-bonding network with bulk water, buried aspartates via salt bridge or, minimum, two intramolecular H bonds, and buried aspartic acids via, minimum, one intramolecular H bond. An "acid-alcohol pair" (involving Ser/Thr/Tyr) was a common determinant to any "functional" buried aspartate/aspartic acid. Higher energy "molecular associations" observed within proteins compared to those within water, presumably, indicated easy molecular restructuring and alteration of the Asp protonation states during a protein-mediated proton/electron transfer.
Assuntos
Ácido Aspártico , Prótons , Transporte de Elétrons , Ligação de Hidrogênio , Conformação Proteica , ÁguaRESUMO
[This corrects the article DOI: 10.1371/journal.pone.0228156.].
RESUMO
Mutations conferring susceptibility to complex disorders also occur in healthy individuals but at significantly lower frequencies than in patients, indicating that these mutations are not completely penetrant. Therefore, it is important to estimate the penetrance or the likelihood of developing a disease in presence of a mutation. Recently, a method to calculate penetrance and its credible intervals was developed on the basis of the Bayesian method and since been used in literature. However, in the present form, this approach demands programming skills for its utility. Here, we developed 'CalPen', a web-based tool for straightforward calculation of penetrance and its credible intervals by entering the number of mutations identified in controls and patients, and the number of patients and controls studied. For validation purposes, we show that CalPen-derived penetrance values are in good agreement with the published values. As further demonstration of its utility, we used schizophrenia as an example of complex disorder and estimated penetrance values for 15 different copy number variants (CNVs) reported in 39,059 patients and 55,084 controls, and 145 SNPs reported in 45,405 patients and 122,761 controls. CNVs showed an average penetrance of 7% with 22q11.21 CNVs having highest value (~20%) and 15q11.2 deletions with lowest value (~1.4%). Most SNPs, on the other hand showed a penetrance of 0.7% with rs1801028 having the highest penetrance (1.6%). In summary, CalPen is an accurate and user-friendly web-based tool useful in human genetic research to ascertain the ability of the mutation/ variant to cause a complex genetic disorder.
Assuntos
Doenças Genéticas Inatas/genética , Penetrância , Teorema de Bayes , Variações do Número de Cópias de DNA/genética , Predisposição Genética para Doença/genética , Humanos , Internet , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único/genética , Esquizofrenia/genética , SoftwareRESUMO
A general method has been developed to characterize the hydrophobicity or hydrophilicity of the microenvironment (MENV), in which a given amino acid side chain is immersed, by calculating a quantitative property descriptor (QPD) based on the relative (to water) hydrophobicity of the MENV. Values of the QPD were calculated for a test set of 733 proteins to analyze the modulating effects on amino acid residue properties by the MENV in which they are imbedded. The QPD values and solvent accessibility were used to derive a partitioning of residues based on the MENV hydrophobicities. From this partitioning, a new hydrophobicity scale was developed, entirely in the context of protein structure, where amino acid residues are immersed in one or more "MENVpockets." Thus, the partitioning is based on the residues "sampling" a large number of "solvents" (MENVs) that represent a very large range of hydrophobicity values. It was found that the hydrophobicity of around 80% of amino acid side chains and their MENV are complementary to each other, but for about 20%, the MENV and their imbedded residue can be considered as mismatched. Many of these mismatches could be rationalized in terms of the structural stability of the protein and/or the involvement of the imbedded residue in function. The analysis also indicated a remarkable conservation of local environments around highly conserved active site residues that have similar functions across protein families, but where members have relatively low sequence homology. Thus, quantitative evaluation of this QPD is suggested, here, as a tool for structure-function prediction, analysis, and parameter development for the calculation of properties in proteins.
Assuntos
Aminoácidos/química , Proteínas/química , Algoritmos , Sítios de Ligação , Solventes/química , Relação Estrutura-AtividadeRESUMO
Role of Magnesium ion is well substantiated in DNA structure and function though the appropriate nature of DNA magnesium interaction is still not fully established. We have analyzed available DNA crystal structures in presence of magnesium ion, which show the experimental evidences for various interaction modes between DNA molecule and magnesium ion. Two preferred modes are found: direct coordinating interaction between magnesium ion and electronegative DNA atoms, and the secondary mode of interaction via formation of hydrogen bonds. This qualitative data is further supported by ab initio quantum chemical calculations using restricted Hartree-Fock and Density Functional Theory. We have analyzed the energies and partial charges of different DNA fragments and hydrated magnesium ions, following restrained and unrestrained geometry optimizations along the reaction coordinate. The restrained optimizations for the systems generally show two energy minima separated by an energy barrier, the height ranges from about 5 to 15 kcal/mol, which is in agreement with experimental observations. All these analyses suggest that both modes of interactions occur almost with equal probability, although water mediated secondary mode of interaction is preferred in most cases, which was so far neglected.