RESUMO
With the notable surge in therapeutic peptide development, various peptides have emerged as potential agents against virus-induced diseases. Viral entry inhibitory peptides (VEIPs), a subset of antiviral peptides (AVPs), offer a promising avenue as entry inhibitors (EIs) with distinct advantages over chemical counterparts. Despite this, a comprehensive analytical platform for characterizing these peptides and their effectiveness in blocking viral entry remains lacking. In this study, we introduce a groundbreaking in silico approach that leverages bioinformatics analysis and machine learning to characterize and identify novel VEIPs. Cross-validation results demonstrate the efficacy of a model combining sequence-based features in predicting VEIPs with high accuracy, validated through independent testing. Additionally, an EI type model has been developed to distinguish peptides specifically acting as Eis from AVPs with alternative activities. Notably, we present iDVEIP, a web-based tool accessible at http://mer.hc.mmh.org.tw/iDVEIP/, designed for automatic analysis and prediction of VEIPs. Emphasizing its capabilities, the tool facilitates comprehensive analyses of peptide characteristics, providing detailed amino acid composition data for each prediction. Furthermore, we showcase the tool's utility in identifying EIs against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2).
Assuntos
Antivirais , Biologia Computacional , Aprendizado de Máquina , Peptídeos , SARS-CoV-2 , Internalização do Vírus , Internalização do Vírus/efeitos dos fármacos , Antivirais/farmacologia , Antivirais/química , Humanos , Peptídeos/química , Peptídeos/farmacologia , Biologia Computacional/métodos , SARS-CoV-2/efeitos dos fármacos , Tratamento Farmacológico da COVID-19 , Simulação por Computador , COVID-19/virologia , SoftwareRESUMO
Precise gene-editing using CRISPR/Cas9 technology remains a long-standing challenge, especially for genes with low expression and no selectable phenotypes in Chlamydomonas reinhardtii, a classic model for photosynthesis and cilia research. Here, we developed a multi-type and precise genetic manipulation method in which a DNA break was generated by Cas9 nuclease and the repair was mediated using a homologous DNA template. The efficacy of this method was demonstrated for several types of gene editing, including inactivation of two low-expression genes (CrTET1 and CrKU80), the introduction of a FLAG-HA epitope tag into VIPP1, IFT46, CrTET1 and CrKU80 genes, and placing a YFP tag into VIPP1 and IFT46 for live-cell imaging. We also successfully performed a single amino acid substitution for the FLA3, FLA10 and FTSY genes, and documented the attainment of the anticipated phenotypes. Lastly, we demonstrated that precise fragment deletion from the 3'-UTR of MAA7 and VIPP1 resulted in a stable knock-down effect. Overall, our study has established efficient methods for multiple types of precise gene editing in Chlamydomonas, enabling substitution, insertion and deletion at the base resolution, thus improving the potential of this alga in both basic research and industrial applications.
Assuntos
Chlamydomonas reinhardtii , Chlamydomonas , Sistemas CRISPR-Cas , Chlamydomonas/genética , Edição de Genes/métodos , Chlamydomonas reinhardtii/genéticaRESUMO
Antiretroviral peptides are a kind of bioactive peptides that present inhibitory activity against retroviruses through various mechanisms. Among them, viral integrase inhibitory peptides (VINIPs) are a class of antiretroviral peptides that have the ability to block the action of integrase proteins, which is essential for retroviral replication. As the number of experimentally verified bioactive peptides has increased significantly, the lack of in silico machine learning approaches can effectively predict the peptides with the integrase inhibitory activity. Here, we have developed the first prediction model for identifying the novel VINIPs using the sequence characteristics, and the hybrid feature set was considered to improve the predictive ability. The performance was evaluated by 5-fold cross-validation based on the training dataset, and the result indicates the proposed model is capable of predicting the VINIPs, with a sensitivity of 85.82%, a specificity of 88.81%, an accuracy of 88.37%, a balanced accuracy of 87.32% and a Matthews correlation coefficient value of 0.64. Most importantly, the model also consistently provides effective performance in independent testing. To sum up, we propose the first computational approach for identifying and characterizing the VINIPs, which can be considered novel antiretroviral therapy agents. Ultimately, to facilitate further research and development, iDVIP, an automatic computational tool that predicts the VINIPs has been developed, which is now freely available at http://mer.hc.mmh.org.tw/iDVIP/.
Assuntos
Infecções por HIV , Integrases , Humanos , Sequência de Aminoácidos , Peptídeos/farmacologia , Peptídeos/química , Proteínas/químicaRESUMO
Anticancer peptides (ACPs) are bioactive compounds known for their selective cytotoxicity against tumor cells via various mechanisms. Recent studies have demonstrated that in silico machine learning methods are effective in predicting peptides with anticancer activity. In this study, we collected and analyzed over a thousand experimentally verified ACPs, specifically targeting peptides derived from natural sources. We developed a precise prediction model based on their sequence and structural features, and the model's evaluation results suggest its strong predictive ability for anticancer activity. To enhance reliability, we integrated the results of this model with those from other available methods. In total, we identified 176 potential ACPs, some of which were synthesized and further evaluated using the MTT colorimetric assay. All of these putative ACPs exhibited significant anticancer effects and selective cytotoxicity against specific tumor cells. In summary, we present a strategy for identifying and characterizing natural peptides with selective cytotoxicity against cancer cells, which could serve as novel therapeutic agents. Our prediction model can effectively screen new molecules for potential anticancer activity, and the results from in vitro experiments provide compelling evidence of the candidates' anticancer effects and selective cytotoxicity.
Assuntos
Antineoplásicos , Simulação por Computador , Peptídeos , Humanos , Peptídeos/farmacologia , Peptídeos/química , Antineoplásicos/farmacologia , Antineoplásicos/química , Linhagem Celular Tumoral , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Neoplasias/metabolismo , Produtos Biológicos/farmacologia , Produtos Biológicos/química , Sobrevivência Celular/efeitos dos fármacos , Aprendizado de Máquina , Ensaios de Seleção de Medicamentos AntitumoraisRESUMO
Benefit for clinical melanoma treatments, the transdermal neoadjuvant therapy could reduce surgery region and increase immunotherapy efficacy. Using lipoplex (Lipo-PEG-PEI-complex, LPPC) encapsulated doxorubicin (DOX) and carrying CpG oligodeoxynucleotide; the transdermally administered nano-liposomal drug complex (LPPC-DOX-CpG) would have high cytotoxicity and immunostimulatory activity to suppress systemic metastasis of melanoma. LPPC-DOX-CpG dramatically suppressed subcutaneous melanoma growth by inducing tumor cell apoptosis and recruiting immune cells into the tumor area. Animal studies further showed that the colonization and growth of spontaneously metastatic melanoma cells in the liver and lung were suppressed by transdermal LPPC-DOX-CpG. Furthermore, NGS analysis revealed IFN-γ and NF-κB pathways were triggered to recruit and activate the antigen-presenting-cells and effecter cells, which could activate the anti-tumor responses as the major mechanism responsible for the therapeutic effect of LPPC-DOX-CpG. Finally, we have successfully proved transdermal LPPC-DOX-CpG as a promising penetrative carrier to activate systemic anti-tumor immunity against subcutaneous and metastatic tumor.
Assuntos
Melanoma , Humanos , Melanoma/tratamento farmacológicoRESUMO
Endometrial cancer is one of the most common malignancy affecting women in developed countries. Resection uterus or lesion area is usually the first option for a simple and efficient therapy. Therefore, it is necessary to find a new therapeutic drug to reduce surgery areas to preserve fertility. Anticancer peptides (ACP) are bioactive amino acids with lower toxicity and higher specificity than chemical drugs. This study is to address an ACP, herein named Q7, which could downregulate 24-Dehydrocholesterol Reductase (DHCR24) to disrupt lipid rafts formation, and sequentially affect the AKT signal pathway of HEC-1-A cells to suppress their tumorigenicity such as proliferation and migration. Moreover, lipo-PEI-PEG-complex (LPPC) was used to enhance Q7 anticancer activity in vitro and efficiently show its effects on HEC-1-A cells. Furthermore, LPPC-Q7 exhibited a synergistic effect in combination with doxorubicin or paclitaxel. To summarize, Q7 was firstly proved to exhibit an anticancer effect on endometrial cancer cells and combined with LPPC efficiently improved the cytotoxicity of Q7.
Assuntos
Neoplasias do Endométrio , Oxirredutases atuantes sobre Doadores de Grupo CH-CH , Humanos , Feminino , Neoplasias do Endométrio/tratamento farmacológico , Neoplasias do Endométrio/genética , Peptídeos/farmacologia , Peptídeos/uso terapêutico , Proteínas do Tecido NervosoRESUMO
MicroRNAs (miRNAs) are small non-coding RNAs (typically consisting of 18-25 nucleotides) that negatively control expression of target genes at the post-transcriptional level. Owing to the biological significance of miRNAs, miRTarBase was developed to provide comprehensive information on experimentally validated miRNA-target interactions (MTIs). To date, the database has accumulated >13,404 validated MTIs from 11,021 articles from manual curations. In this update, a text-mining system was incorporated to enhance the recognition of MTI-related articles by adopting a scoring system. In addition, a variety of biological databases were integrated to provide information on the regulatory network of miRNAs and its expression in blood. Not only targets of miRNAs but also regulators of miRNAs are provided to users for investigating the up- and downstream regulations of miRNAs. Moreover, the number of MTIs with high-throughput experimental evidence increased remarkably (validated by CLIP-seq technology). In conclusion, these improvements promote the miRTarBase as one of the most comprehensively annotated and experimentally validated miRNA-target interaction databases. The updated version of miRTarBase is now available at http://miRTarBase.cuhk.edu.cn/.
Assuntos
Bases de Dados de Ácidos Nucleicos , MicroRNAs/metabolismo , MicroRNA Circulante/metabolismo , Mineração de Dados , Regulação da Expressão Gênica , RNA Mensageiro/metabolismo , Interface Usuário-ComputadorRESUMO
Antimicrobial peptides (AMPs), naturally encoded from genes and generally contained 10-100 amino acids, are crucial components of the innate immune system and can protect the host from various pathogenic bacteria, as well as viruses. In recent years, the widespread use of antibiotics has inspired the rapid growth of antibiotic-resistant microorganisms that usually induce critical infection and pathogenesis. An increasing interest therefore was motivated to explore natural AMPs that enable the development of new antibiotics. With the potential of AMPs being as new drugs for multidrug-resistant pathogens, we were thus motivated to develop a database (dbAMP, http://csb.cse.yzu.edu.tw/dbAMP/) by accumulating comprehensive AMPs from public domain and manually curating literature. Currently in dbAMP there are 12 389 unique entries, including 4271 experimentally verified AMPs and 8118 putative AMPs along with their functional activities, supported by 1924 research articles. The advent of high-throughput biotechnologies, such as mass spectrometry and next-generation sequencing, has led us to further expand dbAMP as a database-assisted platform for providing comprehensively functional and physicochemical analyses for AMPs based on the large-scale transcriptome and proteome data. Significant improvements available in dbAMP include the information of AMP-protein interactions, antimicrobial potency analysis for 'cryptic' region detection, annotations of AMP target species, as well as AMP detection on transcriptome and proteome datasets. Additionally, a Docker container has been developed as a downloadable package for discovering known and novel AMPs on high-throughput omics data. The user-friendly visualization interfaces have been created to facilitate peptide searching, browsing, and sequence alignment against dbAMP entries. All the facilities integrated into dbAMP can promote the functional analyses of AMPs and the discovery of new antimicrobial drugs.
Assuntos
Anti-Infecciosos/química , Peptídeos Catiônicos Antimicrobianos/química , Bases de Dados de Compostos Químicos , Proteoma , Transcriptoma , Anti-Infecciosos/síntese química , Peptídeos Catiônicos Antimicrobianos/síntese química , Peptídeos Catiônicos Antimicrobianos/genética , Simulação por Computador , Descoberta de Drogas , Ontologia Genética , Interações Hidrofóbicas e Hidrofílicas , Imunidade Inata , Internet , Software , Solubilidade , Especificidade da Espécie , Relação Estrutura-AtividadeRESUMO
The dbPTM (http://dbPTM.mbc.nctu.edu.tw/) has been maintained for over 10 years with the aim to provide functional and structural analyses for post-translational modifications (PTMs). In this update, dbPTM not only integrates more experimentally validated PTMs from available databases and through manual curation of literature but also provides PTM-disease associations based on non-synonymous single nucleotide polymorphisms (nsSNPs). The high-throughput deep sequencing technology has led to a surge in the data generated through analysis of association between SNPs and diseases, both in terms of growth amount and scope. This update thus integrated disease-associated nsSNPs from dbSNP based on genome-wide association studies. The PTM substrate sites located at a specified distance in terms of the amino acids encoded from nsSNPs were deemed to have an association with the involved diseases. In recent years, increasing evidence for crosstalk between PTMs has been reported. Although mass spectrometry-based proteomics has substantially improved our knowledge about substrate site specificity of single PTMs, the fact that the crosstalk of combinatorial PTMs may act in concert with the regulation of protein function and activity is neglected. Because of the relatively limited information about concurrent frequency and functional relevance of PTM crosstalk, in this update, the PTM sites neighboring other PTM sites in a specified window length were subjected to motif discovery and functional enrichment analysis. This update highlights the current challenges in PTM crosstalk investigation and breaks the bottleneck of how proteomics may contribute to understanding PTM codes, revealing the next level of data complexity and proteomic limitation in prospective PTM research.
Assuntos
Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Motivos de Aminoácidos , Biologia Computacional , Estudo de Associação Genômica Ampla , Glicosilação , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Espectrometria de Massas/métodos , Fosforilação , Polimorfismo de Nucleotídeo Único , Proteômica/métodos , Relação Estrutura-Atividade , Especificidade por Substrato , Interface Usuário-ComputadorRESUMO
BACKGROUND: Protein phosphoglycerylation, the addition of a 1,3-bisphosphoglyceric acid (1,3-BPG) to a lysine residue of a protein and thus to form a 3-phosphoglyceryl-lysine, is a reversible and non-enzymatic post-translational modification (PTM) and plays a regulatory role in glucose metabolism and glycolytic process. As the number of experimentally verified phosphoglycerylated sites has increased significantly, statistical or machine learning methods are imperative for investigating the characteristics of phosphoglycerylation sites. Currently, research into phosphoglycerylation is very limited, and only a few resources are available for the computational identification of phosphoglycerylation sites. RESULT: We present a bioinformatics investigation of phosphoglycerylation sites based on sequence-based features. The TwoSampleLogo analysis reveals that the regions surrounding the phosphoglycerylation sites contain a high relatively of positively charged amino acids, especially in the upstream flanking region. Additionally, the non-polar and aliphatic amino acids are more abundant surrounding phosphoglycerylated lysine following the results of PTM-Logo, which may play a functional role in discriminating between phosphoglycerylation and non-phosphoglycerylation sites. Many types of features were adopted to build the prediction model on the training dataset, including amino acid composition, amino acid pair composition, positional weighted matrix and position-specific scoring matrix. Further, to improve the predictive power, numerous top features ranked by F-score were considered as the final combination for classification, and thus the predictive models were trained using DT, RF and SVM classifiers. Evaluation by five-fold cross-validation showed that the selected features was most effective in discriminating between phosphoglycerylated and non-phosphoglycerylated sites. CONCLUSION: The SVM model trained with the selected sequence-based features performed well, with a sensitivity of 77.5%, a specificity of 73.6%, an accuracy of 74.9%, and a Matthews Correlation Coefficient value of 0.49. Furthermore, the model also consistently provides the effective performance in independent testing set, yielding sensitivity of 75.7% and specificity of 64.9%. Finally, the model has been implemented as a web-based system, namely iDPGK, which is now freely available at http://mer.hc.mmh.org.tw/iDPGK/ .
Assuntos
Biologia Computacional/métodos , Lisina/metabolismo , Software , Sequência de Aminoácidos , Glicosilação , Internet , Lisina/química , Aprendizado de Máquina , Matrizes de Pontuação de Posição Específica , Processamento de Proteína Pós-Traducional , Proteínas/química , Curva ROC , Reprodutibilidade dos Testes , Máquina de Vetores de SuporteRESUMO
In mammals, microRNAs (miRNAs) play key roles in controlling posttranscriptional regulation through binding to the mRNAs of target genes. Recently, it was discovered that viral miRNAs may be involved in human cancers and diseases. It is likely that viral miRNAs help viruses enter the latent phase of their life cycle and become undetected by the host's immune system, while increasing the host's risk for cancer development. Cervical cancer is typically related to the infection of human papillomavirus (HPV) through sexual transmission. To further understand the molecular mechanisms underlying the associations of HPV infection with genital diseases, we developed a systematic method for viral miRNA identification and viral miRNA-mediated regulatory network construction based on genome-wide sequence analysis. The complete genomes of certain high-risk HPV subtypes were used to predict putative viral pre-miRNAs by bioinformatics approaches. In addition, small RNA libraries in human cervical lesions from existing publications were collected to validate the predicted HPV pre-miRNAs. For the construction of virally encoded miRNA-mediated regulatory network of HPV infection, cervical squamous epithelial carcinoma gene expression data were extracted from the RNA sequencing platform in The Cancer Genome Atlas; the differentially expressed genes were used to identify the putative targets of viral miRNAs. Predicted cellular target genes of HPV-encoded miRNAs provide an overview of these viral miRNA's putative functions. Finally, a large-scale genome analysis was carried out to examine the phylogenetic relationship and structural evolution among genital HPV types that have the potential to cause genital cancer. In this study, we discovered putative HPV-encoded miRNAs, which were validated against the small RNA libraries in human cervical lesions. Furthermore, as indicated by their biological functions, host genes targeted by HPV-encoded miRNAs may play significant roles in virus infection and carcinogenesis. These viral miRNAs pose as promising candidates for the development of antiviral drugs. More importantly, the identified subtype-specific miRNAs have the potential to be used as biomarkers for HPV subtype determination.
Assuntos
Evolução Molecular , Genoma Viral , MicroRNAs/genética , Papillomaviridae/genética , Filogenia , RNA Viral/genética , Carcinogênese , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Interações Hospedeiro-Patógeno , Humanos , Papillomaviridae/classificação , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Glutarylation, the addition of a glutaryl group (five carbons) to a lysine residue of a protein molecule, is an important post-translational modification and plays a regulatory role in a variety of physiological and biological processes. As the number of experimentally identified glutarylated peptides increases, it becomes imperative to investigate substrate motifs to enhance the study of protein glutarylation. We carried out a bioinformatics investigation of glutarylation sites based on amino acid composition using a public database containing information on 430 non-homologous glutarylation sites. RESULTS: The TwoSampleLogo analysis indicates that positively charged and polar amino acids surrounding glutarylated sites may be associated with the specificity in substrate site of protein glutarylation. Additionally, the chi-squared test was utilized to explore the intrinsic interdependence between two positions around glutarylation sites. Further, maximal dependence decomposition (MDD), which consists of partitioning a large-scale dataset into subgroups with statistically significant amino acid conservation, was used to capture motif signatures of glutarylation sites. We considered single features, such as amino acid composition (AAC), amino acid pair composition (AAPC), and composition of k-spaced amino acid pairs (CKSAAP), as well as the effectiveness of incorporating MDD-identified substrate motifs into an integrated prediction model. Evaluation by five-fold cross-validation showed that AAC was most effective in discriminating between glutarylation and non-glutarylation sites, according to support vector machine (SVM). CONCLUSIONS: The SVM model integrating MDD-identified substrate motifs performed well, with a sensitivity of 0.677, a specificity of 0.619, an accuracy of 0.638, and a Matthews Correlation Coefficient (MCC) value of 0.28. Using an independent testing dataset (46 glutarylated and 92 non-glutarylated sites) obtained from the literature, we demonstrated that the integrated SVM model could improve the predictive performance effectively, yielding a balanced sensitivity and specificity of 0.652 and 0.739, respectively. This integrated SVM model has been implemented as a web-based system (MDDGlutar), which is now freely available at http://csb.cse.yzu.edu.tw/MDDGlutar/ .
Assuntos
Biologia Computacional/métodos , Glutaratos/metabolismo , Lisina/metabolismo , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Lisina/química , Camundongos , Proteínas/química , Curva ROC , Reprodutibilidade dos Testes , Especificidade por Substrato , Máquina de Vetores de Suporte , Interface Usuário-ComputadorRESUMO
BACKGROUND: Group B streptococcus (GBS) is an important pathogen that is responsible for invasive infections, including sepsis and meningitis. GBS serotyping is an essential means for the investigation of possible infection outbreaks and can identify possible sources of infection. Although it is possible to determine GBS serotypes by either immuno-serotyping or geno-serotyping, both traditional methods are time-consuming and labor-intensive. In recent years, the matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) has been reported as an effective tool for the determination of GBS serotypes in a more rapid and accurate manner. Thus, this work aims to investigate GBS serotypes by incorporating machine learning techniques with MALDI-TOF MS to carry out the identification. RESULTS: In this study, a total of 787 GBS isolates, obtained from three research and teaching hospitals, were analyzed by MALDI-TOF MS, and the serotype of the GBS was determined by a geno-serotyping experiment. The peaks of mass-to-charge ratios were regarded as the attributes to characterize the various serotypes of GBS. Machine learning algorithms, such as support vector machine (SVM) and random forest (RF), were then used to construct predictive models for the five different serotypes (Types Ia, Ib, III, V, and VI). After optimization of feature selection and model generation based on training datasets, the accuracies of the selected models attained 54.9-87.1% for various serotypes based on independent testing data. Specifically, for the major serotypes, namely type III and type VI, the accuracies were 73.9 and 70.4%, respectively. CONCLUSION: The proposed models have been adopted to implement a web-based tool (GBSTyper), which is now freely accessible at http://csb.cse.yzu.edu.tw/GBSTyper/, for providing efficient and effective detection of GBS serotypes based on a MALDI-TOF MS spectrum. Overall, this work has demonstrated that the combination of MALDI-TOF MS and machine intelligence could provide a practical means of clinical pathogen testing.
Assuntos
Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Streptococcus/classificação , Aprendizado de Máquina , SorotipagemRESUMO
We previously found that circulating ß2 -glycoprotein I inhibits human endothelial cell migration, proliferation, and angiogenesis by diverse mechanisms. In the present study, we investigated the antitumor activities of ß2 -glycoprotein I using structure-function analysis and mapped the critical region within the ß2 -glycoprotein I peptide sequence that mediates anticancer effects. We constructed recombinant cDNA and purified different ß2 -glycoprotein I polypeptide domains using a baculovirus expression system. We found that purified ß2 -glycoprotein I, as well as recombinant ß2 -glycoprotein I full-length (D12345), polypeptide domains I-IV (D1234), and polypeptide domain I (D1) significantly inhibited melanoma cell migration, proliferation and invasion. Western blot analyses were used to determine the dysregulated expression of proteins essential for intracellular signaling pathways in B16-F10 treated with ß2 -glycoprotein I and variant recombinant polypeptides. Using a melanoma mouse model, we found that D1 polypeptide showed stronger potency in suppressing tumor growth. Structural analysis showed that fragments A and B within domain I would be the critical regions responsible for antitumor activity. Annexin A2 was identified as the counterpart molecule for ß2 -glycoprotein I by immunofluorescence and coimmunoprecipitation assays. Interaction between specific amino acids of ß2 -glycoprotein I D1 and annexin A2 was later evaluated by the molecular docking approach. Moreover, five amino acid residues were selected from fragments A and B for functional evaluation using site-directed mutagenesis, and P11A, M42A, and I55P mutations were shown to disrupt the anti-melanoma cell migration ability of ß2 -glycoprotein I. This is the first study to show the therapeutic potential of ß2 -glycoprotein I D1 in the treatment of melanoma progression.
Assuntos
Movimento Celular/efeitos dos fármacos , Melanoma Experimental/tratamento farmacológico , Peptídeos/farmacologia , beta 2-Glicoproteína I/química , Sequência de Aminoácidos , Animais , Sítios de Ligação/genética , Linhagem Celular Tumoral , Masculino , Melanoma Experimental/genética , Melanoma Experimental/metabolismo , Camundongos Endogâmicos C57BL , Simulação de Acoplamento Molecular , Mutagênese Sítio-Dirigida , Peptídeos/química , Peptídeos/metabolismo , Domínios Proteicos , Homologia de Sequência de Aminoácidos , beta 2-Glicoproteína I/genética , beta 2-Glicoproteína I/metabolismoRESUMO
Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource.
Assuntos
Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Sítios de Ligação , Doença , Glicosilação , Redes e Vias Metabólicas , Preparações Farmacêuticas/química , Conformação Proteica , Mapeamento de Interação de ProteínasRESUMO
BACKGROUND: Protein carbonylation, an irreversible and non-enzymatic post-translational modification (PTM), is often used as a marker of oxidative stress. When reactive oxygen species (ROS) oxidized the amino acid side chains, carbonyl (CO) groups are produced especially on Lysine (K), Arginine (R), Threonine (T), and Proline (P). Nevertheless, due to the lack of information about the carbonylated substrate specificity, we were encouraged to develop a systematic method for a comprehensive investigation of protein carbonylation sites. RESULTS: After the removal of redundant data from multipe carbonylation-related articles, totally 226 carbonylated proteins in human are regarded as training dataset, which consisted of 307, 126, 128, and 129 carbonylation sites for K, R, T and P residues, respectively. To identify the useful features in predicting carbonylation sites, the linear amino acid sequence was adopted not only to build up the predictive model from training dataset, but also to compare the effectiveness of prediction with other types of features including amino acid composition (AAC), amino acid pair composition (AAPC), position-specific scoring matrix (PSSM), positional weighted matrix (PWM), solvent-accessible surface area (ASA), and physicochemical properties. The investigation of position-specific amino acid composition revealed that the positively charged amino acids (K and R) are remarkably enriched surrounding the carbonylated sites, which may play a functional role in discriminating between carbonylation and non-carbonylation sites. A variety of predictive models were built using various features and three different machine learning methods. Based on the evaluation by five-fold cross-validation, the models trained with PWM feature could provide better sensitivity in the positive training dataset, while the models trained with AAindex feature achieved higher specificity in the negative training dataset. Additionally, the model trained using hybrid features, including PWM, AAC and AAindex, obtained best MCC values of 0.432, 0.472, 0.443 and 0.467 on K, R, T and P residues, respectively. CONCLUSION: When comparing to an existing prediction tool, the selected models trained with hybrid features provided a promising accuracy on an independent testing dataset. In short, this work not only characterized the carbonylated substrate preference, but also demonstrated that the proposed method could provide a feasible means for accelerating preliminary discovery of protein carbonylation.
Assuntos
Aminoácidos/química , Fenômenos Químicos , Carbonilação Proteica , Sequência de Aminoácidos , Arginina/química , Humanos , Lisina/química , Modelos Teóricos , Matrizes de Pontuação de Posição Específica , Prolina/química , Processamento de Proteína Pós-Traducional , Proteínas/química , Espécies Reativas de Oxigênio/química , Especificidade por Substrato , Treonina/químicaRESUMO
Given the increasing number of proteins reported to be regulated by S-nitrosylation (SNO), it is considered to act, in a manner analogous to phosphorylation, as a pleiotropic regulator that elicits dual effects to regulate diverse pathophysiological processes by altering protein function, stability, and conformation change in various cancers and human disorders. Due to its importance in regulating protein functions and cell signaling, dbSNO (http://dbSNO.mbc.nctu.edu.tw) is extended as a resource for exploring structural environment of SNO substrate sites and regulatory networks of S-nitrosylated proteins. An increasing interest in the structural environment of PTM substrate sites motivated us to map all manually curated SNO peptides (4165 SNO sites within 2277 proteins) to PDB protein entries by sequence identity, which provides the information of spatial amino acid composition, solvent-accessible surface area, spatially neighboring amino acids, and side chain orientation for 298 substrate cysteine residues. Additionally, the annotations of protein molecular functions, biological processes, functional domains and human diseases are integrated to explore the functional and disease associations for S-nitrosoproteome. In this update, users are allowed to search a group of interested proteins/genes and the system reconstructs the SNO regulatory network based on the information of metabolic pathways and protein-protein interactions. Most importantly, an endogenous yet pathophysiological S-nitrosoproteomic dataset from colorectal cancer patients was adopted to demonstrate that dbSNO could discover potential SNO proteins involving in the regulation of NO signaling for cancer pathways.
Assuntos
Bases de Dados de Proteínas , Óxido Nítrico/metabolismo , Processamento de Proteína Pós-Traducional , Aminoácidos/química , Animais , Doença , Humanos , Internet , Redes e Vias Metabólicas , Camundongos , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Ratos , Transdução de SinaisRESUMO
Poly-γ-glutamic acid (γ-PGA) is a biodegradable biopolymer produced by several bacteria, including Bacillus subtilis and other Bacillus species; it has good biocompatibility, is non-toxic, and has various potential biological applications in the food, pharmaceutical, cosmetic, and other industries. In this review, we have described the mechanisms of γ-PGA synthesis and gene regulation, its role in fermentation, and the phylogenetic relationships among various pgsBCAE, a biosynthesis gene cluster of γ-PGA, and pgdS, a degradation gene of γ-PGA. We also discuss potential applications of γ-PGA and highlight the established genetic recombinant bacterial strains that produce high levels of γ-PGA, which can be useful for large-scale γ-PGA production.
Assuntos
Bacillus/metabolismo , Ácido Poliglutâmico/análogos & derivados , Bacillus/classificação , Bacillus/genética , Fermentação , Regulação Bacteriana da Expressão Gênica , Microbiologia Industrial/métodos , Filogenia , Ácido Poliglutâmico/biossíntese , Ácido Poliglutâmico/genéticaRESUMO
BACKGROUND: Tuberculosis (TB) is a serious infectious disease in that 90% of those latently infected with Mycobacterium tuberculosis present no symptoms, but possess a 10% lifetime chance of developing active TB. To prevent the spread of the disease, early diagnosis is crucial. However, current methods of detection require improvement in sensitivity, efficiency or specificity. In the present study, we conducted a microarray experiment, comparing the gene expression profiles in the peripheral blood mononuclear cells among individuals with active TB, latent infection, and healthy conditions in a Taiwanese population. RESULTS: Bioinformatics analysis revealed that most of the differentially expressed genes belonged to immune responses, inflammation pathways, and cell cycle control. Subsequent RT-PCR validation identified four differentially expressed genes, NEMF, ASUN, DHX29, and PTPRC, as potential biomarkers for the detection of active and latent TB infections. Receiver operating characteristic analysis showed that the expression level of PTPRC may discriminate active TB patients from healthy individuals, while ASUN could differentiate between the latent state of TB infection and healthy condidtion. In contrast, DHX29 may be used to identify latently infected individuals among active TB patients or healthy individuals. To test the concept of using these biomarkers as diagnostic support, we constructed classification models using these candidate biomarkers and found the Naïve Bayes-based model built with ASUN, DHX29, and PTPRC to yield the best performance. CONCLUSIONS: Our study demonstrated that gene expression profiles in the blood can be used to identify not only active TB patients, but also to differentiate latently infected patients from their healthy counterparts. Validation of the constructed computational model in a larger sample size would confirm the reliability of the biomarkers and facilitate the development of a cost-effective and sensitive molecular diagnostic platform for TB.
Assuntos
Biomarcadores/análise , Tuberculose Latente/diagnóstico , Mycobacterium tuberculosis/genética , Transcriptoma , Tuberculose/diagnóstico , Teorema de Bayes , Estudos de Casos e Controles , Perfilação da Expressão Gênica/métodos , Humanos , Tuberculose Latente/genética , Tuberculose Latente/microbiologia , Leucócitos Mononucleares/metabolismo , Análise em Microsséries , Curva ROC , Reprodutibilidade dos Testes , Tuberculose/genética , Tuberculose/microbiologiaRESUMO
Transmembrane (TM) proteins have crucial roles in various cellular processes. The location of post-translational modifications (PTMs) on TM proteins is associated with their functional roles in various cellular processes. Given the importance of PTMs in the functioning of TM proteins, this study developed topPTM (available online at http://topPTM.cse.yzu.edu.tw), a new dbPTM module that provides a public resource for identifying the functional PTM sites on TM proteins with structural topology. Experimentally verified TM topology data were integrated from TMPad, TOPDB, PDBTM and OPM. In addition to the PTMs obtained from dbPTM, experimentally verified PTM sites were manually extracted from research articles by text mining. In an attempt to provide a full investigation of PTM sites on TM proteins, all UniProtKB protein entries containing annotations related to membrane localization and TM topology were considered potential TM proteins. Two effective tools were then used to annotate the structural topology of the potential TM proteins. The TM topology of TM proteins is represented by graphical visualization, as well as by the PTM sites. To delineate the structural correlation between the PTM sites and TM topologies, the tertiary structure of PTM sites on TM proteins was visualized by Jmol program. Given the support of research articles by manual curation and the investigation of domain-domain interactions in Protein Data Bank, 1347 PTM substrate sites are associated with protein-protein interactions for 773 TM proteins. The database content is regularly updated on publication of new data by continuous surveys of research articles and available resources.