Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
Int J Mol Sci ; 24(18)2023 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-37762466

RESUMO

In flowering plants, C4 photosynthesis is superior to C3 type in carbon fixation efficiency and adaptation to extreme environmental conditions, but the mechanisms behind the assembly of C4 machinery remain elusive. This study attempts to dissect the evolutionary divergence from C3 to C4 photosynthesis in five photosynthetic model plants from the grass family, using a combined comparative transcriptomics and deep learning technology. By examining and comparing gene expression levels in bundle sheath and mesophyll cells of five model plants, we identified 16 differentially expressed signature genes showing cell-specific expression patterns in C3 and C4 plants. Among them, two showed distinctively opposite cell-specific expression patterns in C3 vs. C4 plants (named as FOGs). The in silico physicochemical analysis of the two FOGs illustrated that C3 homologous proteins of LHCA6 had low and stable pI values of ~6, while the pI values of LHCA6 homologs increased drastically in C4 plants Setaria viridis (7), Zea mays (8), and Sorghum bicolor (over 9), suggesting this protein may have different functions in C3 and C4 plants. Interestingly, based on pairwise protein sequence/structure similarities between each homologous FOG protein, one FOG PGRL1A showed local inconsistency between sequence similarity and structure similarity. To find more examples of the evolutionary characteristics of FOG proteins, we investigated the protein sequence/structure similarities of other FOGs (transcription factors) and found that FOG proteins have diversified incompatibility between sequence and structure similarities during grass family evolution. This raised an interesting question as to whether the sequence similarity is related to structure similarity during C4 photosynthesis evolution.


Assuntos
Magnoliopsida , Setaria (Planta) , Sorghum , Zea mays/genética , Fotossíntese/genética
2.
Int J Mol Sci ; 24(4)2023 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-36835386

RESUMO

With climate change and labor shortages, direct-seeding rice cultivation is becoming popular worldwide, especially in Asia. Salinity stress negatively affects rice seed germination in the direct-seeding process, and the cultivation of suitable direct-seeding rice varieties under salinity stress is necessary. However, little is known about the underlying mechanism of salt responses during seed germination under salt stress. To investigate the salt tolerance mechanism at the seed germination stage, two contrasting rice genotypes differing in salt tolerance, namely, FL478 (salt-tolerant) and IR29 (salt-sensitive), were used in this study. We observed, that compared to IR29, FL478 appeared to be more tolerant to salt stress with a higher germination rate. GD1 (germination defective 1), which was involved in seed germination by regulating alpha-amylase, was upregulated significantly in the salt-sensitive IR29 strain under salt stress during germination. Transcriptomic data showed that salt-responsive genes tended to be up/downregulated in IR29 but not in FL478. Furthermore, we investigated the epigenetic changes in FL478 and IR29 during germination under saline treatment using whole genome bisulfite DNA sequencing (BS-seq) technology. BS-seq data showed that the global CHH methylation level increased dramatically under salinity stress in both strains, and the hyper CHH differentially methylated regions (DMRs) were predominantly located within the transposable elements regions. Compared with FL478, differentially expressed genes with DMRs in IR29 were mainly related to gene ontology terms such as response to water deprivation, response to salt stress, seed germination, and response to hydrogen peroxide pathways. These results may provide valuable insights into the genetic and epigenetic basis of salt tolerance at the seed germination stage, which is important for direct-seeding rice breeding.


Assuntos
Oryza , Transcriptoma , Oryza/genética , Epigenoma , Germinação , Melhoramento Vegetal , Estresse Salino , Genótipo
3.
Int J Mol Sci ; 23(20)2022 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-36293547

RESUMO

Proteins are modular functionalities regulating multiple cellular activities in prokaryotes and eukaryotes. As a consequence of higher plants adapting to arid and thermal conditions, C4 photosynthesis is the carbon fixation process involving multi-enzymes working in a coordinated fashion. However, how these enzymes interact with each other and whether they co-evolve in parallel to maintain interactions in different plants remain elusive to date. Here, we report our findings on the global protein co-evolution relationship and local dynamics of co-varying site shifts in key C4 photosynthetic enzymes. We found that in most of the selected key C4 photosynthetic enzymes, global pairwise co-evolution events exist to form functional couplings. Besides, protein-protein interactions between these enzymes may suggest their unknown functionalities in the carbon delivery process. For PEPC and PPCK regulation pairs, pocket formation at the interactive interface are not necessary for their function. This feature is distinct from another well-known regulation pair in C4 photosynthesis, namely, PPDK and PPDK-RP, where the pockets are necessary. Our findings facilitate the discovery of novel protein regulation types and contribute to expanding our knowledge about C4 photosynthesis.


Assuntos
Carbono , Fotossíntese , Carbono/metabolismo , Fotossíntese/fisiologia , Plantas/metabolismo , Ciclo do Carbono
4.
Development ; 149(18)2022 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-35950926

RESUMO

The morphology of the flowering plant is established during early embryogenesis. In recent years, many studies have focused on transcriptional profiling in plant embryogenesis, but the dynamic landscape of the Arabidopsis thaliana proteome remains elusive. In this study, Arabidopsis embryos at 2/4-cell, 8-cell, 16-cell, 32-cell, globular and heart stages were collected for nanoproteomic analysis. In total, 5386 proteins were identified. Of these, 1051 proteins were universally identified in all developmental stages and a range of 27 to 2154 proteins was found to be stage specific. These proteins could be grouped into eight clusters according to their expression levels. Gene Ontology enrichment analysis showed that genes involved in ribosome biogenesis and auxin-activated signalling were enriched during early embryogenesis, indicating that active translation and auxin signalling are important events in Arabidopsis embryo development. Combining RNA-sequencing data with the proteomics analysis, the correlation between mRNA and protein was evaluated. An overall positive correlation was found between mRNA and protein. This work provides a comprehensive landscape of the Arabidopsis proteome in early embryogenesis. Some important proteins/transcription factors identified through network analysis may serve as potential targets for future investigation.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Desenvolvimento Embrionário , Regulação da Expressão Gênica de Plantas , Ácidos Indolacéticos/metabolismo , Proteoma/metabolismo , RNA/metabolismo , RNA Mensageiro/metabolismo , Fatores de Transcrição/metabolismo
5.
Biology (Basel) ; 11(1)2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35053135

RESUMO

Tomato Fusarium wilt, caused by Fusarium oxysporum f. sp. lycopersici (Fol), is a destructive disease that threatens the agricultural production of tomatoes. In the present study, the biocontrol potential of strain KR2-7 against Fol was investigated through integrated genome mining and chemical analysis. Strain KR2-7 was identified as B. inaquosorum based on phylogenetic analysis. Through the genome mining of strain KR2-7, we identified nine antifungal and antibacterial compound biosynthetic gene clusters (BGCs) including fengycin, surfactin and Bacillomycin F, bacillaene, macrolactin, sporulation killing factor (skf), subtilosin A, bacilysin, and bacillibactin. The corresponding compounds were confirmed through MALDI-TOF-MS chemical analysis. The gene/gene clusters involved in plant colonization, plant growth promotion, and induced systemic resistance were also identified in the KR2-7 genome, and their related secondary metabolites were detected. In light of these results, the biocontrol potential of strain KR2-7 against tomato Fusarium wilt was identified. This study highlights the potential to use strain KR2-7 as a plant-growth promotion agent.

6.
Plant Physiol Biochem ; 168: 321-328, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34678644

RESUMO

ChIP-seq (Chromatin immunoprecipitation with sequencing) is the gold standard for determining genome-wide in vivo transcription factor binding sites, the first step for targets prediction and network construction. For non-model plants, it is challenging to perform ChIP-seq due to the difficulty in generating stable transgenic plants. AaHY5 is a positive regulator in artemisinin biosynthesis, whose detailed mode of action remains elusive. Here, we established a protoplast transformation procedure for Artemisia annua by optimizing different conditions in protoplast isolation and transfection. We then performed AaHY5 ChIP-seq based on the established transient expression system. Combining RNA-seq data for various tissues, we identified four transcription factors (one MYB and three WRKY family members) in AaHY5 targets that potentially regulated artemisinin biosynthesis. The three WRKY transcription factors could be induced by light and the overexpression of AaHY5 and upregulate two artemisinin biosynthetic genes, ADS and CYP71AV1. Furthermore, AaWRKY14 showed transcriptional activation activity on artemisinin biosynthetic gene CYP71AV1. Together, AaWRKY14 was identified as a potential transcription factor linking AaHY5 and the artemisinin biosynthetic gene regulation.


Assuntos
Artemisia annua , Artemisininas , Artemisia annua/genética , Artemisia annua/metabolismo , Sequenciamento de Cromatina por Imunoprecipitação , Regulação da Expressão Gênica de Plantas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas Geneticamente Modificadas/metabolismo
7.
Molecules ; 26(4)2021 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-33672342

RESUMO

Glandular trichome (GT) is the dominant site for artemisinin production in Artemisia annua. Several critical genes involved in artemisinin biosynthesis are specifically expressed in GT. However, the molecular mechanism of differential gene expression between GT and other tissue types remains elusive. Chromatin accessibility, defined as the degree to which nuclear molecules are able to interact with chromatin DNA, reflects gene expression capacity to a certain extent. Here, we investigated and compared the landscape of chromatin accessibility in Artemisia annua leaf and GT using the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) technique. We identified 5413 GT high accessible and 4045 GT low accessible regions, and these GT high accessible regions may contribute to GT-specific biological functions. Several GT-specific artemisinin biosynthetic genes, such as DBR2 and CYP71AV1, showed higher accessible regions in GT compared to that in leaf, implying that they might be regulated by chromatin accessibility. In addition, transcription factor binding motifs for MYB, bZIP, C2H2, and AP2 were overrepresented in the highly accessible chromatin regions associated with artemisinin biosynthetic genes in glandular trichomes. Finally, we proposed a working model illustrating the chromatin accessibility dynamics in regulating artemisinin biosynthetic gene expression. This work provided new insights into epigenetic regulation of gene expression in GT.


Assuntos
Artemisia annua/metabolismo , Artemisininas/metabolismo , Cromatina/metabolismo , Cromatina/genética , Folhas de Planta/metabolismo
8.
Front Genet ; 11: 551787, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33363566

RESUMO

As one of the most common malignant tumors worldwide, gastric adenocarcinoma (GC) and its prognosis are still poorly understood. Various genetic and epigenetic factors have been indicated in GC carcinogenesis. However, a comprehensive and in-depth investigation of epigenetic alteration in gastric cancer is still missing. In this study, we systematically investigated some key epigenetic features in GC, including DNA methylation and five core histone modifications. Data from The Cancer Genome Atlas Program and other studies (Gene Expression Omnibus) were collected, analyzed, and validated with multivariate statistical analysis methods. The landscape of epi-modifications in gastric cancer was described. Chromatin state transition analysis showed a histone marker shift in gastric cancer genome by employing a Hidden-Markov-Model based approach, indicated that histone marks tend to label different sets of genes in GC compared to control. An additive effect of these epigenetic marks was observed by integrated analysis with gene expression data, suggesting epigenetic modifications may cooperatively regulate gene expression. However, the effect of DNA methylation was found more significant without the presence of the five histone modifications in our study. By constructing a PPI network, key genes to distinguish GC from normal samples were identified, and distinct patterns of oncogenic pathways in GC were revealed. Some of these genes can also serve as potential biomarkers to classify various GC molecular subtypes. Our results provide important insights into the epigenetic regulation in gastric cancer and other cancers in general. This study describes the aberrant epigenetic variation pattern in GC and provides potential direction for epigenetic biomarker discovery.

9.
Int J Mol Sci ; 21(17)2020 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-32846981

RESUMO

Long noncoding RNA (lncRNA)/microRNA(miRNA)/mRNA triplets contribute to cancer biology. However, identifying significative triplets remains a major challenge for cancer research. The dynamic changes among factors of the triplets have been less understood. Here, by integrating target information and expression datasets, we proposed a novel computational framework to identify the triplets termed as "lncRNA-perturbated triplets". We applied the framework to five cancer datasets in The Cancer Genome Atlas (TCGA) project and identified 109 triplets. We showed that the paired miRNAs and mRNAs were widely perturbated by lncRNAs in different cancer types. LncRNA perturbators and lncRNA-perturbated mRNAs showed significantly higher evolutionary conservation than other lncRNAs and mRNAs. Importantly, the lncRNA-perturbated triplets exhibited high cancer specificity. The pan-cancer perturbator OIP5-AS1 had higher expression level than that of the cancer-specific perturbators. These lncRNA perturbators were significantly enriched in known cancer-related pathways. Furthermore, among the 25 lncRNA in the 109 triplets, lncRNA SNHG7 was identified as a stable potential biomarker in lung adenocarcinoma (LUAD) by combining the TCGA dataset and two independent GEO datasets. Results from cell transfection also indicated that overexpression of lncRNA SNHG7 and TUG1 enhanced the expression of the corresponding mRNA PNMA2 and CDC7 in LUAD. Our study provides a systematic dissection of lncRNA-perturbated triplets and facilitates our understanding of the molecular roles of lncRNAs in cancers.


Assuntos
MicroRNAs/genética , Neoplasias/genética , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Transcriptoma , Células A549 , Adenocarcinoma/genética , Adenocarcinoma/patologia , Biomarcadores Tumorais/genética , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Genômica/métodos , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Neoplasias/patologia , Prognóstico , Repetições de Trinucleotídeos/genética
10.
Biophys Chem ; 266: 106455, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32835911

RESUMO

Identifying drug targets is one of the major tasks in drug discovery. As experimental identification of targets is rather challenging, development of computational methods is necessary for efficient identification of drug-target interaction. Traditional computational method, such as docking, is based solely on the chemical structure, which is not available for most of the targets. On the other hand, bioassay data might contain information helpful for prediction of drug-target interaction. In this study, a feature enrichment method integrating bioassay and chemical structure data was developed to predict drug-target interaction. Using a large-scale benchmark on the datasets, we demonstrated that the model adopting integrated fingerprint outperformed the one using chemical fingerprint. Influence of the false positive hits in bioassays and algorithm-related factors on the model performance were also investigated. The results suggested that prediction by using integrated fingerprint was robust to false positive hits, the choice of classifiers, and different random splits of the datasets.


Assuntos
Algoritmos , Bioensaio , Biologia Computacional , Preparações Farmacêuticas/química , Bases de Dados Factuais , Descoberta de Drogas
11.
Molecules ; 24(13)2019 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-31262005

RESUMO

Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC50 value of 0.71 µM.


Assuntos
Integrase de HIV/química , HIV-1/enzimologia , Aprendizado de Máquina , Modelos Químicos , Simulação por Computador , Ligantes
12.
BMC Cancer ; 19(1): 263, 2019 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-30902072

RESUMO

BACKGROUND: Lung adenocarcinoma is the most common type of lung cancers. Whole-genome sequencing studies disclosed the genomic landscape of lung adenocarcinomas. however, it remains unclear if the genetic alternations could guide prognosis prediction. Effective genetic markers and their based prediction models are also at a lack for prognosis evaluation. METHODS: We obtained the somatic mutation data and clinical data for 371 lung adenocarcinoma cases from The Cancer Genome Atlas. The cases were classified into two prognostic groups (3-year survival), and a comparison was performed between the groups for the somatic mutation frequencies of genes, followed by development of computational models to discrete the different prognosis. RESULTS: Genes were found with higher mutation rates in good (≥ 3-year survival) than in poor (< 3-year survival) prognosis group of lung adenocarcinoma patients. Genes participating in cell-cell adhesion and motility were significantly enriched in the top gene list with mutation rate difference between the good and poor prognosis group. Support Vector Machine models with the gene somatic mutation features could well predict prognosis, and the performance improved as feature size increased. An 85-gene model reached an average cross-validated accuracy of 81% and an Area Under the Curve (AUC) of 0.896 for the Receiver Operating Characteristic (ROC) curves. The model also exhibited good inter-stage prognosis prediction performance, with an average AUC of 0.846 for the ROC curves. CONCLUSION: The prognosis of lung adenocarcinomas is related with somatic gene mutations. The genetic markers could be used for prognosis prediction and furthermore provide guidance for personal medicine.


Assuntos
Adenocarcinoma de Pulmão/mortalidade , Biomarcadores Tumorais/genética , Neoplasias Pulmonares/mortalidade , Modelos Biológicos , Máquina de Vetores de Suporte , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/patologia , Adenocarcinoma de Pulmão/terapia , Biologia Computacional , Conjuntos de Dados como Assunto , Estudos de Viabilidade , Genômica/métodos , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Neoplasias Pulmonares/terapia , Mutação , Medicina de Precisão/métodos , Prognóstico , Curva ROC , Análise de Sobrevida , Taxa de Sobrevida
13.
Entropy (Basel) ; 21(8)2019 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-33267482

RESUMO

Analysis of high-dimensional data is a challenge in machine learning and data mining. Feature selection plays an important role in dealing with high-dimensional data for improvement of predictive accuracy, as well as better interpretation of the data. Frequently used evaluation functions for feature selection include resampling methods such as cross-validation, which show an advantage in predictive accuracy. However, these conventional methods are not only computationally expensive, but also tend to be over-optimistic. We propose a novel cross-entropy which is based on beta distribution for feature selection. In beta distribution-based cross-entropy (BetaDCE) for feature selection, the probability density is estimated by beta distribution and the cross-entropy is computed by the expected value of beta distribution, so that the generalization ability can be estimated more precisely than conventional methods where the probability density is learnt from data. Analysis of the generalization ability of BetaDCE revealed that it was a trade-off between bias and variance. The robustness of BetaDCE was demonstrated by experiments on three types of data. In the exclusive or-like (XOR-like) dataset, the false discovery rate of BetaDCE was significantly smaller than that of other methods. For the leukemia dataset, the area under the curve (AUC) of BetaDCE on the test set was 0.93 with only four selected features, which indicated that BetaDCE not only detected the irrelevant and redundant features precisely, but also more accurately predicted the class labels with a smaller number of features than the original method, whose AUC was 0.83 with 50 features. In the metabonomic dataset, the overall AUC of prediction with features selected by BetaDCE was significantly larger than that by the original reported method. Therefore, BetaDCE can be used as a general and efficient framework for feature selection.

14.
Plant Cell Environ ; 42(4): 1302-1317, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30474863

RESUMO

Light is essential for the plant establishment. Arabidopsis seedlings germinated in the dark cannot grow leaf and only have closed cotyledons. However, exogenous application of H2 O2 can induce leaves (establishment) in the dark. Comparative transcriptomic analysis revealed that light-responsive genes were activated by H2 O2 treatment. These genes are functionally correlated with photosynthesis, photorespiration, and components of photosystem, such as antenna proteins and light-harvesting chlorophyll proteins. We further found that application of H2 O2 facilitates cell cycle by accelerating G2 -M checkpoint transition in shoot apical meristem. Phytochrome-mediated light signalling pathway was also involved in the H2 O2 -facilitated establishment process. The constitutive photomorphogenesis 1 and phytochrome interacting factor 3 proteins were shown to be down-regulated by H2 O2 treatment and accordingly removed their inhibitory effects on photomorphogenesis in the dark. The crosstalk between oxidation and light signal pathways explains the mechanism that H2 O2 regulates plant dark establishment. The endogenous photorespiratory H2 O2 production was mimicked by overexpression of glycolate oxidase genes and supplement of substrate glycolate. As expected, seedling establishment was also induced by the endogenously produced H2 O2 under dark condition. These findings also suggest that photorespiratory H2 O2 production is at least partially involved in postgermination establishment.


Assuntos
Arabidopsis/efeitos da radiação , Peróxido de Hidrogênio/farmacologia , Plântula/efeitos da radiação , Transdução de Sinais/efeitos da radiação , Oxirredutases do Álcool/metabolismo , Arabidopsis/efeitos dos fármacos , Arabidopsis/crescimento & desenvolvimento , Proteínas de Arabidopsis/metabolismo , Ciclo Celular/efeitos dos fármacos , Ciclo Celular/efeitos da radiação , Escuridão , Citometria de Fluxo , Luz , Microscopia Confocal , Microscopia Eletrônica de Varredura , Plantas Geneticamente Modificadas , Reação em Cadeia da Polimerase em Tempo Real , Plântula/efeitos dos fármacos , Plântula/crescimento & desenvolvimento , Transdução de Sinais/efeitos dos fármacos , Transdução de Sinais/fisiologia
15.
BMC Bioinformatics ; 18(1): 459, 2017 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-29065858

RESUMO

BACKGROUND: Pre-mRNA splicing is the removal of introns from precursor mRNAs (pre-mRNAs) and the concurrent ligation of the flanking exons to generate mature mRNA. This process is catalyzed by the spliceosome, where the splicing factor 1 (SF1) specifically recognizes the seven-nucleotide branch point sequence (BPS) and the U2 snRNP later displaces the SF1 and binds to the BPS. In mammals, the degeneracy of BPS motifs together with the lack of a large set of experimentally verified BPSs complicates the task of BPS prediction in silico. RESULTS: In this paper, we develop a simple and yet efficient heuristic model for human BPS prediction based on a novel scoring scheme, which quantifies the splicing strength of putative BPSs. The candidate BPS is restricted exclusively within a defined BPS search region to avoid the influences of other elements in the intron and therefore the prediction accuracy is improved. Moreover, using two types of relative frequencies for human BPS prediction, we demonstrate our model outperformed other current implementations on experimentally verified human introns. CONCLUSION: We propose that the binding energy contributes to the molecular recognition involved in human pre-mRNA splicing. In addition, a genome-wide human BPS prediction is carried out. The characteristics of predicted BPSs are in accordance with experimentally verified human BPSs, and branch site positions relative to the 3'ss and the 5'end of the shortened AGEZ are consistent with the results of published papers. Meanwhile, a webserver for BPS predictor is freely available at http://biocomputer.bio.cuhk.edu.hk/BPS .


Assuntos
Modelos Moleculares , RNA Mensageiro/metabolismo , Éxons , Humanos , Íntrons , Ligação Proteica , Precursores de RNA/metabolismo , Splicing de RNA , Ribonucleoproteína Nuclear Pequena U2/química , Ribonucleoproteína Nuclear Pequena U2/metabolismo , Termodinâmica
16.
Mol Biosyst ; 13(12): 2545-2550, 2017 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-28990628

RESUMO

Cysteine S-sulfenylation is a major type of posttranslational modification that contributes to protein structure and function regulation in many cellular processes. Experimental identification of S-sulfenylation sites is challenging, due to the low abundance of proteins and the inefficient experimental methods. Computational identification of S-sulfenylation sites is an alternative strategy to annotate the S-sulfenylated proteome. In this study, a novel computational predictor SulCysSite was developed for accurate prediction of S-sulfenylation sites based on multiple sequence features, including amino acid index properties, binary amino acid codes, position specific scoring matrix, and compositions of profile-based amino acids. To learn the prediction model of SulCysSite, a random forest classifier was applied. The final SulCysSite achieved an AUC value of 0.819 in a 10-fold cross-validation test. It also exhibited higher performance than other existing computational predictors. In addition, the hidden and complex mechanisms were extracted from the predictive model of SulCysSite to investigate the understandable rules (i.e. feature combination) of S-sulfenylation sites. The SulCysSite is a useful computational resource for prediction of S-sulfenylation sites. The online interface and datasets are publicly available at .


Assuntos
Biologia Computacional/métodos , Algoritmos , Sequência de Aminoácidos , Humanos , Matrizes de Pontuação de Posição Específica , Processamento de Proteína Pós-Traducional , Software , Máquina de Vetores de Suporte
17.
Int J Nanomedicine ; 12: 6303-6315, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28894368

RESUMO

Lysine succinylation, an important type of protein posttranslational modification, plays significant roles in many cellular processes. Accurate identification of succinylation sites can facilitate our understanding about the molecular mechanism and potential roles of lysine succinylation. However, even in well-studied systems, a majority of the succinylation sites remain undetected because the traditional experimental approaches to succinylation site identification are often costly, time-consuming, and laborious. In silico approach, on the other hand, is potentially an alternative strategy to predict succinylation substrates. In this paper, a novel computational predictor SuccinSite2.0 was developed for predicting generic and species-specific protein succinylation sites. This predictor takes the composition of profile-based amino acid and orthogonal binary features, which were used to train a random forest classifier. We demonstrated that the proposed SuccinSite2.0 predictor outperformed other currently existing implementations on a complementarily independent dataset. Furthermore, the important features that make visible contributions to species-specific and cross-species-specific prediction of protein succinylation site were analyzed. The proposed predictor is anticipated to be a useful computational resource for lysine succinylation site prediction. The integrated species-specific online tool of SuccinSite2.0 is publicly accessible.


Assuntos
Biologia Computacional/métodos , Lisina/metabolismo , Proteínas/metabolismo , Ácido Succínico/metabolismo , Simulação por Computador , Bases de Dados de Proteínas , Lisina/química , Processamento de Proteína Pós-Traducional , Proteínas/química , Software , Especificidade da Espécie
18.
Mol Cell Proteomics ; 16(10): 1815-1828, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28827280

RESUMO

Protein cysteinyl residues are the mediators of hydrogen peroxide (H2O2)-dependent redox signaling. However, site-specific mapping of the selectivity and dynamics of these redox reactions in cells poses a major analytical challenge. Here we describe a chemoproteomic platform to systematically and quantitatively analyze the reactivity of thousands of cysteines toward H2O2 in human cells. We identified >900 H2O2-sensitive cysteines, which are defined as the H2O2-dependent redoxome. Although redox sites associated with antioxidative and metabolic functions are consistent, most of the H2O2-dependent redoxome varies dramatically between different cells. Structural analyses reveal that H2O2-sensitive cysteines are less conserved than their redox-insensitive counterparts and display distinct sequence motifs, structural features, and potential for crosstalk with lysine modifications. Notably, our chemoproteomic platform also provides an opportunity to predict oxidation-triggered protein conformational changes. The data are freely accessible as a resource at http://redox.ncpsb.org/OXID/.


Assuntos
Cisteína/química , Peróxido de Hidrogênio/química , Proteoma/análise , Proteômica/métodos , Motivos de Aminoácidos , Linhagem Celular Tumoral , Simulação por Computador , Cisteína/análise , Células HEK293 , Células Hep G2 , Humanos , Peróxido de Hidrogênio/análise , Lisina/análise , Lisina/química , Oxirredução , Conformação Proteica , Proteoma/química
19.
Bioinformatics ; 33(20): 3166-3172, 2017 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-28633445

RESUMO

MOTIVATION: Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. RESULTS: We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. CONTACT: djguo@cuhk.edu.hk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Splicing de RNA , Análise de Sequência de RNA/métodos , Software , Algoritmos , Genoma Humano , Humanos
20.
BMC Genomics ; 18(1): 279, 2017 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-28376774

RESUMO

BACKGROUND: Disulfide bonds are traditionally considered to play only structural roles. In recent years, increasing evidence suggests that the disulfide proteome is made up of structural disulfides and reversible disulfides. Unlike structural disulfides, reversible disulfides are usually of important functional roles and may serve as redox switches. Interestingly, only specific disulfide bonds are reversible while others are not. However, whether reversible disulfides can be predicted based on structural information remains largely unknown. METHODS: In this study, two datasets with both types of disulfides were compiled using independent approaches. By comparison of various features extracted from the local structural signatures, we identified several features that differ significantly between reversible and structural disulfides, including disulfide bond length, along with the number, amino acid composition, secondary structure and physical-chemical properties of surrounding amino acids. A SVM-based classifier was developed for predicting reversible disulfides. RESULTS: By 10-fold cross-validation, the model achieved accuracy of 0.750, sensitivity of 0.352, specificity of 0.953, MCC of 0.405 and AUC of 0.751 using the RevSS_PDB dataset. The robustness was further validated by using RevSS_RedoxDB as independent testing dataset. This model was applied to proteins with known structures in the PDB database. The results show that one third of the predicted reversible disulfide containing proteins are well-known redox enzymes, while the remaining are non-enzyme proteins. Given that reversible disulfides are frequently reported from functionally important non-enzyme proteins such as transcription factors, the predictions may provide valuable candidates of novel reversible disulfides for further experimental investigation. CONCLUSIONS: This study provides the first comparative analysis between the reversible and the structural disulfides. Distinct features remarkably different between these two groups of disulfides were identified, and a SVM-based classifier for predicting reversible disulfides was developed accordingly. A web server named RevssPred can be accessed freely from: http://biocomputer.bio.cuhk.edu.hk/RevssPred .


Assuntos
Cistina/química , Proteínas/química , Software , Sequência de Aminoácidos , Simulação por Computador , Humanos , Modelos Moleculares , Estrutura Secundária de Proteína , Curva ROC , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA