Your browser doesn't support javascript.
loading
GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.
Li, Fuyi; Li, Chen; Wang, Mingjun; Webb, Geoffrey I; Zhang, Yang; Whisstock, James C; Song, Jiangning.
Afiliación
  • Li F; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Li C; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Wang M; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Webb GI; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Zhang Y; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Whisstock JC; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
  • Song J; College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology,
Bioinformatics ; 31(9): 1411-9, 2015 May 01.
Article en En | MEDLINE | ID: mdl-25568279
ABSTRACT
MOTIVATION Glycosylation is a ubiquitous type of protein post-translational modification (PTM) in eukaryotic cells, which plays vital roles in various biological processes (BPs) such as cellular communication, ligand recognition and subcellular recognition. It is estimated that >50% of the entire human proteome is glycosylated. However, it is still a significant challenge to identify glycosylation sites, which requires expensive/laborious experimental research. Thus, bioinformatics approaches that can predict the glycan occupancy at specific sequons in protein sequences would be useful for understanding and utilizing this important PTM.

RESULTS:

In this study, we present a novel bioinformatics tool called GlycoMine, which is a comprehensive tool for the systematic in silico identification of C-linked, N-linked, and O-linked glycosylation sites in the human proteome. GlycoMine was developed using the random forest algorithm and evaluated based on a well-prepared up-to-date benchmark dataset that encompasses all three types of glycosylation sites, which was curated from multiple public resources. Heterogeneous sequences and functional features were derived from various sources, and subjected to further two-step feature selection to characterize a condensed subset of optimal features that contributed most to the type-specific prediction of glycosylation sites. Five-fold cross-validation and independent tests show that this approach significantly improved the prediction performance compared with four existing prediction tools NetNGlyc, NetOGlyc, EnsembleGly and GPP. We demonstrated that this tool could identify candidate glycosylation sites in case study proteins and applied it to identify many high-confidence glycosylation target proteins by screening the entire human proteome. AVAILABILITY AND IMPLEMENTATION The webserver, Java Applet, user instructions, datasets, and predicted glycosylation sites in the human proteome are freely available at http//www.structbioinfor.org/Lab/GlycoMine/. CONTACT Jiangning.Song@monash.edu or James.Whisstock@monash.edu or zhangyang@nwsuaf.edu.cn SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Asunto(s)

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Programas Informáticos / Inteligencia Artificial / Procesamiento Proteico-Postraduccional / Proteoma Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Bioinformatics Año: 2015 Tipo del documento: Article

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Programas Informáticos / Inteligencia Artificial / Procesamiento Proteico-Postraduccional / Proteoma Tipo de estudio: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Bioinformatics Año: 2015 Tipo del documento: Article