RESUMEN
BACKGROUND: Protein phosphorylation is one of the most prevalent posttranslational modifications involved in molecular control of cellular processes, and is mediated by over 520 protein kinases in humans and other mammals. Identification of the protein kinases responsible for phosphorylation events is key to understanding signaling pathways. Unbiased phosphoproteomics experiments have generated a wealth of data that can be used to identify protein kinase targets and their preferred substrate sequences. METHODS: This study utilized prior data from mass spectrometry-based studies identifying sites of protein phosphorylation after in vitro incubation of protein mixtures with recombinant protein kinases. PTM-Logo software was used with these data to generate position-dependent Shannon information matrices and sequence motif 'logos'. Webpages were constructed for facile access to logos for each kinase and a new stand-alone application was written in Python that uses the position-dependent Shannon information matrices to identify kinases most likely to phosphorylate a particular phosphorylation site. RESULTS: A database of kinase substrate target preference logos allows browsing, searching, or downloading target motif data for each protein kinase ( https://esbl.nhlbi.nih.gov/Databases/Kinase_Logos/ ). These logos were combined with phylogenetic analysis of protein kinase catalytic sequences to reveal substrate preference patterns specific to particular groups of kinases ( https://esbl.nhlbi.nih.gov/Databases/Kinase_Logos/KinaseTree.html ). A stand-alone program, KinasePredictor, is provided ( https://esbl.nhlbi.nih.gov/Databases/Kinase_Logos/KinasePredictor.html ). It takes as input, amino-acid sequences surrounding a given phosphorylation site and generates a ranked list of protein kinases most likely to phosphorylate that site. CONCLUSIONS: This study provides three new resources for protein kinase characterization. It provides a tool for prediction of kinase-substrate interactions, which in combination with other types of data (co-localization, etc.), can predict which kinases are likely responsible for a given phosphorylation event in a given tissue. Video Abstract.