ReviewBiochemistry

Revisiting protein kinase–substrate interactions: Toward therapeutic development

See allHide authors and affiliations

Sci. Signal.  22 Mar 2016:
Vol. 9, Issue 420, pp. re3
DOI: 10.1126/scisignal.aad4016

Gloss

Protein phosphorylation is a common posttranslational modification and is involved in many physiological and pathophysiological processes. Among other diseases, the deregulation of protein kinase activities can lead to cellular transformation and cancer. Thus, kinases are good drug targets. Understanding how kinases interact with their substrates may elucidate processes that lead to disease, as well as aid in the development of better, more specific kinase inhibitors with improved clinical success. In this Review, which contains 4 figures, 2 tables, and 129 references, we summarize the advances in understanding how kinases physically interact with their substrates and discuss the technologies to detect kinase substrates.

Abstract

Despite the efforts of pharmaceutical companies to develop specific kinase modulators, few drugs targeting kinases have been completely successful in the clinic. This is primarily due to the conserved nature of kinases, especially in the catalytic domains. Consequently, many currently available inhibitors lack sufficient selectivity for effective clinical application. Kinases phosphorylate their substrates to modulate their activity. One of the important steps in the catalytic reaction of protein phosphorylation is the correct positioning of the target residue within the catalytic site. This positioning is mediated by several regions in the substrate binding site, which is typically a shallow crevice that has critical subpockets that anchor and orient the substrate. The structural characterization of this protein-protein interaction can aid in the elucidation of the roles of distinct kinases in different cellular processes, the identification of substrates, and the development of specific inhibitors. Because the region of the substrate that is recognized by the kinase can be part of a linear consensus motif or a nonlinear motif, advances in technology beyond simple linear sequence scanning for consensus motifs were needed. Cost-effective bioinformatics tools are already frequently used to predict kinase-substrate interactions for linear consensus motifs, and new tools based on the structural data of these interactions improve the accuracy of these predictions and enable the identification of phosphorylation sites within nonlinear motifs. In this Review, we revisit kinase-substrate interactions and discuss the various approaches that can be used to identify them and analyze their binding structures for targeted drug development.

The Catalytic Domain of Eukaryotic Protein Kinases

Typically, eukaryotic protein kinases are composed of nonconserved regulatory domains and a conserved catalytic core of ~250 amino acid residues that binds and anchors substrates and is responsible for catalysis (1). The catalytic domain consists of two lobes called N and C (also known as small and large lobes, respectively), named for their N- or C-terminal position, respectively, within the domain. The N-lobe consists of five-stranded, anti-parallel β sheets that are an essential part of the adenosine triphosphate (ATP) binding site, whereas the C-lobe is mostly helical (Fig. 1A). The active-site cleft, which contains the ATP binding site, lies between the two lobes (2). In an activated kinase, the lobes converge to form a deep cleft where the adenine ring of ATP binds such that the γ-phosphate is positioned at the outer edge where the transfer of the phosphoryl group occurs, whereas the adenosine moiety is buried in a hydrophobic region of the pocket (Fig. 1B). Adjacent to the ATP binding pocket is a shallow crevice called the substrate binding site (SBS) that anchors the substrate and correctly positions the phosphorylatable residue (2).

Fig. 1

Structure of the catalytic domain of protein kinases. (A) Classical division of the catalytic domain, showing a small N-terminal lobe (blue) and the C-terminal lobe (red). (B) Same structure as in (A) highlighting the functional aspects of the catalytic domain. The ATP binding site (green) is present in the cleft between the lobes. The SBS (yellow) interacts directly with the substrate (purple), aiding the selectivity process. PKI, protein kinase inhibitor.

Catalysis is mediated by opening and closing of this active-site cleft. Substrates are anchored and positioned near this cleft so that the hydroxyl group of the phosphorylatable residue (termed P0) can accept the γ-phosphate. Flanking regions help stabilize the active kinase and are also essential for catalysis (1). Tyrosine kinases have a deeper cleft crevice around P0 than serine/threonine (Ser/Thr) kinases to better accommodate a bulky side chain (3).

An increase in the catalytic activity of kinases often leads to cancer (4); therefore, their activation must be tightly regulated, and several regulatory mechanisms maintain kinases inactive. One such mechanism is the intramolecular interactions that hide the catalytic domain in a kinase (5, 6). Then, to activate the kinase, conserved mechanisms involve movements in the activation loop and DFG (Asp-Phe-Gly) motif, exposing the ATP binding pocket (described further below). Other regulatory mechanisms are kinase-specific and involve blocking the SBS. For example, protein kinase C (PKC) presents a pseudosubstrate region in its regulatory domain that interacts with the SBS in the catalytic domain (5), aiding the maintenance of the inactive state of the kinase. This pseudosubstrate region is similar to PKC substrates except that the phosphorylatable residues Ser and Thr are substituted to Ala. For protein kinase A (PKA), a pseudosubstrate also helps maintain the kinase inactive; however, in this case, regulatory domains are encoded by separate genes (5). The SH2 and SH3 domains found in the regulatory domain of the nonreceptor tyrosine kinase SRC interact with the catalytic domain, thereby maintaining the kinase inactive. Dephosphorylation of a Tyr residue that interacts with the SH2 domain disrupts this intramolecular interaction and is an important step for SRC activation (6). Activation of the receptor-coupled tyrosine kinases typically occurs upon binding of a ligand and dimerization of the receptor, followed by conformational changes that make the catalytic domain available to interact with the substrates. Conformational changes are usually followed by a series of phosphorylations that lead to kinase activation (1).

Active kinases form what is called a regulatory hydrophobic spine (R-spine) that is assembled after the phosphorylation of the activation loop. This spine is composed of two residues from the N-lobe and two from the C-lobe (1). A catalytic spine (C-spine) is assembled upon ATP binding. Thus, kinase activation involves assembly of R- and C-spines, and inactivation involves disassembly of the R-spine. Once the R- and C-spines have been formed, kinases are termed “primed” for catalysis, and binding to scaffold proteins and substrates is enhanced (4). The activation loop, which in many kinases is the site of regulatory phosphorylations or interactions with activity modulators, shows considerable structural diversity (7). Many kinases are activated by the phosphorylation of the activation loop, which upon phosphorylation is released from the active site, thereby enabling substrate binding. Some kinases require more than one phosphorylation in the activation loop, and both auto- and heterophosphorylation of the activation loop may occur (4). At the N terminus of the activation loop, what is known as the DFG motif is found. In some kinases (such as ABL), this motif is flipped out in the inactive kinase (and referred to as “DFG out”) and flipped inward in the active kinase, exposing a hydrophobic (allosteric) pocket that can be a binding site for drugs (8). A “gatekeeper” residue helps regulate whether DFG is “in” or “out.” Mutations in the gatekeeper residue can lead to a constitutive activation of the kinase by changing the position of the activation loop and stabilizing the hydrophobic spine (9). The gatekeeper residue may also influence substrate specificity (10). Protein kinase activation has the objective of structurally positioning residues involved in catalysis and substrate binding that may be distant in the primary sequence.

Substrate Recognition by the SBS

Most substrates are anchored by binding to the C-lobe to facilitate phosphoryl group transfer. Furthermore, substrate anchoring is an important factor for determining kinase-substrate specificity and occurs mainly through electrostatic interactions. Variability between protein kinases is found in differences in charge and hydrophobicity of surface residues in the SBS of the catalytic domain, and this variability contributes to the specificity of kinase-substrate interactions (5). Analysis of the sequence surrounding the phosphorylated residue (referred to as P0) and use of synthetic peptides have shown that Ser/Thr kinases specifically recognize residues surrounding the P0 residue. These residues from the N to C terminus are named according to their positions relative to P0, namely P-3, P-2, P-1 and P1, P2, P3 (11), confer substrate specificity, and aid substrate anchoring. Thus, linear consensus motifs recognized by kinases have been established for most of the protein kinases by analysis of the primary structure of proteins and experimentation with synthetic peptide libraries (12). In an elegant study, Alexander and collaborators (13) used positional scanning–oriented peptide library screening (PS-OPLS) to independently test the position of each amino acid in a specific sequence and determine the consensus sequence preferably recognized by several mitotic kinases. These sequences were further validated as to their presence in the linear sequence of identified substrates. The authors demonstrated that despite overlapping localizations of the kinases, substrate recognition of specific motifs is particular to the different mitotic kinases, thus further strengthening the fact that kinases recognize specific consensus sequences (13).

There are five categories of Ser/Thr kinase substrates based on consensus recognition motifs formed by basic, prolyl, acidic, or hydrophobic residues or even previously phosphorylated seryl, threonyl, and tyrosyl residues (14). The active site interacts with four residues on either side of the phosphorylated P0 residue. More distant residues interact with regions outside the active site (3) but still may contribute to substrate specificity, as has been shown for glycogen synthase kinase 3 (GSK3) (15). These studies suggested that anchoring residues other than the phosphorylatable residue are important to achieve kinase-substrate specificity and affinity (15).

Linear consensus motifs recognized by tyrosine kinases are still being determined. The search for linear consensus motifs has been widely used as a strategy to find kinase-specific substrates and is further discussed below. For example, a crystal structure of PKA [Protein Data Bank (PDB): 3FJQ] (16) interacting with a pseudosubstrate peptide containing a linear consensus motif for PKA formed by flanking basic amino acids ([K/R][K/R]X[S/T]) is depicted in Fig. 2A. However, substrates frequently do not contain linear consensus sequences; about 13% of the PKA substrates reported in PhosphoSitePlus database do not contain a linear consensus sequence (17). Mutations or deletions of regions far from the phosphorylatable residue in casein kinase 2 affect autophosphorylation (18), and mutation of noncontiguous regions in the casein kinase 2 substrate 46-kD mannose 6-phosphate receptor (MPR 46) also affected substrate phosphorylation (19). Mutation of a linear consensus motif for PKA does not prevent the cyclic adenosine monophosphate (cAMP)–dependent PKA phosphorylation of acetylcholinesterase, suggesting that PKA recognized nonlinear consensus motifs (20). Site-directed mutagenesis of the basic residues (Lys163 and Lys164) located far from a phosphorylated threonine (Thr253) in α-tubulin decreases phosphorylation of this residue by PKC, and structural analysis shows that these lysines form a PKC consensus phosphorylation site resembling a linear consensus named “structurally formed consensus motif” (Fig. 2B). Additionally, experimentally validated sites phosphorylated by PKA (17) deposited in the PhosphoSitePlus data bank that did not contain a linear consensus motif were modeled and suggested to also contain structurally formed consensus motifs (17).

Fig. 2

Examples of protein kinases interacting with substrates with linear and structurally formed consensus sites. (A) Crystal structure of PKA (PDB: 3FJQ; gray) (16) interacting with a pseudosubstrate peptide that presents the linear consensus (yellow). The red spots on the surface of PKA highlight acidic residues that interact with the P-2 and P-3 positions of the substrate (blue sticks, both arginines), which determine the PKA/PKC consensus phosphorylation motif as [K/R][K/R]X[S/T]. The purple spot on the surface denotes the catalytic residue, and the orange mark on the substrate denotes the position of the phosphorylated residue (P0). (B) Model of PKC (gray) interacting with a known substrate (α-tubulin, blue). Acidic residues that interact with the basic residues on the substrate are depicted in the SBS (red). The phosphorylatable residue, Thr253 (orange) of α-tubulin is not found in a PKA/PKC phosphorylation motif. Structural analyses of this substrate revealed that basic residues Lys163 and Lys164 (blue sticks) are close to Thr253, presenting a spatial conformation similar to the linear substrate.

The idea that phosphorylation occurs mainly in unstructured flexible regions has been proposed as an explanation for the lack of a linear consensus sites because unstructured regions could easily accommodate to the catalytic site (21), but this notion should be revisited. Recently, it was shown that 37% of phosphorylation sites reported occur in structured regions (22). Thus, the concept of a conformational motif seen only in structured proteins where noncontiguous residues come close to form conformations similar to the ones found in linear consensus sites can at least in part explain kinase interactions with substrates lacking linear consensus motifs. This perception confirms the importance of anchoring residues in determining substrate specificity and in catalysis itself and suggests that phosphorylation in structured, less flexible regions of proteins may also have important functions and should be underscored. By taking tridimensional structures into consideration, detection and validation of substrates containing “conformationally formed consensus motifs” will aid in developing a new software that tries to attribute specific kinases to substrates or predict phosphorylation sites (as discussed below).

Substrate-Anchoring Domains in Kinases

Substrate recruitment by interactions with other regions of the kinase, mainly but not exclusively in the regulatory domain, has also been suggested to be important for kinase-substrate interactions (23). Substrate binding regions that are not in the active site (though often present in the catalytic domain) are frequently called “docking domains” or docking sites. Docking sites aid in substrate anchoring to and recognition specificity by the kinase. In some cases, substrate docking may also have an allosteric effect (24). These docking sites in kinases often bind short peptide motifs (such as SH2, SH3, PH, and C2 domains) (25) that mediate specific protein-protein interactions (PPIs). Besides the target substrate, these interactions at docking sites often involve scaffold proteins or phosphatases. Interactions with scaffold proteins increase the local concentration of substrates (discussed in more detail below), whereas interactions with phosphatases help sequester inactive kinases in the cytoplasm and may compete with substrates for kinase binding (26). Sites other than the active site that interact with substrates and are important for substrate recognition and specificity are observed in many kinases, particularly members of the mitogen-activated protein kinases (MAPKs), such as extracellular signal–regulated kinases 1 and 2 (ERK1/2), p38, c-Jun N-terminal kinase (JNK), and ERK5.

The MAPK pathway is composed of (i) MAP kinase kinase kinase (MAP3K), (ii) MAP kinase kinase (MAP2K), and (iii) MAPK. Dual phosphorylation of Tyr and Thr residues at the activation loop activates MAPKs, and inactivation is mediated by several phosphatases. This family of kinases contains common docking (CD) motifs, usually found in close proximity of the catalytic site, that interact with D-motifs (also known as Kim motifs, D boxes, or DEJL domains) that are found in their specific substrates and phosphatases. D-motifs are 13 to 16 amino acids long and composed of a cluster of positively charged residues surrounded by hydrophobic residues (27), namely, (R/K)2–3-X2–6-UA-X-UB, where U is any hydrophobic residue (28). Thus, interaction with the D-motif is mainly hydrophobic and electrostatic. Recognition of the D-motif is one of the factors determining substrate specificity for the kinases in the MAPK signaling cascade, and most of the proteins interacting with MAPKs contain D-motifs (28). Peptides derived from different MAPK substrates, including the kinases MAPKAP (MAPK-activated protein) kinase 2 (MK2, a p38α substrate), MAPK-interacting Ser/Thr protein kinase 1 (MNK1, an ERK1 and p38α substrate), and the transcription factor nuclear factor of activated T cells, cytoplasmic 3 (NFAT4, a JNK1 substrate that also interacts with p38α), all contain linear D-motifs (Fig. 3A). Peptides containing D-motifs can discriminate effectively between JNK and p38 but not between p38 and ERK (28). In this case, besides the CD domain of the kinases, specificity is also found in a site called ED that also contributes to substrate docking. Swapping ED residues in ERK2 for ED residues in p38 changes the substrate specificity of ERK2 to that of p38. Besides the above-listed substrates, scaffold proteins [such as JNK-interacting protein 1 (JIP1)] (29) and some phosphatases [such as dual specificity phosphatases (DUSPs), Tyr and Ser/Thr phosphatases, hematopoietic Tyr phosphatase (HePTP), immune system–specific, striatum-enriched phosphatase (STEP), and brain-specific, STEP-like PTP] interact with MAPKs (28) at a D-motif.

Fig. 3

Examples of protein kinases interacting through the docking site. (A) Structure of protein kinase p38α (gray; PDB: 3GC7) (122) with the docking site highlighted (orange). Three peptides of substrates from different MAPKs are overlapped to show small differences in physical interactions that determine specificity: in blue, a peptide from p38α substrate MK2 (PDB: 2OKR) (123); in red, a peptide from ERK1 substrate, MNK1 (PDB: 2Y9Q) (124); and in green, a peptide from JNK1 substrate, NFAT4 (PDB: 2XRW) (124). The peptide in pink is a spatial reference for the location of the SBS, showing the distance between these two key interaction sites in p38α. (B) Structure of p38α interacting with phosphatase MKP5 (pink) compared to the interaction with the linear peptide from a p38α substrate (blue), showing that the interaction on the docking site also presents the possibility of “structural consensus specificity,” which can be relevant for substrate specificity.

Substrate docking sites have also been described in Tyr kinases. For example, in the C-terminal SRC kinase (CSK), a kinase that specifically phosphorylates SRC, mutations in the docking site of CSK decrease SRC phosphorylation but only partially affect general kinase activity (30). Likewise, cyclin-dependent kinases (CDKs) are activated by cyclins that recognize specific docking motifs in substrates. For example, substrates containing LP motifs (enriched in Leu and Pro residues) are preferentially recognized by the G1/S cyclin complex Cln1/2-CDK2, whereas RXL motifs are preferentially recognized by S-phase cyclins. A conserved region in Cln1/2 first described in yeast was shown to recognize the LP motif and aid in the phosphorylation of substrates with multiple phosphorylation sites. Disruption of substrate binding to the docking site delayed the transition between G1 and S phases (31). As has been shown for p38α, substrate docking not only ensures specificity but also has an allosteric effect on kinase activity, enhancing substrate phosphorylation and enabling phosphorylation even under low concentrations of ATP. ATP binding and substrate docking are cooperative; thus, previous ATP binding may assist the docking of substrates that otherwise have low affinities for their docking sites (24). Docking sites found in substrates and phosphatases generally are linear and short; however, the presence of a “conformational D-motif” has also been reported in MAPK phosphatase 5 (MKP5). Crystal structure analysis revealed that the binding of p38α to MKP5 is mediated by distinct helical regions in the phosphatase that come together to form the kinase binding domain (Fig. 3B) (32). These results further support the idea that structured regions are also important for the establishment of kinase interactions with its substrates and phosphatases because conformation can aid the binding of kinases to substrates (as discussed above).

One of the factors that controls signal transduction pathways is the balance between kinase and phosphatase activity toward a substrate. Regulation of these activities can be mediated by docking of the kinase or phosphatase to a specific substrate, which is the case of the retinoblastoma tumor suppressor protein (Rb). Upon mitogen stimulation, CDK phosphorylates Rb, thereby coordinating the initiation of S phase. The dephosphorylation of Rb by protein phosphatase 1 (PP1) is required for mitotic exit. Because the binding sites for the kinase and phosphatase occupy the same region on Rb, the kinase and phosphatase compete for the same binding site. This is a mechanism of regulating the antagonistic processes of Rb regulated by phosphorylation and dephosphorylation (33). Besides substrate docking sites within kinases, scaffold proteins that bind to kinases also help determine substrate and phosphatase specificity, in some cases enhancing substrate phosphorylation, as discussed below (34).

Substrate Scaffolding by Adaptor Proteins

Substrates are frequently found in low abundance in cells. Thus, cells have found other means to increase kinase-substrate interactions. In cells, scaffold proteins are essential components of signal transduction, increasing the kinase-substrate interactions and reaction kinetics. Scaffold proteins also contribute to substrate specificity and localization of kinases at different subcellular locations within cells (23, 35).

Substrate recruitment in cells is an essential step because it increases local concentration of the substrate and thus the frequency of proximity between kinase and substrate. The idea that recruitment mediated by scaffolds could substitute for the absence of linear consensus motifs in some substrates (23) should be revisited to include the fact that structurally formed consensus motifs could be present in substrates that do not contain linear motifs. Nevertheless, scaffolding is underscored, specifically within the context of the cell where, besides increasing the interaction with substrates in low concentrations, scaffold proteins position kinases and substrates within specific subcellular locations. Several reviews discuss scaffold proteins; here, we acknowledge that they are also key components of signal transduction pathways and are important for the recognition and interaction of the substrate by a specific kinase. Interactions between kinases and scaffold proteins have also been explored to develop more specific kinase modulators. One well-studied example is RACKs (receptors for activated C kinase) that help anchor active PKCs to specific subcellular locations and often promote the interaction of a specific PKC isoenzyme with its substrates (36). Peptide inhibitors derived from RACK binding sites in PKC inhibit isoenzyme-specific interactions between a specific PKC isoenzyme and its RACK, whereas peptide agonists promote these interactions (37). A second example is the A-kinase–anchoring proteins (AKAPs) that anchor inactive PKA to specific subcellular locations and promote substrate kinase interactions upon increases in cAMP concentrations. Inhibition of this interaction inhibits PKA signaling at distinct subcellular locations within cells (38).

Identification and Prediction of Specific Kinase Substrates

Biochemical methods

Even though kinases have been studied for several years, few physiological substrates, meaning proteins that would be phosphorylated by a specific kinase in cells, have been established. One of the reasons for this is the fact that kinase-substrate interactions are transient. Thus, several methods have been developed to detect kinase-specific substrates. Here, we describe some of the most recently used methods (Table 1).

Table 1. Biochemical methods used to detect protein kinase substrates.

Listed are various methods using biochemical approaches to predict substrates on the basis of linear and structural motifs.

View this table:

With the development of more sensitive equipment, mass spectrometry is frequently coupled to other methods used to detect kinase-specific substrates (39). Among these are separation of proteins by two-dimensional gel electrophoresis stained with phospho-specific dyes (40) and analysis of more complex samples with, for example, stable isotope labeling of amino acids in cell culture (SILAC) (41, 42) combined with genetic (overexpression or knockdown) (43) or pharmacological manipulation [with small molecules (44) or peptides (45, 46)] of a specific kinase. However, an important limitation of mass spectrometry is that the signal of proteins that are present in low abundance is frequently suppressed in favor of those that are more abundant (47). The combination of subcellular fractionation and techniques of phosphopeptide enrichment has improved the detection of low-abundance proteins and phosphorylation sites (48).

Immunoprecipitation using kinase-specific antibodies combined with mass spectrometry is also applied to discover candidate substrates. However, kinase-substrate copurification is often difficult because of the transient nature of this interaction. After substrate phosphorylation, the negative charge of the added phosphate frequently repels the kinase, disrupting the association between the substrate and the kinase (49). To improve kinase-substrate coimmunoprecipitation efficiency, many methods to covalently cross-link proteins have been developed (15, 5053). Immunoprecipitation of substrates with specific antibodies that recognize proteins that contain a phosphorylated residue within a consensus sequence has also been used. For example, in detecting substrates of the kinase AKT in 3T3-L1 adipocytes stimulated with insulin (54), new AKT substrates that contain a Rab GAP (guanosine triphosphatase–activating protein) domain and two phosphotyrosine binding (PTB) domains were identified (54). Although there has been much success using antibodies against phosphorylation motifs for screening of kinase substrates (5457), the identification of a specific kinase responsible for a given phosphorylation event remains a major challenge because many kinases recognize similar consensus sequences. However, substrate promiscuity may occur also within cells because many kinases exhibit overlapping functions (3, 58). Because linear and structurally formed consensus sites are similar in conformations, antibodies against phosphorylation motifs may be able to detect both types of substrates in immunoprecipitation assays under native conditions.

An important drawback of immunoprecipitation assays is the frequent nonspecific binding of proteins to resins, and several strategies have been developed to eliminate the detection of these proteins. Therefore, it is recommended to cross-link the antibody to the beads used during immunoprecipitation assays to reduce contaminating the antibodies (59). Despite these difficulties found when using immunoprecipitation assays combined with mass spectrometry, this has still been a technique of choice to find substrates with the advantage of being able to detect substrates and signaling complexes.

An alternate assay called the kinase-interacting substrate screening (KISS) assay consists of binding a specific kinase to beads, which then interact with proteins in cell lysates. Bound proteins are subsequently digested with trypsin and enriched for phosphopeptides, which are then detected by mass spectrometry. Using this method, for example, 356 phosphorylation sites of 140 proteins were identified as candidate substrates for Rho-associated kinase (ROCK2), among which some of the substrates detected were validated and shown to interact with ROCK2 within the context of cells (60).

Another method called the yeast two-hybrid (Y2H) system is used to detect a physical association between a specific kinase and substrate within the context of cells. In the Y2H system, a DNA sequence including the coding region of the catalytic domain of a kinase is fused to the DNA binding domain of the transcription factor Gal4p (this serves as a bait protein). At the same time, a complementary DNA (cDNA) library is constructed to encode for putative substrates in fusion with a transcriptional activation domain in Gal4p (these serve as prey proteins). If the bait (kinase) and prey proteins (substrate) interact, they reconstitute a functional Gal4p that activates the expression of a reporter gene in a yeast strain (61, 62). Despite the use of a heterologous system, the Y2H system has the advantage of being able to detect PPIs within a cellular context. However, Y2H assays have some limitations, mainly associated with high frequency of false positives due to overexpression conditions of both kinases and putative substrates. In addition, phosphorylation is a transient event (as noted above), and consequently, the fast interaction between the substrate and the kinase is often not sufficient to activate the reporter gene. Despite these biases, the use of the Y2H system has successfully identified some protein kinase substrates (62, 63). One of the first papers showing the importance of the Y2H system in identifying PKC substrates was published by Staudinger and colleagues in 1995. The authors used the catalytic region of PKCα fused to the DNA binding domain of yeast GAL4 as bait to screen a mouse T cell cDNA library, in which cDNA was fused to the GAL4 activation domain. Using this approach, it was possible to identify the PKC substrate, protein interacting with PKCα 1 (PICK1). More recently, de Souza and colleagues used the Y2H approach to study the functions of the Ser/Thr kinase NEK7 and identified putative binding partners and new substrates. Immunoprecipitation assays combined with mass spectrometry analysis validated kinase-substrate interactions and the phosphorylation status of identified proteins. This analysis suggested that NEK7 is involved in key cell division processes and chromosome segregation (63).

Other methods developed from the Y2H system to detect PPIs have been successfully used to detect kinase substrates. In the split-ubiquitin system (SUS), interacting proteins are combined with the N- or C-terminal half of ubiquitin, and upon successful interaction, a complete ubiquitin moiety is reconstituted and coupled to a metabolic readout, such as degradation of the orotidine 5′-phosphate decarboxylase (URA3) fused to the C-terminal half of ubiquitin. Degradation of URA3 prevents the conversion of 5-fluoroorotic acid monohydrate (5-FOA) to the toxic compound 5-fluorouracil; thus, when there is a PPI, the cells survive in the presence of 5-FOA (64). Such an assay was used to detect substrates of CDKA in Arabidopsis (65). A variation of the SUS is the bimolecular fluorescence complementation [BiFC; also called split yellow fluorescent protein (YFP)] assay (66). Pusch and collaborators used both the SUS and BiFC systems to detect and validate in vivo substrates for the Arabidopsis CDKA. A few of the substrates were further validated to show that CDKA phosphorylates proteins that control the redox state by the cell cycle (65).

An assay described in 2012 by Xue and colleagues (67) uses an integrated proteomic strategy termed “kinase assay linked with phosphoproteomics” (KALIP); this assay combines a sensitive kinase reaction with endogenous kinase-dependent phosphoproteomics to identify direct substrates of protein kinases. This assay consists of preparing cell lysates, digesting the cellular proteins, and then dephosphorylating them. The proteins are then phosphorylated in vitro with a specific kinase, and phosphorylation sites are detected by mass spectrometry. KALIP may not be effective for kinases that require priming phosphorylation events (such as GSK3), additional interacting surfaces, or a docking site on the protein (such as in the case of CSK). In addition, because of loss of localization information when the cell is lysed, this approach cannot eliminate certain false positives in cases where substrates of other kinases have similar motifs as the kinase of interest. Furthermore, digestion may abolish certain motifs, and structurally formed phosphorylation sites will not be detected using this approach.

Metabolic labeling enabled the isolation of different subsets of the proteome containing posttranslational modifications (68). In 2007, Green and Pflum validated that several kinases can use γ-biotinylated ATP as a substrate, thus transferring phosphobiotin directly to the substrates (69). Although the range of kinases that can accept biotin-ATP as a substrate and the precise catalytic parameters for its use have not been determined, this approach holds promise as a generally accessible way to identify new kinase substrates through direct labeling (69).

Chemical biology approaches are also being successfully developed to find kinase-specific substrates. In 1997, Shokat and colleagues published an approach using [γ-32P]-labeled orthogonal ATP analogs containing sterically bulky groups in the adenosyl moiety. These analogs were used by analog-sensitive kinase generated by site-directed mutagenesis of the ATP binding site, thus generating substrates labeled with specific analogs, which were then detected by mass spectrometry (70). Because only the mutant kinase was active on these bulky ATP analogs, this approach allowed identification of substrates for only the engineered kinase v-Src (70). The Shokat group later developed another set of ATP analogs by replacing the [32P]phosphate with γ-thiophosphate and specifically enriched the thiophosphate-modified proteins using iodoacetyl agarose (71). This technique was then successfully used to identify >70 substrates of the CDK1–cyclin B complex (72). The disadvantage of this approach, however, is that some ATP analogs are not cell-permeable, and therefore, one cannot detect substrates within the cellular context. However, both linear and structurally formed phosphorylation sites can be detected, and the transient nature of kinase-substrate interactions would not hinder the detection of substrates because they are labeled with the ATP analogs.

Computational methods

Despite the improvement in techniques used to detect phosphorylated amino acid residues, credited largely to advances in mass spectrometry, it still has been difficult to find phosphorylation sites of low-abundance substrates within the cellular context. Ideally, predictive tools capable of finding kinase-specific substrates would be more cost-effective and could predict phosphorylation of low-abundance proteins. With an increasing number of validated substrates deposited in several phosphoprotein data banks, new computational approaches can be developed to more accurately predict kinase-specific substrates.

Predictive tools can be briefly summarized from a computational point of view as an artificial intelligence algorithm focused on finding recognizable patterns over samples of phosphorylation data (Table 2). Most of the available methods rely on the evaluation of the amino acid sequence, which is apparent in a review of computational methods for predicting eukaryotic phosphorylation sites, in which there are references for 29 methods that use only the primary structure of proteins, whereas only 10 use some kind of structural information (73).

Table 2. Computational methods used to predict protein kinase substrates.

Listed are various methods using computational tools to predict substrates on the basis of sequence or structure.

View this table:

The importance of using three-dimensional (3D) information in phosphorylation prediction studies is not a new concept. One of the first methods ever developed in this field, NetPhos (74), was already a first attempt to use structural data to predict phosphorylation sites. Blom et al. justify the NetPhos development, stating that “it is obvious that what the kinase actually recognizes is the three-dimensional structure of the polypeptide at the acceptor residue, and not the primary structure” (74). This method is based on a neural network fed with structural backbone information of phosphorylated sites. Although at the time this method presented an overall worse performance than sequence-based neural network methods, it was able to correctly predict sites that did not match the typical consensus, therefore generating negative predictions on sequence-based methods but including structural features that resembled some of the known phosphorylation sites. This method was later updated to deal with kinase specificity (75).

Whereas NetPhos used a statistical learning algorithm, thus generating predictions on computer-generated parameters, Predkin (76) was developed totally based on structural analyses. On the basis of crystal structures of PKA, phosphorylase kinase (PHK), and CDK2 with bound substrate peptides, key residues for their interactions were determined, and a set of rules to determine specificity was generated. Predkin 2.0 (77) implemented a more robust approach, in which the substrate-determining residues are predicted by a scoring scheme on the basis of substrate-weighted matrices. In the DREAM (Dialogue on Reverse Engineering Assessment and Methods) challenge, an open science initiative aiming at promoting collaborative efforts joining computational and experimental biologists focused on methods to reverse engineer cellular networks from high-throughput data (78), Predkin had the best performance in specificity on the category of peptide recognition domain, in which participants had to determine the specificity of uncharacterized protein kinases and compare it with previously unknown experimental data (79).

An increase in the complexity of the newer versions of NetPhos and Predkin indicates new trends in the field. DISPHOS (disorder-enhanced phosphorylation predictor) (80) adds physical chemical parameters in a learning reinforcement algorithm. Plewczynski and collaborators used a structural fragment database on a support vector machine (SVM) method (81). pkaPS (82) uses biochemical properties of the amino acids for a highly specific method to predict PKA substrates. NetworKIN (83) combines sequence-based phosphorylation prediction with protein network interaction data.

Although all these methods brought innovative approaches and an interesting increase in prediction capabilities, they all share a central characteristic with the sequence-based methods: They limit the prospection around the phosphorylation site based on the primary structure. Some methods (such as Predkin) work with 7-residue long peptides (76), whereas others (such as pkaPS) (82) use a ≤81-residue window. This creates a limitation because the structure of the sequential residues may not be representative of the phosphorylation site environment on the protein context. On the folded protein, the residues spatially flanking the phosphorylation site are not necessarily in proximity on the primary structure. The 3D structure of phosphorylation sites on proteins has been studied for some time. Phospho3D (84) is a public database of this kind of structures. The new version of this database, Phospho3D 2.0 (22), includes a 3D zone tool that compares information of the residue composition on the spatial surroundings of the phosphorylated residue (P0).

Prediction tools also incorporated this kind of information. Phos3D (85) evaluated a long list of physical-chemical characteristics of the region within 2 and 10 Å of P0. Using a statistical approach on an SVM algorithm, an improvement on the prediction capabilities was achieved, although the general conclusion was that most of the discriminatory effect was connected to the local one-dimensional sequence. Using a similar approach in a more recent work, Su and Lee (86) were able to outperform most of the sequence-based methods. The bioinformatics methods that use structural knowledge to predicted phosphorylation sites and the search space of each one of them are listed in Table 2.

Computational methods present a great set of advantages. They are inexpensive compared to experimental techniques, fast in that they are capable of analyzing whole genomes, and easily available to everyone. Innovative methods are flourishing. One such method is the Phosphorylation Set Enrichment Analysis (PSEA), an analog of Gene Set Enrichment Analysis (GSEA), used for analyses of DNA microarrays, which is adapted for the analysis of phosphorylated substrates (87). However, all the existing methods, especially the structure-based ones, can be further improved by the advances in the related fields and increased knowledge about the structural basis of kinase-substrate interactions.

Once the substrates are detected or predicted, they should be validated. The most commonly used biochemical method to determine kinase activity toward substrates is the in vitro kinase assay, in which the purified kinase is incubated with a putative substrate in the presence of ATP (67). In vitro phosphorylation frequently may differ from what takes place physiologically. First, the use of concentrated, purified kinase in vitro is partially responsible for a lower specificity. Second, the use of exogenous kinases outside cellular contexts often leads to a loss of physiological regulatory mechanisms (67). The substrates must interact with a specific kinase within the context of a living cell. Therefore, site-directed mutagenesis and phospho-specific antibodies are commonly used to confirm phosphorylation sites both in vitro and in vivo (17).

Perhaps because of the transient nature of kinase-substrate interactions, relatively few crystal structures of kinases with their respective substrates have been reported (8898). Nuclear magnetic resonance (NMR) has also been used to map kinase-substrate interaction sites (99, 100). Improved techniques that can directly determine the structural nature of kinase-substrate interactions or increasing the affinity between the kinase and the substrate may lead to a better understanding of the structural nature of this interaction and the development of more specific kinase inhibitors.

The SBS as a Drug Targeting Site

More than 4500 lead compounds have been described as kinase inhibitors. Among these, 40 compounds were launched and 27 compounds are in clinical phase 3 (data obtained from Thomson Reuters Integrity, 2015). However, none of these target the SBS or interactions with substrates. Figure 4 depicts the different types of kinase inhibitors and how they overlap with each other.

Fig. 4

Different regions (mainly N-lobe) of the catalytic domain of protein kinases explored as drug binding sites. (A) The ATP pocket is targeted by type 1 inhibitors (PDB: 1FMO) (88). (B) A pocket formed in the DFG-out conformation is targeted by type 2 inhibitors, such as imatinib [PDB: 2HYY (125)], depicted in darker blue sticks. (C) Type 3 inhibitors target a hydrophobic pocket (but not the ATP binding region) released in DFG-out conformations. For example, depicted in darker red sticks within the red hydrophobic pocket is the non–ATP-competitive inhibitor N-{4-[(1S)-1,2-dihydroxyethyl]benzyl}-N-methyl-4-(phenylsulfamoyl)benzamide of human LIMK2 kinase domain (PDB: 4TPT) (126). (D) A pocket formed in the surface of the N-lobe of MEK1 binds the non–ATP-competitive inhibitor 2-([3R-3,4-dihydroxybutyl]oxy)-4-fluoro-6-[(2-fluoro-4-iodophenyl)amino]benzamide (PDB: 4ARK) (127). (E) A shallow crevice and ATP binding pocket are occupied by an inhibitor formed by a synthetic peptide linked to thiophosphoric acid o-((adenosyl-phospho)phospho)-s-acetamidyl-diester, a typical type 4 inhibitor (magenta sticks; PDB: 1GAG) (117). (F) General view of all surfaces of pockets used by different inhibitor types. The reference structure (gray) is PDB: 2HYY (125).

Several of these compounds are ATP-competitive inhibitors and are called type 1 inhibitors. Because ATP binding sites are highly conserved among the different kinases (as discussed above), most ATP binding site competitive inhibitors are not very selective. Furthermore, intracellular concentrations of ATP may be high, which contributes to the low efficiency of this class of compounds. Both specificity and efficiency problems have been undertaken by the discovery of inhibitors that bind the inactive conformation of kinases (“DFG-out,” as discussed above), which is less conserved among the different kinases. Early examples of these compounds, called type 2 inhibitors, include imatinib (STI571), BIRB796, and sorafenib (BAY43-9006) (101). Furthermore, type 3 inhibitors bind the catalytic domain of the kinase close to the ATP binding site but do not interact with the hinge region. Rather, type 3 inhibitors can interact with the hydrophobic (allosteric) pocket generated by the DFG-out conformation (102). Type 4 inhibitors bind allosteric regions around the catalytic domain, and type 5 inhibitors are bivalent or bisubstrate compounds that at least in part would occupy the SBS and the ATP binding site (103).

The main strategy to overcome the issue of lack of specificity would be to explore allosteric sites or PPIs between kinases and substrates, scaffolds, and modulator proteins. Competitive inhibitors of substrate binding are difficult to develop because kinase-substrate interactions are usually found at shallow crevices, making it difficult to find small molecules that would directly compete for substrate binding.

The development of inhibitors of PPIs is not trivial. Such inhibitors frequently violate Lipinski’s “rule-of-five” (104), which consists of a group of experimental parameters important for the pharmacokinetics of the drug in the human body and help determine the drugability of a compound: (i) no more than 5 hydrogen bond donors (the total number of nitrogen-hydrogen and oxygen-hydrogen bonds), (ii) no more than 10 hydrogen bond acceptors (all nitrogen or oxygen atoms), (iii) a molecular weight of less than 500 daltons, and (iv) an octanol-water partition coefficient log P not greater than 5 (ratio of concentrations of a compound in a mixture of two immiscible phases at equilibrium) (105). Several molecules that do not follow these rules, for example, are larger than 500 daltons, effectively bind shallow pockets (106) and could be good PPIs. The development of inhibitors of PPIs is highly dependent on new chemical approaches that involve the combination of hits from screenings of fragment libraries (107) and generation of new libraries using diversity-oriented synthesis (a strategy aimed at improving compound synthesis frequently building compounds upon an original scaffold) (107) or even libraries of large macrocycles (ring-shaped molecules containing 12 or more ring atoms) (106). Drugability of these compounds that do not follow Lipinski’s rules (105) will have to be determined de novo because they do not have the same pharmacokinetic properties of the commonly used kinase inhibitors.

Peptides have proven to be good inhibitors of PPIs despite their well-known limitations (108). Pseudosubstrate peptides, which contain the consensus motifs recognized by a specific kinase but with an Ala substitution for the phosphorylatable residue, efficiently inhibit substrate phosphorylation [revised in (109)]. In another example, a peptide mimicking the docking site potently inhibited CSK-mediated phosphorylation of SRC but only moderately inhibited its general kinase activity, suggesting that inhibition of substrate docking sites may be a good strategy to develop specific kinase-substrate interaction inhibitors (30). The disadvantage of using peptides is that often their delivery to cells is difficult, despite the fact that several peptide delivery systems have been described, being the most popular attachment to cell-penetrating peptides (108). Peptide inhibitors of scaffold proteins have been shown to efficiently inhibit kinase activity; examples of these have been extensively discussed elsewhere (37, 110, 111).

Small molecules that are either substrate competitors or allosteric inhibitors have also been found. For example, benzothiazinone compounds are relatively specific substrate-competitive inhibitors and allosteric modulators of GSK3, a key enzyme involved in the regulation of glycogen metabolism and thus a target for diabetes (112).

The Ser/Thr kinase Polo-like kinase 1 (PLK1), which is involved in mitosis, has a substrate binding domain termed “polo-box domain” (PBD). Peptides that compete with PBD binding to substrates are selective PLK inhibitors, as are fragment-ligated inhibitory peptides (113). Recently, a selective PLK1 small-molecule inhibitor that blocks substrate binding to PBD was rationally developed using docking strategies and is relatively selective and effective both in vitro and in vivo (114). Screening assays generally find ATP-competitive inhibitors; therefore, there is a need to develop new assays to find allosteric or substrate-competitive inhibitors. For example, the substrate activity screening assay, which is based on the conversion of substrates into inhibitors, was developed to find a small-molecule competitor of the kinase c-Src that was effective both in vitro and in cells and acted synergistically with ATP-competitive inhibitors (115).

Assurance that a small molecule is in fact a substrate-competitive inhibitor may be difficult at times and some small molecules that were initially described as substrate-competitive inhibitors were later shown to have other targets or to in fact be ATP-competitive inhibitors. For an extensive review of small molecules that are substrate-competitive inhibitors, see Breen and Soellner (116).

To this end, bispecific type 5 inhibitors constitute a clear evolution. Structurally, they are bifunctional, occupying both niches or rather substrate and ATP binding sites (Fig. 4E) (117) [extensively revised (103)]. Usually, type 5 inhibitors are peptide/small-molecule hybrids (118); their drugability is still to be explored.

Future Perspectives for Clinical Development

Several studies demonstrate the power of interfering with PPI as a new frontier for drug development. Particularly, in the case of protein kinases, this may be a way of overcoming the problem of lack of specificity of kinase inhibitors. Peptides that interfere with PPIs involving kinases and scaffold proteins or other binding proteins are being developed (37, 110, 111) and may serve as lead compounds as more specific kinase inhibitors. At the substrate end, because structurally formed consensus sites require correct folding, one can envision developing inhibitors that bind to the substrate in a manner that interferes with this folding, thereby inhibiting its interaction with a kinase.

One of the reasons for the limited success of kinase inhibitors in clinical settings is the lack of specificity of these inhibitors. With the advancement in mass spectrometry, chemical biology techniques, and predictive tools, more kinase-specific substrates have been and will be detected. The detection of substrates, together with advancements in the area of NMR and crystallography aimed at mapping interaction sites, will be helpful to further understand the structural basis of the interaction between kinases and their substrates and to develop more accurate predictive tools.

To this end, one can envision the prospect that kinase and substrate mimetopes based on the SBS and neighboring regions may have several applications. For basic research and elucidating signal transduction cascades, these mimetopes can be competitive inhibitors used to detect substrates when coupled to phosphoproteomic approaches. Furthermore, key kinase-substrate interactions have been found to be important for certain diseases (119); thus, as a diagnostic or prognostic tool, we can use peptides derived from mimotopes to produce antibodies that can differ substrate-bound and unbound kinases.

There are very few structures of kinases bound to substrates; the elucidation of structures of new complexes and molecular modeling approaches will be essential for the design of mimotopes of kinase-bound states. For therapeutic purposes, mimetopes of either kinases or substrates can serve as drug leads to design mimetics that can specifically compete with the interaction between a depicted kinase and a specific substrate. In some cases, such as certain types of cancer (120) and Parkinson’s disease (121), in which increasing the catalytic activity of a kinase toward a specific substrate may be beneficial, understanding the nature of this interaction may help design allosteric activators.

Overall, the knowledge of PPI surfaces involved in kinase-substrate recognition and activation will certainly be prominent for basic research and development of more specific therapeutics for cancer and beyond.

REFERENCES AND NOTES

Acknowledgments: We would like to thank L. Devi (Mount Sinai School of Medicine) for helpful discussions. Funding: Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) grants 2012/24154-4 and 2015/21786-8, Conselho Nacional de Desenvolvímento Científico e Tecnológico (CNPq) grant 473665/2012-3, and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) grant 88881.0622072014-01 to D.S.; CAPES postdoctoral fellowship to F.A.M. (88881.0622072014-01); and FAPESP doctoral fellowships to D.A.P. (2011/10321-3) and D.T.P. (2015/17812-3).
View Abstract

Navigate This Article