Research ResourceSystems Biology

The switches.ELM Resource: A Compendium of Conditional Regulatory Interaction Interfaces

See allHide authors and affiliations

Science Signaling  02 Apr 2013:
Vol. 6, Issue 269, pp. rs7
DOI: 10.1126/scisignal.2003345

Abstract

Short linear motifs (SLiMs) are protein interaction sites that play an important role in cell regulation by controlling protein activity, localization, and local abundance. The functionality of a SLiM can be modulated in a context-dependent manner to induce a gain, loss, or exchange of binding partners, which will affect the function of the SLiM-containing protein. As such, these conditional interactions underlie molecular decision-making in cell signaling. We identified multiple types of pre- and posttranslational switch mechanisms that can regulate the function of a SLiM and thereby control its interactions. The collected examples of experimentally characterized SLiM-based switch mechanisms were curated in the freely accessible switches.ELM resource (http://switches.elm.eu.org). On the basis of these examples, we defined and integrated rules to analyze SLiMs for putative regulatory switch mechanisms. We applied these rules to known validated SLiMs, providing evidence that more than half of these are likely to be pre- or posttranslationally regulated. In addition, we showed that posttranslationally modified sites are enriched around SLiMs, which enables cooperative and integrative regulation of protein interaction interfaces. We foresee switches.ELM complementing available resources to extend our knowledge of the molecular mechanisms underlying cell signaling.

Introduction

Tight regulation of protein function is vital for reliable and robust control of cell physiology and is enforced at every stage of a protein’s life cycle (1). This includes control of local protein abundance by modulating gene transcription, mRNA decay, translation efficiency, protein degradation, subcellular localization, and scaffolding (1, 2). Furthermore, pretranslational mechanisms, such as alternative promoter usage, alternative splicing, and RNA editing, can produce functionally distinct protein species derived from the same gene (35). Most commonly, however, protein function is controlled at the posttranslational level through interactions mainly with other proteins but also with other cellular components including membranes, small molecules, and nucleic acids (69). Regulatory interactions are often context-dependent, and the function of a protein can be altered by rewiring its interaction network in response to a specific signal. As such, these conditional interactions control whether and how subsequent signaling proceeds to elicit appropriate responses to internal and external cues, and thereby mediate molecular decision-making in cell regulation (1016).

Regulatory protein interactions are often mediated by motifs [also referred to as short linear motifs (SLiMs)], a subset of compact and degenerate binding modules typically located in the proteome’s intrinsically disordered regions (IDRs), which lack a stable tertiary structure under native conditions (13, 1720). SLiMs are a spatially efficient and convergently evolvable solution for encoding interaction interfaces (21, 22). They are ubiquitous in eukaryotic proteomes and mediate many regulatory functions, including directing ligand binding, providing docking sites for modifying enzymes, controlling protein stability, and acting as signals to target proteins to specific subcellular locations. Typically, only three to four residues of a SLiM determine the majority of the binding specificity and affinity; hence, motif-mediated interactions occur with low affinity, are transient and reversible, and can be easily modulated (15, 20, 22). These intrinsic properties make SLiMs ideal regulatory modules and enable them to conditionally switch between their “on” and “off” states or between multiple, functionally distinct on states. Switching of a motif between different functional states can be mediated by multiple mechanisms that often depend on posttranslational modifications (PTMs) and the cooperative or competitive use of multiple overlapping or adjacent SLiMs (7, 15, 22, 23). Conditional switching of motif functionality can stimulate a gain, loss, or exchange of binding partners and thereby regulate the function of SLiM-containing proteins in a context-dependent manner.

Pretranslational mechanisms can irreversibly control the presence and binding properties of a SLiM, respectively, by producing protein isoforms that lack a specific motif or have a motif with altered flanking regions (5, 2428). Posttranslationally, several mechanisms, most commonly protein modification, can reversibly regulate motif function by switching a single motif on or off, or by gradually altering its binding strength (2934). When multiple SLiMs are present, they can be used cooperatively to mediate multivalent binding, having a more than additive effect on protein interaction strength (7, 3538). Alternatively, multiple overlapping or adjacent SLiMs can bind competitively to mutually exclusive interaction partners. A signal can shift the balance of the competition in favor of a specific binding partner (39, 40) or mediate a sequential exchange of binding partners by altering the specificity of the SLiM-containing region in a stepwise manner (41, 42). These different modes of motif regulation are widely used in the cell and mediate conditional, combinatorial, and fine-tuned control of protein function (15). This includes controlling the subcellular localization of proteins or protein isoforms, temporally and spatially regulating their stability, and directing the assembly of functionally distinct signaling complexes depending on prevailing conditions (1, 15, 37, 38).

Despite being prevalent in and imperative for dynamic and robust cell regulation, the mechanisms that control SLiM-mediated interactions through context-dependent switching of motif function, and thereby control the function of SLiM-containing proteins and the processes they regulate, are not fully captured in the currently available resources. Here, we present the switches.ELM resource (http://switches.elm.eu.org), which consists of a categorized repository of 710 manually curated, experimentally validated instances of SLiM-based switch mechanisms that we collected from the literature. On the basis of these instances, we defined a set of rules to suggest switch mechanisms for SLiMs. These rules were integrated in the resource to enable users to explore a motif of interest for putative regulatory mechanisms and to direct them to the relevant literature and data sources. We then applied these rules to analyze the 2070 experimentally validated SLiMs that are manually curated in the Eukaryotic Linear Motif (ELM) database (43). This survey provides suggestive evidence that more than half of the characterized SLiMs are likely to be pre- or posttranslationally regulated. In addition, we elaborated on a previous study (23) and found that modification sites are enriched in regions flanking the validated SLiMs from ELM, and that a large portion of modification sites regulating annotated switch instances are modified by multiple enzymes. Finally, we found that SLiMs involved in the curated switches are more likely to be located in nonconstitutive exons compared to SLiMs in general, and are thus more likely to be removed from specific protein isoforms.

Results

Collection and categorization of switches.ELM data

In the context of switches.ELM, a “switch mechanism” is a regulatory process that controls SLiM-mediated interactions by modulating the function of a SLiM, and hence affects the function of the SLiM-containing protein because of the concomitant gain, loss, or exchange of binding partners. The modulation of motif function can be mediated by a pretranslational mechanism, the addition of a PTM, or the binding of a molecule. The experimentally validated instances of SLiM-based switch mechanisms curated in the switches.ELM resource were collected by a literature survey (see Materials and Methods). On the basis of the mechanism that mediates switching of motif functionality, these instances were classified into different switch types and subtypes, listed with examples in Table 1 and explained in detail here. It should be taken into consideration that the curated switch instances were regarded in isolation; however, in a biological context, these definitions partially overlap and the different mechanisms combine to mediate higher-order regulatory functions that allow integration of multiple signals. Also, some instances have a unique regulatory mechanism and are currently labeled as uncategorized switches. As more instances accumulate, these switches may be worthy of a novel switch type or subtype.

Table 1

Examples for the different SLiM-based switch mechanism types and subtypes. Defined residues of a motif are in bold. Residues requiring phosphorylation in order for the motif to be functional are preceded by a lowercase p, whereas a residue that will be phosphorylated by a kinase that interacts with and modifies a motif is written in brackets.

View this table:

Simple “binary” switches shift between two distinct functional states of a motif. This can be either the on and off states or two on states with different affinities for a single binding partner. On the basis of the mechanism that regulates switching between the two states, different subtypes were defined. A first subtype comprises the “altered physicochemical compatibility” switches that are regulated by addition of a PTM to a residue in or adjacent to a SLiM. Owing to their specific structural and physicochemical properties, PTMs can alter the intrinsic specificity or affinity of a SLiM for its binding partner (2931). For example, the addition of a PTM to a specific residue within a SLiM is often a prerequisite for a SLiM-mediated interaction (44), whereas a PTM adjacent to a SLiM can enhance the strength of an interaction (45). Alternatively, a PTM can partially or completely inhibit an interaction by rendering a SLiM physicochemically incompatible with its binding partner (46). The “allostery” switches are a second subtype comprising SLiMs or SLiM-binding globular protein domains whose binding properties are modulated indirectly by allosteric effects induced by the addition of a PTM or by the binding of an effector molecule to a site that is distinct from the actual interaction interface (35, 4749). “Pretranslational” switches are the third and final subtype of binary switches. These are mediated by pretranslational mechanisms that irreversibly alter motif presence or function, either by isoform-specific removal of a SLiM-containing exon or by generating isoforms whose SLiMs show different binding specificities or affinities due to variations in their flanking regions (5, 2527, 5053).

Other more complex mechanisms exist to regulate the function of a single motif. The “cumulative” switches modulate motif functionality by multisite modification, that is, the addition of PTMs to multiple residues in or adjacent to the motif (32). A subtype of cumulative switches is the “rheostatic” switches, which gradually alter the affinity of a motif for a single binding partner by the addition of multiple PTMs that additively strengthen or weaken the interaction. This enables fine-tuned control of motif activity depending on the strength of a signal or integration of multiple signals (33, 34, 54). Another more complex switch type acting on single motifs consists of “preassembly” switches, which require prior formation of a complex as a prerequisite for a SLiM-mediated interaction to occur. A subtype of the preassembly switch is the prerequisite formation of a “composite binding site” that spans more than one component of the complex. Because neither component on its own contains a functional motif-binding domain, binding of the motif only occurs in the context of the active, fully assembled complex (55).

The small footprint of SLiMs facilitates the occurrence of regions with high functional density containing multiple motifs that can bind cooperatively or competitively. “Avidity-sensing” switches involve multiple motifs that mediate high-avidity binding, effected by low-affinity single interactions that have a more than additive effect on binding strength (7, 35, 37). These high-avidity binding events can be mediated by multiple motifs within a single protein, even if the motifs are separated by large distances. The motifs can also be distributed over more than one protein, thereby promoting multiprotein complex formation. High-avidity interactions are indispensable for cell signaling because they mediate the assembly of large metastable signaling complexes at a biologically relevant time scale and contribute to the switch-like activation of the emergent properties of these complexes (36, 38). Alternatively, adjacent or overlapping motifs can bind in a mutually exclusive manner and thus promote competitive binding (22). The outcome of the competition depends on the intrinsic specificity and affinity of the SLiMs for, as well as the local abundance of, the competitors; modulating either of these can shift the balance in favor of a specific interactor. Mutually exclusive interaction interfaces are characterized by distinct on states, and “specificity” switches can mediate a regulated exchange of the distinct binding partners through different mechanisms, characterized by their subtypes. “Competition” switches between mutually exclusive SLiMs depend on local target protein abundance, which can be regulated by changing the expression level or subcellular localization of the competitors, as well as by scaffolding (1, 15, 40, 56, 57). “Altered binding specificity” switches tip the balance of the competition for mutually exclusive SLiMs in favor of one of the interactors by PTM-dependent modulation of the intrinsic specificity or affinity of the binding region (39). In addition, multiple, successive PTMs allow sequential switching of different binding partners in an ordered manner by stepwise alteration of binding specificity (41). A common and biologically important subset of competition switches involves mutually exclusive interactions where there is strong preferential binding of one of the competing interactors, either due to a large difference in the intrinsic specificity or affinity of the binding region for a specific competitor or due to a large difference in the local abundance of the competitors. The interaction that is favored thus inhibits binding of a mutually exclusive interactor. “Motif hiding” switches can preclude a motif from binding by sterically blocking its accessibility through binding of a protein to an overlapping or adjacent motif (58, 59). Similarly, “domain hiding” switches can sterically block binding of a SLiM to a domain by binding of another molecule to an overlapping site on the domain (60, 61). With respect to the functional outcome, these switches act in a binary fashion, and activation of the masked binding region depends on active removal of the preferred interactor.

Curation and visualization of switches.ELM data

The first release (version 1.0, March 2013) of the switches.ELM resource contained 710 manually curated switch instances, involving 817 distinct interactions mediated by 664 different SLiMs in 409 proteins. For each of these instances, different types of information were annotated to accurately describe the switch mechanism (see Materials and Methods). Each switch instance contains one or more SLiM-mediated interactions, with each interaction comprising two interactors. A specific switch mechanism modulates the function of a SLiM in one of the interactors and thereby regulates its interactions. The switch mechanism depends on a pretranslational mechanism, the addition of a PTM, or the binding of a molecule. It can be reversible or irreversible and can either positively or negatively affect a specific interaction. A positive effect can be either induction or enhancement of an interaction, whereas a negative effect can be partial inhibition or complete abrogation of an interaction. Also, interdependencies between different interactions were annotated. An interaction can be mutually exclusive with one or more other interactions or, in contrast, require one or more previous interactions to occur. Interactions that occur in a specific, defined order were numbered according to their position in the sequence of binding events. The context dependency of the switches is captured by adding temporal, spatial, and contextual conditionality to the interaction data. These data were either manually curated (table S1) or computationally inferred (table S2) (see Materials and Methods). This includes, when applicable, manual curation of the cell cycle phases during which a switch is active. Also, subcellular localization data were manually annotated, retrieved from manually curated localization data in ELM (43), or inferred from the Gene Ontology (GO) resource (62). Furthermore, pathways in which a switch is involved were manually annotated or inferred from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (63). For interactions regulated by PTMs, modifying enzymes were manually curated or inferred from the PhosphoSitePlus (64) and Phospho.ELM (65) resources. All curated and inferred data can be downloaded from the Web site in tab-separated values (tsv) format. Data from version 1.0 are supplied in tables S1 and S2.

The curated switches can be browsed by different categories, which are color-coded, or searched on the basis of all annotated information by user-defined search terms. Each switch instance can be viewed in a visual output that gives a clear and detailed overview of the regulatory mechanisms (fig. S1), including a concise text description, the type of switch, the participating molecules, the binding interfaces that mediate the interactions, and information on how interactions are regulated. The curated and inferred cell cycle phase, subcellular localization, pathway, and modifying enzyme data are provided as contextual information. For a sequence-centric view, an alignment of orthologous proteins corresponding to the protein region containing the SLiM of interest is shown (fig. S1).

Biologically relevant data, including genomic data (66), interaction interfaces (43, 67), modification sites (64, 65), mutation data (66), and structural information (68, 69), are mapped to the region and displayed. A diagram of the complete protein architecture is also provided to situate a switch in the context of the whole protein. Data displayed in this overview include other switch instances curated for the protein, SLiMs, protein domains, PTM sites, and secondary structure. Detailed descriptions for all data used in the switches output view are shown upon hovering over a feature, whereas clicking on the feature navigates to the corresponding external data source (43, 6268, 7073).

The switches.ELM exploratory analysis tool

To enable users to submit and analyze a motif of interest for putative switch mechanisms regulating its function, we developed a rule-based analysis tool and integrated it in switches.ELM (fig. S2). The tool is designed to be exploratory and is thus deliberately overpredictive because it puts forward all hypotheses concerning regulation of the entered motif that can be inferred given the available data. Further reading of the relevant literature, which is presented to the user, is necessary to gain an understanding of the true regulatory mechanisms. The switch mechanisms for a motif suggested by the tool are inferred from rules that were defined for the different switch types and subtypes on the basis of SLiM-specific modes of regulation and the curated switch instances (see Materials and Methods).

Some SLiM classes annotated in ELM are not intrinsically active and require switching to be functional. For motifs that can be mapped to the motif patterns defined for the different motif classes in ELM (43), rules specific to these motif classes are applied. For example, some motif classes require a PTM for binding, whereas others are allosterically regulated or depend on the formation of a composite binding site. Also, some SLiM classes are known to mediate multivalent binding when at least two motifs of the same class are present in a protein. Finally, some motifs have been observed to mediate intramolecular interactions, which enable regulation by motif and domain hiding if a protein contains both the motif and the corresponding motif-binding domain.

Additional general rules were defined on the basis of the collected switch instances. The presence of one or more experimentally determined PTM sites, retrieved from PhosphoSitePlus (64) and Phospho.ELM (65), within 20 residues upstream or downstream of a motif is suggestive of an altered physicochemical compatibility or a rheostatic switch mechanism, respectively. For motifs contained within a nonconstitutive exon that is specifically removed from one or more protein isoforms [on the basis of isoform data curated in UniProtKB (66)], pretranslational switching is suggested. The presence of overlapping or adjacent (within 20 residues) motifs might confer regulation by competition or motif hiding. Altered binding specificity is proposed as a switch mechanism if the conditions for both competition and altered physicochemical compatibility are met.

Analysis of ELM instances for putative switch mechanisms

To suggest novel regulatory mechanisms for previously characterized SLiMs, we applied the set of rules describing the switch mechanisms to the 2070 experimentally validated SLiMs curated in the ELM database as of November 2012 (43) (see Materials and Methods). A total of 1789 switch mechanisms were proposed for the 2070 ELM instances, and hence, on average a motif annotated in ELM was suggested to be regulated by 0.86 switch mechanisms (Fig. 1A). Confidence in the suggested switch mechanisms varies. Several of these are inherently correct, such as pretranslational switching of functional motifs in nonconstitutive exons, whereas others, such as rheostatic switches, would require extensive biochemical characterization to be validated. In all, 1058 (51.1%) of the 2070 ELM instances had proteomic or genomic evidence suggestive of a pre- or posttranslational regulatory switch mechanism (Fig. 1B). Of these, 375 (18.1%) had experimental evidence for a switch mechanism available and were curated in switches.ELM. A total of 681 (32.9%) ELM instances were modified by PTMs on residues within or adjacent (within five residues of a defined residue) to the motif (Fig. 1A), whereas 214 (10.3%) were modified on defined residues (excluding modification sites used to induce binding to protein domains that recognize a specific PTM on a specific motif residue). These defined residues are typically in direct contact with the binding partner and contribute most of the binding specificity and affinity. Hence, the addition of a PTM to such a residue will likely have a strong effect on the binding properties of the motif. In addition, 218 instances (10.5%) were modified by PTMs on multiple adjacent positions (within five residues of a defined residue), potentially enabling integration of upstream signals acting rheostatically, through priming or through independent modification sites (Fig. 1A). Also, 115 (5.6%) of the SLiMs annotated in ELM occurred within 20 residues of each other, suggesting that they may mediate mutually exclusive binding events (Fig. 1A). Finally, 304 (14.7%) of the ELM instances occurred in nonconstitutive exons, corroborating a recent study showing preferential splicing of SLiM-containing exons (24) (Fig. 1A). All switch mechanisms suggested by the switches.ELM tool to regulate motifs curated in ELM are listed in table S3 and are available as an interactive results page on the switches.ELM Web site.

Fig. 1

Analysis of ELM instances. (A) Bar plot of the number of ELM instances suggested to be regulated by a specific switch mechanism (light gray) and the number of these already curated in switches.ELM (dark gray). (B) Pie chart of the number of ELM instances known to be regulated by a switch mechanism (Curated), suggested to be regulated by a switch mechanism using the switches.ELM rules (Suggested), and with no information that indicates regulation by a switch mechanism (None identified). (C) Plot of the proportion of known PTM sites flanking ELM instances. The central vertical line denotes the position of the motif. Twenty-five residues before the first and after the last residue of the motif are plotted. The gray-shaded region indicates the adjacent positions (within five residues of the motif). The horizontal line denotes the mean proportion of residues modified for the 50 residues shown. Black circles denote positions that have more than the mean proportion of PTM sites, whereas gray squares denote positions that have less. Mann-Whitney tests show a statistically significant enrichment of PTM sites adjacent to known motifs (P = 1.48 × 10−4, U = 378.5, NA = 11, NB = 40; P = 1.12 × 10−3, U = 353.5, NA = 11, NB = 40 when only high-throughput PTM data were used; and P = 3.56 × 10−3, U = 337.5, NA = 11, NB = 40 when ELM MOD classes were omitted).

Analysis of ELM instances suggested many switch mechanisms that might regulate known functional motifs. Although direct experimental confirmation could be found in the literature for some of these, evidence was either circumstantial or absent for many others. Some representative examples of the latter are discussed here to demonstrate how hypotheses developed from suggestions made by switches.ELM can direct further experimental elucidation of these systems and how the tool can be useful to guide and focus experimental research aimed at interpreting high-throughput modification data.

The first example looks at transcription intermediary factor 1-β (TIF1-β), which functions as a co-repressor of heterochromatin protein 1 (HP1) homologs to mediate transcriptional repression of genes, including cell cycle enhancers such as cyclin A2 (CCNA2) (74). The switches.ELM tool suggested several switch mechanisms for the HP1-binding motif of TIF1-β, including an altered physicochemical compatibility switch mediated by phosphorylation of the Ser489 residue. This suggestion resulted from integration of data from a large-scale study of cell cycle–regulated protein phosphorylation, which showed increased phosphorylation of TIF1-β in cells arrested in M phase compared to cells arrested in G1 phase of the cell cycle (75). Two sites with increased phosphorylation were Ser473 and Ser489, which are adjacent to and part of, respectively, the HP1-binding motif. Phosphorylation of Ser473 has previously been shown to inhibit binding of TIF1-β to HP1 and to be associated with induction of CCNA2 expression at S phase (74). Thus, direct phosphorylation of the motif on Ser489 might serve the same purpose because it occurs under similar conditions and probably has the same, if not stronger, effect on binding to HP1. This example shows how the switches.ELM tool offers testable hypotheses by integrating motif and PTM data from low- and high-throughput experiments.

The second example looks at the erythropoietin (EPO) receptor (EPOR-F), which stimulates proliferation and terminal differentiation of committed erythroid progenitor cells in response to its ligand. EPO-induced dimerization of the receptor initiates Janus kinase (JAK) and signal transducer and activator of transcription (STAT) signaling. Recruitment of a JAK dimer to the EPOR-F results in tyrosine-specific phosphorylation of the receptor, which creates docking sites for STAT proteins. The latter become phosphorylated by JAK, translocate to the nucleus, and initiate transcription of target genes, including antiapoptotic factors such as B cell lymphoma 2 (BCL2) (76, 77). Early-stage progenitor cells predominantly express the truncated EPOR-T isoform, which acts as a dominant-negative receptor for EPO-mediated signaling. In addition, cells transfected with EPOR-T are more susceptible to apoptosis (78). The switches.ELM tool suggested that these observations might be explained by a pretranslational switch that removes two SH2 (Src homology 2)–binding motifs, which are involved in phospho-dependent recruitment of STAT proteins, from the EPOR-T isoform.

Analysis of ELM instances for enrichment of motif-proximal modification sites

A large subset of the altered physicochemical compatibility switches curated in switches.ELM consisted of SLiMs that are regulated by the addition of a specific PTM to a specific residue in the motif, which induces binding to PTM-binding domains such as the SH2, phosphotyrosine-binding (PTB), and bromo domains (6, 29, 31). However, many motifs annotated in switches.ELM are regulated by the addition of PTMs to residues that are distinct from these position-specific PTM sites that induce binding to PTM-binding domains. Of the 710 curated switches, 80 are regulated by this mechanism, and the vast majority of their switching modifications occur within five residues of the motif [81 (76.3%) of the 114 modification sites involved in this subset of switches]. To assess if this is a general feature of SLiMs, the available modification data (64, 65) were mapped to the ELM instances (43) to determine whether there was an enrichment of modification sites around validated functional SLiMs. The residues adjacent (within five residues of a defined residue) to validated motif instances had an enrichment of modification sites [690 (3.6%) of the 19,112 residues are modified] when compared to a background data set, in which 2151 (2.9%) of the 73,005 residues are modified (see Materials and Methods). The proportion of PTMs present in each set was calculated, and a statistically significant enrichment of modification sites adjacent to known motifs was found (Fig. 1C). Similar results were obtained when modification data were restricted to only high-throughput experimentation, suggesting that the observation is not an acquisition bias due to annotation of the modifications in the motif literature. Also, when ELM motif instances belonging to a MOD class, which are recognized and modified by enzymes, were omitted from the analysis, the enrichment of modification sites adjacent to motif instances was still significant. These findings agree with results from a previous study analyzing coupling between motifs and phosphorylation events (23).

Analysis of switches.ELM instances

Previous studies indicate that motifs are enriched in IDRs (22). We found that 479 (72.1%) of the 664 SLiMs that participate in curated switches occurred in predicted IDRs. In comparison, 1428 (68.9%) of the 2070 SLiM instances annotated in ELM occur in predicted IDRs (Fig. 2A). This slight difference is not statistically significant and can be explained by the large number of glycosylation sites, which often occur in globular regions, in the ELM data set. Hence, motifs in the switches.ELM repository concur with previous observations that motifs are enriched in IDRs (22); however, as expected, they are no more likely to occur in IDRs than annotated functional motifs in general.

Fig. 2

Analysis of switches.ELM instances. (A) Proportion of motifs in the switches.ELM resource (479 of 664 annotated SLiMs) and motifs in the ELM database (1428 of 2070 annotated SLiMs) predicted to be intrinsically disordered. A Fisher’s exact test indicates that the difference is not statistically significant (P = 0.13). (B) Proportion of motifs in the switches.ELM resource (all: 126 of 532 annotated SLiMs; human: 112 of 395 annotated SLiMs) and motifs in the ELM database (all: 304 of 2070 annotated SLiMs; human: 232 of 1168 annotated SLiMs) located in nonconstitutive exons. A Fisher’s exact test indicates a statistically significant enrichment of SLiMs located in nonconstitutive exons in switches.ELM compared to ELM, both when all motifs in the two data sets were considered and when only human motifs were considered (all: P = 1.57 × 10−6; human: P = 5.66 × 10−4). (C) Pie chart of the number of modification sites involved in PTM-dependent switches curated in switches.ELM grouped by the number of modifying enzymes inferred per modification site.

Recent analyses have shown motifs to be enriched in nonconstitutive exons, which enables isoform-specific rewiring of protein interaction networks (5, 24, 25). In addition, motifs involved in switches curated in switches.ELM were more likely to be located in nonconstitutive exons compared to ELM instances. We find that 304 (14.7%) of the 2070 SLiMs in ELM occurred in nonconstitutive exons, whereas this was true for 126 (23.7%) of 532 SLiMs in switches.ELM (excluding the 132 motifs involved in pretranslational switches) (Fig. 2B). This difference is statistically significant and did not result from the bias toward human SLiMs in switches.ELM or the superior coverage for human SLiMs in the available protein isoform data [112 (28.4%) of 395 human SLiMs from switches.ELM compared with 232 (19.9%) of 1168 human SLiMs from ELM] (Fig. 2B). This indicates a tight regulation of the switch mechanisms at both pre- and posttranslational levels.

Modifying enzymes could be inferred for 182 (42.6%) of the 427 modification sites involved in regulating PTM-dependent switches curated in switches.ELM (see Materials and Methods) (table S2). Of the 182 sites for which modifying enzymes could be inferred, 83 (46.6%) have experimental evidence for modification by two or more distinct enzymes, emphasizing the integrative nature of these switch mechanisms (Fig. 2C). More than 50 families of modifying enzymes were inferred to regulate the curated switches, with the CDC2 (cell division control protein 2; also known as cyclin-dependent kinase 1), MAPK (mitogen-activated protein kinase), and the tyrosine protein kinase SRC families being particularly well represented.

Discussion

Intrinsically disordered interfaces, and SLiMs in particular, enable rapid and robust regulation of protein function and activity in response to prevailing conditions and therefore combinatorial control of cell physiology (11, 15, 22, 23). Consequently, dynamic, tunable interactions mediated by IDRs of higher eukaryotic proteomes underlie a substantial portion of the ability of cellular networks to cooperatively make decisions (11, 15, 7982). The compactness of SLiM interaction interfaces facilitates low-affinity interactions that can be easily modulated and permits high functional density. This enables SLiMs to act as switchable modules that can respond to multiple input signals to direct cell signaling in a context-dependent manner. An additional consequence of the short length and degeneracy of motifs is their propensity for de novo evolution because a functional motif can be gained or lost by a few or even a single mutation (21, 22). The convergent evolution of new motif instances enabling recruitment of existing motif-binding domains can add functionality to proteins by rewiring their interaction networks (21, 83). Furthermore, the addition of SLiMs and modification sites to proteins enables highly complex interface regulation to evolve in a modular and additive way.

The instances of SLiM-based switch mechanisms collected in the switches.ELM resource show that many examples of motif-mediated decision-making interfaces have now been experimentally characterized. Furthermore, using the available genomic and proteomic data (43, 6466), we showed evidence that many of the known motifs are regulated either pre- or posttranslationally and revealed that modified residues were enriched around known motifs, suggesting that motif-based switches are common in higher eukaryotes. This synergistic relationship between motifs and modification sites, as well as the important regulatory functionality afforded by these cooperatively regulated interfaces, offers an appealing functional explanation for the extensive intrinsic disorder content of higher eukaryotic proteomes (84) and for the many PTMs mapped to these regions (17, 19). These observations indicate that modification sites should be considered as a mechanism for switching the intrinsic binding specificity of the underlying sequence, in addition to the canonical PTM-dependent sites recognized by PTM-binding domains such as the SH2, PTB, and bromo domains (6).

By separating SLiM-based regulatory systems into their most basic parts, such as individual binding interfaces and PTM sites, we can gain insight into the emergent properties enabled by cooperative use of a multiplicity of these simple modules. A publicly available database of experimentally validated motif-based switches will assist research on cell regulatory mechanisms and help extend our knowledge on how these systems conditionally direct cell signaling. As we gain a better understanding of recurring regulatory mechanisms, experimental design will advance to reflect this improved information, providing us with a more complete view of these complex interfaces. By integrating the available information, the rule-based switches.ELM exploratory analysis tool can guide this design by identifying possible context-dependent modes of regulation.

The switches.ELM resource may also prove useful for systems biologists. Currently, interaction data are captured in a binary fashion in bioinformatics resources, with limited information on specific requirements depending on cellular context (85). The switches.ELM resource adds temporal, spatial, and contextual conditionality to one-dimensional interaction data to capture their context dependency and interdependency. This conditional information can be incorporated into rule-based systems models describing the complex regulatory interaction networks in the cell, thereby increasing the quality and utility of such approaches. To facilitate data exchange and analysis, collaborative efforts are under way to integrate switches.ELM with current systems biology resources. In conclusion, switches.ELM adds to and is complementary to a growing list of resources (43, 8689) that allow a better understanding of SLiMs and can help to direct their in-depth experimental characterization.

Materials and Methods

Collection, curation, and visualization of switches.ELM data

Experimentally validated motif-based switches were collected by a literature survey of publications retrieved from directed searches in PubMed and references annotated in the ELM database (43). Relevant publications were further mined for related citations. In addition, we used the switches.ELM tool to analyze SLiM instances curated in ELM and retrieve literature that experimentally validated the suggested switch mechanisms. Users are encouraged to use the Web site’s submission form to submit publications describing experimentally validated switches not yet annotated in the repository, which will be reviewed and added to the database.

Each switch instance is described by one or more interactions, each of which comprises two interactors that are identified by a name and identifier from an external resource [UniProtKB for proteins (66), the Chemical Entities of Biological Interest (ChEBI) database for small molecules (70), and the European Nucleotide Archive (ENA) for nucleotide sequences (71)]. For each protein interactor, the binding region that mediates the interaction is annotated using an identifier referencing an external resource [ELM for SLiMs (43) and the Protein family (Pfam) (67) or InterPro database (72) for protein domains] and the start and stop position of the binding region in the sequence. If the position of the binding region given by the authors in a publication is different from that found in the available resources, the positions that map this region to the canonical sequence in UniProtKB were chosen.

Each switch instance is categorized by a switch type and subtype that describes how the function of a motif in one of the interactors is switched to regulate its interactions. This involves either a pretranslational mechanism, the addition of a PTM, or the binding of a molecule to a specific interactor. PTMs are annotated by the position of the modified residue in the sequence and the type of modification, using the PSI-MOD controlled vocabulary for protein modifications (90), whereas effector molecules are identified by reference to an external resource (66, 70, 71). The switch mechanism can be reversible or irreversible and can be positive (either induce or enhance an interaction) or negative (either inhibit or abrogate an interaction). Interactions can be intramolecular, mutually exclusive, require preceding interactions, or have a specific order, in which case they are numbered according to their position in the sequence of binding events.

Each switch instance is provided with a concise text description and a list of publications from which it was curated. If available, structural information [Protein Data Bank (PDB) identifier (68)] and affinity values (dissociation constant in micromolar) for the interactions are collected. Contextual information was manually curated from the literature, retrieved from manually curated data in ELM (43), or computationally inferred from external resources [subcellular localization data from GO (62), pathway data from KEGG (63), and enzymatic modification data from PhosphoSitePlus (64) and Phospho.ELM (65)]. Subcellular localization for 353 switch instances could be inferred from colocalization in a cellular compartment of at least two proteins involved in the switch. Similarly, 210 switch instances were mapped to KEGG pathways, such as the ErbB pathway (fig. S3), by co-occurrence in a pathway of at least two proteins involved in the switch. Enzymatic modification data could be mapped to 182 of the modification sites annotated in switches.ELM.

Biologically relevant data for the sequence-centric view of the protein region surrounding a SLiM whose function is regulated by a switch mechanism were integrated from various resources, and all data on a switch instance output page are linked to the corresponding resources for more detailed information [ELM (43) for validated SLiMs; Pfam (67) and InterPro (72) for protein domains; PhosphoSitePlus (64) and Phospho.ELM (65) for modification sites; UniProtKB (66) for proteins, protein isoform data, single-nucleotide polymorphisms (SNPs), and experimental mutations; ChEBI (70) for small molecules; ENA (71) for nucleotide sequences; KEGG (63) and Reactome (73) for pathways; PDB (68) for structural information; GO (62) for GO terms; PubMed for Publications; disorder scores calculated by IUPred (69); and sequence matches to ELM regular expressions (43)].

Rules to analyze motifs for putative switch mechanisms

The motif class–specific rules were developed from motif classes annotated in ELM that are regulated by a specific switch mechanism and are applied to motifs that can be mapped to the regular expressions of these ELM motif classes using the CompariMotif tool (91):

(i) Altered physicochemical compatibility switches are suggested for ELM classes whose binding is induced by PTM (LIG_14-3-3 classes, LIG_AGCK_PIF_1, LIG_BRCT_BRCA1 classes, LIG_BRCT_MDC1_1, LIG_FHA classes, LIG_ODPH_VHL_1, LIG_PTB_Phospho_1, LIG_SCF_FBW7 classes, LIG_SCF-TrCP1_1, LIG_SH2 classes, LIG_WW_Pin1_4, MOD_TYR_ITAM, MOD_TYR_ITIM, MOD_TYR_ITSM) and substrate motifs for modifying enzymes whose recognition depends on a priming PTM on the motif (MOD_CK1_1, MOD_GSK3_1).

(ii) Allostery switches are suggested for motif classes known to be regulated by allosteric mechanisms (LIG_CORNRBOX, LIG_NRBOX, TRG_ENDOCYTIC_2, TRG_LysEnd_ApsAcLL_1).

(iii) Preassembly switches are suggested for motifs that require preformation of a composite binding site (LIG_Integrin_isoDGR_1, LIG_RGD, LIG_SCF_Skp2-Cks1_1).

(iv) Avidity sensing is suggested for ELM classes known to mediate multivalent binding (MOD_TYR_ITAM, LIG_14-3-3 classes, LIG_EH_1, LIG_AP2alpha_2, LIG_SH2 classes, LIG_SH3 classes). Whereas a single MOD_TYR_ITAM motif mediates multivalent binding, an avidity-sensing switch mechanism is only suggested for the other motif classes when there are at least two instances of the same motif class present within a protein.

(v) Motif and domain hiding, through an intramolecular interaction, is suggested for LIG_SH2 classes, LIG_SH3 classes, and LIG_PDZ classes if the protein containing the motif also contains the corresponding motif-binding domain.

General rules were developed on the basis of the curated switch instances and depended on the integration of experimentally derived modification sites (64, 65), sequence information for protein isoforms (66), and experimentally validated SLiMs and sequence matches to ELM regular expressions (43):

(i) Altered physicochemical compatibility switching is suggested if there is an experimentally validated modified residue in or adjacent (within 20 residues) to the motif.

(ii) Pretranslational switches are suggested for motifs contained within a nonconstitutive exon that is specifically removed from one or more isoforms of a protein.

(iii) Rheostatic switching is suggested if multiple experimentally validated modified residues are present in or adjacent (within 20 residues) to the motif.

(iv) Competition and motif hiding switches are suggested for motifs overlapping or adjacent (within 20 residues) to experimentally validated SLiMs or sequences matching the ELM regular expressions.

(v) Altered binding specificity switches are suggested for motifs overlapping or adjacent (within 20 residues) to experimentally validated SLiMs or sequences matching the ELM regular expressions if there is an experimentally validated modified residue present in or adjacent (within 20 residues) to at least one motif.

Analysis of ELM instances for putative switch mechanisms

The ELM data set (as of November 2012) consists of 2070 experimentally validated functional SLiMs (ELM instances) in 1325 different proteins. The ELM instances were manually curated and categorized into 178 motif classes (43). The rules applied for the analysis of ELM instances differed from those integrated in the switches.ELM analysis tool, which were less stringent to reflect the exploratory function of the tool. Altered physicochemical compatibility and rheostatic switching was suggested for ELM instances with one or more experimentally derived modification sites, respectively, in a motif or within 5 residues of the motif, instead of the 20 residues specified for the switches.ELM tool. In addition, only experimentally validated motifs annotated in ELM were considered here to determine the presence of overlapping or adjacent motifs. Sequences matching the ELM regular expressions were used by the switches.ELM tool, but these were not taken into account in the analysis of ELM instances.

Analysis of ELM instances for enrichment of motif-proximal modification sites

A data set of experimentally derived modification sites (acetylation, methylation, phosphorylation, sumoylation, ubiquitylation) was retrieved from PhosphoSitePlus (64) and Phospho.ELM (65) in November 2012. A high-throughput modification site data set was also created from definitions made by these source databases. These modification sites were mapped to the 1325 SLiM-containing proteins from the ELM data set (containing 2070 SLiMs) or to the ELM data set excluding the MOD motif classes (1622 SLiMs in 1084 proteins) (43). Two sets of modification sites were created: “adjacent sites” are modification sites within five residues of but not including a defined motif residue (690 modifications on 19,112 residues); “background sites” are those greater than 5 and less than or equal to 25 residues from a defined motif residue (2151 modifications on 73,005 residues). The number of modification sites at each position relative to the motif was calculated and corrected for the nonuniform distribution of sample sizes at different distances from the defined motif. The proportions of modified residues at each position in the “adjacent” and “background” sets of positions were compared using a two-tailed Mann-Whitney U test. This is a nonparametric test for assessing whether observations in one sample tend to be larger than observations in another sample; in this case, whether amino acids within 5 residues of a motif tend to have more PTM sites than residues between 5 and 25 residues from a motif.

Analysis of switches.ELM instances

The switches.ELM data set (as of March 2013) contained 710 experimentally validated switches comprising 817 distinct interactions that are mediated by 664 different SLiMs in 409 proteins. For all SLiMs in the switches.ELM and ELM data sets, the average disorder score was calculated as the mean IUPred score (69) for residues between the start and end position of the SLiMs. A cutoff of 0.4 was used (22), with values equal to and above this cutoff indicating disordered regions and values below 0.4 indicating globular regions. The proportion of disordered motifs in both data sets was compared by the Fisher’s exact test. Similarly, the number of SLiMs contained within a nonconstitutive exon [assigned from protein isoform data in UniProtKB (66)] was determined for both data sets, and the counts were compared by the Fisher’s exact test. For this analysis, the 132 SLiMs that were involved in pretranslational switches were not included in the switches.ELM data set. Counts for the number of modifying enzymes per modification site were derived by mapping modification data (64, 65) to the modification sites involved in PTM-dependent switches. The modifying enzymes were grouped according to protein families defined in UniProtKB (66).

Supplementary Materials

www.sciencesignaling.org/cgi/content/full/6/269/rs7/DC1

Fig. S1. Visual output of switch instances in switches.ELM.

Fig. S2. Visual output of the switches.ELM exploratory analysis tool.

Fig. S3. Switch instances in switches.ELM mapped to the ErbB signaling pathway annotated in KEGG.

Table S1. Curated data in the first version of switches.ELM.

Table S2. Inferred data in the first version of switches.ELM.

Table S3. Putative switch mechanisms for experimentally validated functional motifs curated in the ELM database.

References and Notes

Acknowledgments: We thank A. Järvelin, M. Seiler, B. Uyar, and N. Haslam for critical evaluation of the manuscript and switches.ELM resource. We also thank our colleagues in the motif biology field for useful discussions and suggestions. Funding: This work was supported by an FP7 Health grant (no. 242129; SyBoSS) from the European Commission (K.V.R.), an EMBL International PhD Programme fellowship (R.J.W.), and an EMBL Interdisciplinary Postdoctoral fellowship (EIPOD) (N.E.D.). Author contributions: K.V.R. and N.E.D. designed the study; K.V.R. and N.E.D. wrote the manuscript; K.V.R. and R.J.W. curated the data; N.E.D. and H.D. developed the switches.ELM Web site; N.E.D. performed the analyses; T.J.G. provided advice and edited the manuscript; and all authors discussed the results and commented on the manuscript. Competing interests: The authors declare that they have no competing interests.
View Abstract

Navigate This Article