Review

EH and UIM: Endocytosis and More

See allHide authors and affiliations

Science's STKE  16 Dec 2003:
Vol. 2003, Issue 213, pp. re17
DOI: 10.1126/stke.2132003re17

Abstract

Exogenously and endogenously originated signals are propagated within the cell by functional and physical networks of proteins, leading to numerous biological outcomes. Many protein-protein interactions take place between binding domains and short peptide motifs. Frequently, these interactions are inducible by upstream signaling events, in which case one of the two binding surfaces may be created by a posttranslational modification. Here, we discuss two protein networks. One, the EH-network, is based on the Eps15 homology (EH) domain, which binds to peptides containing the sequence Asp-Pro-Phe (NPF). The other, which we define as the monoubiquitin (mUb) network, relies on monoubiquitination, which is emerging as an important posttranslational modification that regulates protein function. Both networks were initially implicated in the control of plasma membrane receptor endocytosis and in the regulation of intracellular trafficking routes. The ramifications of these two networks, however, appear to extend into many other aspects of cell physiology as well, such as transcriptional regulation, actin cytoskeleton remodeling, and DNA repair. The focus of this review is to integrate available knowledge of the EH- and mUb networks with predictions of genetic and physical interactions stemming from functional genomics approaches.

Introduction

Biochemical cell signals that originate both exogenously and endogenously are propagated within the cell by functional and physical networks of proteins, leading to a vast array of biological outcomes. Many of the interconnections within signaling pathways are mediated by protein-protein interactions between binding domains and short peptide motifs (or particular lipids). In some cases, these interactions are constitutive, whereas in others they are inducible by upstream signaling events. In the latter instance, one of the two binding surfaces is frequently created, at least in part, by a posttranslational modification. Here, we will focus on two protein networks. One, the EH-network, is based on the Eps15 homology (EH) domain, which binds to peptides containing the amino acid sequence Asp-Pro-Phe (NPF). The other, which we define as the monoubiquitin (mUb)-network, relies on interactions involving monoubiquitination, which is emerging as an important posttranslational modification that regulates protein function. Many proteins harbor ubiquitin (Ub)-binding regions, such as the Ub-interacting motif (UIM), and are thus able to interact specifically with Ub-containing proteins. Both networks were initially implicated in the control of plasma membrane receptor endocytosis and in the regulation of intracellular trafficking routes. The effects of these two networks, however, appear to extend into many other aspects of cell physiology as well, such as transcriptional regulation, actin cytoskeleton remodeling and DNA repair.

The ultimate goal in signaling research is to identify the complete repertoire of intracellular transducers and to build physical and functional maps of their complex interactions. High-throughput, low-resolution approaches, coupled with genomics knowledge, offer useful tools to compile such atlases, which in turn provide testable hypotheses for classic high-resolution genetic and biochemical studies. Thus, the major focus of this review will be to integrate current knowledge about the EH- and mUb networks with predictions of genetic and physical interactions stemming from functional genomics approaches.

The EH Domain and the "EH Network"

The EH domain is a protein-protein interaction domain that was originally identified as a motif present in three copies at the N termini of the endocytic proteins eps15 and eps15R (1). Various approaches identified three classes of EH-binding peptides (2, 3). Most EH domains bind preferentially to NPF-containing peptides, designated class I peptides (2). Class II peptides are much less common and are characterized by Trp-Trp (WW), Phe-Trp (FW), or Ser-Gly-Trp (SGW) consensus sequences (2). Finally, a His-Ser/Thr-Phe [H(S/T)F] motif in class III peptides binds exclusively to one EH-containing protein from yeast, End3p (2).

The structure of five different EH domains has been determined [Supplementary Information 1], and the molecular and structural bases of their binding to class I and class II peptides have been elucidated and reviewed extensively (4-10). Briefly, EH domains share the same fold, composed of two closely associated helix-loop-helix motifs, also called EF-hands, connected by a short antiparallel β-sheet. EF-hands are endowed with Ca2+ binding properties (11). All EH domains display two EF-hand structures, but not all of them possess all of the residues required to bind Ca2+, as defined by the canonical and the pseudo EF-hand consensus sequences (12, 13). In the second EH domain of eps15, the binding pocket for the NPF motif is formed mainly by Leu155, Leu165, and Trp169, and the latter two residues are the most conserved among all EH domains (10). Mutations at these positions are not compatible with binding to NPF-containing peptides (7, 14). In the EH domain, the residue at position +3 with respect to the conserved Trp contributes to the recognition specificity of the target peptide. EH domains with a preference for NPF have alanine or serine at this position, whereas domains that bind FW, WW, or SWG display slightly larger amino acids (cysteine or valine) (10, 14).

In this review, we will concentrate on the biochemical and functional properties of EH-containing and EH-binding proteins, which together constitute the "EH-network."

The availability of the complete genomic sequences of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae has made it possible to establish the existence of 35 EH-containing proteins in these species (Table 1). In humans, 11 EH-containing proteins have been identified that belong to five subfamilies termed Eps homology domain-containing proteins (EHDs; four members, EHD1, 2, 3, and 4), Eps15s (two members, eps15 and eps15R), intersectins (INTSs; two members, INTS1 and 2), RalBP1-associated Eps-homology domain proteins (REPSs; two members, Reps1 and Reps2/Pob1), and γ-synergin (one member). Only the Eps15 subfamily is present throughout evolution, with a clear-cut homolog, Ede1p, present in yeast (Table 1). The EHD, INTS, and REPS families are also well conserved, with single homologs present in both Drosophila and C. elegans. Conversely, γ-synergin is present in mammals, but not in lower species. Four additional EH-containing proteins are present only in S. cerevisiae with no clear-cut ortholog in higher species (Table 1). The functions of many EH-containing proteins have been elucidated through various approaches, including biological and biochemical studies in mammalian cells and genetic studies in lower organisms. The available knowledge is summarized in Table 2 (and references within).

Twelve EH-binding proteins are now known (Table 3 and references within). In all cases, interactions are mediated through binding of NPF-containing sequences to various EHs. Most EH-binding proteins are well conserved throughout evolution, with homologs conserved down to S. cerevisiae (Table 3). In many cases, the interactions have been reasonably well validated by co-immunoprecipitation experiments. In other instances, however, physical interactions in vivo have not yet been demonstrated. Thus, the physiologic relevance of some EH-interacting proteins to the EH network awaits further confirmation (Table 3). A number of integrated approaches in mammalian cells, mice, and lower organisms have also started to elucidate the individual functions of these proteins (Table 3 and references within).

Known Functions of the EH-network and "Genomics" Predictions

The functions of the EH-network as a whole are best appreciated by the combined analysis of the properties of EH-containing proteins and of the cellular proteins that interact with them, either through EH-mediated interactions or through other domains (Table 1). Some general themes can be inferred (Fig. 1, Table 1, Table 2, and Table 3). First, most EH-network proteins have particular functions at various steps of the endocytic process (and, in higher organisms, also in synaptic vesicle recycling). Second, some EH-network proteins participate in other aspects of intracellular trafficking; for instance, γ-synergin is involved in trafficking from the Golgi to endosomes. Third, EH-network proteins are also involved in regulating the organization of the actin cytoskeleton. The EH-network affects various cellular functions through the regulation of these critical pathways. In addition, a direct involvement of the EH-network in processes such as mitogenic signaling and control of cell proliferation, control of nuclear shuttling, and DNA repair is either experimentally proven or can be postulated on the basis of known interactions (Table 1, Table 2, and Table 3, and references within). A schematic representation of the best-characterized functions of the EH-network, and of their impact on cell physiology, is shown in Figure 1.

Fig 1.

EH-mediated interactions in biological processes. This schematic representation of the best-characterized functions of the EH network, and of their impact on cell physiology, represents a conceptual integration of information obtained in mammals and in yeast. (A) Eps15 and epsin are adaptor proteins involved in EGF receptor (EGFR) internalization. They are most likely recruited to active receptors through monoubiquitination of the receptors (orange circles; see also Fig. 3), and in turn either recruit or stabilize (or both) clathrin and AP2 at the plasma membrane. The EH domain of eps15 interacts with the NPF motifs of epsin [reviewed in (42)]. ENTH, epsin N-terminal homology domain. (B) γ-synergin is involved in trafficking from the Golgi network to endosomes through its interaction with the AP1 adaptor [not shown; see (100)]. Additionally, a role in trans-Golgi network (TGN) dynamics is suggested by the interaction of its EH with NPF motifs present in the endosomal protein SCAMP (101). (C) Intersectin is an endocytic EH-containing protein with a direct role regulation of the actin cytoskeleton. Through its EH domains, intersectin binds to epsin, and at its five SH3 domains it binds to N-WASP, a potent activator of the Arp2/3 complex (see D and E). It also binds to the guanosine triphosphatase (GTPase) dynamin, a protein critical to endocytosis that has also been implicated in regulation of the actin cytoskeleton [reviewed in (102)]. (D) The long isoform of intersectin contains a Dbl homology and pleckstrin homology (DH/PH) domain and regulates actin assembly through Cdc42 and N-WASP. The DH/PH domain is a guanine nucleotide exchange factor (GEF) for Cdc42. N-WASP binds directly to intersectin and up-regulates the GEF activity of the latter. This leads to the generation of guanosine triphosphate (GTP)-bound Cdc42, which in turn activates N-WASP (103). (E) In yeast, the complex Sla1p-End3p-Pan1p stimulates actin polymerization through activation of the Arp2/3 complex. Multiple EH-mediated interactions keep the complex together; note that the interaction between Sla1p and the EH-domains of Pan1 and End3 is not mediated by the NPF motif of Sla1p. Pan1p, in turn, interacts with the Arp2/3 complex (89, 104). Additional interactions with the endocytic machinery are likely, because Pan1p also interacts with Entps and Yap180ps (homologs of mammalian epsins and AP180s adaptors, not shown) [(105) and references therein]. (F) The EH network influences the Crm-1-mediated nuclear export pathway. The EH domain of eps15 interacts with the NPF motifs of Hrb, a cellular cofactor of the Rev export pathway [(98); reviewed in (106)]. (G) Intersectin activates transcription. Its EH domains activate Elk-dependent transcription in a mitogen-activated protein kinase (MAPK)-independent manner. This ability involves interaction with a yet-unknown ligand for its EH domains (107).

Fig 2.

UIM-containing proteins. The schematic diagram shows the domain architecture of selected UIM-containing proteins. The UIM is indicated by red ovals. In the structure of human (Hs) epsin, the second UIM is indicated by a blue oval to indicate the existence of a splice variant lacking it. Other domains are drawn as colored rectangles and are labeled as follows: UBA, Ub-associated domain; EH, eps15-homology domain; ENTH, epsin N-terminal homology domain; VHS, Vps27/Hrs/Stam domain; FYVE, FYVE-finger domain; SH3, Src-homology 3 domain; VWA, von Willebrand factor type A domain; Josephin, Josephin domain. Other abbreviations: Hs, Homo sapiens; Mm, Mus musculus; Dm, Drosophila melanogaster; Ce, Caenorhabditis elegans; Sc, Saccharomyces cerevisiae.

Fig 3.

Interactions between mUb and Ub receptors in endocytosis and vesicular trafficking. The schematic representation of the functions most relevant to endocytosis and vesicular trafficking, mediated by mUb and Ub receptors, represents a conceptual integration of data obtained in mammals and in yeast. Orange circles represent mUb. (A) Eps15 and epsin are adaptor proteins involved in EGFR internalization. They are likely recruited to ubiquitinated plasma membrane receptors through their UIMs [Fig. 1; reviewed in (42)]. (B) The interaction between monoubiquitinated internalized receptors and a Ub receptor (Vps9p, displaying a CUE domain) is critical for endosome fusion in yeast. Vps9p (the yeast homolog of mammalian Rabex-5) is a GEF that regulates the activity of Vps21 (the yeast homolog of mammalian Rab5), which in turn promotes membrane fusion in the endosomes. Biochemical and genetic data support the possibility that the CUE domain of Vps9p inhibits its GEF activity. Upon interaction of Vps9p with ubiquitinated cargo receptors (Ste2p), this inhibition is relieved, allowing activation of Vps21p and endosome fusion. The high degree of conservation of the entire system suggests a similar mechanism of regulation in mammals (67, 108). (C) Hrs functions at the endosomal level at which internalized receptors are sorted to different destinations. Hrs recruits ubiquitinated receptors through the UIM and sorts them to the multivesicular body (MVB), and hence to the lysosome for degradation. The process also involves the ESCRT complex (not shown) and the Ub C-terminal hydrolase DoA4 (Dub) [reviewed in (41)]. (D) Some ubiquitinated biosynthetic cargoes, such as carboxypeptidase S (CPS), are also sorted to the MVB through the UIM of Hrs [reviewed in (41)].

Fig 4.

A map of physical and genetic interactions of EH-, UIM- and UBA-, and NPF-containing proteins in S. cerevisiae. An interaction diagram is shown in which proteins containing EH, UIM or UBA, or NPF are linked to other proteins and then grouped into functional areas. Interaction data were derived from the General Repository of Interaction Datasets (GRID, http://biodata.mshri.on.ca/grid) and from literature. The picture was initially generated using the Osprey software (109), and then edited with Adobe Illustrator. A relevant selection of the reported interactions is shown. See also Supplementary Information 4 for annotations of proteins, their interactions, and PubMed links to references.

Table 1.

EH-containing proteins and their interactors. All genes that encode EH-containing proteins are listed in the first column, grouped into subfamilies. Their presence in Homo sapiens (Hs), Mus musculus (Mm), Drosophila melanogaster (Dm), Caenorhabditis elegans (Ce), and Saccharomyces cerevisiae (Sc) is indicated by a gray box in the grid. Also shown is the modular composition of EH-containing proteins and the known interactors of EH-containing proteins. Interactions mediated through the EH domain are shown in the "Interactors (EH-mediated)" column, whereas proteins interacting through other regions are indicated in the "Other interactors" column. aMus musculus adaptor protein complex 1 (AP1) γ subunit binding protein 1 (Ap1gbp1; GenBank accession number BC056370). bDrosophila melanogaster CG6192 protein; orthology was assigned by sequence similarity. cCaenorhabditis elegans Y39B6A.38 protein; orthology was assigned by sequence similarity (limited to the EH domain). dListed are all known proteins that interact with EH-containing proteins by non-EH-mediated interactions. Yeast protein information was collected from the Saccharomyces Genome Database. RhoGEF, Rho guanine nucleotide exchange factor; COIL, coiled-coil region; C2, Protein kinase C conserved region 2.

Table 2.

Functions of EH-containing proteins. All known EH-containing proteins and genes are listed in the first column, together with known functions and known genetic interactions in various organisms. aYeast protein information was collected from the Saccharomyces Genome Database. KO, knockout; RNAi, RNA interference; IGF-1R, Insulin-like growth factor 1 receptor; TS, temperature-sensitive; WT, wild-type.

Table 3.

EH-binding proteins. All known genes encoding EH-binding proteins are shown together with known functions and known genetic interactions in various organisms. Also shown is the number of NPF motifs that they contain. The column "EH interactors in mammals" lists the mammalian EH-containing proteins that interact by their EH domain with proteins listed in column one. aData were obtained from the Database of the Drosophila Genome. bYeast protein information was collected from the Saccharomyces Genome Database. cNPF motif not present in C. elegans protein. IP, co-immunoprecipitation; IVB, in vitro binding; Y2H, yeast two-hybrid system

From the perspective of networking and signaling, an interesting unanswered question concerns the extent of the EH-network. Although genomic knowledge makes it possible to unequivocally establish the number of EH-containing proteins in a given organism, the short consensus sequences for EH-binding peptides hamper our ability to make predictions solely on the basis of bioinformatics. Interestingly, all known EH-interacting proteins contain NPF motifs (class I peptides; Table 3). It is possible, therefore, that class II and class III peptides represent mimotopes, which may mimic the NPF peptide conformation in the conditions used for the binding assay (2), rather than physiologically relevant EH-binding peptides. Thus, whereas participation of proteins containing class II or III peptides in the network cannot be formally excluded, it is reasonable to assume that most bona fide EH-interactors possess an NPF motif. Almost four thousand sequences in the analyzed genomes (Table 4) have at least one NPF motif [for information on all NPF-containing proteins, see Supplementary Information 2]. In addition, many EH-binding proteins contain more than one NPF motif, a feature mirrored by the frequent presence of more than one EH domain in the same EH-containing protein. It remains to be established whether, given the low affinity of the EH-NPF interaction (15), this redundancy is necessary to impart sufficient avidity to these proteins to achieve meaningful stoichiometries of interaction in vivo. About 300 proteins out of 4000 contain two or more NPFs in the five genomes under analysis (Table 4); this characteristic may set further limits on the physiologic relevance of candidates for the EH-network.

Table 4.

NPF-containing proteins in various organisms. The TrEMBL (translated EMBL) and SwissProt databases were searched for NPF motifs using ScanProsite tool (164). Retrieved proteins were made non-redundant using the NRDB90 software (165) and by visual inspection. aThe protein FYVE-finger-containing Rab5 effector protein rabenosyn-5 (ZFYVE20) contains five NPFs in human (TrEMBL accession number Q9H1K0) and six NPFs in mouse (TrEMBL accession number Q8K0L6), due to a mutation of NPF to NPL in the human sequence. However, the human protein hypothetical protein FLJ34993 (TrEMBL accession number Q8NAQ1), which is a shorter form of Q9H1K0, does not have this mutation, and in the human genome the gene ZFYVE20 is predicted to encode a protein with six NPF motifs.

In C. elegans, for instance, about 50 proteins contain two or more NPFs [Table 4; Supplementary Information 2]. A close inspection of this list reinforces the notion that the EH-network is primarily involved in intracellular traffic, because it includes, in addition to the known EH-interactors, proteins such as UNC11, the nematode homolog of the mammalian endocytic adaptor AP180; and F58G6.1, an amphiphysin-like protein. Additional EH-binding candidates include LIN-10 and SEL-5. LIN-10 is part of the LIN-2-LIN-7-LIN-10 complex that mediates basolateral membrane localization of the epidermal growth factor (EGF) receptor LET-23 in vulval epithelial cells (16). LIN-10's mammalian homolog, Mint1, is essential for synaptic vesicle exocytosis (17, 18). SEL-5 is a Ser-Thr kinase that facilitates the activity of receptors of the LIN-12 and Notch family (19, 20). In this connection, it is interesting to note that Numb, a known Notch antagonist in mammals and Drosophila, is also an EH-interacting protein (Table 3). Numb is involved in endocytosis (21), and mutations in the Drosophila endocytic adaptor α-adaptin phenotypically mimic the loss of Numb (22). Also, in Drosophila, Notch requires dynamin (and thus most likely depends on endocytosis) for its function (23, 24). Thus, a complex pathway linking the endocytic process to Notch activation, and involving EH-network control at various steps, might exist [reviewed in (9, 25)].

In conclusion, the extent of the EH-network seems to be reasonably limited, and testable predictions about interactions can be made, on the basis of available genomic knowledge. Thus, the EH-network is a good model system with which to integrate low-resolution genomic approaches with high-resolution biochemical and genetic approaches in mammals and lower organisms, toward the elucidation of the complete functional wiring of a particular signaling network.

Monoubiquitination and Ubiquitin Receptors

A novel modality of intracellular signaling, based on monoubiquitination, is emerging from studies of Ub itself, protein domains or motifs that can bind to Ub, and molecules that harbor such Ub-recognition devices (Ub receptors).

Ubiquitination is a posttranslational modification whereby Ub, a conserved 76-amino-acid protein, is attached to a substrate protein by an isopeptide bond between the C-terminal glycine of Ub and an ϵ-amino group of lysine residues in the substrate [reviewed in (26)]. The best-characterized type of ubiquitination is polyubiquitination, in which the substrate-bound Ub serves as an acceptor for further cycles of ubiquitination, through several of its seven lysine residues (26). Polyubiquitination functions, when present as a chain of four or more Ubs linked at Lys48 of Ub, as a general mechanism for the targeting of the polyubiquitinated substrate to the proteasome, with ensuing proteolytic degradation (27). Some nonproteolytic functions of polyubiquitination have also been described (28, 29). Both proteolytic and nonproteolytic functions of polyubiquitin (pUb) have been extensively reviewed (30-33).

An emerging body of evidence indicates, however, that when Ub is appended as a single moiety to a target protein (monoubiquitination), the posttranslational modification has a completely different biological impact, and serves as a signaling-dependent device to establish networks of protein-protein interactions in the cell (monoubiquitin network). The conclusion that monoubiquitination serves as a signal-dependent mechanism for establishing protein networks is based on a number of published observations. First, monoubiquitination is promoted by extracellular stimuli. Receptor tyrosine kinases (RTKs) induce their own monoubiquitination when engaged by their cognate ligands. This process requires the kinase activity of the receptor and is mediated by recruitment of the Ub ligase Cbl (34). RTKs, which were long thought to be polyubiquitinated, are actually monoubiquitinated at multiple sites (35). In addition, the monoubiquitination of RTKs affects signaling, in that it is sufficient to promote receptor internalization and degradation (35). A similar role for receptor monoubiquitination has been described in yeast (Table 5;36). Active RTKs, moreover, stimulate the monoubiquitination of a number of intracellular proteins, including eps15, eps15R, epsins, Hrs, and CIN85 (37-39). In some cases, the combined analysis of the functions of these proteins in mammals and yeast indicates a role for their monoubiquitination in endocytosis [reviewed in (30, 40-42)]. The Ub system is also a key regulator of growth hormone receptor (GHR) internalization. Both genetic and molecular evidence shows that GHR, which internalizes constitutively, accumulates at the plasma membrane if the Ub system is impaired (43). Ubiquitination of the GHR itself is not required, as shown by the fact that mutagenesis of all lysine residues in its cytosolic tail does not alter its internalization (44). Thus, other Ub-based mechanisms of regulation must exist, possibly involving other ubiquitinated adaptors.

Table 5.

Monoubiquitinated proteins in yeast and their functions.

Second, signaling stimuli that originate intracellularly also promote monoubiquitination. Molecular genetic studies of Fanconi anemia (FA) reveal that a "core" macromolecular complex (the FA complex) containing five proteins is necessary for the monoubiquitination of the FA complementation group D2 (FANCD2) protein after DNA damage (45, 46). Plant homeodomain (PHD) finger protein 9 (PHF9), a new component of this complex, was recently isolated. PHF9 possesses Ub ligase activity in vitro and may represent the primary E3 ligase for the FANCD2 protein (47). Monoubiquitination leads to the recruitment of FANCD2 into BRCA1-containing nuclear foci associated with DNA repair and checkpoint function (48, 49). Interestingly, FA-dependent monoubiquitination of FANCD2 occurs also during the S-phase of normal cell cycle progression (50).

Third, a series of Ub recognition devices exists. Intracellular proteins that harbor these devices, Ub receptors, can thus physically interact with monoubiquitinated proteins as a consequence of various signaling events. Four major classes of Ub-binding devices have been characterized, the Ub-associated (UBA) domain, the Ub E2 variant (UEV) domain, the Cue1-homologous (CUE) domain, and the UIM [Table 6; reviewed in (40)].

Table 6.

Proteins containing Ub-binding devices in yeast and human. SMART and PFAM databases were used as source of information. Retrieved protein sequences were made nonredundant using the NRDB90 software (165) and by visual inspection.

In conclusion, monoubiquitinated proteins and Ub receptors have the potential to establish an intracellular network. The monoubiquitination system has remarkable similarities, including inducibility and rapid reversibility (40, 51), with the extensively characterized phosphorylation system in which phosphoamino acid residues are recognized by specific protein domains, such as Src-homology 2 (SH2), protein tyrosine phosphatase (PTP), Forkhead associated domain (FHA), WW, and 14-3-3 domains [reviewed in (52, 53)]. We will concentrate on one aspect of this emerging network, the interaction between UIM and mUb-containing proteins.

UIM and UIM-Containing Proteins

The UIM was identified by Hofmann and Falquet (54) through a bioinformatics approach, and was then experimentally validated as a bona fide Ub-binding motif (37, 55-59). Indeed, binding to Ub or to ubiquitinated proteins seems to be common to all UIMs studied so far. The UIM occurs in many proteins and is frequently present in multiple copies within the same protein. It is a very short sequence motif (comprising about 20 residues) that contains the conserved core ψ-X-X-Ala-X-X-X-Ser-X-X-Ac, where "ψ" is a large hydrophobic residue, "X" is any amino acid, and "Ac" is an acidic residue. A block of four preferentially acidic residues that are not conserved precedes this core. Such a short motif is unlikely to form a "domain" in the structural sense, because commonly a "domain" includes more than one short sequence segment that usually form a series of structurally conserved "folds" (α-helices or β-strands) connected by flexible "loops." As shown by nuclear magnetic resonance (NMR) spectroscopy of several UIMs, and from the crystal structure of the second UIM of Vps27p (60, 61), the UIM forms a short α-helix in which all of the conserved residue are exposed on one face [Supplementary Information 1]. Of great interest is the surprising finding that the motif crystallized as an antiparallel four-helix bundle (60) in which most of the conserved residues (those most likely involved in Ub recognition) are buried in the middle of the bundle. Although this tetrameric assembly immediately suggests modes by which Ub binding might be regulated, it remains to be established whether this really occurs in vivo, and, if so, whether it is biologically relevant.

Twenty UIM-containing proteins have been found in humans [an updated list, with known biological functions is available as Supplementary Information 3]. The motif is well conserved throughout evolution, being present in many proteins from human to yeast [Fig. 2; Supplementary Information 3]. In addition, because the UIM is very short, it is likely that not all biologically relevant instances are identified by profile-based bioinformatics methods; thus, an even greater number of UIM-containing proteins probably exist.

Three major types of proteins harbor UIMs. The first type includes proteins involved in ubiquitination or Ub metabolism, or known to interact with Ub-like modifiers [Supplementary Information 3]. Among these proteins is the proteasomal component Rpn10/S5a, in which the UIM coincides with the Leu-Ala-Leu-Ala-Leu (LALAL) motif that was previously implicated in pUb recognition by the proteasome (62).

The UIM also occurs in a second group, a number of proteins involved in receptor endocytosis [Fig. 2; Supplementary Information 3]. This process was known to involve monoubiquitination of proteins, but its Ub recognition components were elusive (31). A series of recent papers demonstrated Ub-binding abilities for the UIMs of eps15 and eps15R (37), Hrs (37, 55, 58, 61), epsin (37, 55) and Vps27p (55, 59).

The third class of UIM-containing proteins includes species that have putative roles in various biological processes such as DNA repair and mRNA splicing, or have been associated with pathological processes such as neurodegeneration [Supplementary Information 3]. With this class of proteins, a direct role for the UIM-Ub interaction has not yet been reported. Ataxin 3, a protein of yet unknown function, is a member of this group. Expansion of Cys-Ala-Gly (CAG) repeats in the coding region of ataxin 3 causes a polyglutamine disorder called spinocerebellar ataxia type 3, which is characterized by intranuclear inclusions that contain Ub (63). Recently, based on predictive computational analysis, the Josephin domain present at the N-terminus of ataxin 3 (Fig. 2) was proposed to be similar to the ENTH and VHS domains present in the other UIM-containing proteins involved in endocytosis (Fig. 2;64). Thus, ataxin 3 might link intracellular traffic and neurodegenerative disorders.

One final interesting feature of the UIM (and of at least one other Ub-binding device, the CUE domain) is that its presence frequently dictates the monoubiquitination of UIM-containing proteins (37, 56, 57, 65-67). The molecular mechanisms involved are objects of speculation at present and have been reviewed elsewhere (40). The result is that the same protein contains mUb and a Ub binding device, and can thus engage in multiple interactions. These interactions can be either intramolecular (with obvious regulatory implications) or intermolecular, thereby establishing a chain of Ub-mediated contacts that can help propagate the signal (40, 42).

A Monoubiquitin-Based Network?

Many UIM-containing proteins (and Ub receptors in general) cannot be immediately associated with the proteasome pathway and are most likely not involved in the recognition of pUb chains, but rather in establishing a network of interactions with monoubiquitinated proteins. What might be the functions of such a network? One important aspect of the interaction between mUb and Ub receptors is to regulate different steps of the endocytic route and other intracellular trafficking pathways. A number of recent reviews has covered this topic extensively (30, 41, 42, 68), and a schematic representation of the molecular connections involving mUb and Ub receptors in these pathways is portrayed in Figure 3.

The list of known proteins that are modified by monoubiquitination, however, reveals that many other biological functions might be regulated through Ub-mediated interactions (Table 5, Table 7, and references within). These functions include DNA repair, as demonstrated by monoubiquitination of FANCD2 (described above) and proliferating cell nuclear antigen (PCNA) (Table 7), and regulation of histone function, because both core and linker histones are monoubiquitinated (Table 5, Table 7). Recent work has elucidated an important role for monoubiquitinated histone H2B in yeast. The E2-E3 complex of Rad6p and Bre1p ubiquitinates Lys123 in H2B. This, in turn, leads to methylation of Lys4 and Lys79 on histone H3, a modification directly involved in gene silencing (69-72). How the ubiquitination of histone H2B directs the activity of H3 site-specific histone methylases is not known. Of course, not all mUb modifications must necessarily lead to interactions between Ub and the Ub receptor. Indeed, monoubiquitination is a bulky modification that, especially in the context of chromatin, can lead to global changes in folding. However, the emergence of mUb as a protein networking device suggests the intriguing possibility that the interaction of monoubiquitinated histones with yet unidentified Ub receptors plays a role in transcriptional regulation.

Table 7.

Monoubiquitinated proteins in mammals and their functions.

The real impact of monoubiquitination on protein function is only starting to be appreciated. The number of characterized monoubiquitinated proteins is quite low (Table 5, Table 7), because of a lack of systematic efforts toward characterizing them. The elucidation of the mUb proteome will therefore advance our knowledge of this network of interactions enormously. In principle, characterization of the mUb proteome is feasible, as demonstrated by the recent initial characterization of the yeast Ub proteome (73). A major difficulty will involve the separation of monoubiquitinated species from polyubiquitinated ones, which are predicted to constitute the vast majority of Ub conjugates in the cell. The availability of anti-phosphotyrosine (pTyr) antibodies, was the turning point in the elucidation of pTyr-mediated interactions. The production of antibodies that distinguish mUb from pUb might prove extremely difficult, because all epitopes in mUb might also be contained in pUb. Nevertheless, recent findings from Dikic's lab and ours might help circumvent this problem. We have exploited two antibodies, one that recognizes both pUb and mUb, and one that recognizes only pUb, to selectively identify monoubiquitinated RTKs (35). It is thus now possible to envision strategies for the selective enrichment of monoubiquitinated species for proteomics approaches.

Ub and Ub Receptors: Is the Interaction Specific?

A major unanswered question is whether (and if so, how) specificity exists in the interaction between Ub and Ub receptors. How do different Ub receptors recognize different Ub conjugates specifically, to achieve biochemical and biological diversity? What prevents Ub receptors from interacting with polyubiquitinated proteins, which after all represent the majority of ubiquitinated species within the cell? A related question, stemming from the observation that the UIM-Ub interaction is of low affinity (60), is how significant stoichiometries of interaction are achieved in vivo.

In some cases, where more than one UIM is present, avidity for pUb (or for multiple mUbs, as in the case of RTKs) is predicted to result in stable interactions. This might be the case, for example, for the proteasomal S5a protein. In this instance, specificity might not be a real biological problem, because S5a should operate through efficient recognition of polyubiquitinated proteins regardless of their nature [reviewed in (74)]. In other cases, as exemplified by the RTK-eps15 association, a stable interaction might again result from avidity, because RTKs are monoubiquitinated at multiple sites (35), and eps15, which contains a single functional UIM, is a dimer or a tetramer in the cell (75). In addition, compartmentalization might help maintain the specificity of the interaction.

However, for the interaction between Ub receptors and mUb to be at the core of a signal-inducible protein-protein network, molecular devices should exist to direct specific interaction(s). A reasonable working hypothesis can be derived by extending the analogy with the pTyr-SH2 or phosphotyrosine-binding (PTB) interaction. In the latter case, a double contact takes place that includes a "constant" component, of relatively low affinity, between pTyr and the pTyr pocket in the SH2 or PTB. Specificity (and high affinity) is achieved through an additional "variable" contact, of even lower affinity, between a specificity-determining region (SDR), which is different among various SH2s and PTBs, and amino acids distal or proximal to pTyr (76-78). Thus, additional interactions, analogous to those mediated by the SDR, between Ub receptors and Ub-bearing proteins, might account for specificity. Whatever the case, how selectivity in binding and biological diversity are achieved remains to be determined and warrants further experimental attention.

An Integrated "Genomics" View of the EH- and mUb Networks

The important roles of the EH- and mUb networks in the regulation of endocytosis suggest that the two systems may be functionally coordinated. One possible way to elucidate such a complex issue is to exploit information on functional genomics. In particular, several high-throughput screens aimed at the systematic identification of protein complexes in S. cerevisiae have been performed (79-83). By taking advantage of ad hoc databases, which combine literature-derived and high-throughput interaction data sets (84), we tried to visualize a yeast network based on proteins that contain EH domains, UIM and UBA domains, NPF motifs, or any combination of these devices. This "conceptual map" is shown in Figure 4 [see also Supplementary Information 4] and underscores how proteins of the EH- and mUb networks might link processes (hereby operationally defined as "realms") such as endocytosis and trafficking, control of the actin cytoskeleton, DNA repair, nuclear shuttling, and possibly even cell cycle control, through physical or genetic interactions.

As expected, the highest concentration of proteins from these two networks is in the realm of endocytosis (Fig. 4). Here, Ede1p and Ent1p provide the best examples of integration of the two networks. In fact, Ede1p and Ent1p (similarly to their mammalian homologs eps15 and epsin) interact both physically, through an EH-NPF contact (85), and genetically, to promote cooperatively the internalization of ubiquitinated receptors (55). They also contain UIMs and UBAs, and their mammalian homologs are monoubiquitinated (37, 39). Aguilar and coworkers (85) have proposed a model in which the UIM of Ent1p and the UBA domain of Ede1p can cooperatively bind to ubiquitinated cargo receptors at the plasma membrane; the complex is further stabilized by the interaction of the EH domains of Ede1p with the NPF motif on Ent1p (Fig. 1 and Fig. 3).

The most clear-cut connection between realms is represented by that between endocytosis and regulation of the actin cytoskeleton (Fig. 4). These two processes are closely linked far beyond the involvement of the EH- and mUb networks, as indicated by numerous studies that have identified mutations that simultaneously affect the two processes [reviewed in (86-88)]. One clear example derives from Tang et al. (89), who demonstrated that Pan1p and End3p (two EH-containing proteins involved in endocytosis) form a complex with Sla1p, a protein that is essential for the proper formation of the cortical actin cytoskeleton (90). However, although these interactions are mediated by the C-terminal portion of Sla1p and the EH domains of Pan1p and End3p, the NPF motif of Sla1p is not required (89). The Sla1p-End3p-Pan1p complex can modulate actin polymerization through the Arp2/3 complex (Fig. 1; [(91)]. Interestingly, many other proteins in the actin-regulation realm (Fig. 4) contain NPF motifs, suggesting further levels of regulation governed by EH-mediated interactions.

The mUb network is implicated in the endocytic-cytoskeletal connection through Rsp5p, an E3 ligase that represents the ortholog of mammalian NEDD4. Rsp5p probably influences coordination between the two realms through its ability to monoubiquitinate proteins, rather than through polyubiquitination leading to degradation. Supporting this interpretation, Rsp5p monoubiquitinates several plasma membrane transporters, thereby regulating their internalization (92-97). In addition, RSP5 gene mutants interact genetically and show synthetic lethality with several genes that encode proteins of the endocytic or actin cytoskeleton machinery (Fig. 4), a finding more readily compatible with regulatory interactions than with modulation of proteasomal degradation [Supplementary Information 4]. Interestingly, Rsp5p also displays a NPF motif, although its interaction with the EH-network has not been tested.

Finally, as is evident from Figure 4, EH- or mUb network-based interactions, or both, might also exist among other realms, including control of intracellular traffic, DNA repair, cell cycle control, and nuclear shuttling. Although most of these connections remain highly speculative, coordinated control of endocytosis and nuclear shuttling is suggested by findings in mammalian systems where the Hrb protein (an ortholog of yeast Nup42p or Rip1p) interacts physically with eps15, and through this interaction influences nuclear export pathways [Fig. 2; (98)].

Conclusions

The EH- and mUb networks sit at the heart of endocytosis and connect the machinery of this process to those of a number of other important cellular processes. Knowledge of how these networks function is relevant not only to cell physiology, but to pathology, because the pathogenesis of a number of diseases can be directly traced to alterations in endocytosis or changes in members of the EH- and mUb networks. These include metabolic diseases, neurodegenerative diseases, and cancer (25, 54, 99, and references therein). The dream of signaling scientists is to unravel the wiring diagram of individual networks and of the inter-network connections, with the ultimate goal of reverse-engineering the cellular machine. This approach will provide the knowledge for the design of target-driven therapeutics. Now, with the advent of postgenomic technologies, this objective appears within reach. At the present rate of effort, it will not be long before this goal is achieved for the EH- and mUb networks.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
  80. 80.
  81. 81.
  82. 82.
  83. 83.
  84. 84.
  85. 85.
  86. 86.
  87. 87.
  88. 88.
  89. 89.
  90. 90.
  91. 91.
  92. 92.
  93. 93.
  94. 94.
  95. 95.
  96. 96.
  97. 97.
  98. 98.
  99. 99.
  100. 100.
  101. 101.
  102. 102.
  103. 103.
  104. 104.
  105. 105.
  106. 106.
  107. 107.
  108. 108.
  109. 109.
  110. 110.
  111. 111.
  112. 112.
  113. 113.
  114. 114.
  115. 115.
  116. 116.
  117. 117.
  118. 118.
  119. 119.
  120. 120.
  121. 121.
  122. 122.
  123. 123.
  124. 124.
  125. 125.
  126. 126.
  127. 127.
  128. 128.
  129. 129.
  130. 130.
  131. 131.
  132. 132.
  133. 133.
  134. 134.
  135. 135.
  136. 136.
  137. 137.
  138. 138.
  139. 139.
  140. 140.
  141. 141.
  142. 142.
  143. 143.
  144. 144.
  145. 145.
  146. 146.
  147. 147.
  148. 148.
  149. 149.
  150. 150.
  151. 151.
  152. 152.
  153. 153.
  154. 154.
  155. 155.
  156. 156.
  157. 157.
  158. 158.
  159. 159.
  160. 160.
  161. 161.
  162. 162.
  163. 163.
  164. 164.
  165. 165.
  166. 166.
  167. 167.
  168. 168.
  169. 169.
  170. 170.
  171. 171.
  172. 172.
  173. 173.
  174. 174.
  175. 175.
  176. 176.
  177. 177.
  178. 178.
  179. 179.
  180. 180.
  181. 181.

Stay Connected to Science Signaling

Navigate This Article