Previously identified motifs are marked with (✓). Regular expressions follow POSIX definitions (23). The symbols ‘x’ and ‘.’ mark any residues in the definition of main residues and regular expressions.
Region | Protein (UniProt accession) | Motif | ELM class* | Main residues | Regular expression | Start | End | Sequence† | Binding domain‡ | Interaction partner§ | Interaction type |
Extracellular | SARS-CoV-2 spike protein (P0DTC2) | RGD | LIG_RGD | RGD | RGD | 403 | 405 | RGD | PF00362 and PF01839 | RGD-binding integrins, most probably α5β1 and αvβ3 | Host:virus |
Multibasic cleavage sites (✓) | – | RRxR | – | 682 | 687 | RRAR|SV | PF00082 or IPR001254 | Furin-like PCs/ TMPRSS2 | Host:virus | ||
– | KxxKR | – | 811 | 817 | KPSKR|SF | ||||||
CendR (✓) | LIG_NRP_ CendR_1 | RxxR | [RK].{0,2}[R]$ | 682 | 685 | RRAR | PF00754 | Neuropilin-1 | Host:virus | ||
Integrin αv (similar for other α chains) (P06756) | Multibasic cleavage sites (✓) | – | xKR | – | 888 | 892 | TKR|DL | PF00082 | Furin-like PCs | Host | |
Integrin β3 (similar for other β chains) (P05106) | MIDAS║ (✓) | – | DxSxS | D.[TS].S | 145 | 149 | DLSYS | – | The acidic part of RGD-like ligands | Host | |
Furin (P09958) | RGD | LIG_RGD | RGD | RGD | 498 | 500 | RGD | PF00362 and PF01839 | Possibly RGD-binding integrin dimers | Host | |
MIDAS║ | – | DxSxS | D.[TS].S | 543 | 547 | DISNS | - | Unknown partner with acidic residue via metal ion coordination | Host | ||
Multibasic cleavage site (✓) | – | R | – | 697 | 716 | RTEVEKAIRM SRSRINDAFR | IPR001254 | TMPRSS2 | Host | ||
Intracellular | ACE2 (Q9BYF1) | I-BAR binding | LIG_IBAR_ NPY_1 | NPY | NPY | 779 | 781 | NPY | IPR027681 | I-BAR domain– containing proteins like IRSp53 or IRTKS | Host |
Endocytic sorting signal | TRG_ ENDOCYTIC_2 | YPxΦ | Y[^P].[LMVIF] | 781 | 784 | YASI | PF00928 | Adapter protein complex μ2 subunit | |||
SH2 binding | – | YxxΦD | ((Y)[DE][^KRHG] [DESTAPILVMFYW] [^KR])|((Y) [NQSTAILVMFY] [^KRHG][ILV][^KR]) | 781 | 785 | YASID | PF00017 | SH2 domain of SFKs | |||
LIR autophagy | LIG_LIR_ Gen_1 | ExxYxxΦxΦ | [EDST].{0,2}[WFY] [^RKP][^PG] [ILMV].{0,4}[LIVFM] | 778 | 786 | ENPYASIDI | PF02991 | Related proteins LC3, Atg8, GABARAP. There may be some variation in LIR motif specificity | |||
apoPTB | LIG_PTB_ Apo_2 | Nxx[FY] | (.[^P].NP.[FY])|(. [ILVMFY].N..[FY].) | 789 | 796 | GENNPGFQ | PF08416 | PTB-containing protein with a preference for NxxF core motifs | |||
PBM | LIG_PDZ_ Class_1 | TxF$ | [ST].[ACVILF]$ | 800 | 805 | DVQTSF | PF00595 | PDZ-containing proteins with TxF$ preferences such as NHERF3 and SHANK1 | |||
Integrin β3 (P05106) | apoPTB (✓) | LIG_PTB_ Apo_2 | Nxx[FY] | (.[^P].NP.[FY])|(. [ILVMFY].N..[FY].) | 767 | 774 | TANNPLYK | PF00373 PF00630 | Talins (high affinity) Dok1 (low affinity) Filamin-A (binding to both apoPTB motifs simultaneously) | Host | |
779 | 786 | TFTNITYR | PF00373 PF00630 | Kindlin Filamin-A (binding to both apoPTB motifs simultaneously) | |||||||
PTB (✓) | LIG_PTB_ Phospho_1 | Nxx(Y) | (.[^P].NP.(Y))|(. [ILVMFY].N..(Y)) | 767 | 773 | TANNPLY | PF08416 PF00640 PF02174 | Talins (low affinity) Dok1 (high affinity) Shc (binding to both PTB motifs simultaneously) | |||
779 | 785 | TFTNITY | PF00640 | Shc (binding to both apoPTB motifs simultaneously) | |||||||
LIR autophagy | LIG_LIR_ Gen_1 | ExxYxxΦxΦ | [EDST].{0,2}[WFY] [^RKP][^PG] [ILMV].{0,4}[LIVFM] | 777 | 783 | TSTFTNI | PF02991 | Atg8 protein family | Host | ||
Integrin β1 (P05556) | ApoPTB (✓) | LIG_PTB_ Apo_2 | Nxx[FY] | (.[^P].NP.[FY])| (.[ILVMFY].N..[FY].) | 777 | 784 | TGENPIYK | PF00373, PF10480 PF00630 | Talins (high affinity) Dok1 (low affinity) ICAP-1 Filamin-A (binding to both apoPTB motifs simultaneously) | Host | |
789 | 796 | TVVNPKYE | PF00373 PF00630 | Kindlin Filamin-A (binding to both apoPTB motifs simultaneously) | |||||||
PTB (✓) | LIG_PTB_ Phospho_1 | Nxx(Y) | (.[^P].NP.(Y))|(. [ILVMFY].N..(Y)) | 777 | 783 | TGENPIY | PF10480 PF00640 PF02174 | Talins (low affinity) Dok1 (high affinity) ICAP-1 Shc (binding to both PTB motifs simultaneously) | |||
789 | 795 | TVVNPKY | PF00640 | Shc (binding to both PTB motifs simultaneously) |
*Motif identifier as in the ELM resource.
†“|” denotes cleavage points for protease-recognition motifs.
‡Defined through use of Pfam (103) or InterPro (104), where applicable.
§PC, proprotein convertases.
║Not a SLiM but a structural motif.