Research ResourcePosttranslational Modifications

Proteome-wide analysis of arginine monomethylation reveals widespread occurrence in human cells

See allHide authors and affiliations

Sci. Signal.  30 Aug 2016:
Vol. 9, Issue 443, pp. rs9
DOI: 10.1126/scisignal.aaf7329

Appreciating arginine methylation

Posttranslational modifications, such as phosphorylation and ubiquitylation, regulate protein abundance, localization, interactions, and function. Larsen et al. investigated the landscape and functional roles of the posttranslational modification arginine methylation. RNA interference and high-throughput single-cell imaging revealed that arginine methylation regulated two proteins involved in RNA processing and transport. Arginine methylation by distinct arginine methyltransferases controlled the localization and RNA binding functions of the pre-mRNA splicing factor SRSF2 and the RNA-transporting activity of the protein HNRNPUL1. On a broader level, their data provide a rich resource for the future investigation of the function of arginine methylation and indicate that sites modified by this posttranslational modification are hotspots for mutations in disease.

Abstract

The posttranslational modification of proteins by arginine methylation is functionally important, yet the breadth of this modification is not well characterized. Using high-resolution mass spectrometry, we identified 8030 arginine methylation sites within 3300 human proteins in human embryonic kidney 293 cells, indicating that the occurrence of this modification is comparable to phosphorylation and ubiquitylation. A site-level conservation analysis revealed that arginine methylation sites are less evolutionarily conserved compared to arginines that were not identified as modified by methylation. Through quantitative proteomics and RNA interference to examine arginine methylation stoichiometry, we unexpectedly found that the protein arginine methyltransferase (PRMT) family of arginine methyltransferases catalyzed methylation independently of arginine sequence context. In contrast to the frequency of somatic mutations at arginine methylation sites throughout the proteome, we observed that somatic mutations were common at arginine methylation sites in proteins involved in mRNA splicing. Furthermore, in HeLa and U2OS cells, we found that distinct arginine methyltransferases differentially regulated the functions of the pre-mRNA splicing factor SRSF2 (serine/arginine-rich splicing factor 2) and the RNA transport ribonucleoprotein HNRNPUL1 (heterogeneous nuclear ribonucleoprotein U-like 1). Knocking down PRMT5 impaired the RNA binding function of SRSF2, whereas knocking down PRMT4 [also known as coactivator-associated arginine methyltransferase 1 (CARM1)] or PRMT1 increased the RNA binding function of HNRNPUL1. High-content single-cell imaging additionally revealed that knocking down CARM1 promoted the nuclear accumulation of SRSF2, independent of cell cycle phase. Collectively, the presented human arginine methylome provides a missing piece in the global and integrative view of cellular physiology and protein regulation.

INTRODUCTION

Protein arginine methylation is catalyzed by protein arginine methyltransferases (PRMTs) via the transfer of methyl groups from S-adenosyl methionine to the side chains of arginine residues (1). The PRMT enzymes are divided into two categories depending on the type of methylarginine they catalyze. Although all PRMTs can generate omega-N-methylarginine (arginine monomethylation), generation of asymmetric dimethylarginine is primarily catalyzed by PRMT1, coactivator-associated arginine methyltransferase 1 (CARM1; also referred to as PRMT4), PRMT6, and PRMT8, whereas PRMT5 and PRMT9 primarily catalyze the formation of symmetric dimethylarginine. Because monomethylation is a prerequisite for the enzymatic catalysis of dimethylation, identification of monomethylated arginine residues is a strong proxy for arginine methylation sites. The guanidino groups on the arginine side chain are unique among amino acids, because they form strong interactions with biological hydrogen bond acceptors (2). Consequently, addition of methyl groups to the guanidino groups may negatively alter such hydrogen bond interactions or alternatively facilitate stacking with bases of RNA and DNA or aromatic residues as the methylated arginine becomes more hydrophobic (3). As a result, arginine methylation increases the structural diversity of proteins and modulates their function in living cells and has been described to play a prominent role in protein-protein, protein-RNA, and protein-DNA interactions (4). However, information regarding which protein substrates are targeted by PRMTs and the functional consequences of arginine methylation on a global scale is as yet not well understood. Furthermore, the cellular extent of arginine methylation, the evolutionary conservation of modifications sites, and the site-specific stoichiometry of this modification remain elusive. Together, such information greatly aid in the biological interpretation of protein arginine methylation and, combined with localization and cellular regulation of modification sites, allow for an improved understanding of the cellular functions that arginine methylation plays in human cells (5).

Here, we used an improved strategy for systematic and proteome-wide analysis of endogenous arginine methylation sites. We identified 8030 high-confidence omega-NG-monomethylarginine (MMA) sites residing on 3300 proteins in human embryonic kidney (HEK) 293 cells. The increased number of identified arginine methylation sites presented here raises fundamental questions about their properties and biological relevance. For example, we found that the cellular distribution of arginine methylation was analogous to other widespread modifications, such as phosphorylation and lysine ubiquitylation. Using RNA interference (RNAi) and quantitative proteomics, we investigated the regulatory roles of PRMT enzymes and determined the cellular stoichiometry of arginine methylation sites. Our data challenge the current dogma by showing that PRMT enzymes catalyzed methylation independent of arginine sequence context. Although we found that arginine methylation sites generally had a decreased somatic mutation rate than did regular arginine residues, modification sites residing on splicing components are enriched in somatic mutations, suggesting a previously unknown and unexplored regulatory function for arginine methylation in human cancer. Using quantitative high-content imaging, we demonstrate that arginine methylation catalyzed by CARM1 affects the nuclear localization of serine/arginine-rich splicing factor 2 (SRSF2) and that arginine methylation by various PRMTs regulates the RNA binding properties of SRSF2 and heterogeneous nuclear ribonucleoprotein U-like 1 (HNRNPUL1). Collectively, these data demonstrate the potential of our approach to uncover hitherto unappreciated processes regulated by arginine methylation. To our knowledge, this resource represents the largest experimentally derived catalog of arginine methylation sites and constitutes a first draft reference of the human arginine methylome.

RESULTS

Arginine monomethylation is a widespread posttranslational modification

To facilitate a systematic analysis of arginine methylation sites in human proteins, we combined high-pH (HpH) prefractionation with antibody-based enrichment of arginine monomethylated peptides for identification of modification sites on a global scale (Fig. 1A). Here, we focused on analyzing monomethylated peptides because they constitute methylation substrates or methylation products for essentially all PRMT enzymes and therefore constitute a general proxy for arginine methylation sites (4). Each antibody-enriched HpH fraction of HEK293 cells was subsequently analyzed on a Q Exactive HF mass spectrometer (MS) using a 77-min liquid chromatography (LC)–MS gradient (further detailed in Materials and Methods), enabling proteome-wide analysis of the arginine methylome within just 22 hours. Replicate analysis among the technical and independent experiments demonstrates strong reproducibility in the established methodology, furthered by high reproducibility and resolution of the HpH fractionation (fig. S1, A to C). Moreover, we find that 73% of identified sites from one replicate were also identified in a second, independent replicate, in support of reliable site and peptide identification by our MS analysis (Fig. 1B). Notably, our described methodology does not entail use of any methanol- or SDS-based sample preparation, which has been previously described to induce artificial in vitro methylations on cysteines and glutamic and aspartic acids (6), which could potentially give rise to incorrect identifications of protein arginine methylation sites during database searches (further detailed in Materials and Methods) (7). Collectively, from the four replicates, we could identify an initial 7866 high-confidence arginine methylation sites (localization probability >0.75) belonging to 3086 human proteins (Fig. 1C).

Fig. 1 Proteome-wide identification of arginine methylation sites.

(A) Overview of the experimental setup for identification of arginine methylation sites in HEK293 cells. (B) Percentage overlap for identified high-confidence sites (localization score >0.75) from four replicates (two technical replicates each from two independent experiments). (C) Total number of methylated arginine sites and proteins identified from four replicate experiments with a false discovery rate (FDR) of 1% for both peptide and protein identifications. (D) Distribution of arginine methylation sites per protein. (E) Correlation of cellular distribution of identified sites with the known distribution of PRMT enzymes. (F) Biological processes significantly enriched among proteins observed to be arginine-methylated. ER, endoplasmic reticulum. (G) Immunoblot (IB) for green fluorescent protein (GFP)–tagged SMARCA5, MCM6, and SMC1A after pulldown for GFP or omega-NG-MMA. WCL, whole cell lysate; IP, immunoprecipitation. (H) Distribution of arginine methylation sites compared to other widespread posttranslational modifications. AA, amino acid. Data are representative of n = 4 experiments.

Compared to publicly available databases (8), the majority (>80%) of identified arginine methylation sites constituted novel modification sites (fig. S1D), thereby enabling a deeper understanding of the cellular extent of arginine methylation and providing novel insight into the biological processes surrounding it. To this end, we first assessed the distribution of arginine methylation sites and found that 52% of identified proteins harbor more than one arginine methylation site, with only 3.7% of identified proteins containing >10 modification sites (Fig. 1D). Gene Ontology (GO) analysis revealed a widespread cellular localization of arginine-methylated proteins with almost equal distribution of cytoplasmic and nuclear targets (Fig. 1E), in agreement with the cellular distribution of the catalyzing PRMT enzymes (9). We observed arginine methylation sites on 373 proteins annotated with mitochondrial localization (table S1). Given that PRMTs are not known to be active or to exist in mitochondria, we speculate that the observed arginine methylations, which exclusively reside on proteins encoded by the nuclear genome, are most likely deposited outside the mitochondria (9). Thus, the functional role for these modifications may relate to activities performed outside mitochondria.

Our data furthermore reveal that arginine methylation sites are present on proteins involved in major cellular processes, including RNA transport, endocytosis, cell cycle, and insulin signaling. Although arginine methylation has previously been reported to play functional roles in these processes (2, 1012), our data indicate that the role of the modification in these processes is much broader than previously anticipated (Fig. 1F). Likewise, we found a large number of proteins involved in DNA replication to be arginine-methylated (Fig. 1F), supporting emerging observations that arginine methylation is an important modification in this process. For example, our data set confirms the known arginine methylation of Flap endonuclease 1 (FEN1), and we additionally confirmed several other DNA replication–associated factors [structural maintenance of chromosomes protein 1A (SMC1A), the DNA replication licensing factor minichromosome maintenance protein complex 6 (MCM6), and the switch/sucrose nonfermentable (SWI/SNF)–related matrix-associated actin-dependent regulator of chromatin subfamily A member 5 (SMARCA5)] as arginine-methylated substrates by Western blot analysis after immunoprecipitation for omega-NG-MMA (Fig. 1G).

Next, we assessed the cellular extent of the arginine methylome by calculating the overall percentage of arginine residues that harbor a methylation within a given protein. With the identification of more than 3000 arginine-methylated proteins in our data set, we calculated that at least 7% of all arginines in these proteins were observed to be methylated (Fig. 1H). In comparison, 9% of all serine residues and 3% of threonines are observed to be phosphorylated in public phosphoproteomic databases (8), whereas 7% of lysine residues are observed ubiquitylated (13) and 5.7% show SUMOylation (14). Although amino acids occur at different frequencies in the human genome, our data demonstrate that arginine methylation is a widespread modification analogous to other global posttranslational modifications (PTMs), including phosphorylation and ubiquitylation (Fig. 1H).

Arginine monomethylation targets major protein complexes

Intrigued by the presence of arginine-methylated proteins in a wide array of biological processes (Fig. 1F), we next used the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database of physical and functional interactions to quantify the protein interaction properties of arginine-methylated proteins (15). Compared to randomly selected proteins with the same node degree distribution, arginine-methylated proteins had significantly more functional interactions with each other [meaning they had higher network connectivity (P < 1 × e−323), with an average of 6.9 interactions per arginine methylation protein, in contrast to 3.2 interactions per random node (analyzed using the highest STRING score threshold, >0.9)].

To detail the interaction network, we classified arginine-methylated proteins to known macromolecular complexes using the Comprehensive Resource of Mammalian protein complexes (CORUM) database (16). An unbiased enrichment analysis revealed that arginine methylation of protein complexes is a general property of arginine-methylated proteins (table S2; P < 0.001, Fisher’s exact test).

Several known RNA binding complexes, such as the spliceosome (fig. S2A), were highly enriched in arginine-methylated proteins, whereas RNA processing complexes that were not previously associated with arginine methylation were also observed to be enriched in modified proteins (Fig. 2A). Intriguingly, we observed the nuclear pore complex (NPC) extensively modified with arginine methylation (Fig. 2B and table S1). Because NPC mediates the transport of molecules across the nuclear envelope, this finding supports a yet uncharacterized function for arginine methylation in regulating shuttling of RNA and proteins between the cytoplasm and nucleus.

Fig. 2 Arginine methylation targets protein complexes.

(A to E) Proteins modified with arginine methylation were mapped onto the CORUM database, and significantly enriched protein complexes were visualized using STRING and Cytoscape. High-confidence protein complexes included (A) RNA processing complexes, (B) the nuclear core complex, (C) cell cycle complexes, (D) several chromatin remodeling complexes, and (E) DNA damage complexes. Data are representative of n = 4 experiments.

Besides its involvement in RNA processing, arginine methylation has been shown to target proteins involved in the cell cycle (17), chromatin remodeling (18), and the DNA damage response (19). However, the extent of the modification within these processes is elusive. In our data set, we found several cell cycle–associated protein complexes enriched in arginine methylation (Fig. 2C), including several members of the mitotic checkpoint complex [such as mitotic arrest deficient-like 1 (MAD2L1), mitotic checkpoint protein BUB3, and cell-division cycle protein 20 (CDC20)], which may shed new light on the regulatory functions of the complex. Moreover, we confirmed that the known chromatin remodeling substrates SMARCA4 (SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily A, member 4) and SMARCC1 are arginine-methylated proteins (fig. S4E), and we additionally found that most of the SWI/SNF, nucleosome remodeling and deacetylase (NuRD), host cell factor 1 (HCF-1), and protein regulator of cytokinesis 1 (PRC1) complex components were extensively modified (Fig. 2D).

Within the DNA damage response, meiotic recombination 11 (MRE11) was previously shown to be arginine-methylated in its glycine-arginine–rich (GAR) motif by PRMT1 (20), with arginine methylation suggested to regulate the exonuclease activity of MRE11 on double-stranded DNA (21). Our proteomic analysis confirms the known arginine methylation site of MRE11, whereas additional methylation sites are observed on Arg388, Arg604, and Arg616. In addition to MRE11, we also found the DNA repair protein RAD50 to be arginine-methylated on four residues, whereas no methylations were found on Nijmegen breakage syndrome 1 (NBS1) (table S1). Other DNA damage factors targeted by arginine methylation include mediator of DNA damage checkpoint 1, which harbors several modification sites in the nuclear localization sequence and within the sequence region interacting with the DNA-dependent protein kinase catalytic subunit complex (22). Besides, DNA damage–associated complexes, such as MCM, histone H2AX complex, DNA methyltransferase 1–associated protein (DMAP1) complex, and the repressor/activator protein 1 (RAP1) complex, are also highly enriched in arginine-methylated proteins (Fig. 2E). Using Western blot analysis, we confirm several protein candidates involved in various protein complexes as arginine-methylated substrates (fig. S2B). Collectively, these data demonstrate that arginine methylation extensively targets protein complexes and forms several tightly connected clusters (table S3).

Arginine monomethylation localizes to nonenzymatic RNA binding domains

To investigate the site-specific properties of arginine methylation, we analyzed local sequence context around the modification sites using the IceLogo web server (Fig. 3A) (23). Arginine methylation has previously been reported to occur in glycine-rich regions (24), in particular within the so-called GAR motifs (2). However, our data set displays a less pronounced preference for GAR motifs than previously observed, with only 31% of identified sites residing in the GAR motif (24). To investigate this discrepancy, we extracted the cellular abundance of each arginine methylation site based on their MS signal intensities and observed that the abundance of arginine methylation sites spans six orders of magnitude (fig. S1F). Hence, previous analyses may only have sampled more abundant arginine methylation sites, demonstrating a preferred localization within GAR motifs. In support of this, we found that abundant arginine methylation sites were strongly enriched in GAR motif in comparison to their lower abundant counterparts (fig. S1G).

Fig. 3 Site-specific analyses of arginine methylation.

(A) Sequence logo for identified arginine methylation sites in relation to previously reported RG motif. (B) Pfam domain analysis for arginine methylation preference in relation to RNA binding domains (RBDs) [RRM, KH, and Asp-Glu-Ala-Asp (DEAD)] and phospho-specific domains [pkinase, major histocompatibility complex (MHC), and Src homology 2 (SH2)]. Statistical significance was calculated with a two-tailed t test assuming unequal variance. (C and D) Arginine methylation colocalization with phosphorylation (C) or lysine ubiquitylation (D) (assessed by Fisher’s exact test). (E) Evolutionary analysis of methylated arginines compared to unmodified counterparts. The abscissa lists species in ascending degree of evolutionary distance to Homo sapiens, whereas the ordinate depicts the overall percentage of arginines at specific positions within multiple sequence alignments of clusters of orthologous genes (COGs). These positions are determined by pairwise comparison of methylated and nonmethylated arginines of the human proteins measured within this study. Data are representative of n = 4 experiments.

Previously, we reported that arginine methylation preferentially resides outside of protein family (Pfam) domains (24). Our current study recapitulates this finding and strengthens the conclusion (fig. S3A). Investigation of the distribution of arginine methylation sites compared to unmodified arginine residues across specific Pfam domains (Fig. 3B) shows a significant enrichment in the RNA recognition motif (RRM) and K homology (KH) RNA binding domains (RBDs), supporting the notion that methylated arginines function in regulating the dynamics and properties of RNA binding proteins.

In contrast, arginine methylation sites were not observed within the DEAD box domain of RNA helicases (Fig. 3B), supporting that arginine methylation primarily participates in regulating the RNA binding properties of nonenzymatic RBDs as compared to enzymatic RBDs (25).

Arginine methylation colocalizes with phosphorylation but not ubiquitylation

Arginine methylation has previously been reported to participate in crosstalk with phosphorylation (26, 27) and ubiquitylation (28, 29). To investigate colocalization between arginine methylation and phosphorylation, we first aligned our arginine methylation sites with those of unmodified serine, threonine, and tyrosine residues. No enrichment of these amino acids was observed in a ±6–amino acid window surrounding arginine methylation sites (fig. S3B). On the contrary, a strong colocalization of arginine methylation and phosphorylation sites (from phosphosite.org) (8) was observed (Fig. 3C), confirming that, on a proteome-wide scale, arginine methylation and phosphorylation localize to the same sequence regions in target substrates.

From the distribution of lysine residues in the near vicinity (±6 amino acids) of arginine methylation sites, we found lysine residues to preferentially localize nearer nonmodified arginine residues (fig. S3C). This may stem from lysine residues having physicochemical properties similar to arginine (hydrophobic and charged). Hence, having an amino acid with similar properties in close vicinity might hamper the biological impact of arginine methylation. Analogously, we found that lysine ubiquitylation sites [diGly-modified lysines (13)] were significantly enriched in the proximity of nonmodified arginine residues (Fig. 3D), supporting that crosstalk between arginine methylation and lysine ubiquitylation may not be prevalent. However, whether crosstalk is occurring, or whether such crosstalk might be antagonistic, sequential, or mutually exclusive, remains to be investigated for each individual protein region.

Arginine methylation sites are less conserved compared to nonmodified counterparts

On the basis of our observation that arginine methylation clusters proximal to other PTMs, we decided to investigate the evolutionary conservation of identified arginine methylation sites. Such evolutionary conservation is often used to infer functionally important modification sites on proteins; however, most human PRMTs originated early in eukaryotic evolution and therefore experience an evolutionary conservation distinct from that of, for example, kinases and acetyltransferases in eukaryotes (30, 31). Our analysis revealed that arginine methylation sites are less conserved across eukaryotic species as compared to their nonmodified counterparts (Fig. 3E) for both disordered and ordered regions (fig. S3E). However, a strong conservation across species closely related to humans was observed, demonstrating that protein arginine methylation constitutes more recent functional innovations. Intriguingly, the evolution of arginine methylation sites hereby follows the general evolution of PRMT enzymes and trails the evolution of spliceosomal introns (32). Thus, the functions of arginine methylation in RNA metabolism appear to evolutionarily overlap with the diversification of the transcriptome presumably through the important function that arginine methylation plays in alternative splicing and RNA metabolism (33).

Arginine methylation sites generally have decreased somatic mutation rates compared to regular arginine residues

For investigation of the correlation between arginine methylation and somatic mutations [single-nucleotide variants (SNVs)], we mapped somatic mutations onto our data set and compared the overlap of somatic mutations occurring at methylated versus nonmethylated arginine residues. According to the Catalogue of Somatic Mutations in Cancer (COSMIC) database (34), arginine residues are most frequently targeted by somatic mutations compared to any other amino acid (Fig. 4A). This discrepancy relates to the covariate of the human mutation rate and its positive correlation with CpG (5′-cytosine-phosphate-guanine-3′) content, with CpG being hypermutable (35). Because most cytosines in CpGs are methylated (36), the deamination of cytosine is enhanced, leading to a mutation rate that is 10 to 50 times higher than with cytosines located in any other sequence context (37). Consequently, the increased mutation rate of arginine residues probably relates to the high CpG content of four of the six codons encoding for arginine (CGU, CGC, CGA, CGG, AGA, and AGG). To confirm this, we looked at the amino acids to which arginine residues are most frequently mutated. As expected, cysteine, histidine, glutamine, arginine (synonymous mutation), and tryptophan are most frequently observed as somatic mutations at arginine residues, consistent with CpG-driven mutations of codons CGU, CGC, CGA, and CGG (fig. S3D).

Fig. 4 Arginine methylation sites are targeted by somatic mutations.

(A) Distribution of somatic mutations reported in COSMIC in comparison to amino acid distribution of the human proteome reveals that arginine residues are preferred sites for somatic mutations. (B) Distribution of somatic mutations (SNVs) on arginine methylation sites versus nonmodified arginines. Orange represents occurrence of synonymous mutations, whereas green represents nonsynonymous mutations. (C) Distribution of single-nucleotide polymorphisms (SNPs) occurring at arginine methylation sites as compared to nonmodified arginine residues. No difference is observed (Fisher’s exact test). (D) GO enrichment analysis reveals that genes harboring arginine methylations targeted by somatic mutations are strongly enriched in gene expression and RNA metabolic processes, whereas genes harboring somatic mutations at nonmodified arginines are enriched in protein transportation. Data are representative of n = 4 experiments.

When compared to nonmodified arginine residues, we observe a significant reduction of nonsynonymous SNVs at arginine methylation sites (Fig. 4B, green bars). In contrast, no difference is observed for synonymous SNVs between nonmodified arginines and methylation sites (Fig. 4B, orange bars). To investigate whether this observation is exclusive to somatic mutations, we extracted all SNPs from the COSMIC database and performed a similar analysis. This revealed an identical occurrence of synonymous and nonsynonymous SNPs between nonmodified arginine residues and methylation sites (Fig. 4C), concluding that arginine methylation sites are less targeted by nonsynonymous SNVs (P < 2.89 × e−148, Fisher’s exact test).

Although somatic mutations occur less frequently at methylated arginines, their occurrence is still preponderant when compared to somatic mutations at other amino acids (Fig. 4A). As a result, the observed differences most likely are not due to arginine methylation being a protective mark against somatic mutations but may instead stem from somatic mutations only targeting arginine methylation sites on specific subclasses of proteins. In fact, several studies have demonstrated that disease-causing mutations commonly alter protein stability (38), protein localization (39), and protein-protein interactions (40), which are all biological functions that correlate with processes affected by arginine methylation (Figs. 1F and 2, A to E). To investigate this in more detail, we compared proteins harboring arginine methylation sites targeted by somatic mutations to proteins that are generally targeted by somatic mutations. To avoid any bias imposed by the arginine methylation itself, we additionally compared the mutated proteins to our entire arginine methylome. Intriguingly, we found that proteins exclusively targeted by somatic mutations at arginine methylation sites preferentially were involved in gene expression and RNA metabolic processes (Fig. 4D), whereas proteins targeted by somatic mutations on nonmodified arginine residues were enriched in processes related to protein localization. Given that arginine methylation plays a prominent role in regulating RNA splicing events (41), our data support the fact that arginine methylation sites residing on splicing factors are specific targets for somatic mutation.

Determination of arginine methylation stoichiometry by cellular knockdown of PRMT enzymes reveals strong enzymatic regulation

To obtain regulatory information of arginine methylation sites, we next combined our established proteomic workflow with stable isotope labeling with amino acids in cell culture (SILAC) quantification and RNAi of several PRMTs (Fig. 5A). Although several PRMTs are known to be embryonic lethal in mice (42, 43), we obtained efficient knockdown in HEK293 cells (Fig. 5B). Besides quantitative investigations of PRMT enzymes via RNAi, we analyzed the regulatory effect of the recently reported PRMT5 inhibitor, EPZ015666 (44).

Fig. 5 Quantitative analysis of the cellular effects of siPRMT.

(A) Schematic representation of the SILAC-based enrichment strategy. HEK293 cells were grown in “light” or “heavy” SILAC medium. Light-labeled cells were treated with RNAi. Cell lysates were separated by HpH fractionation, and for each fraction, separate pulldowns of SILAC-encoded lysates were performed using an arginine methylation–specific antibody. Eluates were subsequently analyzed by high-resolution LC-MS/MS. (B) Control by immunoblot analysis such that proper knockdown of PRMT enzymes is obtained. (C) Correlation of replicate siPRMT1 experiments demonstrates good correlation [Pearson coefficient (R) = 0.73]. (D) siPRMT1 and siPRMT5 experiments do not correlate (R = 0.05). (E) Hierarchical clustering of arginine methylation sites under RNAi treatment of various PRMT enzymes and cellular treatment with the PRMT5 inhibitor EPZ015666 (EPZ). High reproducibility is observed as replicate siPRMT1 and siPRMT5 experiments cluster together. (F) Distribution of arginine methylation stoichiometry (occupancy) between different siPRMT experiments, and comparison to the corresponding values for phosphorylation and glycosylation. (G) Distribution of stoichiometry for arginine methylation sites residing in RG or non-RG motif sequences. (H) Distribution of stoichiometry for arginine methylation sites residing in RG or non-RG motif sequences upon siPRMT1 and siPRMT5 treatment. Data for siPRMT1 and siPRMT5 are representative of n = 2 experiments.

Replicate RNAi analysis of PRMT1 and PRMT5 revealed strong Pearson correlations in measured SILAC ratios (Fig. 5C and fig. S4B), demonstrating high biological reproducibility in our experimental setup. This is further supported by the observation that experiments using small interfering RNA (siRNA) against PRMT1 (siPRMT1) and those using siPRMT5 did not correlate (Fig. 5D and fig. S4C). A heatmap revealed that replicate experiments (for PRMT1 or PRMT5, separately) clustered together (Fig. 5E). As expected, the EPZ015666 experiment clustered closely to PRMT5 replicates, whereas the CARM1 experiment exhibited a separate cluster. From our quantitative MS data, we further confirmed known substrates of PRMTs and identified new ones, including DNA damage–associated components replication factor C subunit 5 (RFC5) and the Set1/Ash2 histone methyltransferase complex subunit ASH2 as novel targets of CARM1 or PRMT4 (fig. S4, D to F). Because the latter is also a target of PRMT1 and PRMT5 (45), our data expand current knowledge regarding the intricate and regulatory interactions between the different histone methyltransferases.

Using the quantitative information obtained in the conducted SILAC experiments, we assessed the overall stoichiometry of arginine methylation sites, defined as the fraction of a protein that is modified at a given arginine methylation site (table S4) (5). Such investigations are important because, together with the arginine methylation site, the stoichiometry of the modification will determine the effect and extent of changes in protein function. The stoichiometry is derived from an intricate interplay between the activities of enzymes capable of adding and removing the PTM, along with the overall turnover of the modified protein substrate (46). For arginine methylation, this is further complicated by the fact that monomethylation may be further converted into dimethylation, which can affect its stoichiometry; however, this was not taken into consideration in the presented analysis. Still, stoichiometric values may indicate whether a given PTM is dynamic, which is particularly interesting for arginine methylation. Consequently, obtaining comprehensive information on arginine methylation site stoichiometry enables a better understanding of its functional importance and dynamics.

To obtain stoichiometric values, we used the relative signal intensity of the arginine-methylated peptide and correlated this to the cognate nonmethylated counterpart and the overall protein abundance to infer methylation stoichiometry (47). To obtain information on nonmethylated peptide and protein abundances, we analyzed the cellular proteomes for all investigated RNAi SILAC experiments (table S5). Because all analyses were conducted in a SILAC setting, our calculations enabled us to assess the stoichiometry of arginine methylation in both wild-type cells (untreated cells, heavy SILAC conditions) and cells treated with siPRMT (light SILAC conditions). Overall, we found that the fractional stoichiometry correlated well between investigated experiments, with half of the sites in wild-type cells having less than 26% fractional occupancy (Fig. 5F and fig. S4A). Upon RNAi treatment, the fractional stoichiometry decreased to below 10% for the same sites across individual siPRMT experiments (fig. S4A), demonstrating a strong enzymatic regulation of arginine methylation stoichiometry. We found that the distribution of the fractional stoichiometry of arginine methylation was almost identical to that of the corresponding stoichiometry of phosphorylation (Fig. 5F) (47, 48). Conversely, PTMs that are generated co-translationally occur at higher stoichiometry, such as N-linked glycosylation, where half of glycosylation sites exhibit more than 72% occupancy (Fig. 5F) (49). Thus, arginine methylation exhibits a cellular extent and stoichiometry similar to phosphorylation, supporting the fact that arginine methylation exhibits a dynamic and regulatory nature analogous to phosphorylation.

PRMTs target RG and non-RG sequences with equal preference

Most PRMT enzymes (PRMT1, PRMT3, PRMT5, PRMT6, and PRMT8) are regarded to preferentially target arginine residues located within arginine-glycine (RG) motifs (2), although arginines that deviate from this census motif have been shown to be methylated by PRMTs as well (50). Hence, the global sequence preference of PRMT enzymes is an open question. In our analysis, we found that 31% of all arginine methylation sites were located in RG motifs (Fig. 3A). However, our data showed that these sites preferentially occur on abundant arginines (fig. S1, F and G). Moreover, we found that RBDs preferentially become arginine-methylated (Fig. 3B), which collectively raises the question of whether PRMT enzymes indeed prefer to methylate arginines within RG motifs or whether the observed sequence preference may stem from the cellular abundance of RNA binding proteins harboring RBD rife in RG motifs.

To investigate motif preferences in more detail, we decided to focus on the stoichiometry values derived from our quantitative siPRMT1 and siPRMT5 experiments. This premise was made on the basis that PRMT1 and PRMT5 are reported to target RG motifs and to preferentially catalyze dimethylation, and hence that monomethylation should be the preferred substrate for these PRMTs. Besides, PRMT1 is the major enzyme responsible for catalyzing arginine methylation in human cells (4). With the sequence logo analysis only depicting the overall distribution of arginine methylation sites (Fig. 3A), a comparison of stoichiometry values for arginine methylations located in RG motifs to those located in non-RG motifs should therefore be a more appropriate measure of PRMT sequence preference. When stoichiometry values were compared, we observed no statistical difference between arginine methylation sites located in RG sequence motifs and those located in non-RG sequence motifs (Fig. 5G). In fact, when comparing arginine methylation values from siPRMT1- and siPRMT5-treated cells (light SILAC conditions), the arginine methylation sites located in non-RG motifs were significantly more reduced in their stoichiometry as compared to arginine methylations located in RG motifs (Fig. 5H; P = 1.33 × e−8, Fisher’s exact test). Although the changes in stoichiometry were derived from replicate analyses of siPRMT1 and siPRMT5 experiments, the catalytic activity of other PRMT enzymes may indirectly be affected upon the loss of PRMT1 or PRMT5. Still, our data support the fact that PRMT enzymes target arginine residues located in RG and non-RG sequence motifs with similar preference.

Quantitative image-based cytometry single-cell analysis reveals CARM1-specific regulation of SRSF2 into nuclear speckles

To assess the applicability of our resource data for novel biological regulation, we conducted a more detailed analysis of the splicing component SRSF2. Our quantitative MS data identified eight arginine methylation sites on SRSF2, of which several were observed regulated upon the various siPRMT treatments (table S1). The biological role of SRSF2 (also referred to as SC35) has previously been functionally linked to its localization in nuclear speckles (51), which are dynamic nuclear domains that are enriched with factors involved in pre-mRNA processing and RNA transport (52). SRSF2 is a member of the serine/arginine (SR) Pfam, and the ability of SR proteins to bind pre-mRNA is essential for their activity in both constitutive and alternative splicing.

The SR proteins share a modular structure consisting of one or two copies of an RRM along with a C-terminal domain of variable length rich in alternating SR dipeptides (the so-called RS domain). The RRM domain primarily determines substrate specificity, whereas the RS domains participate in protein-protein interactions (53).

All modification sites identified in SRSF2 reside outside the RS domains, with most of the identified arginine methylation sites localizing to the RRM domain (fig. S5A). Because the RRM domain is known to interact with RNA, we surmised that arginine methylation of SRSF2 may constitute a novel regulatory mechanism for its assembly in nuclear speckles. To investigate this in more detail, we used immunofluorescence microscopy to analyze the subcellular localization of endogenously expressed SRSF2 in U2OS cells transfected with siRNA against several PRMT enzymes. We detected an accumulation of SRSF2 into speckle-like structures in both wild-type cells and cells treated with various siPRMTs. However, the localization of SRSF2 into nuclear speckles was more pronounced in cells treated with siCARM1 (Fig. 6A).

Fig. 6 PRMT-dependent regulation of SRSF2.

(A) U2OS cells were left untreated or pretreated with siControl or various siPRMTs, and subsequently immunostained with SRSF2 antibody. Scale bar, 10 μm. (B) Quantification of endogenous SRSF2 speckle intensities per nucleus of the conditions represented in (A). Each dot represents the intensity value observed for a single quantified cell. Data are representative of >15,000 cells quantified per condition. (C) Quantification of the number of endogenous SRSF2 speckles per nucleus of the conditions represented in (A). The red line represents median distribution of cell population. (D) (Left) Western blot analysis of the RNA binding properties for GFP-tagged SRSF2 (top) and HNRNPUL1 (bottom) in HeLa cells transfected with the indicated siRNA. (Right) GFP and HNRNPUL1 loading control for investigated cells and conditions. RBP PD, RNA-binding protein pulldown.

For unbiased and quantitative analyses, we conducted an automated and intensity-based high-content imaging detection of immunofluorescence-stained SRSF2 nuclear speckles based on more than 15,000 cells per investigated condition. The quantitative measurements revealed an increase in speckle SRSF2 intensity in cells treated with siCARM1 (Fig. 6B), whereas the total number of detected speckles was unchanged between investigated conditions (Fig. 6C). Because each dot in these figures represents values derived from single-cell analyses, we conclude from these high-content single-cell analyses that the increased accumulation into nuclear speckles is a general property of SRSF2 in CARM1-deficient cells.

To confirm the specificity of the observed SRSF2 accumulation, no regulation was detected for the splicing factor SRSF5 upon siCARM1 treatment (fig. S5, B and C). Given that CARM1 regulates DNA damage and cell cycle factors, we investigated whether the observed SRSF2 effects are cell cycle–dependent using quantitative image-based cytometry (QIBC) (54). This allowed us to monitor the complete dynamics of siCARM1 treatment with unprecedented detail and in a fully automated and high-content fashion. Using QIBC, we found a cell cycle–independent effect when cells were depleted for CARM1 compared to controls (fig. S6A). Likewise, the normalized speckle intensities showed cell cycle (G1, S, and G1-M)–independent regulation (fig. S6B), supporting the fact that the observed speckle formation is distinct from the cell cycle–dependent localization of SRSF2 caused by, for example, RAS-related nuclear protein–binding protein 2 (55).

To investigate whether CARM1-dependent accumulation of SRSF2 leads to increased RNA binding, we performed an RNA pull-down experiment of GFP-tagged SRSF2 HeLa cells using established methodologies (56, 57). Western blot analysis confirmed only a slight increase in RNA binding affinity for SRSF2 upon CARM1 knockdown (Fig. 6D), supporting the fact that CARM1-catalyzed arginine methylation primarily regulates the spatiotemporal dynamics of nuclear factor SRSF2. Intriguingly, our data revealed that the RNA binding property of SRSF2 was significantly decreased upon only PRMT5 knockdown, supporting the fact that localization and RNA binding properties of SRSF2 are regulated by arginine methylation but via distinct PRMT enzymes.

To investigate whether the observed PRMT-dependent regulation was true for other RNA binding proteins, we investigated the RNA binding properties of the heterogeneous nuclear ribonucleoprotein HNRNPUL1 within the same pull-down experiment. Previously, arginine methylation was reported to regulate the interaction between HNRNPUL1 and NBS1 within DNA damage (58, 59), whereas our proteomics data reveal that HNRNPUL1 is differentially methylated by various PRMTs (table S1 and fig. S6C). Hence, we reasoned that the observed changes may differentially affect the RNA binding properties of HNRNPUL1 analogous to our observations for SRSF2. Indeed, our RNA pull-down experiment revealed that knocking down CARM1 (siPRMT4 condition) and, to a lesser extent, PRMT1, but not PRMT5, causes a substantial increase in the RNA binding affinity of HNRNPUL1 (Fig. 6D).

Collectively, these results confirm that arginine methylation events catalyzed by different PRMTs regulate separate biological functions of the same substrate and support that these PRMT-dependent functions may be more prevalent than currently anticipated (60). Hence, our data provide novel insights into the diverse biological functions that PRMTs regulate within RNA metabolic processes.

DISCUSSION

Here, we used a streamlined methodology for proteome-wide identification and quantitation of arginine methylation sites to investigate the in vivo human arginine methylome. We identified 8030 arginine methylation sites with localization scores above 0.60 and found that arginine methylation was implicated in a diverse set of cellular functions. We observed good overlap of the measured arginine methylome in replicate experiments, indicating that we sampled a substantial part of all arginine methylation sites in these cells. With the identification of thousands of arginine methylation sites described here, we positioned the size of the arginine methylome parallel to the corresponding phosphoproteome and ubiquitylome.

In strong support of the acquired data, we confirmed covalent arginine methylation of several new targets by in vivo approaches and performed functional analysis of the data set, establishing that this modification extensively occurs in a variety of biological processes.

Although arginine residues are evolutionarily less conserved compared to their unmodified counterparts, their evolution seems to trail the early eukaryotic evolution of most human PRMTs and correlate with the biological importance of PRMTs and the spliceosome in mammalian physiology. With the metabolic costs of methylation events being rather high [12 adenosine triphosphate molecules per methylation event (61)], our analyses support the fact that such an “expensive” enzymatic reaction has not been evolutionarily retained in full.

Our data further provide evidence of a relation between arginine methylation sites and the modification sites of phosphorylation. Such co-occurrence can be used to infer PTM crosstalk and as a feature to assign functional relevance to investigated modifications (62). Notably, crosstalk between various PTMs on histone proteins has been fundamental in establishing the concept of a “histone code” (63). Considering the cellular distribution of arginine methylation reported in this study and the colocalization with other widespread modifications, we suggest that the histone code may not be exclusive to histone proteins, but a more widely occurring phenomenon.

Besides providing new biological insights into arginine methylation at a systems level, our data contain many interesting leads for future functional studies. For example, we found that all transfer RNA synthetases were arginine-methylated, suggesting an important role of this modification in translation, which could pave the way for future studies. We also found that all components of the classical epidermal growth factor (EGF) signaling pathway were extensively modified with arginine methylation, including growth factor receptor–bound protein 2 (GRB2), son of sevenless homolog 1 (SOS1), guanosine triphosphatase RAS, RAF proto-oncogene serine/threonine protein kinase (RAF1), dual-specificity mitogen-activated protein kinase kinase 1 (MEK1), and mitogen-activated protein kinase 1 (MEK1). These data support a more widespread functional role for arginine methylation within EGF signaling than was anticipated (27, 64) and may in turn lead to new insights into cellular signaling in human cells.

Moreover, using high-content microscopy and biochemical assays, we uncovered previously unknown roles for arginine methylation in the regulation of SRSF2 and HNRNPUL1, which affect subnuclear structures and RNA binding properties. Although SRSF2 and HNRNPUL1 harbor several arginine methylation sites that are regulated by distinct PRMTs, we found that knocking down CARM1 promoted the accumulation of SRSF2 in nuclear speckles and the RNA binding properties of HNRNPUL1. Contrary to this, knocking down PRMT5 prevented the RNA binding of SRSF2, whereas no effect was observed on HNRNPUL1. Collectively, these data demonstrate the analytical advantages of combining unbiased methodologies such as quantitative MS analysis with high-content microscopy and RNA binding pull-down experiments to decipher the functional role of arginine-methylated substrates.

In conclusion, our data extend current knowledge regarding arginine methylation and reinforce the widespread interest in targeting PRMT enzymes as therapeutic candidates in various human diseases (65). This idea is further supported by the development of a selective PRMT5 inhibitor for the treatment of mantle cell lymphoma (44). Hence, a comprehensive and quantitative evaluation of the human arginine methylome may reveal useful regulatory information to interrogate future disease pathways implicated by arginine methylation.

MATERIALS AND METHODS

Cell culture and transfection

HEK293, HeLa, and U2OS cells were grown in Dulbecco’s modified Eagle’s medium (DMEM; Invitrogen) supplemented with 10% fetal bovine serum (FBS) and penicillin/streptomycin (100 U/ml) (Gibco). Stable HeLa-Kyoto cells expressing SMARCA5, EIF4B, EIF3I, CBX4, SRSF1, SRSF2, RFC5, EEF1A1 FUS, and GFP tagged with C-terminal GFP under the control of an endogenous promoter were generated by transfecting BAC transgenes and were provided by A. Hyman (Max Planck Institute, Dresden). Cell lines expressing GFP-tagged MCM6 and SMC1A were provided by M. Vermeulen (Radboud Institute for Molecular Life Sciences, Nijmegen). Selection was maintained by adding G418 (400 μg/ml; Sigma-Aldrich) to the culture medium. SILAC HEK293 cells were grown in SILAC DMEM (Invitrogen) supplemented with 10% dialyzed FBS, l-glutamine, penicillin/streptomycin, and either l-lysine and l-arginine, l-lysine 4,4,5,5-D4 and l-arginine–U-13C6, or l-lysine–U-13C6-15N2 and l-arginine–U-13C6-15N4 (Cambridge Isotope Laboratories) (66). The siRNA oligonucleotides against endogenous proteins were purchased from Life Technologies for human PRMT1 (ID: 10000), PRMT4 (Stealth PRMT4/CARM1 siRNAs; set of 3: HSS116113, HSS116114, and HSS116115), PRMT5 (ID: 135623), and Negative Control siRNA#1. siRNA transfections were performed using Lipofectamine RNAiMAX (Invitrogen) according to the manufacturer’s protocol and lysed 48 hours after transfection. Cells were treated with 100 nM EPZ015666 (Selleckchem) for 48 hours.

Sample preparation

Cells were harvested by washing with phosphate-buffered saline (PBS) and lysed in modified radioimmunoprecipitation assay (RIPA) buffer [50 mM tris (pH 7.5), 400 mM NaCl, 1 mM EDTA, 1% NP-40, and 0.1% Na-deoxycholate] supplemented with protease inhibitor cocktail (Roche) and 2 mM Na-orthovanadate, 5 mM NaF, and 5 mM glycero-2-phosphate. Lysates were sonicated and then cleared by high-speed centrifugation. Proteins were precipitated by adding fourfold excess volumes of ice-cold acetone and stored at −20°C overnight. Subsequently, proteins were solubilized in a urea solution [6 M urea/2 M thiourea/10 mM Hepes (pH 8.0)]. The RIPA cell pellets were resuspended in 6 M urea/2 M thiourea/10 mM Hepes, sonicated, and, after additional centrifugation, combined with the already solubilized proteins. Protein concentrations in lysates were measured using Bradford assay (Bio-Rad). Next, proteins were reduced by adding dithiothreitol (DTT) to a final concentration of 1 mM and alkylated with chloroacetamide at 5.5 mM. Proteins were digested using endoproteinase Lys-C (1:100, w/w) and modified sequencing grade trypsin (1:100, w/w) after a fourfold dilution in 25 mM ammonium bicarbonate solution. Protease digestion was terminated by slow addition of trifluoroacetic acid to pH 2. Precipitates were removed by centrifugation for 10 min at 3000g. Peptides were purified using reversed-phase Sep-Pak C18 cartridges (Waters). Peptides were eluted off the Sep-Pak with 50% acetonitrile, and the acetonitrile was then removed by vacuum centrifugation. The peptides were subsequently fractionated by HpH fractionation (see separate experimental procedure).

HpH reversed-phase prefractionation

Ten milligrams of peptide mixture from biological samples was fractionated using a Waters XBridge BEH130 C18 3.5 μm 4.6 × 250 mm column on an UltiMate 3000 HPLC (high-performance liquid chromatography) system (Dionex) operating at 1 ml/min as previously described (67). Buffer A was Milli-Q water and buffer B was 100% acetonitrile. Buffer C consisted of 25 mM ammonium hydroxide and was constantly introduced throughout the gradient at 10%. A similar injection protocol and gradient were used for all fractionation experiments; all fractions were collected using a Dionex AFC-3000 fraction collector in a 24–deep well plate at 1-min intervals. Samples were initially loaded onto the column at 1 ml/min for 4 min, after which the fractionation gradient commenced as follows: 5% B to 25% B in 62 min, 60% B in 5 min, and ramped to 70% B for 3 min. At this point, fraction collection was halted, and the gradient was held at 70% B for 5 min before being ramped back to 5% B, where the column was then washed and equilibrated and injection of buffer C was halted. The total number of fractions concatenated was set to 14 throughout all experiments.

Enrichment of peptides modified with omega-NG-MMA

After HpH fractionation, the acetonitrile and the ammonium hydroxide as well as the volume of the concatenated samples were reduced by vacuum centrifugation at 60°C. Appropriate amount of PTMScan IAP Buffer (10×) (Cell Signaling) to make the concentration 1× was added to the fractionated peptide samples. Two vials of PTMScan Mono-Methyl Arginine Motif [mme-RG] Immunoaffinity Beads were equally distributed in the 14 samples and incubated rotating for 2 hours at 4°C. The immunoprecipitates were washed three times in ice-cold immunoprecipitation buffer followed by three washes in water, and modified peptides were eluted with 2 × 50 μl of 0.15% trifluoroacetic acid in Milli-Q water. Peptide eluates were desalted on reversed-phase C18 StageTips as described previously (68).

MS analysis

All MS experiments were performed on a nanoscale EASY-nLC 1000 UHPLC system (Thermo Fisher Scientific) connected to an Orbitrap Q Exactive HF equipped with a nanoelectrospray source (Thermo Fisher Scientific). Each peptide fraction was eluted off the StageTip, autosampled, and separated on a 15-cm analytical column (75 μm inner diameter) in-house packed with 1.9-μm C18 beads (Reprosil Pur-AQ, Dr. Maisch) using a 77-min gradient ranging from 5 to 40% acetonitrile in 0.5% formic acid at a flow rate of 250 nl/min. The effluent from the HPLC was directly electrosprayed into the MS. The Q Exactive HF was operated in data-dependent acquisition mode, and all samples were analyzed using a previously described “sensitive” acquisition method (69). Backbone fragmentation of eluting peptide species was obtained using higher-energy collisional dissociation (HCD), which ensured high-mass accuracy on both precursor and fragment ions (70).

Identification of peptides and proteins

All raw data analysis was performed with MaxQuant software suite version 1.2.6.20 supported by the Andromeda search engine (71, 72). Data were searched against a concatenated target/decoy (forward and reverse) version of the UniProt Human fasta database encompassing 71,434 protein entries (downloaded from www.uniprot.org on 7 March 2013). Mass tolerance for searches was set to a maximum of 7 parts per million (ppm) for peptide masses and 20 ppm for HCD fragment ion masses, making it possible to distinguish arginine monomethylation from other amino acid combinations (such as Ile + Gly, Leu + Gly, and Val + Ala) or from other modified residues such as acetylated lysine residues despite the mass discrepancies only being 66 ppm. Data were searched with carbamidomethylation as a fixed modification and protein N-terminal acetylation, methionine oxidation, and monomethylation on lysine and arginine as variable modifications. A maximum of three miscleavages was allowed while requiring strict trypsin specificity, and only peptides with a minimum sequence length of seven were considered for further data analysis. Peptide assignments were statistically evaluated in a Bayesian model on the basis of sequence length and Andromeda score. Only peptides and proteins with an FDR of less than 1% were accepted, estimated on the basis of the number of accepted reverse hits, and FDR values were finally estimated separately for modified and unmodified peptides. Protein sequences of common contaminants such as human keratins and proteases used were added to the database. For SILAC quantification, a minimum of two ratio counts was required. See also the supplementary note in the Supplementary Materials.

Bioinformatics analyses

Statistical analysis and hierarchical clustering were performed using Perseus (Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Munich). Significantly enriched GO terms were determined using the Functional Annotation Tool of the DAVID Bioinformatics database version 6.8 Beta (73). Protein interaction networks were analyzed using the interaction data from the STRING database (v. 9.05) (74) and visualized using Cytoscape (v. 2.8.3) (75). Sequence motif analyses were performed using the iceLogo web server (23).

The evolutionary conservation analysis was based on the pairwise comparison of methylated and nonmethylated arginine positions to positions within homologous proteins of eukaryotic species using eggNOG’s multiple sequence alignments (76). Methylated arginine positions were based on MS data, whereas nonmethylated arginines were randomly selected. UniProt accession numbers were mapped to Ensembl proteins (ENSPs) using STRING’s (76) “proteins-aliases” resource. The presence or absence of arginines at the methylated and nonmethylated positions within the alignments was counted, and the overall percentage of arginines at the respective positions was calculated for each species. The comparison was restricted to species classified as “core” in STRING, because of the quality and functional annotation of their genomes. We have made use of eggNOG’s “fine-grained orthologs” functionality to refine the selection of ENSPs used for comparison within each COG, in order to strengthen the specificity of the comparison and to mitigate the elevated influence a multitude of homologous ENSPs of a single species within a COG would have on the overall results of the given species. iTOL and phyloT (77, 78) were used to generate a phylogenetic tree, which was the basis for the succession of species in ascending degree of evolutionary distance to H. sapiens.

For mapping of somatic mutations, we downloaded the most recent version of the COSMIC database (http://cancer.sanger.ac.uk/cosmic) containing 3,534,058 somatic mutations, of which 469,855 mutations (corresponding to 13.3% of total) specifically occurred at arginine residues. These arginine somatic mutations were mapped onto the identified arginine methylation sites, and as control, we additionally mapped mutations onto arginine residues detected within the same protein as not being methylated (referred to as nonmodified arginine residues).

GFP and methylated protein precipitations

Cells expressing the tagged versions of the proteins of interest were harvested by washing with PBS and lysed in modified RIPA buffer [50 mM tris (pH 7.5), 400 mM NaCl, 1 mM EDTA, 1% NP-40, and 0.1% Na-deoxycholate] supplemented with protease inhibitor cocktail (Roche) and 2 mM Na-orthovanadate, 5 mM NaF, and 5 mM glycero-2-phosphate. Lysates were diluted in modified RIPA without salt and then sonicated and cleared by high-speed centrifugation. GFP immunoprecipitation was performed with 20 μl of GFP-Trap_A agarose beads (Chromotek) precipitation was performed using 20 μl of Strep-Tactin Sepharose beads (IBA). One milligram of protein mixtures was incubated with the appropriate beads for 2 hours rotating at 4°C before washing and subsequent elution with 2× Laemmli sample buffer (Thermo Fisher Scientific) at 90°C. Immunoprecipitation of methylated proteins was performed similarly but using 20 μl of PTMScan Mono-Methyl Arginine Motif [mme-RG] Immunoaffinity Beads (Cell Signaling) and 5 mg of protein mixtures.

RNA binding assay

RNA-protein complex isolation was done as previously described (79), with minor changes. In brief, tagged SRSF2 HeLa cells were irradiated with 254 nm of ultraviolet light (0.15 J/cm2) (Dr. Gröbel GmbH), harvested in ice-cold PBS, and spun down. Cell pellets were resuspended in lysis buffer [20 nM tris-HCl (pH 7.5), 500 mM lithium chloride (LiCl), 0.1% Na-deoxycholate, 1 mM EDTA, and 5 mM DTT] and immediately sonicated for 20 s at an amplitude of 70% (Sonics). Magnetic poly(dT) beads (New England Biolabs) were added to 10 mg of cell lysate and incubated at 4°C for 1 hour with gentle rotation. Beads were captured using a magnetic rack (Thermo Fisher Scientific) and washed once with lysis buffer, followed by washes with the following buffers: (i) 20 mM tris-HCl (pH 7.5), 500 mM LiCl, 0.02% (w/v) Na-deoxycholate, 1 mM EDTA, and 5 mM DTT; (ii) 20 mM tris-HCl (pH 7.5), 500 mM LiCl, 1 mM EDTA, and 5 mM DTT; (iii) 20 mM tris-HCl (pH 7.5), 200 mM LiCl, 1 mM EDTA, and 5 mM DTT. Bound RNA and cross-linked proteins were eluted at 55°C for 3 min in elution buffer [20 mM tris-HCl (pH 7.5) and 1 mM EDTA]. Ribonuclease (RNase) buffer (10×) [100 mM tris-HCl (pH 7.5), 1.5 M NaCl, 0.5% (v/v) NP-40, and 5 mM DTT] was added to the sample along with a mixture of RNase A and T1 (Thermo Fisher Scientific), and the sample was incubated for 1 hour at 37°C. After RNase digestion, the eluates were used for Western blot analysis.

Western blotting

The following antibodies were used in this study: rabbit polyclonal PRMT1 (Cell Signaling), rabbit polyclonal PRMT4 (Cell Signaling), rabbit polyclonal PRMT5 (Cell Signaling), mouse monoclonal actin (Sigma-Aldrich), rabbit monoclonal GFP (Cell Signaling), mouse monoclonal GFP (Roche), and monoclonal rabbit hnRNPUL1 (Novus Biologicals). Total cell lysates, together with the eluates, were resolved on 4 to 12% gradient SDS–polyacrylamide gel electrophoresis gels (Thermo Fisher Scientific), and proteins were transferred onto nitrocellulose membranes (Sigma-Aldrich). Membranes were blocked using 5% bovine serum albumin (BSA) solution in PBS supplemented with Tween-20 (0.1%). Secondary antibodies coupled to horseradish peroxidase (Jackson ImmunoResearch Laboratories) were used for immunodetection. The detection was performed with a Novex ECL Chemiluminescent Substrate Reagent Kit (Invitrogen).

Quantitative high-content imaging

For immunofluorescence analysis, U2OS cells were grown on 96-well screenstar microplates (Greiner Bio One) and fixed in 4% formaldehyde at room temperature for 15 min. Cells were permeabilized for 2 min using 0.2% Triton X-100, blocked in 5% BSA, incubated with mouse monoclonal SC35 (Novus Biologicals) or rabbit polyclonal SRSF5 (Sigma-Aldrich) overnight at 4°C and with secondary antibody Alexa Fluor 488 (Invitrogen) for 1 hour at room temperature, stained with 4′,6-diamidino-2-phenylindole (DAPI) for 20 min, and left in PBS. For cell cycle analysis, the Click-iT EdU Alexa Fluor 594 Imaging Kit (Thermo Fisher Scientific) was used. In brief, cells were treated with EdU for 30 min before fixation and stained for 30 min according to the manufacturer’s protocol before continuing the immunofluorescence staining. QIBC for measurement of fluorescence intensities was performed as described previously (54). For quantitative analysis of CARM1/PRMT4-induced relocalization of SRSF2 and SRSF5 into subnuclear speckles, images were acquired on an Olympus ScanR fully automated widefield microscope, equipped with a light-emitting diode light source, single filters DAPI/fluorescein isothiocyanate/Cy3/Cy5, and a digital monochrome Hamamatsu OrcaFlash 4 camera (2048 × 2048 pixels; cell size, 6.5 μm) with a 20× 0.75 UPLSAPO objective. For each condition, image information of at least 15,000 cells was acquired under nonsaturating conditions. Subnuclear speckles were identified using the inbuilt object recognition features of the ScanR Image Analysis Software. To avoid potential interference from nuclear-cytoplasmic protein shuttling, the count and sum of speckle-associated fluorescence intensities normalized by total nuclear fluorescence intensities of SRSF2 served as a measurement for CARM1-induced accumulation of SRSF2 into speckles. The Spotfire software was used to quantify percentages and median values in cell populations and to generate scatter diagrams.

SUPPLEMENTARY MATERIALS

www.sciencesignaling.org/cgi/content/full/9/443/rs9/DC1

Supplementary note on mass spectrometric identification of arginine-methylated peptides

Fig. S1. Validation of established methodology for mapping arginine methylation sites.

Fig. S2. Arginine methylation sites are found in major protein complexes.

Fig. S3. Site-specific analysis of arginine methylation sites.

Fig. S4. Occupancy analysis of arginine methylation sites.

Fig. S5. High-content microscopy analysis of splicing component SRSF2.

Fig. S6. Cell cycle analysis using QIBC.

Table S1. List of identified arginine methylation sites from all experiments.

Table S2. List of protein complexes found significantly enriched in arginine methylation protein factors.

Table S3. List of Pfam domains where arginine methylation sites occur.

Table S4. Stoichiometry values for identified arginine methylation sites.

Table S5. List of identified proteins used for determination of arginine methylation stoichiometry values.

Reference (81)

REFERENCES AND NOTES

Acknowledgments: We thank members of the Novo Nordisk Foundation Center for Protein Research (NNF-CPR) for fruitful discussions and careful reading of the manuscript. Funding: The work carried out in this study was in part supported by the NNF-CPR, the Novo Nordisk Foundation (grant numbers NNF14CC0001 and NNF13OC0006477), the Lundbeck Foundation, and the Danish Council of Independent Research [grant agreement numbers DFF 4002-00051 (Sapere Aude) and DFF 4183-00322A]. A.M. is supported by a Marie Curie Intra-European Fellowship for Career Development (Project #627187). Author contributions: S.C.L. and K.B.S. designed and performed the HpH fractionation, enrichment of arginine-methylated peptides, and MS and Western blot experiments; analyzed the data; and contributed to the writing of the manuscript. S.C.L. and A.M. designed and performed the microscopy analysis. A.M. analyzed microscopy data. M.M. performed RNA binding assay. M.V.M. performed Western blot analysis. D.L. and L.J.J. performed bioinformatics analysis and contributed to the writing of the manuscript. J.A.D. edited the manuscript and discussed the results. M.L.N. designed the experiments, analyzed the data, performed bioinformatics analysis, and wrote the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The MS proteomics data have been deposited to the ProteomeXchange Consortium (www.proteomexchange.org) via the PRIDE (80) partner repository with the data set identifier PXD003700.
View Abstract

Navigate This Article