Research ArticleInflammation

Integrated in vivo multiomics analysis identifies p21-activated kinase signaling as a driver of colitis

See allHide authors and affiliations

Science Signaling  27 Feb 2018:
Vol. 11, Issue 519, eaan3580
DOI: 10.1126/scisignal.aan3580

Tracking inflammation in disease

Crohn’s disease and ulcerative colitis are forms of inflammatory bowel disease (IBD), which have limited therapeutic options. Thus, a better understanding of the molecular mechanisms underlying IBD is needed to develop new treatment strategies. Lyons et al. combined RNA microarray, total protein mass spectrometry, and phosphoprotein mass spectrometry analyses of tissues isolated from a mouse model of colitis. Integration of these data sets enabled tracking from gene expression to protein phosphorylation in individual animals over time. Computational analysis of the data identified discrepancies between transcriptomic and proteomic measurements and predicted that the kinase Pak1 mediated colonic inflammation. Treatment of mice with a pharmacological inhibitor of Pak1 ameliorated disease, highlighting the importance of proteomic measurement to the understanding of disease pathogenesis.


Inflammatory bowel disease (IBD) is a chronic disorder of the gastrointestinal tract that has limited treatment options. To gain insight into the pathogenesis of chronic colonic inflammation (colitis), we performed a multiomics analysis that integrated RNA microarray, total protein mass spectrometry (MS), and phosphoprotein MS measurements from a mouse model of the disease. Because we collected all three types of data from individual samples, we tracked information flow from RNA to protein to phosphoprotein and identified signaling molecules that were coordinately or discordantly regulated and pathways that had complex regulation in vivo. For example, the genes encoding acute-phase proteins were expressed in the liver, but the proteins were detected by MS in the colon during inflammation. We also ascertained the types of data that best described particular facets of chronic inflammation. Using gene set enrichment analysis and trans-omics coexpression network analysis, we found that each data set provided a distinct viewpoint on the molecular pathogenesis of colitis. Combining human transcriptomic data with the mouse multiomics data implicated increased p21-activated kinase (Pak) signaling as a driver of colitis. Chemical inhibition of Pak1 and Pak2 with FRAX597 suppressed active colitis in mice. These studies provide translational insights into the mechanisms contributing to colitis and identify Pak as a potential therapeutic target in IBD.


Inflammatory bowel disease (IBD), which includes Crohn’s disease (CD) and ulcerative colitis (UC), affects more than 5 million people worldwide. Sufferers experience various debilitating gastrointestinal symptoms that require medical and, eventually, surgical intervention. The ultimate target of medical treatment is mucosal healing, which improves outcome. Nevertheless, this goal remains elusive in many patients. Although there have been advances in therapeutics, the current treatment options for IBD are limited and include general immunomodulators and targeted biologics, such as antibodies targeting tumor necrosis factor–α (TNF-α), the α4β7 integrin, and the cytokines interleukin-12 (IL-12) and IL-23 (1). All of these therapies suffer from variable efficacy and nondurable responses, as well as a spectrum of negative side effects. A better understanding of the molecular pathogenesis of IBD would lead to new therapeutic strategies that could lead to more effective treatments.

Genetic and epidemiological studies in human patients (2, 3), as well as experimental studies in animal models (4), have identified numerous genetic and environmental risk factors for CD and UC but have not identified clear driver mutations that could lead to therapeutic opportunities. To account for the complexity in IBD etiology, researchers have taken transcriptomic (5, 6), proteomic (7, 8), metabolomic (9, 10), and metagenomic (11, 12) approaches in an attempt to understand global disease networks and to identify genes, proteins, and metabolites that may be involved in disease pathogenesis. Although these approaches have provided valuable insights, they have fallen short of identifying potential high-value therapeutic targets in IBD.

Here, we generated a multiomics data set in which transcriptomic, proteomic, and phosphoproteomic measurements were made from the individual colons of mice with and without colitis. We used this data set to understand the relationships between RNA abundance, protein abundance, and protein phosphorylation and to determine what each type of data reveals about gut inflammation. Because all three types of data were collected from each individual sample, we could identify discrepancies between transcriptomic and proteomic measurements; thus, we could predict posttranslational protein regulation and identify changes in gene expression that originated at distant organ sites. Finally, we performed coexpression network analysis to identify signaling pathways that were coordinately or uniquely dysregulated in the different data sets, and we computationally inferred kinase activation from global phosphoproteomic mass spectrometry (pMS) data by collating kinase substrate lists and by using them with the gene set enrichment algorithm (GSEA). These complementary computational approaches implicated p21-activated kinase (Pak) signaling as a potential driver of colitis. We validated the role of Pak signaling in a preclinical therapeutic study in mice, and analysis of gene expression data from humans indicated that this finding was applicable to IBD patients. Together, these studies provide a comprehensive view of dysregulated signaling in colitis and identify a previously unappreciated pathogenic signaling pathway that represents a viable therapeutic opportunity.


Collection of multiomics data from mouse colons

The initial goal of this study was to quantify global transcriptomic and proteomic changes that occurred during chronic colitis. Because this requires a large amount of starting material, we chose to use a mouse model of IBD, namely, the adoptive transfer mouse model of CD, which is known for high penetrance and relatively short latency (13). Rag1-null animals on a C57BL/6J genetic background were injected with 400,000 CD45RBhi naïve T cells or, as a negative control, 200,000 regulatory T cells (Tregs) from isogenic wild-type (WT) animals and then weighed biweekly and assessed for symptoms related to the onset of colitis, such as diarrhea and rectal prolapse. Animals were sacrificed after sustained weight loss of more than 1.5 g in 1 week, which was indicative of severe colitis. Control animals were sacrificed concomitantly. Upon sacrifice, 3-mm sections of tissue from the medial colon were removed and fixed for histological assessment (fig. S1). The remaining colon was opened longitudinally and approximately one-eighth was snap-frozen for microarray analysis, whereas the remaining matched tissue was snap-frozen for mass spectrometry (MS) analysis (fig. S2). With this tissue-processing strategy, we obtained RNA, total protein, and phosphorylated proteins (phosphoproteins) from an individual colon, enabling relative quantification among control and experimental animals.

After sample processing and data collection, we quantified 39,325 named (38,666 unique) RNA transcripts, 7951 proteins, and 3159 phosphopeptides representing 3325 unique phosphorylation sites on 1711 proteins (tables S1 to S3). Unsupervised hierarchical clustering indicated that each of the three measurements segregated the inflamed mice from the noninflamed mice (Fig. 1A). Of the 7951 proteins measured by MS (hereafter, we refer to total protein MS as MS), 7611 (96%) were represented in the RNA data set. RNA transcripts were measured for 1634 (95%) of the 1711 proteins in the pMS data set. Total protein MS data were obtained for 1474 (86%) of 1711 proteins measured by pMS, and 1415 species were measured in all three data sets.

Fig. 1 Multiomics analyses of murine colitis.

(A) Unsupervised clustering of microarray, mass spectrometry (MS), and phosphoproteomic mass spectrometry (pMS) data. Noninflamed (NI) samples are marked in blue in the dendrograms above the heat maps, whereas inflamed (Inf) samples are marked in red. (B) Probability distributions of Spearman correlations for each pairwise comparison between each data set. (C) Average fold changes in Lima1 (Eplin) RNA, protein, and phosphoprotein abundance between inflamed and noninflamed mouse colons. (D) Scatter plot of Eplin-Ser360 and total Eplin protein counts in individual samples. (E) Venn diagrams summarizing the unique and overlapping differential expression events between the RNA, MS, and pMS data sets. Species refers to RNAs, proteins, or phosphopeptides that were detected in each data set from the comparison. Expression events refer to the direct comparison between the data sets for a given species. (F) Cellular localization of representative phospho-signals that were increased in abundance in colitis. Left: The abundances in individual samples, as detected by MS, for Trim28 Ser473 (top) and Map3k3 Ser337 (bottom). **P < 0.01, ***P < 0.001 by unpaired t test. Middle and right: Immunohistochemical analysis of pTrim28 and pMap3k3 in noninflamed and inflamed colons. In all panels, data are from five control (noninflamed) samples and three inflamed samples.

Because we measured RNA, protein, and phosphoproteins from individual samples, we could perform one-to-one matched correlation of individual genes across measurements (Fig. 1B). The probability density functions for Spearman correlations showed a correlation landscape for the RNA-MS comparison that was distinct from those for the RNA-pMS and MS-pMS comparisons. Most RNA-MS gene pairs were positively correlated with each other, with only a few species showing an inverse correlation (Fig. 1B). The RNA-pMS probability density function showed a bimodal distribution, indicating that there were similarly sized subgroups for which there was negative correlation, no correlation, or positive correlation between RNA abundance and phosphopeptide abundance. The MS-pMS probability distribution was similar to the RNA-pMS distribution, but there were more positively correlated MS-pMS species than RNA-pMS species (Fig. 1B). We hypothesized that the inverse correlation between some MS-pMS species (higher MS to pMS) represented proteins that were marked for degradation by phosphorylation. Eplin (which is encoded by the Lima1 gene) has a ubiquitin-priming phosphorylation site and exemplified this type of regulation in our data. Lima1 transcripts were essentially unchanged (1.1-fold reduced) in inflamed versus noninflamed tissue; however, the abundance of Eplin protein was nearly twofold decreased (Fig. 1C). We detected a 3.3-fold increase in the phosphorylation of Eplin at Ser360 (Fig. 1, C and D), which targets the protein for ubiquitylation and degradation (14).

Differential RNA expression and protein and phosphoprotein abundance analysis

We determined the RNAs, proteins, and phosphopeptides that had statistically significantly different abundances in inflamed colons relative to those in noninflamed colons. Overall, 7752 of 38,666 RNA transcripts, 4443 of 7951 proteins, and 2346 of 3325 phosphopeptides had differential abundance (table S4). Of the 7611 species for which we had both RNA and MS data, 1858 showed changes in abundance that were similar (either both increased or both decreased); 1064 RNAs and 2401 proteins had differential abundance only in their respective data set; and the remaining 2288 species were not differentially changed (Fig. 1E). All 1345 proteins detected in the pMS and MS data had differential abundances in either one data set or both (Fig. 1E), whereas none of the 1634 RNAs with comeasured phosphopeptides in the pMS data were differentially expressed, resulting in limited differential abundance similarity between all three data sets (Fig. 1E). This comparison suggested that the pMS data were critical for exploring the molecular pathogenesis of colitis.

Because our goal was to characterize changes to the tissue-level signaling network during colitis, we isolated RNA and protein from the whole colon, which includes the epithelium, lamina propria (including immune, stromal, and vascular cells), and muscularis as the starting material for our RNA and protein data sets. The cellular representation of an inflamed colon is different from that of a normal colon, especially with respect to the influx of inflammatory cells (fig. S1). To explore whether the changes that we detected through transcriptomic and proteomic analyses reflected the influx of inflammatory cells, we used immunohistochemistry to analyze the cellular localization of signals that were enhanced in animals with colitis. We selected phosphorylation of Trim28 at Ser473 and Map3k3 at Ser337 because they were among the few sites that met the criteria of being increased during colitis and having phospho-specific antibodies that worked for immunohistochemistry. Both sites exhibited substantially greater phosphorylation during colitis (Fig. 1F); however, their distribution in the colon differed. Trim28 phosphorylation was increased in the colonic epithelium, whereas Map3k3 phosphorylation was increased in all components of the colonic environment (Fig. 1F). These observations indicated that the changes in RNA, protein, and phosphoprotein abundance that occur during colitis reflect both a major change in the cellular composition of the tissue and changes in the tissue-level signaling network.

Pathway enrichment analysis

Although these analyses described how the multiomics data sets related to one another, we wanted to determine whether the differences between inflamed and noninflamed samples in each data set represented similar pathways and high-level functional categories. We performed GSEA (15) on the 7611 genes represented in both the RNA and MS data sets to identify differentially regulated pathways between inflamed and noninflamed mice. At the RNA level, 19 pathways were positively enriched and 8 pathways were negatively enriched in inflamed mice (Fig. 2A and table S5). Most were the same as the 15 positively enriched and 10 negatively enriched pathways in inflamed mice at the total protein level, suggesting strong functional concordance between the transcriptomic and proteomic data (Fig. 2B and table S5). The finding that 14 of 20 positively enriched and 8 of 10 negatively enriched pathways were similarly regulated in inflamed mice was consistent with the Spearman correlation histogram of the RNA/MS data (Fig. 1B). Coordinately, positively enriched pathways included those involved in the inflammatory response and TNF-α signaling, as well as pathways controlling epithelial proliferation, such as E2F targets, KRAS signaling, and MYC targets (Fig. 2C). The “epithelial-to-mesenchymal transition” gene set was positively enriched at the RNA level and negatively enriched at the protein level, suggesting complex regulation of this particular pathway (Fig. 2C). This observation highlights the potential for RNA analysis to provide misleading information about the role of a particular pathway in driving a disease.

Fig. 2 Differential RNA expression, differential protein abundance, and pathway analysis.

(A) Heat maps of RNA and total protein measurements most strongly contributing to pathway enrichment scores (ES) of gene set enrichment algorithm (GSEA). (B) Venn diagrams summarizing the unique and overlapping positively and negatively correlated pathway enrichments in the RNA and MS data sets. (C) GSEA plots from the RNA and total protein MS data set.

Analysis of noncorrelated RNA and protein signals

Comparative analysis of the changes in RNA and protein abundance presents an opportunity to explore if and how each data set provides unique information on the regulatory events associated with colitis, especially when those data sets do not change in the same direction. For example, our GSEA analysis identified the “interferon alpha response” as positively enriched in the MS data set, but not in the RNA data set (table S5). To expand upon this observation, we examined the correlation between RNA and protein abundance for the 7611 species for which we collected both types of data. Although we observed general concordance in fold change differences (control versus colitis) between RNA and protein measurements, a small number of species were altered at the protein level but exhibited no change at the RNA level (Fig. 3A). To determine whether these discordant species were present in particular functional categories, we performed gene ontology (GO) enrichment analysis for all species that showed greater than 22-fold change in protein abundance and less than 20.75-fold change in RNA abundance. We found enrichment for genes involved in defense response (positively enriched in the MS data set) and extracellular matrix (negatively enriched in the MS data set). The extracellular matrix category of proteins was composed primarily of collagens (Fig. 3A, purple dots) and when we surveyed all of the collagens shared between the microarray and total MS data sets, we found that 12 of 17 collagens were reduced at least twofold in protein abundance but were unchanged or weakly increased in abundance at the RNA level (Fig. 3B). Moreover, we observed an increase in several extracellular matrix metalloproteinases (MMPs), notably MMP3, MMP7, MMP9, and MMP10, in the MS data set. All of these enzymes function to degrade extracellular matrix components, such as collagen and fibronectin (16). This illustrates how comparative RNA-protein abundance analysis can infer genes for which the corresponding protein is regulated by degradation.

Fig. 3 Differential regulation of RNA and protein.

(A) Scatter plot of the fold change (inflamed versus noninflamed) in RNA expression plotted against the fold change in protein abundance for species that were present in both data sets. Colored dots represent extracellular matrix proteins (purple), acute-phase proteins (red), and neutrophil proteins (orange), with the arrowheads indicating genes that were further investigated. (B) Collagen expression in the RNA data and abundance in the protein data. Differential abundance of matrix metalloproteinases (MMPs) and tissue inhibitor of matrix metalloproteinases (TIMPs) in the MS data are indicated in the heat map inset with a key. (C) Tissue expression patterns of acute-phase (red) and neutrophil (orange) transcripts. Each gene was normalized to a maximum of 1.0, and all of the genes from each category were averaged to generate bars. (D) Induction of acute-phase RNA in the liver during inflammation. Bars represent the ratio of liver expression to colon expression for Orm1 and Fga. Data are from two inflamed and two noninflamed animals. Assays were performed in duplicate. *P < 0.05 by unpaired t test. P = 0.085 for Orm1. (E) Loss of gene expression in colonic neutrophils. The abundances of the RNAs for Camp and Elane from bone marrow neutrophils relative to those in neutrophils isolated from colon were determined. The plot represents log-transformed data of two inflamed animals, normalized to the smallest expression value for each gene. For both genes, P < 0.05 by paired t test. norm., normalized. (F) Model depicting RNA expression (R) and protein abundance (P) for acute-phase proteins (APP) and neutrophil proteins (represented by a polymorphonuclear cell symbol) in the colon and distant organ sites.

Within the defense response GO category, we focused on the acute-phase proteins (Fig. 3A, red dots) and a set that are specific to neutrophils (Fig. 3A, orange dots). The liver coordinates the acute-phase response, and therefore, we hypothesized that these genes may be transcribed and translated in the liver and then their encoded proteins might travel to the colon through the blood stream or possibly through bile. According to the publicly available mouse gene atlas (17), all are highly expressed in the liver, with minimal expression in other tissues (Fig. 3C). To confirm this expression pattern in our experimental model, we assayed the RNA abundance of two of the genes, Fga and Orm1, in the colons and livers of animals with and without colitis. For both genes, there was at least 5000-fold higher expression in the liver than in the colon (Fig. 3D). In inflamed animals, this ratio increased to ~12- to 50,000-fold (Fig. 3D), indicating that the transcriptional events that ultimately resulted in increased colonic abundance of acute-phase proteins most likely originated in the liver.

We selected the five neutrophil-specific genes with discordant RNA and protein changes (Fig. 3A, orange dots) because this group of genes was unexpected: The adoptive transfer model of IBD exhibits strong neutrophil recruitment to the colon, and there were many other neutrophil-specific genes that were increased in abundance at the RNA and protein levels in animals with colitis. According to the mouse gene atlas (17), these five genes exhibit bone marrow–specific expression in normal mice (Fig. 3C). However, these genes belong to other groups of genes that show decreasing RNA expression as neutrophils travel from the bone marrow to the circulating blood (18). To assess whether this expression pattern underlies the discordant colonic RNA and protein abundance in our model, we isolated neutrophils by flow cytometry from the bone marrow and colons of animals with colitis. We assessed the abundance of the transcripts for Camp and Elane in neutrophils from these two tissues, which revealed an average of 2257- and 6142-fold higher expression in the bone marrow–derived neutrophils compared to colonic neutrophils from inflamed animals (Fig. 3E). These data suggest that neutrophils transcribe and translate these proteins in the bone marrow and that the RNA message is degraded by the time the neutrophils reach the site of inflammation, leaving only the protein. Together, these examples demonstrate how transcriptomic and proteomic data covering the same genes can be used to infer regulation of expression in distant organ sites. In both cases, the mRNA and protein were initially expressed in a distant organ but carried to the colon through the blood (acute-phase proteins) or within cells recruited to the colon (neutrophil proteins) (Fig. 3F).

Trans-omics coexpression network analysis

To gain further insight into the relative information content of the three data sets, we assessed the co-abundance changes of the 1415 species (1429 RNA transcripts, 1452 proteins, and 3080 phosphopeptides) that were present in all data sets. Spearman correlation coefficients were calculated for all pairs of variables within each data set and dichotomized. Correlations with values >0.9 or <−0.9 were set to 1, and all others were set to 0. We visualized both the correlation networks and the two-step generalized topological overlap matrix (GTOM2) to explore the correlation landscape, modularity, and existence of highly correlated subsets of variables of each data set (Fig. 4, A and B). GTOM2 relates the interconnectedness of two genes by computing the number of shared two-step network neighbors. Clustering the GTOM2 matrix facilitates the identification of groups of highly correlated genes called modules (19, 20). At the RNA level, we noted several small clusters of highly correlated transcripts, whereas the protein abundance and phosphopeptide abundance changes were less modular (Fig. 4B). To estimate the true number of clusters in the RNA, MS, and pMS data sets, we computed the gap statistic for up to 15 clusters, by k-means clustering, of the GTOM2 matrix (21). The gap statistic measures the within-cluster variation of each cluster in a data set compared to a null distribution of expected within-cluster variation. The cluster at which the gap curve begins to level off or decrease as a function of the number of clusters indicates the number of clusters in each data set. On the basis of the gap curves and clustered GTOM2 matrices, we estimated that there were eight RNA clusters, five MS clusters, and five pMS clusters in our data (Fig. 4C).

Fig. 4 Coexpression and co-abundance network landscapes of RNA, MS, and pMS measurements.

(A) Correlation networks for RNA, MS, and pMS data sets. Nodes indicate genes or proteins and edges connect two genes or proteins if the Spearman correlation between the expression of two genes or abundance of the two proteins is greater than 0.9 or less than −0.9. (B) Two-step generalized topological overlap matrices (GTOM2) of the RNA, MS, and pMS data sets clustered by unsupervised hierarchical clustering. Correlations greater than 0.9 or less than −0.9 were set to 1 (yellow), and all others were set to 0 (blue). Square regions indicate highly connected clusters of genes or proteins. (C) Plot of the gap statistic versus the number of clusters in each data set. Clustering cutoff points are marked with a star for each data set based on the gap statistic and GTOM2 topology. (D) Network visualization of module overlap. Nodes indicate particular modules; edges are present if there was statistically significant overlap in genes or proteins between the two modules (Fisher exact test, P < 0.05; FDR, q < 0.25). (E) YourCrosstalker network module for pMS module 3. Nodes are colored by differential phosphorylation status in inflamed relative to uninflamed colons (red, hyperphosphorylated; blue, dephosphorylated; Wilcoxon-Mann-Whitney, P < 0.05; FDR, q < 0.25), and edges are colored by pathway membership of the interaction. Gray edges indicate protein-protein interactions that exist but are not part of an enriched pathway in the subnetwork. Pathways with higher statistical significance determine the interaction pathway association for interactions in multiple pathways. Striped nodes were recruited by the algorithm during the random walk procedure as significantly traversed “crosstalker” nodes. (F) YourCrosstalker network for MS module 5. Nodes are colored by total protein differential abundance status in inflamed relative to uninflamed mice (red, increased abundance; blue, decreased abundance; Wilcoxon-Mann-Whitney, P < 0.05; FDR, q < 0.25). Edges and striped nodes are defined as in (E).

We extracted the genes in each of these clusters (table S6) and compared the gene lists between clusters (table S7). We visualized the statistically significant overlap between modules in a network diagram and found that most RNA modules shared a significant overlap with multiple MS and pMS modules (Fig. 4D). Although some MS and pMS modules overlapped, the connections were sparser and the partially isolated modules (RNA_3, MS_5, and pMS_4) might present interesting cases of specific trans-omics regulation in colitis. To interpret the function of the individual modules in each data set and to examine function across data sets, we used the network analysis tool YourCrosstalker (22, 23) to identify the enriched pathways, filter out network edges that were not associated with protein-protein interactions (PPIs), and identify topologically relevant genes that were not in our original cluster gene lists. We performed the YourCrosstalker random walk on the STRING PPI network and performed pathway enrichment with a combination of Reactome and the National Cancer Institute Pathway Interaction Database (NCI-PID) (2427).

The correlation structure of module RNA_3 was selectively associated with the phosphorylation modules pMS_3 and pMS_4. The 54 genes in RNA_3 showed enrichment only for nerve growth factor (NGF) signaling. The 121 genes in pMS_4 showed enrichment only for processing of capped intron-containing pre-mRNAs. In contrast, we found a greater diversity of pathway enrichments in the YourCrosstalker modules of the 397 genes in pMS_3 (Fig. 4E). Here, we found a central network that connected multiple vascular endothelial growth factor (Vegf) signaling, p38 signaling, cell cycle, and pre-mRNA processing pathways (Fig. 4E). We analyzed the 221 proteins in the module MS_5, a protein module that did not significantly associate with any pMS modules but was significantly connected to RNA modules 2, 4, 5, and 6 (Fig. 4F). Similar pathways were implicated by the genes in MS_5 as in pMS_3. In particular, we noted the shared MS_5–pMS_3 pathways of Vegf, pre-RNA processing, cell cycle, and p38 signaling, as well as the presence of NGF signaling shared with RNA_3 (Fig. 4F). This observation suggests that the signaling network at the intersection of these pathways plays a role in colitis and that the components of this network are dysregulated in distinct ways in the different omics data sets. This discovery was enabled by the integrated picture provided by our multiomics data set and trans-omics analysis approach.

Inferring kinase activity from phosphoproteomic data

Our overarching goal was to understand how dysregulated signaling contributed to the pathogenesis of colitis and to determine whether there were signaling pathways that could represent novel therapeutic targets for IBD. To this end, we reasoned that our pMS data set would most accurately reflect therapeutically tractable changes in signal transduction that occur during colitis. Our pMS analysis identified 2346 differentially phosphorylated peptides in animals with colitis, and although 80% of the phosphopeptides had been previously identified, only 1.68% were functionally annotated (fig. S3). In essence, the biological importance of most of the measured phosphorylation events is presently unknown. We sought to overcome this deficiency in previous knowledge by using known kinase-substrate relationships to computationally infer kinase activity from the pMS data. Lists of kinase-substrate relationships for 348 kinases across a range of kinase families were curated from PhosphoSitePlus (table S8) (28). These were entered as “gene” lists into the GSEA algorithm, enabling us to perform a “kinase activation” analysis from the pMS data. Although many kinases showed substrate enrichment in inflamed colons (table S9 and Fig. 5A), only Pak1, which had an uncertain role in IBD pathogenesis, reached statistical significance as defined by a false discovery rate (FDR) of <0.25 (Fig. 5, A and B). Both Pak1 and Pak2 were part of several of the YourCrosstalker modules that we identified in our coexpression network analysis (Fig. 4, E and F, and table S6), providing a cross validation of the different computational analyses.

Fig. 5 Inferring kinase activity from pMS measurements.

(A) Volcano plot of normalized enrichment score (NES) versus false discovery rate (FDR). Kinases with positive or negative enrichment and an FDR <0.25 are specified. (B) Heat maps of phosphopeptides corresponding to known kinase substrates from noninflamed (NI) and inflamed (Inf) animals. The kinase is indicated to the left of each set of substrates together with the log2 differential abundance (inflamed versus noninflamed) for RNA (left box), protein (right box), and phosphorylation (circles). NES is specified for each kinase. All of the kinases shown had FDR <0.25 and are predicted to be either increased (positive NES) or decreased (negative NES) in abundance in colitis. (C) Validation of Mapk14 and Gsk3α/β phosphorylation in colon samples from inflamed and noninflamed animals. Fluorescence intensity (FI) was measured using Luminex assays specific to each phosphorylation site. Data are from samples from 12 individual noninflamed animals and 25 individual inflamed animals. *P < 0.05 and **P < 0.01 by unpaired t test.

In addition to the predicted activation of Pak1 during colitis, GSEA predicted that six kinases—casein kinase 2A1 (Csnk2a1), Gskα and β, p38α (Mapk14), casein kinase 1D (Csnk1d), and Dyrk1α—were less active in the animals with colitis than in the control animals (Fig. 5, A and B). Of these kinases, only Mapk14, one of the stress-induced mitogen-activated protein kinases (MAPKs), has been linked to inflammation (29). To confirm the predicted kinase activation, we measured the phosphorylation of sites on Gsk3α/β and Mapk14 that regulate their activation states. We performed a Luminex-based phosphorylation assay on samples from an independent cohort of animals with adoptive transfer-induced colitis. This analysis revealed that animals with colitis had decreased phosphorylation of Mapk14 at activating sites (Thr180/Tyr182) and increased phosphorylation of Gsk3α/β at inhibitory sites (Ser21/Ser9) (Fig. 5C). Hence, this experiment validated the reduction in Mapk14 and Gsk3α/β activity as predicted by GSEA of the pMS data set.

Evidence of Pak signaling in human IBD patients

An important consideration with any experimental model system is the generalizability of the findings of that system to the human in vivo context. To compare our mouse data to human, we obtained a publicly available gene expression data set of inflamed (n = 12) and uninflamed (n = 16) IBD patient colonic biopsies (6). Our aim was to assess global concordance between human IBD differential gene expression and, in particular, whether the identification of increased Pak1 and Pak2 activity was conserved in patients. We performed differential expression analysis on the entire human data set and compared the differential abundance of human RNA to mouse RNA, MS, and pMS. There were 1708 genes differentially expressed between inflamed and uninflamed human samples. All 1708 human RNAs were represented in the mouse RNA data, 1040 genes were represented in the mouse MS data, and 529 genes were found in the mouse pMS data. Of the 2710 homologous mouse RNA transcripts that were differentially expressed, 751 were also differentially expressed in humans (Fig. 6A). Of the 2608 homologous mouse proteins with differential abundance, 613 were also represented by differentially expressed transcripts in the human RNA data (Fig. 6A). Finally, of the 867 homologous differentially phosphorylated proteins measured in the mouse pMS data, 197 were differentially expressed in the human RNA data (Fig. 6A). In general, more molecular species (RNA, MS, and pMS) tended to exhibit differential abundance in the mouse relative to the human RNA data. The larger set of differentially abundant mouse RNA, MS, and pMS species represented 43.9, 58.9, and 37.2% of the possible homologous differential expression events. This suggests that the mouse MS data set offers a reasonable experimental representation of the human disease context. Furthermore, similar to the generally weak correspondence between RNA and phosphopeptide differential activity in the mouse, we observed that the products of genes differentially expressed in the human tended to not be differentially phosphorylated in the mouse.

Fig. 6 Overlap between mouse model omics data and human inflammatory bowel disease biopsy transcripts.

(A) Venn diagrams representing the differential expression analysis of human inflammatory bowel disease (IBD) colonic biopsies in inflamed and uninflamed phenotypes (Wilcoxon-Mann-Whitney, P < 0.05; FDR, q < 0.25) compared to differentially expressed RNA, protein abundance, and phosphopeptide abundance between inflamed and uninflamed mouse colons. (B) Differentially expressed genes in the PAK signaling network neighborhood in human IBD. Genes are colored by differential expression direction (red, increased; blue, decreased) in inflamed relative to uninflamed human colonic biopsies. (C) Assessment of the overlap between the genes regulated by kinases that were statistically significantly associated with colitis in the mouse and human genes differentially expressed in the kinase-regulated network (hypergeometric test, P < 0.05).

We next sought to determine the extent to which the Pak1 and Pak2 signaling network neighborhood was differentially active in human IBD. We assembled a human PPI network by querying PAK1 and PAK2 in the Pathway Commons database (26). The PAK network neighborhood contained 529 unique genes and 3666 interactions. We filtered the human expression data and PAK network for overlapping genes. After filtering for expression array coverage, the final PAK neighborhood contained 431 genes and 2534 interactions (fig. S4). When we overlaid the differentially expressed genes from human IBD patients onto the Pak network neighborhood, we found that 95 genes were differentially expressed (Fig. 6B) and the largest connected network of these genes implicated PAK2, STAT1, and STAT3 as key hub nodes. This analysis suggested that the inferred PAK signaling mechanism from the mouse model translates to the human disease and may be a viable therapeutic target in IBD.

To assess the relative importance of PAK signaling compared to that of other kinases implicated by phosphopeptide-based GSEA (Fig. 5), we repeated the analysis of constructing a network neighborhood of the kinase and searching for differentially expressed human genes. A hypergeometric test was applied to assess the statistical significance of the overlap between differentially expressed human genes and the network neighborhood of the kinases MAPK14, CSNK2A1, GSK3A, and GSK3B. The PAK1 neighborhood contained 95 differentially expressed genes and was the most significant kinase (P < 10−15) (Fig. 6C). Although MAPK14, GSK3A, and GSK3B also had significantly active network neighborhoods, the PAK1 neighborhood was statistically the most significant of the kinases (Fig. 6C).

Validation of Pak as a therapeutic target in colitis

With multiple lines of evidence pointing toward dysregulation of Pak signaling in colitis, we investigated whether pathway activation was a cause or consequence of the disease. First, we sought to validate, in an independent cohort of animals, that Pak was activated during colitis. Phosphorylation of Pak1/2 on Ser144/Ser141 is associated with kinase activation (30), and Western blotting analysis of colonic protein lysates from the new cohort confirmed increased phosphorylation in inflamed colons (Fig. 7A). This observation validated the MS analysis and kinase inference that predicted Pak1 activation based on increased phosphorylation of its substrates in inflamed colons (Fig. 5, A and B). Note that Pak2 was not predicted by GSEA to be activated because not enough of its known substrates (table S8) were represented in the pMS data set. We also found that Pak1 autophosphorylation was increased in animals with acute colitis induced with dextran sodium sulfate, indicating that Pak activation is not specific to the adoptive transfer mouse model of colitis (fig. S5).

Fig. 7 Validation of Pak as a therapeutic target in colitis.

(A) Validation of Pak activation in the colons of animals with induced colitis (inflamed). Phosphorylated Pak1 and Pak2 were detected by Western blotting. Each lane represents a sample from a different animal. (B) Inhibition of Pak activity by FRAX597. Merlin phosphorylation on Ser518, a Pak substrate, was assessed by Western blotting analysis of colon samples from inflamed animals that had been treated for 24 hours with FRAX597 (100 mg/kg single dose) or polyethylene glycol and polyvinylpyrrolidone (vehicle). Gapdh, glyceraldehyde phosphate dehydrogenase. (C) Colonoscopic monitoring of colitis. Colonoscopy images of a representative mouse with adoptive transfer–induced colitis. The mouse was imaged before and 7 days after treatment with FRAX597 (100 mg/kg per day). (D) Histological effects of FRAX597 on the colons of mice with induced colitis. (E) Immunological effects of FRAX597 on the colons of mice with induced colitis. The percentages of macrophages and neutrophils were quantified by flow cytometry in the colons of inflamed animals after 7 days of treatment with FRAX597 or vehicle. *P < 0.05 by one-tailed Mann-Whitney test. Data are from three vehicle-treated animals and four FRAX597-treated animals.

Next, we investigated whether inhibition of Pak signaling could suppress colitis. We chose to focus on animals that already had severe inflammation because our goal was to determine whether inhibition of the pathway could be effective in patients with active disease. We found that colons from animals with colitis that were treated acutely with FRAX597, a Pak1/2 inhibitor (31), exhibited reduced phosphorylation of the Pak1/2 substrate Merlin on Ser518 (Fig. 7B). Next, we treated sick animals with FRAX597 for 7 days (100 mg/kg per day) and assessed their phenotype by endoscopic monitoring. After treatment with the Pak inhibitor FRAX597, the animals exhibited decreased mucosal thickness, return of both small and large vessel visible vascular markings, and resolution of contact friability and bleeding (Fig. 7C), all of which are signs that the active colitis had been diminished after Pak inhibition.

At the histologic level, animals treated with FRAX597 exhibited reduced immune cell infiltration into their colons and a return to more normal epithelial crypt morphology (Fig. 7D). The colitis that arises in the T cell transfer model is characterized by increased numbers of colonic macrophages and neutrophils in the lamina propria (32). Pak has a direct role in neutrophil migration by promoting chemotaxis (33). To determine whether Pak inhibition altered the immune milieu in the colon, we used fluorescence-activated cell sorting (FACS) to quantify immune cells from the colon. Consistent with the reduction in the clinical and histologic presentation of colitis, we found that animals treated with FRAX597 had reduced numbers of colonic macrophages and neutrophils relative to those of vehicle-treated controls (Fig. 7E). Together, our computational and experimental studies implicate Pak signaling as a driver of chronic inflammation in the colon.


Here, we present a case in which RNA microarray, total proteomic, and phosphoproteomic measurements have been analyzed and integrated in matched tissue from single animals in a mouse model of colitis. This “all measurements from a single animal” approach enabled us to determine how changes in gene expression are carried through to protein expression and modification. We found that genes that were differentially expressed at the RNA level showed similar patterns of differential regulation in the MS data set but that differential expression at the RNA level did not predict protein phosphorylation status. Furthermore, although most genes differentially expressed at the total protein level were also differentially phosphorylated at the phosphoprotein level, the pMS data contained many differentially expressed phosphopeptides that were unchanged at the RNA and total protein levels (Fig. 1E). Although there are typically a greater number of molecular species measured through transcriptomics, our finding demonstrates that there are additional layers of molecular regulation that are not represented within transcriptomic data sets, underscoring the importance of proteomic measurement for understanding disease pathogenesis at the molecular level.

We observed stronger concordance between the RNA and MS data at the pathway level than at the single gene level (Fig. 2B). The enriched pathways pointed to several dysregulated signaling pathways, such as oxidative phosphorylation and signaling through E2F, KRAS, and MYC, that might provide some therapeutic targets in IBD. Work by Bar et al. (34) demonstrated that mice with dextran sodium sulfate (DSS)– and trinitrobenzene sulfonic acid (TNBS)–induced colitis are protected against more severe symptoms when oxidative phosphorylation is more active. Their work suggests that increasing the activity of this pathway could reduce inflammation and crypt formation in the intestinal epithelium through enhanced nuclear factor κB (NF-κB) signaling (34). Because our analysis identified a link to oxidative phosphorylation in the T cell transfer model of colitis, it appears that the anti-inflammatory activity of this pathway is conserved among the DSS, TNBS, and adoptive transfer models.

Furthermore, whereas the RNA and MS measurements exhibited conserved patterns of expression and higher-order process enrichment, there were also many proteins that were differentially regulated in inflamed versus noninflamed tissue but had unchanged RNA expression. Using GO enrichment analysis, we identified differential regulation in defense response and extracellular matrix. This led us to infer posttranslational regulation of collagens by MMPs (Fig. 3B). In addition, we found a subset of neutrophil and acute-phase proteins that were transcribed and translated in other tissues (bone marrow and liver) and transported to the colon in the blood stream or through infiltrating immune cells (Fig. 3F). These analyses leveraged the transcriptomic and proteomic data to produce hypotheses regarding organismal scale gene and protein regulation that could not have been made with either data set alone, underscoring the value of multiomics measurement. Only multiomics analysis of an intact organism could identify these instances of physiologic regulation of protein expression.

The collection of multiomics data enabled us to perform trans-omics coexpression network analysis to examine the correlation structure between all pairs of RNA transcripts, proteins, and phosphopeptides measured in all three data sets (Fig. 4). By clustering the data sets into modules and performing targeted network analysis on the modules with YourCrosstalker, we were able to identify important functional commonalities between the phosphopeptide module pMS_3 and the total protein module MS_5. We found that pMS_3 associated with all eight RNA modules and was characterized by an intriguing core network of hyperphosphorylated proteins in the RNA metabolism and pre-RNA processing pathways coupled to a dephosphorylated network of Vegf signaling. The MS_5 module did not associate with any pMS modules, and yet it was characterized by similar coupling of pre-RNA processing pathway proteins to Vegf signaling proteins through the cell cycle signaling pathway (Fig. 4, E and F). Although the characteristic proteins of MS_5 and pMS_3 did not overlap, the same signaling pathways and core network architecture were identified by YourCrosstalker, suggesting an important role for these pathways in colitis. RNA processing and cell cycle are general terms, and it is difficult to ascertain specific mechanisms potentially involved in colitis, but Vegf plays a role in angiogenesis and lymphangiogenesis during colitis (35). Vegf therapy is beneficial in mouse models of the disease (36, 37). By extension, Vegf signaling in experimental colitis is likely to be a result of inflammation as the tissue attempts to repair itself.

The primary goal of this study was to use the multidimensional data set to identify new drivers of colitis, and we noted that the pMS analysis provided additional mechanistic insight that was not revealed by the RNA or protein expression studies. Because most of the individual sites identified by pMS were not functionally annotated, we reasoned that phosphoproteomic data coupled with previous knowledge of kinase substrates would enable computational inference of kinase activity. Public databases and software packages provide information on kinase-substrate interactions, kinase recognition motifs, and kinase-substrate predictions, and various algorithms have been used to determine a kinase activity metric based on these relationships. For example, and similar to our analysis, Drake et al. (38) compiled substrate sets and used an algorithm analogous to GSEA to quantify enrichment. Here, GSEA predicted one significant positively enriched and six significant negatively enriched kinases from our pMS data set (Fig. 5A), most of which had unknown or poorly characterized roles in IBD. The activation or inhibition of several of these predicted kinases was confirmed through the measurement of regulatory phosphorylation sites on those kinases (Figs. 5C and 7A). Pak1 was of particular interest because our enrichment analysis indicated that it was activated during colitis, suggesting that inhibition could be a therapeutic strategy. Although Pak2 was not implicated by GSEA, Western blotting and YourCrosstalker network analyses revealed that it is also activated in animals with colonic inflammation and in human patients (Figs. 4, E and F, and 6B). Inhibition of Pak1/2 with FRAX597 suppressed inflammation in animals with active colitis, suggesting that this pathway plays an active role in the pathogenesis of the disease (Fig. 7, C to E). Thus, we demonstrated that chemical inhibition of Pak signaling can revert inflammatory disease in the colon.

Pak1 and Pak2 are members of the family of group 1 p21-activated kinases that regulate inflammatory responses, in part by stimulating assembly of the reduced nicotinamide adenine dinucleotide phosphate oxidase complex in neutrophils (39). Pak1 was previously reported to be increased in abundance in epithelial cells during colitis and is thought to be regulated by mechanistic target of rapamycin (mTOR) signaling (40). Our YourCrosstalker network analysis identified the mTOR pathway as being activated in the T cell transfer model (Fig. 4E), although our GSEA analysis failed to identify enrichment for mTOR substrates in the pMS data from animals with colitis (table S9). Hence, the mechanism of activation of Pak1 in the T cell transfer model is not clear. Nevertheless, the effect of Pak inhibition validates the prediction made by our computational modeling approaches that Pak plays a critical role in the pathogenesis of colitis. In sum, our analyses demonstrate the added value of multiomics measurements by showing how different molecular species in each data set may be acting on similar pathways in distinct ways. We used multiomics comparisons to obtain mechanistic insight into the pathogenesis of chronic inflammation in the colon, in particular, identifying Pak signaling as a therapeutic target. This work highlights the power of analyzing the global proteome and phosphoproteome to uncover dysregulated signaling pathways that are not revealed by transcriptomic studies alone.


T cell transfer model of colitis

T cell transfer was performed according to established methods (13). Briefly, splenocytes were isolated from WT C57BL/6J mice (Jackson Laboratory) and depleted for red blood cells by treatment with ACK lysis buffer. CD4+ T cells were enriched using a Dynal CD4 untouched kit (Thermo Fisher Scientific). Naïve T cells (CD4+CD45RBhi) and Tregs cells (CD4+CD25+) were isolated by FACS. Naïve T cells (400,000) or Tregs (200,000) in phosphate-buffered saline were injected intraperitoneally into C57BL/6J Rag1-null mice (Jackson Laboratory). Recipients were weighed biweekly. Animals treated with naïve T cells were sacrificed after sustained weight loss of 1.5 g for 1 week. Mice injected with Tregs were sacrificed at these time points. Upon sacrifice, the medial colon was formalin-fixed for histology and then matched tissue was snap-frozen for microarray analysis and MS analysis (fig. S2). All animal work was approved by the Institutional Care and Use Committees of Massachusetts General Hospital and Beth Israel Deaconess Medical Center.

Microarray analysis

RNA was isolated from snap-frozen tissue with a Qiagen RNeasy microarray tissue mini kit (Qiagen, 73304). RNA expression was quantified on Affymetrix Mouse Transcriptome 1.0 Arrays, and data were processed using the Affymetrix Expression Console software. Subsequent analysis was performed on named transcripts. Microarray data were submitted to Gene Expression Omnibus (GEO; accession no. GSE95705).

Protein digestion and tandem mass tag labeling for MS

Excised colon tissue was resuspended in mammalian cell lysis buffer [75 mM NaCl, 50 mM Hepes (pH 8.5), 10 mM sodium pyrophosphate, 10 mM NaF, 10 mM β-glycerophosphate, 10 mM sodium orthovanadate, 1 mM PMSF (phenylmethylsulfonyl fluoride), 3% SDS, and complete mammalian protease inhibitor tablet (Roche)]. Suspensions were mixed with zirconium oxide beads (1-mm diameter) and lysed on a mini bead beater (BioSpec) four times for 45 s, cooling the sample in between. Beads were removed, the lysate was centrifuged at 15,000g for 5 min at 4°C, and insoluble debris was discarded. Ditheiothreitol was used to reduce disulfide bonds, and free thiols were alkylated with iodoacetamide as described previously (41). Reduced and alkylated proteins were then precipitated according to the methanol/chloroform method as described previously (42). Precipitated proteins were reconstituted in 300 μl of 1 M urea in 50 mM Hepes (pH 8.5). Vortexing and sonication were used to aid solubility. Proteins were then digested in a two-step process, first with 3 μg of endoproteinase LysC (Wako) for 17 hours at room temperature and then with 3 μg of sequencing-grade trypsin (Promega) for 6 hours at 37°C. The digest was acidified with trifluoroacetic acid (TFA). Peptides were desalted over Sep-Pak C18 solid-phase extraction (SPE) cartridges (Waters). The peptide concentration was determined by bicinchoninic acid assay (Thermo Fisher Scientific), and a maximum of 50 μg of peptides was aliquoted, dried under vacuum, and stored at −80°C before labeling with tandem mass tag (TMT) reagents. Peptides were labeled with 10-plex TMT reagents (Thermo Fisher Scientific). TMT reagents were suspended in dry acetonitrile (43) at a concentration of 20 μg/μl. Dried peptides were resuspended in 30% dry acetonitrile (ACN) in 200 mM Hepes (pH 8.5), and 5 μl of the appropriate TMT reagent was added to the sample, which was incubated at room temperature for 1 hour. The reaction was then quenched by the addition of 6 μl of 5% (w/v) hydroxylamine in 200 mM Hepes (pH 8.5) and incubation for 15 min at room temperature. The solutions were acidified by the addition of 50 μl of 1 % TFA, combined into one sample, and desalted.

Basic pH reversed-phase liquid chromatography sample fractionation

We used basic pH reversed-phase liquid chromatography sample fractionation (bRPLC) to perform sample fractionation with concatenated fraction combining. Briefly, samples were resuspended in 5% formic acid (FA)/5% ACN and separated over a 4.6-mm × 250-mm Zorbax Extend-C18 column (5 μm, 80 Å, Agilent Technologies) on an Agilent 1260 high-performance liquid chromatography (HPLC) system outfitted with a fraction collector, degasser, and variable wavelength detector. A two-buffer system (buffer A: 5% ACN, 10 mM ammonium bicarbonate; buffer B: 90% ACN, 10 mM ammonium bicarbonate) was used for separation, with a 20 to 35% gradient of buffer B over 60 min at a flow rate of 0.5 ml/min. A total of 96 fractions were collected, which were combined in a total of 24 fractions. The combined fractions were dried under vacuum, reconstituted with 8 μl of 5% FA/5% ACN, 3 μl of which was analyzed by LC-MS2/MS3.

Phosphopeptide enrichment

For each sample, 450 μg of total peptides was subjected to phosphopeptide enrichment using a 4:1 ratio of titanium dioxide beads/peptide (w/w). Peptides were resuspended in 2 M lactic acid in 50% ACN and added to 1.8 mg of titanium dioxide beads. The mixture was shaken gently for 1 hour. Beads were collected by centrifugation and washed three times with 2 M lactic acid in 50% ACN and three times with 50% ACN/0.1% TFA. Phosphopeptides were eluted with 2 × 200 μl of 50 mM KH2PO4 (pH 10) and acidified with 1% TFA. Eluted phosphopeptides were desalted, lyophilized, and labeled with 2 μl of 10-plex TMT reagents 127n-130c, as described earlier. The combined sample was enriched for phosphotyrosine-containing peptides using phosphotyrosine antibody–conjugated beads (Cell Signaling Technology) according to the manufacturer’s protocol. Unbound peptides (phosphoserine and phosphothreonine peptides) were desalted, lyophilized, and fractionated by bRPLC using a gradient of 5 to 28% buffer B. A total of 96 fractions were collected, and the fractions were combined into 12 fractions. Bound peptides (phosphotyrosine peptides) were eluted and desalted. All 12 fractions were resuspended in 5% ACN/5% formic acid and analyzed on an Orbitrap Fusion mass spectrometer using LC-MS2/MS3 for the identification and quantification of the phosphopeptides.

MS data acquisition and analysis

TMT-labeled peptides were subjected to multiplexed quantitative proteomics analysis on an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific) coupled to an EASY-nLC 1000 integrated autosampler and HPLC pump system. Peptides were separated over a 100-μm inner diameter microcapillary column, packed in-house with 0.5 cm of Magic C4 resin (5 μm, 100 Å, Michrom Bioresources), followed by 0.5 cm of Maccel C18 resin (3 μm, 200Å, Nest Group), followed by 29 cm of GP-C18 resin (1.8 μm, 120 Å, Sepax Technologies). Samples were eluted over 165 min at a flow rate of 300 nl/min over a gradient of 6 to 25% ACN/0.125% formic acid. TMT-labeled peptides were identified using MS2 spectra and quantified using a MultiNotch [simultaneous precursor selection (SPS)] MS3 method (41, 44) in a data-dependent mode. Each scan sequence began with acquisition of a full MS spectrum (MS1) acquired in the Orbitrap [mass/charge ratio (m/z) range, 500 to 1200; resolution, 60,000; automatic gain control (AGC)target, 5 × 105; maximum injection time, 100 ms]. From this spectrum, the 10 highest-intensity peptide ions were subjected to MS2 analysis, where acquisition time was optimized in an automated fashion (top speed, 5 s). Peptides were fragmented by collision-induced dissociation (CID) (normalized collision energy, 30%), and low-resolution MS2 scans were performed in the linear ion trap (quadrupole isolation width, 0.5 Th; AGC target, 1 × 104; maximum injection time, 35 ms). From each MS2 spectrum, the 10 highest-intensity fragment ions were selected for SPS MS3 analysis. Fragment ions were restricted to an m/z range of 400 to 2000, an m/z range of −40 to +15 around the precursor peptide ion m/z was excluded from selecting fragment ions, and “TMT” was selected for isobaric tag loss exclusion settings. This group of MS2 fragment ions was further fragmented by higher-energy collisional dissociation (HCD) (normalized collision energy, 50%), and high-resolution MS3 scans were performed in the Orbitrap (resolution, 60,000; AGC target, 5 × 104; maximum injection time, 250 ms). When analyzing phosphopeptide samples, two MS2 spectra were acquired per peptide: a 15,000 resolution spectrum in the Orbitrap upon HCD fragmentation (normalized collision energy, 40%) and a low-resolution CID-MS2 spectrum, as described earlier. Precursor ion selection for MS3 spectra was done on the basis of the low-resolution MS2 spectral data using the top 10 intensity fragment ions. Data analysis was performed on an in-house, SEQUEST-based software platform (45, 46). Raw files were converted into the mzXML format using a modified version of ReAdW.exe. Peptide identification was performed as reported previously (47), searching against a protein sequence database containing all protein sequences in the mouse open reading frame database (downloaded, 1 January 2014), as well as that of known contaminants. For phosphopeptide data, high- and low-resolution spectra were annotated into two separate searches and subsequently combined. Phosphorylation of serine, threonine, and tyrosine residues (79.966331 Da) was set as a variable modification, and up to three modifications were allowed. Peptides were quantified on the basis of the TMT reported ion intensities in the collected MS3 spectra, as reported previously (47, 48). Quantified peptides were required to have a summed signal-to-noise value greater than 386 and an isolation specificity greater than 0.75 (41). TMT intensities for all peptides assigned to a protein were summed for protein quantification. Both protein and phosphopeptide quantitative data were normalized in a two-step procedure. First, the average intensity of each species (protein or phosphopeptide) was calculated and normalized to the median of all of these average intensities. Second, to account for any mixing errors, the intensity of each species was normalized to the ratio of the median intensity for a given TMT channel to the median of all species intensities. Total protein and phosphoprotein MS data sets were deposited into the Mass Spectrometry Interactive Virtual Environment (MassIVE, accession no. MSV000081198).

Reverse transcription polymerase chain reaction confirmation of acute-phase and neutrophil genes

To confirm distant tissue expression, mice were subjected to T cell transfer–induced colitis as described earlier. Inflamed and negative control animals were sacrificed, and whole liver and colon tissue were snap-frozen in liquid nitrogen. In addition, bone marrow was collected from one femur from each animal. Briefly, femurs were flushed with Hanks’ balanced salt solution and passed through a 45-μm filter. Red blood cells were lysed in ACK buffer for 3 min, and remaining cells were stained with APC-Cy7 CD45 (BioLegend) and Alexa-700 LY6G 1A8 (BioLegend) for 10 min at room temperature. In addition, colons were dissociated in a collagenase solution for 1 hour at 37°C with agitation. Tissue was then passed through a 45-μm filter and stained with APC-Cy7 CD45 (BioLegend) and Alexa-700 LY6G 1A8 (BioLegend), as described earlier. Cells were gated on CD45+ and LY6Ghi population (neutrophils), sorted directly into TRIzol on the BD Biosciences ARIA flow cytometer, and stored at −80°C until processing. RNA was isolated from the whole tissue and sorted neutrophils by extraction with TRIzol. For whole-tissue extraction, liver and colon segments were homogenized in TRIzol by chopping with a razor blade. After RNA extraction, complementary DNA (cDNA) was produced using the TaqMan High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems), and preamplification was performed according to manufacturers’ protocols (Applied Biosystems). For liver genes, we assayed Fibrinogen alpha chain (Fga; TaqMan Mm00802584_m1) and alpha-1-acid glycoprotein 1 (Orm1; TaqMan Mm00435456_g1). For neutrophils, we assayed cathelicidin antimicrobial peptide (Camp; Mm00438285_m1) and elastase neutrophil expressed (Elane; Mm01168928_g1). 18S ribosomal RNA (Mm04277571_s1) was used as a standard in each case, and expression levels were determined by the ΔCtCt method. For liver genes, graphs represent the ratio of liver/colon transcripts in inflamed and noninflamed tissue. For neutrophil genes, all measurements were normalized with the lowest value set to one for each gene. Graph represents log2 normalized expression.

Western blotting for Pak1 and Merlin

Lysates from additional inflamed and noninflamed animals (generated using the protocol described earlier) were run on 4 to 12% Novex tris-glycine gels (Invitrogen). After transfer, membranes were probed with primary antibodies as follows: phospho-Merlin (Ser518) from Cell Signaling Technology (13281) used at a 1:1000 dilution; phospho-Pak1/2 Ser144/Ser141 from Cell Signaling Technology (2606) used at a 1:1000 dilution; actin from Santa Cruz Biotechnology (sc-1616) used at a 1:10,000 dilution; and Gapdh (glyceraldehyde phosphate dehydrogenase) used at a 1:10,000 dilution from Cell Signaling Technology (D16H11). Secondary antibodies were anti-mouse and anti-rabbit immunoglobulin G horseradish peroxidase linked as appropriate (Cell Signaling Technology, 7076 and 7074; used at a 1:5000 dilution).

Luminex analysis for p38 and Gsk3α/β phosphorylation

Bio-PlexPro (Bio-Rad) phosphoprotein measurements were performed according to the manufacturer’s instructions with the following kits: Gsk3α/β Ser21/Ser9 (171-V50007M) and p38 MAPK Thr180/Tyr182 (171-V50014M).

Pak inhibition in vivo

Mice that had received adoptive transfer of naïve T cells were monitored for the development of colitis with regular weight measurements. Once mice developed signs of colitis, they underwent rigid endoscopy to confirm evidence of inflammation, using a validated endoscopic scoring system for colitis (49). Once inflammation was demonstrated, mice were treated with FRAX597 (100 mg/kg once a day) or vehicle by oral gavage for 7 days (31). A subset of mice underwent posttreatment endoscopy, and then all mice were sacrificed at 7 days. The colons were resected and opened longitudinally; two side portions (one-fifth) from the entire length were reserved, one for flow cytometry and one placed in Bio-Plex lysis buffer. The rest of the tissue was used for flow cytometric analysis.

Flow cytometry

Tissue was homogenized in serum-free Dulbecco’s modified Eagle’s medium with collagenase type I C (2 mg/ml) (VWR 234153-100MG) and incubated for 1 hour at 37°C. After incubation, the sample was strained through a 45-μm filter and centrifuged for 5 min at 700g. Cells were then stained with the following antibodies from BioLegend (1:300 in FACS buffer) for 10 min: fluorescein isothiocyanate CD4 (116004) (BV-421 F4/80 (123131), BV-605 CD4 (100547), BV-510 cd11b (101245), Alexa-700 Ly6G (127622), phycoerythrin (PE)/Cy7 cd11c (117317), APC/Cy7 CD45 (103116), and PE CD45RB (103308). Cells were analyzed on a 5-laser LSR II flow cytometer (Becton Dickinson SORP).

Unsupervised clustering

Unsupervised hierarchical clustering was performed using the clustergram function in MATLAB, using default parameters. Input data included log2-transformed RNA, MS, and pMS data sets. For the purposes of visualization, RNA data were masked using the genevarfilter, genelowvalfilter, and geneentropyfilter as described at

Differential gene expression and pathway analysis

To compare univariate differential expression in each data set, we analyzed RNA, MS, and pMS data using the Wilcoxon-Mann-Whitney test with Benjamini-Hochberg FDR correction (P < 0.05, q < 0.25). Pathway analysis was performed on the RNA and MS data sets using GSEA. GSEA was performed with weighted log2 ratio ranking and 1000 gene set permutations using the “hallmarks gene sets” from MSigDB.

Coexpression network analysis

Spearman correlations were calculated for all pairs of species represented in all three data sets. Correlation coefficients less than 0.9 or greater than −0.9 were set to 0, and all others were set to 1 to define the undirected network adjacency matrix. The topological similarity of nodes in each data set was calculated using GTOM2 (19, 20). We then clustered the GTOM2 using k-means clustering and determined the number of GTOM2 clusters in each data set by calculating the gap statistic for 1 to 15 possible clusters. The genes were then extracted, and module overlap was computed using Fisher’s exact test with Benjamini-Hochberg FDR correction (P < 0.05, q < 0.25). All analysis was performed in MATLAB R2016a, and networks were generated using Cytoscape 3.4.0 (50). GTOM2 visualizations and calculations were performed using the implementation described in

YourCrosstalker network analysis

Coexpression network modules were analyzed using the YourCrosstalker network analysis tool ( (22, 23). Given an input set of gene seeds and a PPI network database, Crosstalker identifies subnetworks of highly connected genes as scored by a random walk with restarts at the seed genes. As the random walk reaches a steady state, seed genes are removed if they are not topologically related in the PPI network and additional genes are recruited if significantly traversed (“crosstalkers”), indicating possible association with the seed genes. Pathway enrichment is then tested using Fisher’s exact test against a database of curated pathways, and network edges are colored by pathway. The NCI-PID pathway databases and STRING PPI network (high confidence interactions, edge weights >0.7) were used in the analysis (2427).

Differential regulation analysis

We selected 122 genes/proteins as having greater than twofold increased or decreased abundance at the protein level and less than 2.75-fold increased or decreased expression at the RNA level. GO enrichment was performed using Gorilla ( with the background list as all of the shared RNA and protein species. GO terms that were enriched with a P value <10−3 were deemed significant. Tissue expression of acute-phase and neutrophil genes was assessed using the mouse tissue expression atlas. Each gene was normalized to 1 by the highest expression for that gene. After normalization, the average value for all of the genes of a given category was calculated and plotted for each of the indicated tissues.

GSEA of kinase substrates

Functional annotation was assessed on the basis of previous knowledge obtained from downloads available on PhosphoSitePlus. Identified proteins were obtained from Phosphorylation_site_data set, which contains all of the published mouse phosphorylation sites found in their database; functional annotation was obtained from Regulatory_sites, and we included all of the phospho-sites that had known effects on function; upstream kinase information was obtained from kinase lists. Kinase substrate lists were obtained from PhosphoSitePlus ( Lists were composed of known mouse, rat, and human kinase substrate sites. Human and rat sites were converted to mouse numbering. This produced a total of 348 kinase substrate sets with a maximum of 558 unique phospho-sites and a median of 6. Enrichment was performed with a rank-based test using GSEA software developed at the Broad Institute ( Parameters can be found along with the output of the analysis in table S4. Briefly, of 348 kinases sets, 29 met the minimum size threshold of overlap with our data set (four phospho-sites). Input data were the ratio of phosphopeptides to total peptides from match samples. The ratio measurement was used because this is more reflective of specific activation of phosphorylation. Phospho-sites were ranked by “log2 ratio of classes” and permuted 35,000 times to the gene set (because of the limited number of samples, phenotype permutation is not recommended). The kinase substrate lists have been added to the MsigDB so that they can be easily applied to the GSEA software.


Fig. S1. Histological analysis of colon samples.

Fig. S2. Schematic of the experimental design.

Fig. S3. Functional annotation of phosphorylation sites for mammalian proteins.

Fig. S4. Human PAK1 and PAK2 signaling network neighborhood.

Fig. S5. Pak activation during acute colitis.

Table S1. Affymetrix microarray quantification of gene expression in individual samples.

Table S2. MS-based quantification of proteins in individual samples.

Table S3. MS-based quantification of phosphopeptides in individual samples.

Table S4. Differential expression analysis for each data set.

Table S5. Pathway analysis for each data set.

Table S6. Modules from trans-omics coexpression network analysis.

Table S7. Statistics for trans-omics coexpression network analysis.

Table S8. Phosphosite lists for each kinase used in GSEA.

Table S9. Statistics for GSEA-based kinase enrichment.


Acknowledgments: The authors would like to acknowledge T. Libermann at the BIDMC (Beth Israel Deaconess Medical Center) Genomics Core and S. Levine at the MIT (Massachusetts Institute of Technology) BioMicroCenter for microarray processing and instructive conversation. We would also like to thank M. Haigis for helpful comments on the manuscript. Funding: J.L. and L.S.-L. were supported by Research Fellowship Awards from the Crohn’s and Colitis Foundation of America. S.D.S. was supported by a National Science Foundation Graduate Research Fellowship under grant no. 1122374. K.M.H. was funded by grants from the NIH (R01-GM088827) and the U.S. Department of Defense (W81XWH-16-1-0042). J.L.K. was funded by a grant from the NIH (R01-CA124495). D.A.L. and D.K.B. were funded by grants from Boehringer-Ingelheim as part of the SHINE (Strategic Hub for Innovation New Therapeutic Concept Exploration) program and from the Army Research Office Institute for Collaborative Biotechnologies (W911NF-09-0001). Author contributions: J.L., P.C.G., K.R.B., and L.S.-L. performed mouse experiments. J.L. performed GSEA and noncorrelated RNA-MS analyses. D.K.B. performed statistical analyses and YourCrosstalker network analysis. A.E. and M.B. performed pMS. S.D.S. developed a PubMed searching algorithm and contributed to development of kinase substrate lists. Y.-J.L. performed bone marrow isolation. J.L.K. provided the PAK inhibitor for mouse studies. J.L., D.A.L., V.Y., J.L.K., W.H., and K.M.H. contributed to the conception, planning, and execution of experiments and analyses. J.L., D.K.B., and K.M.H. wrote the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: Microarray data were submitted to GEO under accession no. GSE95705. Total protein and phosphoprotein MS data sets were deposited in the MassIVE under accession no. MSV000081198.

Stay Connected to Science Signaling

Navigate This Article