Research ArticleCancer

Profiles of Basal and Stimulated Receptor Signaling Networks Predict Drug Response in Breast Cancer Lines

See allHide authors and affiliations

Science Signaling  24 Sep 2013:
Vol. 6, Issue 294, pp. ra84
DOI: 10.1126/scisignal.2004379


Identifying factors responsible for variation in drug response is essential for the effective use of targeted therapeutics. We profiled signaling pathway activity in a collection of breast cancer cell lines before and after stimulation with physiologically relevant ligands, which revealed the variability in network activity among cells of known genotype and molecular subtype. Despite the receptor-based classification of breast cancer subtypes, we found that the abundance and activity of signaling proteins in unstimulated cells (basal profile), as well as the activity of proteins in stimulated cells (signaling profile), varied within each subtype. Using a partial least-squares regression approach, we constructed models that significantly predicted sensitivity to 23 targeted therapeutics. For example, one model showed that the response to the growth factor receptor ligand heregulin effectively predicted the sensitivity of cells to drugs targeting the cell survival pathway mediated by PI3K (phosphoinositide 3-kinase) and Akt, whereas the abundance of Akt or the mutational status of the enzymes in the pathway did not. Thus, basal and signaling protein profiles may yield new biomarkers of drug sensitivity and enable the identification of appropriate therapies in cancers characterized by similar functional dysregulation of signaling networks.


Large-scale sequencing of human tumors has identified an increasing number of genes encoding signaling proteins that are mutated, overexpressed, or deleted in cancer; examples include the genes encoding the kinase Akt, the lipid phosphatase PTEN (phosphatase and tensin homolog), the epidermal growth factor (EGF) receptors ErbB2 and ErbB1, mitogen-activated protein kinases (MAPKs), and the proto-oncoprotein Raf (1) (see table S1 for a list of gene names and abbreviations used here). Many drugs targeting these proteins are in clinical use or development, but most drugs work in only a subset of patients (27). In a few cases, single genetic factors are highly predictive of drug response in cell lines and human tumors: Bcr-Abl translocation predicts sensitivity to imatinib in leukemia (8), and the BRAFV600E mutation predicts at least initial sensitivity to vemurafenib in melanoma (4, 5). However, the situation is usually more complex.

Early-stage breast cancer is generally treated surgically, and adjuvant drugs are chosen on the basis of the morphology of the cancer and its molecular subtype, which is defined by the abundance of three receptors (9). The HER2amp subtype is defined by amplification of the receptor tyrosine kinase (RTK) ErbB2 (also known as Her2) and is typically scored using immunohistochemistry or by assaying for gene amplification. Overexpression of the estrogen receptor (ER) or progesterone receptor (PR) defines the HR+ (hormone receptor–positive) subtype. In triple-negative breast cancers (TNBCs), the abundance of all three receptors is low. HER2amp status serves as a biomarker for therapy with antibodies that target ErbB2, such as trastuzumab or pertuzumab (2, 3, 1013), and HR+ status is a biomarker for therapy with HR antagonists, such as tamoxifen (14, 15). TNBCs are usually treated with cytotoxic chemotherapy (14, 16), sometimes in combination with ErbB1 inhibitors (17) and function-blocking antibodies targeting ErbB family members (18). However, breast cancer subtypes are heterogeneous (1921), classical molecular subtypes (as described above) and those defined by whole-genome expression profiling are not identical (22), and even the best available biomarker, HER2amp status, correctly predicts response to trastuzumab in only a subset of patients (2, 3, 10, 11). The need for better biomarkers is particularly urgent for TNBCs because they appear to be genetically more heterogeneous than other breast cancer subtypes (23) and patients with these tumors have poor prognosis (24).

Projects like the Cancer Cell Line Encyclopedia aim to identify genomic features, such as gene amplification, mutation, deletion, or epigenetic modifications, that correlate with and are ultimately predictive of drug response (20, 21, 2527). However, biochemical data on drug targets, such as abundance or phosphorylation status, are potentially more predictive of drug response than genomic features (2830). Despite this and the availability of some systematic steady-state protein data (21, 22), few studies have tried to relate the biochemistry of signal transduction to drug response on a large scale (31).

Here, we investigated whether measurements of the basal and stimulated states of signal transduction networks are predictive of drug response as measured by the GI50 value, the concentration of drug that reduces cell number by 50% relative to a no-drug control when assayed at a fixed time after drug exposure (20). We measured the abundance and basal phosphorylation state of nuclear and cell surface receptors, and of downstream signaling kinases in a standardized NCI-ICBP43 cell line collection (21). The choice of which receptors to measure was a practical one: We focused on RTKs, because they are clearly implicated in breast cancer biology, and assayed all receptors for which we could verify the specificity and linearity of plate-based immune assays. Biological ligands present in the microenvironment alter drug sensitivity (32, 33), and thus, some features of signal transduction may not be apparent by steady-state profiling. Therefore, we also measured the activities of downstream signaling kinases before and after exposing cells to a diverse set of growth factors and cytokines. We then evaluated how the resulting set of ~3 × 105 receptor and cell response measurements segregated with molecular subtype and whether they could predict the sensitivity of cells to a range of targeted anticancer drugs (20). We found that predictors could be constructed for more than half of the drugs we analyzed and that data on molecular subtype, signaling biochemistry, and genetic status could be combined to construct hybrid response biomarkers that are mechanistically plausible and may translate into the clinic.


Mapping protein profiles onto subtypes

We profiled two aspects of signaling networks in cell lines of the ICBP43 panel that could reliably be grown as adherent cells (table S2). The “basal profiles” set was defined as the data for the abundance and basal phosphorylation of 22 receptors and 3 downstream kinases [ERK (extracellular signal–regulated kinase), Akt, and Src] (table S3) and the abundance of the ER and PR. All basal phosphorylation and protein abundance data were obtained by enzyme-linked immunosorbent assay (ELISA), and where possible, we estimated steady-state levels in molecules per cell by calibration with recombinant protein (see Materials and Methods). For the RTK phosphorylation data, we measured the amount of total phosphotyrosine rather than of individual phosphorylation sites, with the exception of IGF1R, for which we measured phosphorylation of Tyr1131. Note that such an approach ignores the fact that different phosphorylation events on the same protein often have different biological effects. To generate the “signaling profiles” set, we exposed the cells individually to 22 growth factors or cytokines (table S4) for 10, 30, or 90 min and monitored the response by immunofluorescence microscopy of key signaling kinases and transcription factors (table S5). Activation of NF-κB (nuclear factor κB) was based on nuclear translocation of its subunit p65; activation of all other signaling proteins was based on phosphorylation. Not every ligand was analyzed with every assay (fig. S1).

Basal profiling revealed that some RTKs, such as ErbB2, ErbB1, ErbB3, c-Met, and IGF1R (insulin-like growth factor 1 receptor), were present in many of the cell lines (Fig. 1A), whereas others, such as PDGFR (platelet-derived growth factor receptor) and VEGFR (vascular endothelial growth factor receptor), were detectable in only a few lines (data S1). Absolute abundance varied from ~107 molecules per cell for ErbB2 to ~100 molecules per cell for c-Kit or IGF1R (data S1). The abundance of ErbB2 varied 10-fold among HER2amp cells (and ~103-fold across the collection), but the HR+ line with the greatest amount of ErbB2 had only about 3-fold less receptor than the HER2amp line with the lowest amount (Fig. 1A). The same was true for amounts of ER and PR, which varied 100-fold with intermingling of subtypes (Fig. 1B). More generally, we found that all of the broadly distributed receptors in our data set exhibited high variability in abundance across all molecular subtypes. Thus, these results show that the dichotomous classification of breast cancers by receptor presence or absence belies the fact that actual receptor abundance varies in a graded fashion across subtypes.

Fig. 1 Abundance and phosphorylation status of RTKs reveal heterogeneity in receptors within clinical subtypes in 39 breast cancer cell lines.

(A) Bars indicate the abundance of the RTKs, colored by clinical subtypes (red, TNBC or nonmalignant; yellow, HER2amp; purple, HR+). The black line is the amount of phosphorylated receptor. Dashed lines show the detection threshold for total protein abundance (gray) and phosphoprotein abundance (black). (B) Abundance of the ER and PR (ER, solid; PR, dashed; sum of ER and PR as bars). Horizontal gray lines show the detection thresholds (ER, solid; PR, dashed). (C) Clustering of cell lines according to their basal profile defines three main clusters (BI, blue; BII, green; BIII, yellow; outliers, black). Color intensity in the heat map indicates abundance of the indicated protein or phosphoprotein in picogram per cell for protein abundance and in arbitrary units (a.u.) for phosphorylation measurements (measurements below detection threshold are in gray). Black horizontal lines show the cluster separations. (D) Projection of cell lines on the first two principal axes of a PCA (fig. S3) of the basal measurements. (E) Projection of cell lines on the indicated key variables identified by PCA. Dashed lines show the detection thresholds for each measurement.

Among the receptors profiled, ErbB2 was unique in that basal phosphorylation was highly correlated with protein abundance (Fig. 1A; Spearman’s R = 0.86; P = 3 × 10−12), whereas the abundance and basal phosphorylation status of other receptors exhibited little or no correlation. A likely explanation is that ErbB2 can self-associate when present in high abundance and autoactivate in the absence of ligand (34), whereas most other RTKs are only activated by ligand binding. The amount of phosphorylated ErbB2 (pErbB2) in HER2amp cells also correlated closely with amounts of pErbB1, pErbB3, and pErbB4 (fig. S2A) (34, 35) and with the phosphorylated forms of the insulin and insulin-like growth factor receptors (pInsR and pIGF1R; fig. S2A). Although pIGF1R can cross-activate ErbB2 in trastuzumab-resistant HER2amp cells (36), our data showed that high phosphorylation of IGF1R occurred in 8 of 11 HER2amp cell lines (Fig. 1A and fig. S2B). IGF1R (but not ErbB2) was also highly phosphorylated in a subset of TNBC cells (fig. S2B). High phosphorylation of IGF1R and InsR is linked to poor patient survival across all breast cancer subtypes (37), and IGF1R has been targeted therapeutically (38, 39).

Hierarchical clustering of basal profiles revealed three clusters with several outliers: cluster BI included most of the TNBC lines (12 of 15), and BII most of the HR+ lines (7 of 10); HER2amp lines were present in all three clusters (Fig. 1C). Nonmalignant cell lines (NM) did not have increased amounts of ErbB2, ER, or PR and clustered closely with a subset of the TNBC lines in BI. Principal components analysis (PCA) revealed a similar distribution as found by hierarchical clustering with the TNBC lines tightly grouped in BI, HR+ lines in BII, and HER2amp lines widely distributed (Fig. 1D). Variables identified by PCA (fig. S3) explained the differences between the clusters. In BI, ErbB1 and c-Met were highly abundant, whereas FGFR4 (fibroblast growth factor receptor 4) and ErbB3 were of low abundance in these cell lines; cell lines in BII exhibited the converse pattern (Fig. 1E).

Analysis of the responses to growth factors or cytokines in 37 of the cell lines, which defined the signaling profiles, yielded four clusters, with TNBC cells falling together in SI and HR+ lines in SII (Fig. 2A, only the 20 most variable measurements are shown), consistent with their placement in BI and BII, respectively (Fig. 1C). When analyzing the degree to which phosphorylation of ERK1/2 (pERK) could be induced, we found that TNBC cells were generally more responsive to EGF and the c-Met ligand HGF (hepatocyte growth factor), whereas HR+ cells responded more robustly to FGF-2 and heregulin (HRG), a ligand that binds to ErbB2-ErbB3 heterodimers (Fig. 2B). To validate the signaling profiles collected by microscopy, we measured ligand responses using ELISA assays for pAkt and pERK, the linearity of which can be tested using recombinant protein standards. The profiles collected using the imaging and ELISA methods were highly correlated (fig. S4).

Fig. 2 Signaling response to extracellular ligands shows commonalities within breast cancer cell subtypes and identifies outliers.

(A) Heat map of variables representing the 20 most variable responses to growth factors or cytokines in 37 breast cancer cell lines identified by PCA. Color intensities indicate the fold change from vehicle-treated cells. Cell lines are clustered (SI, blue; SII, green; SIII, yellow; SIV, red; outliers, black) according to all ligand responses measured by high-throughput microscopy. (B) Projection of cell lines on the pERK response to the indicated RTK ligands at 100 ng/ml. (C) Projection of cell lines on the phosphorylated signal transducer and activator of transcription 3 (pSTAT3) response to IL-6 versus the pSTAT1 response to interferon-γ (IFN-γ) at 100 ng/ml. Responses and ligands in (B) and (C) were selected from the most variable measurements identified by PCA (fig. S3).

Unsupervised clustering showed that HR+ and TNBC lines in the ICBP43 collection were distinguishable by the abundance and responsiveness of receptors that are not conventionally part of the subtype definition. These receptors include c-Met (for TNBC) and FGF receptors (for HR+), both of which are activated by components of the tumor microenvironment that are implicated in drug resistance (32, 33, 40) and are being intensively targeted by therapeutic antibodies and small-molecule drugs (41, 42). From a molecular and diagnostic perspective, the HER2amp subtype is generally regarded as the most distinctive, but unsupervised clustering divided HER2amp cells into multiple groups (Fig. 1C) based on 10- to 100-fold differences in the abundance of c-Met, FGFR4, and EGFR (epidermal growth factor receptor) (Fig. 1E). Although many HER2amp cell lines responded weakly to most growth factors, perhaps because high pErbB2 causes constitutive downstream signaling, a subset of HER2amp cells was highly responsive to interleukin-6 (IL-6) (Fig. 2, A and C). High amounts of IL-6 have been observed in some primary breast cancers and are associated with poor clinical outcomes (4345). These results suggested that HER2amp cell lines exhibit substantial heterogeneity at the signaling and receptor levels, which may possibly be a consequence of factors known or suspected to be involved in tumor progression and drug resistance (40, 46).

Predicting drug response from basal and signaling data

Previous attempts to correlate genomic features with drug response often start with a binary division into resistant and sensitive classes, with some exceptions (26). Although it is broadly true that subtype enriches for sensitivity to some drugs (HER2amp cells to ErbB2 inhibitors), in most cases, GI50 varies continuously across the cell line panel (20). There is, for example, no principled way to divide the trimodal distribution of GI50 values for the ErbB1 inhibitor erlotinib into resistant and sensitive classes (Fig. 3A). Therefore, rather than create a classifier, we attempted to predict actual GI50 values directly using partial least-squares regression (PLSR). The quality of each prediction was assessed by a q2 value derived from a leave-one-out cross-validation, and the significance of prediction was assessed by the P value of the correlation between predictions and measurements (see text S1 and fig. S5). In the case of erlotinib, subtype was a poor response predictor (20), but we were able to build a PLSR model (q2 = 0.36 and P = 6.15 × 10−4) to correctly identify sensitive TNBC and HR+ lines (in blue in Fig. 3A) from a large set of insensitive lines. Although erlotinib is not a standard treatment for breast cancer, it is part of 32 ongoing trials (47).

Fig. 3 Basal profiles and signaling profiles predict responses to targeted inhibitors.

(A) Projection of the fit of the PLSR model based on the signaling profile data predicting GI50 values for erlotinib responses against the measured response for each cell line (red, resistant; blue, sensitive). Lower plot shows the trimodal distribution of the measured GI50 values across the cell lines. (B) Fraction of drugs for each class (see Table 1) predicted by the totality of the data (black), the signaling profile based on the maximal response to the 15 growth factors (white), and the basal profile (gray). (C) Distribution of the q2 for drugs significantly predicted by the basal or signaling profile grouped by primary drug targets (Aktp, Akt pathway; MAPKp, MAPK pathway; CC, cell cycle). Nonsignificant predictions (NS) are set to zero. Colors represent different classes of drugs.

We built PLSR models to predict previously reported GI50 values for 43 targeted drugs (20) (data S2) using different partitions of our data sets that corresponded to the basal or signaling profiles (table S6). Overall, 23 models were statistically significant at a false discovery rate (FDR) of 0.15 (Table 1; see data S3 for model coefficients). Most of the drugs targeting ErbB, the MAPK pathway or phosphoinositide 3-kinase to Akt (PI3K/Akt) pathway, and histone deacetylases (HDACs) were predicted with statistical significance (Fig. 3B). The quality of predictions of the signaling profiles collected by ELISA (used for validation) and microscopy has a Pearson’s correlation of R = 0.52, P = 3.0 × 10−4 (fig. S6 and data S4). In general, models based on signaling profiles yielded higher q2 values for drugs with intracellular targets, like the PI3K/Akt pathway, than did those based on basal profiles (Fig. 3C).

Table 1 List of drugs for which GI50 values are significantly predicted using PLSR models.

Quality of the prediction (q2) for the drugs significantly predicted by any of the data sets used (see table S6). The Benjamini-Hochberg procedure with FDR = 0.15 was used for significance, and the P value of the data set with the highest q2 is reported.

View this table:

Predicting sensitivity to PI3K/Akt inhibitors from the response to heregulin

Drugs targeting proteins in the PI3K/Akt pathway are important investigational agents for breast cancer with 24 agents in ~150 trials (47). Furthermore, PIK3CA (encoding the PI3K p110α subunit) and PTEN (encoding a phosphatidylinositol phosphatase) are frequently mutated in breast cancer (1, 22). We found that mutational status was not a significant predictor of drug response for ICBP43 cell lines (table S7), and no clear correlation has been reported in the literature (48). We constructed statistically significant PLSR models for six drugs that target various enzymes in the PI3K/Akt pathway—three targeting PI3K, two targeting mechanistic target of rapamycin (mTOR), and one targeting Akt—using either the signaling profile data set or the basal profile data set (Fig. 4A). Signaling profile data produced statistically significant PLSR models for five drugs targeting the PI3K/Akt pathway. Although the sensitivity to some drugs was also predicted by basal profiles, models that used signaling data yielded substantially higher q2 values for four of six drugs. Inspection of the models showed that the abundance of pERK in cells exposed to the ligand HRG had the highest coefficients, suggesting that responsiveness to HRG, but not other ligands that induce pAkt, is linked to sensitivity to PI3K/Akt inhibitors (fig. S7A). HRG responses were also the key variables in models built on the ligand responses collected by ELISA (fig. S7B).

Fig. 4 Response to HRG and ErbB3 abundance are key predictors for response to drugs targeting the PI3K/Akt pathway.

(A) Quality of the predictions for the drugs targeting the PI3K/Akt pathway using their basal and signaling (based on maximal response) profiles. Nonsignificant predictions (NS) are set to zero. (B) Projection of cell lines according to the pERK response to HRG and cumulative amounts of ErbB2 and ErbB3 overlaid with the GI50 value to GSK2126458 (red, resistant; blue, sensitive). (C) Quality of the prediction for drugs targeting the PI3K/Akt pathway using a linear model with a single variable or the variables ErbB3 and ErbB2.

Ligand response can be measured in cultured cells but not easily in tumors. We therefore looked for basal measurements that might serve as a surrogate for the HRG response. HRG binds to heterodimers of ErbB2 and ErbB3, and we observed that the sum of their abundances could substitute for the amount of pERK in HRG-treated cells (Fig. 4B; illustrated for GSK2126458). Using ErbB2 and ErbB3 amounts, we predicted GI50 values for six drugs with similar or better accuracy than was achieved with the original PLSR models (Fig. 4C). We did not arrive at the [ErbB2 + ErbB3] predictor by systematically searching through the data set for measures correlated with drug response; the danger of such a search is that it potentially uncovers random correlations between a large number of independent variables (biochemical measurements) and a dependent variable (GI50 value), particularly when the number of cell lines is small. Instead, we used PLSR to identify the most predictive model in a principled way and then generated a related, but easier to measure, predictor by combining information on top coefficients of the models and the previous knowledge that HRG functions through the ErbB2-ErbB3 complex. Although one might have expected total or phosphorylated amounts of Akt to serve as an effective surrogate, we found that measure of Akt was not an effective predictor of the response (Fig. 4C), although cells carrying either mutations in the region of PIK3CA encoding the helical domain or in PTEN did have increased pAkt (P = 0.029) compared with cell lines lacking either of these mutations (data S1), consistent with previous reports (28). The [ErbB2 + ErbB3] predictor is potentially assayable in clinical samples and appears to distinguish sensitive from insensitive TNBC cells, an area in which the clinical need is greatest.

Improving drug prediction by stratification of cancer cells

A second way in which to combine experimental data and previous knowledge is to build PLSR models for different subsets of cell lines based on their molecular subtype or mutational status. We applied this to the pan-PI3K inhibitor GSK2126458, which is only poorly predicted by basal profiles when all cell lines are considered (q2 = ~0.1; Fig. 5A). By dividing the cell lines according to their subtype, TNBC versus HR+ and HER2amp (HR+/HER2amp), we obtained a good prediction for both groups (q2 = ~0.35; Fig. 5A). TNBC cell lines are generally resistant, but a few are as sensitive as HR+/HER2amp cells, and the model captured this (Fig. 5B). Subtype-specific models predicted GI50 values for TNBC and HR+/HER2amp cells using different biochemical features: ErbB3 abundance alone could discriminate sensitive and insensitive TNBC cells (confirming one component of the [ErbB2 + ErbB3] predictor), whereas basal pERK was the key variable for HR+/HER2amp cells (Fig. 5, C and D, and data S5). The abundance of ErbB3 had no predictive value for the HR+/HER2amp cells because the amount of ErbB3 varied little among the cell lines of these subtypes (Fig. 1A).

Fig. 5 Inclusion of clinical and genomic information can improve predictions of drug response.

(A) Quality of the prediction for models stratified according to subtype (TNBC versus HR+/HER2amp) for the pan-PI3K inhibitor GSK2126458. Predictions by subset are shown according to a model built on just the indicated subset of cell lines (yellow and purple bars) or all of them (black bar). (B) Distribution of GI50 values for GSK2126458 by subtype. (C and D) Correlation of ErbB3 (C) and pERK (D) with GSK2126458 GI50 values for each of the three cancer subtypes. A dashed line shows a nonsignificant correlation. (E) Quality of the prediction of models stratified according to mutational status of PIK3CA and PTEN for triciribine. Full, all cell lines; WT, cell lines wild type for both PIK3CA and PTEN; Mut, cell lines with mutations in either or both of PIK3CA and PTEN. Nonsignificant predictions (NS) are set to zero. (F) Projection of the abundance of pc-Met and VEGFR1 overlaid with the triciribine GI50 values only for cell lines wild type for PIK3CA and PTEN (red, resistant; blue, sensitive). Dashed lines show the detection threshold for each measurement. (G) Quality of the prediction of models stratified according to mutational status of PIK3CA and PTEN (WT versus Mut) for drugs targeting the PI3K/Akt pathway. Only those drugs for which stratification improves prediction are shown.

Whereas relatively few models were improved by stratifying by subtype, stratification by mutational status had a larger impact. In the case of triciribine, an inhibitor of Akt that also targets other kinases, we could not construct a significant PLSR model using biochemical data alone, but excluding cells carrying PIK3CA or PTEN mutations resulted in a highly predictive model (Fig. 5E and data S5). In the model for wild-type lines, cells sensitive to triciribine had high amounts of phosphorylated c-Met and high VEGFR1 and were distributed across all three clinical subtypes (Fig. 5F, blue). When we built models stratified by mutational status for the 11 drugs targeting the PI3K/Akt pathway in our data set, prediction was improved for either wild-type or PIK3CA and PTEN mutant cell lines in six cases (Fig. 5G), resulting in significant predictions using either a general or a stratified model for 10 of the 11 drugs. Because of varying sample sizes (not all drugs were tested on all cell lines), it is difficult to precisely quantify the degree of improvement, but our findings suggest that molecular determinants of drug response differ in wild-type and PIK3CA/PTEN mutant cells.

Identifying response outliers with modeling

Drugs targeting ErbB are a special case in breast cancer because they are important clinically and well predicted by HER2amp status. However, the correlation is not perfect, which creates two challenges: to predict HER2amp lines that are resistant to ErbB-targeted drugs and to identify TNBC and HR+ lines that are sensitive. We built PLSR models for ErbB-targeted drugs using basal profiles and visualized cell lines on the landscape of model variables (Fig. 6A). This landscape was built by drawing edges between pairs of cell lines that lie within a threshold distance in the space of model variables. In the case of lapatinib, a standard-of-care drug for combination therapy (49) or in patients who relapse after trastuzumab-based therapy (50), we identified one resistant cluster (R1) and two sensitive clusters (S1 and S2) (Fig. 6A). The sole sensitive HR+ line fell into S1 (MDA-MB-175, blue arrow), and the sole resistant HER2amp line fell in R1 (MDA-MB-361, red arrow). The variables that differentiate R1 from S1 and S2 were pERK, which in MDA-MB-175 was significantly higher than in other HR+ lines (P = 1.4 × 10−4), and pErbB2, which in MDA-MB-361 was lower than in other HER2amp lines (P = 4.0 × 10−4). The abundance of pc-Met and pErbB3 distinguished S1 from S2 (fig. S8). The abundance of pERK and pErbB2 are markers of sensitivity to lapatinib (Fig. 6B) and were also the variables that identified the single TNBC line (MDA-MB-453) that is sensitive to afatinib, another approved ErbB2 inhibitor (Fig. 6C, black arrow). MDA-MB-453 lay at the intersection between the R1 cluster and the S1 and S2 clusters (black arrow, Fig. 6A), and it had a significantly higher pErbB2 abundance than the other TNBC lines (P = 3.6 × 10−6). Thus, we found that pathway activity (represented by pErbB2 and pERK abundance) yielded higher q2 for models of ErbB-targeted drugs than did HER2amp status (Fig. 6D). These results showed that biochemical profiles can identify drug-sensitive outliers among HR+ and TNBC lines.

Fig. 6 The amounts of pErbB2 and pERK are key predictors of response to most drugs targeting ErbB1 or ErbB2.

(A) Network representation of the cell lines in high-dimensional space defined by the variables of the predictive models for lapatinib using the basal profile, with connections indicating similar amounts of the proteins or phosphoproteins. Color corresponds to GI50 values (red, resistant; blue, sensitive; red arrow, MDA-MB-361 HER2amp; blue arrow, MDA-MB-175 HR+; black arrow, MDA-MB-453 TNBC). (B and C) Projection of pErbB2 and pERK overlaid with the GI50 values for lapatinib (B) and afatinib (C). Color corresponds to GI50 values (red, resistant; blue, sensitive). Dashed lines show the detection threshold for pErbB2; arrows point the same cell lines as in (A) (GI50 value for MDA-MB-361 not available for afatinib). (D) Quality of the prediction for drugs targeting ErbB1 or ErbB2 using a linear model with a single variable or the variables pERK and pErbB2. Nonsignificant predictions (NS) are set to zero.


Here, we showed that biochemical profiles of basal and ligand-stimulated signaling pathways can be used to construct PLSR-based models that predict responsiveness to 23 of 43 targeted drugs examined by Heiser et al. (20). In contrast to many genomic classifiers of drug sensitivity in breast cancer, our models predicted continuous GI50 values, which rarely divide cleanly into resistant and sensitive classes. We found that signaling profiles yielded more predictive models for drugs targeting intracellular targets than did basal profiles. This seems logical because signaling profiles report on network activity; however, larger data sets are necessary to assess the statistical significance of the differential predictive power of basal and signaling profiles.

Inspection of PLSR model variables revealed which features of signal transduction biochemistry were predictive of sensitivity to which drug. In many cases, the number of variables was small (5 to 12). Surprisingly, our data indicated that target abundance or target phosphorylation rarely correlated with the sensitivity of cells to a drug against that target, except in the case of ErbB2, which also had a positive correlation between abundance and basal phosphorylation. In this sense, ErbB2 is not a prototype for a protein-based biomarker, but rather a special and likely rare exception. More generally, we found that classification of breast cancer lines using a dichotomous high-low score for receptor abundance obscured the graded variation that is observed across cell lines. Unsupervised clustering divided lines into clusters primarily on the basis of the abundance or ligand responsiveness of receptors such as c-Met, FGFR4, and IL-6R, none of which are in the subtype definitions of TNBC, HER2amp, or HR+, but which are implicated in drug resistance (5153) or are the targets of investigational antibody-based therapeutics (5458). Moreover, large-scale sequencing of HER2amp tumors also shows that the subtype constitutes several distinguishable classes (22). Subdividing the HER2amp subtype could be beneficial clinically if future studies establish a correlation of genetic and biochemical profiles with preexisting or acquired resistance to trastuzumab, currently the most important drug for this type of tumor.

Breast cancers are frequently mutated at PIK3CA or PTEN loci (1, 22), and many drugs targeting the PI3K/Akt pathway are in active clinical development (47). Unfortunately, neither mutational status nor subtype is very predictive of sensitivity to these drugs. However, we could build predictive models for most drugs targeting the PI3K/Akt pathway based on network profiling. Nevertheless, even among drugs that supposedly have the same molecular target, the models differed, which may be due to differences in either the isoform specificity or the off-target activities of the individual drugs. Rather than the mutational status of PIK3CA or PTEN, or the abundance of pAkt, we found that for a subset of PI3K/Akt-targeted drugs, responsiveness to the ligand HRG had significant predictive value, as did the abundance of ErbB2 and ErbB3, which are easier to measure in tumor samples. These results suggest that pathway activity or the potential for inducibility is an effective indicator of drug sensitivity. Because predictors are correlative, we cannot say precisely how responsiveness to the ErbB2-ErbB3 ligand HRG and sensitivity to PI3K/Akt inhibitors are linked mechanistically, but HRG and ErbB3 are both implicated in breast cancer biology (59, 60). Additional work with clinical samples is required to determine whether such a biomarker will be practically useful.

In many cases, our ability to predict GI50 values from pathway data is improved by including information about the cell line’s mutational status. Remarkably, prediction of sensitivity to drugs that target ErbB2 (such as lapatinib) or ErbB1 (such as gefitinib) is also improved by adding data on PIK3CA or PTEN mutational status. These findings highlight the value of including both genomic and biochemical data in predictors of drug sensitivity and demonstrate a need for better understanding of dysregulated Akt signaling in breast cancer cells. Our data also suggest that PIK3CA or PTEN mutational status may act as a cofounding factor rather than a predictor of responses to targeted drugs.

The experiments described here serve as proof of principle to study drug response across panels of cells lines based on systematic biochemical analysis of basal and induced cell states. The protein profiles in the current work contain only 50 to 200 variables per cell line, many fewer than could be collected by whole-genome sequencing or mRNA expression profiling (although this may change with high-throughput mass spectrometry). Nonetheless, our results suggest that even a limited number of protein measurements centered on pathways targeted by drugs can be highly informative. If we compare this work to the transcriptional and genetic profiling in Cancer Cell Line Encyclopedia (26), we find that our models have an equal or better Pearson’s correlation for the predictions of all four drugs common to our work. We speculate that the wide range of genetic changes that occur in cancer are canalized into a simpler set of changes at the level of signal transduction networks. If so, systematic mechanistic analysis of signaling pathway status across diverse cancer genotypes should substantially simplify our understanding of tumorigenesis. Of course, genomic and transcriptional data are easier to collect from patients than are proteomic measurements, but we found that, in some cases, very specific proteomic data were predictive of drug response. Combining such proteomic data on signaling networks with genomic data already routinely collected in the clinic might yield hybrid clinical biomarkers that could improve therapy selection in cancer patients.

Materials and Methods

Cell culture

All cells were obtained from the American Type Culture Collection and grown according to recommendations, except for BT-474, MCF7, MDA-MB-415, and MDA-MB-436. All culture conditions are provided in table S2. All cells were free of Mycoplasma. Cells were plated either in 15-cm dishes or in 96-well dishes to achieve about 75% confluency at the time of lysis. Cells were grown for 24 hours and then starved in serum-free medium without additives for 18 hours before exposure to ligands for signal profiling or lysis for basal profiling.

Extracellular ligands

All ligands used in this study are listed in table S4. Ligands were dissolved according to recommendations from the manufacturer at 100 μg/ml, except for IFN-α, which was supplied as a solution of 106 U/ml. For treatment of each cell line with final concentrations of 100 and 1 ng/ml (103 and 10 U/ml for IFN-α), ligands were further diluted into 10× treatment solution with the same medium used for serum starvation.

ELISA—Basal profile lysate collection

Cells were plated in 9× 15-cm dishes, washed with 12 ml of phosphate-buffered saline (PBS), and drained for 30 s. Two plates were treated for about 5 min with 1.5 ml of 0.05% trypsin (Corning, 25-052-CI) until cells detached from the plates. Trypsin was stopped with 8.5 ml of 10% fetal bovine serum in PBS and gently pipetted up and down until it was in single-cell suspension. Cells from each plate were counted in duplicate using a Cellometer Auto T4 (Nexcelcom). Seven plates were lysed with 1 ml of lysis buffer containing the following: Mammalian Protein Extraction Buffer (M-PER; Thermo Scientific, 78501) supplemented with protease inhibitor cocktail (Sigma-Aldrich, P2714), 1 mM sodium orthovanadate (Sigma-Aldrich, S6508), 5 mM sodium pyrophosphate (Sigma-Aldrich, 221368), 50 μM oxophenylarsine (EMD Biosciences, 521000), and 10 μM bpV(phen) (EMD Biosciences, 203695). Lysed cells were scraped off the plate, collected in microcentrifuge tubes, and incubated on ice for 30 min. Membranes and cell debris were sedimented by centrifugation at 20,000g for 10 min at 4°C, and the supernatants were pooled, aliquoted, and subsequently stored at −80°C.

ELISA—Ligand response lysate collection

Cells were plated in two 96-well plates and treated with growth factors for 10, 30, or 90 min. The cells were washed in 200 μl of cold PBS and lysed in 30 μl of the M-PER lysis buffer with the inhibitors listed above. Plates were sealed, rocked on ice for 15 min, and stored at −80°C.

ELISA measurements

All ELISA assays used for this study are listed with their targets in table S3. For protein measurements, 384-well, black, flat-bottomed, polystyrene, high-binding ELISA plates (Corning, 3577) were incubated overnight at room temperature with capture antibodies (table S3) and then blocked with 2% bovine serum albumin in PBS for 1 hour. Plates were washed four times with 0.05% Tween 20 in PBS (PBS-T) and then incubated with lysates and recombinant protein standards for 2 hours at room temperature. After each antibody incubation, plates were washed four times with PBS-T. ELISAs were incubated with primary and secondary antibodies for 2 hours and 1 hour, respectively. All ELISAs were visualized with SuperSignal ELISA Pico Chemiluminescent Substrate (Pierce, 37069), and luminescent signal was measured with an EnVision Plate Reader (PerkinElmer).

Data from ELISA measurements were extracted using MATLAB Statistics Toolbox. Briefly, raw data were background-subtracted (raw values from assay buffer were subtracted from raw values of each sample). A dilution series of recombinant protein (standard curve) was used to convert raw signal into known protein concentration (pg/ml). A linear best-fit line was calculated on the basis of the standard curve, then background-subtracted values for each sample were converted into pg/ml using the linear equation (y = mx + b) of the standard curve. Regressed data were multiplied by the dilution factor for each sample, then mean and SE of the mean values were calculated for at least four technical replicates for the basal profile measurements and two biological replicates for the ligand response measurements. Data above or below the range of detection were set to the upper or lower detection limit, respectively. For steady-state protein levels, ELISA measurements (pg/ml) were normalized to the number of cells in the lysate (cells/ml) to get picograms of protein per cell.

High-throughput microscopy—Ligand response

Experiments were performed as previously described (61). In short, cells were plated in 6× 96-well plates and treated with growth factors and cytokines for 10, 30, or 90 min. The cells were fixed for 10 min at 25°C in 2% paraformaldehyde. Plates were washed with 200 μl of PBS-T and stored at 4°C until assaying. Cells were permeabilized with 100 μl of methanol for 10 min at 25°C, washed with 200 μl of PBS-T, and blocked with 40 μl of Odyssey Blocking Buffer (OBB; LI-COR) for 1 hour at 25°C. Cells were treated with 40 μl of primary antibody diluted 1:400 in OBB, sealed, and incubated overnight at 4°C on rocking platform. Cells were washed twice with 200 μl of PBS-T and treated with 40 μl of secondary antibody diluted 1:2000 in OBB incubated for 1 hour at 25°C. Cells were washed in 200 μl of PBS-T, then 200 μl of PBS, and were stained with 40 μl of Hoechst 33342 (250 ng/ml) (Invitrogen) and 1:1000 Whole Cell Stain (blue; Thermo Scientific) in PBS. Cells were washed two times with 200 μl of PBS and imaged in an imageWoRx high-throughput microscope (Applied Precision). Data were extracted using ImageRail and stored for processing in semantic data cubes (61). All antibodies used for high-throughput microscopy are listed with their targets in table S5.

Clustering parameters

The clustering in Figs. 1C and 2A was made using the MATLAB statistical toolbox. The distance used in the algorithm is the correlation distance (that is, 1 − R, where R is the Pearson’s correlation), and the “average” linkage was used to draw the clustering tree.

Prediction algorithm

Our prediction uses a linear model based on a PLSR with variable selection based on variable importance projection (62). All codes are written in MATLAB using the standard embedded function such as “simpls” for building the PLSR model. The detailed description of the algorithm can be found in text S1.

Supplementary Materials

Text S1. Prediction algorithm and statistical tests.

Fig. S1. Experimental design and illustration of the collected data sets.

Fig. S2. Correlations of the basal amounts of phosphorylated RTKs and intracellular kinases.

Fig. S3. Loading and variance of the principal components of the PCA of the basal profiles.

Fig. S4. Correlation between the ligand responses measured by ELISA and high-throughput microscopy.

Fig. S5. Description of the leave-one-out PLSR algorithm used for prediction.

Fig. S6. Correlation between predictions made with ligand responses measured by ELISA or imaging.

Fig. S7. Coefficients of the models based on the signaling profiles for drugs targeting the PI3K/Akt pathway.

Fig. S8. Distribution of model variables in each cluster for the lapatinib network representation.

Table S1. Names, abbreviations, and UNIPROT IDs of the receptors and downstream signaling proteins mentioned in the main text.

Table S2. Cell lines and culture conditions.

Table S3. Description of the ELISA kits used for measuring the basal profiles.

Table S4. Description of ligands used for measuring the signaling profiles.

Table S5. Description of the antibodies used for the high-throughput microscopy imaging assays.

Table S6. Descriptions of the data sets used for the prediction of GI50 values.

Table S7. Enrichment analysis of the GI50 values by PTEN and PI3KCA mutational status.

Data S1. Measures of the ligand response and basal levels grouped by data set as described in table S6 and raw values.

Data S2. Signaling-targeted drugs used for drug sensitivity predictions and their corresponding reported GI50 in the cell lines.

Data S3. Results of the predictions and coefficients for the significant models for the different data sets used (basal profiles and signaling profiles collected by microscopy).

Data S4. Results of the predictions and coefficients for significant models using the ligand responses measured by ELISA.

Data S5. Results of the stratified models using the basal levels and model coefficients.

References (63, 64)

References and Notes

Acknowledgments: We thank S. Chopra, C. Zechner, S. Rosenthal, J. Gray, and I. Overton for helpful discussions. Funding: This work was supported by the Stand Up to Cancer Project (AACR-SU2C-DT0409), NIH grants U54-HG006097 and CA112967 (acquisition of the ICBP43 cell line collection), and a fellowship from the Swiss National Science Foundation (PBELP3_140652) to M.H. In-kind support for E.A.P., D.H.C., and B.S. was supplied by Merrimack Pharmaceuticals. Author contributions: M.N., E.A.P., M.C., D.H.C., and L.Z. performed all experiments; M.H. performed all analysis; and all authors wrote and edited the paper. Competing interests: P.K.S. is a founder of Merrimack Pharmaceuticals and chair of its scientific advisory board; this relationship is actively managed by Harvard Medical School (HMS) in accordance with NIH regulations. B.S. is a shareholder in Merrimack Pharmaceuticals and a board member of SilverCreek Pharmaceuticals. Data and materials availability: Supplementary materials and HMS LINCS (Library of Integrated Network-based Cellular Signatures) Web site (
View Abstract

Stay Connected to Science Signaling

Navigate This Article