De novo expression of human polypeptide N-acetylgalactosaminyltransferase 6 (GalNAc-T6) in colon adenocarcinoma inhibits the differentiation of colonic epithelium

Aberrant expression of O-glycans is a hallmark of epithelial cancers. Mucin-type O-glycosylation is initiated by a large family of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases (GalNAc-Ts) that target different proteins and are differentially expressed in cells and organs. Here, we investigated the expression patterns of all of the GalNAc-Ts in colon cancer by analyzing transcriptomic data. We found that GalNAc-T6 was highly up-regulated in colon adenocarcinomas but absent in normal-appearing adjacent colon tissue. These results were verified by immunohistochemistry, suggesting that GalNAc-T6 plays a role in colon carcinogenesis. To investigate the function of GalNAc-T6 in colon cancer, we used precise gene targeting to produce isogenic colon cancer cell lines with a knockout/rescue system for GALNT6. GalNAc-T6 expression was associated with a cancer-like, dysplastic growth pattern, whereas GALNT6 knockout cells showed a more normal differentiation pattern, reduced proliferation, normalized cell–cell adhesion, and formation of crypts in tissue cultures. O-Glycoproteomic analysis of the engineered cell lines identified a small set of GalNAc-T6–specific targets, suggesting that this isoform has unique cellular functions. In support of this notion, the genetically and functionally closely related GalNAc-T3 homolog did not show compensatory functionality for effects observed for GalNAc-T6. Taken together, these data strongly suggest that aberrant GalNAc-T6 expression and site-specific glycosylation is involved in oncogenic transformation.

Malignant transformation is closely associated with changes in the glycosylation of proteins and lipids (1,2). One well-documented example, which is observed in the majority of epithelial cancers and premalignant lesions, is cancer-associated changesinGalNAcO-glycosylation (1,(3)(4)(5).GalNAc-typeO-glycosylation is an abundant and diverse form of post-translational modification (6). It is initiated by a family of up to 20 polypeptides, termed GalNAc-transferases (GalNAc-Ts), 2 that catalyzes the addition of GalNAc residues to the hydroxyl groups of selected serine and threonine residues in proteins (6 -19). In healthy cells, the initiating GalNAc residue (GalNAc␣1, also known as the Tn antigen) is elongated, branched, and capped with different carbohydrate structures in sequential processing steps. In contrast, cancer cells are often characterized by the expression of immature and truncated O-glycan structures, such as Tn and sialylated Tn (STn) (20). The expression of these short and truncated O-glycans strongly correlates with poor prognosis (21)(22)(23)(24). Although the association between the expression of truncated O-glycans and cancer prognosis per se is well-established, the importance of GalNAc-T-mediated site-specific glycosylation in cancer is unclear, due mainly to technical limitations. Until recently, it was not possible to conduct a global analysis of GalNAc-T function to identify which proteins are glycosylated with this moiety and the glycosylation sites within the proteins. We recently developed a differential global glycoproteomic strategy, which, in combination with genetic engineering, enables us to investigate the function of specific GalNAc-Ts in cell line models and to begin to investigate the site-specific functions of GalNAc glycosylation in cancer (25)(26)(27).
GalNAc-Ts control the initiation of O-glycan biosynthesis and are differentially expressed in cells and tissues. They have distinct, partly overlapping acceptor substrate specificities (6,28,29). GalNAc-T glycosylation has been implicated in numerous important biological functions, including proprotein processing, ecto-domain shedding, cell signaling, and cell adhesion (30 -33). Furthermore, GalNAc-Ts are reported to influence several key processes that are important for tumor formation, including growth (34 -37), immune evasion (38,39), and invasion and metastasis (36, 37, 40 -44). Notably, the exact relationship of site-specific O-glycosylation and tumor formation is unknown, and we do not yet understand how up-regulation of selective GalNAc-Ts affects carcinogenesis.
In this study, we examined the expression patterns of all Gal-NAc-Ts in colon cancer using transcriptomic data analysis, and we observed selective up-regulation of GalNAc-T6 but not Gal-NAc-T3. To shed light on the function of GalNAc-T6 in colon cancer development, we developed a cell model system in which specific ablation of the gene encoding GalNAc-T6 (GALNT6) was followed by detailed polyomic analysis. Precise genome-targeted knockout of GALNT6, GALNT3, or a combination of the two in the LS174T colon cancer cell line demonstrated that GalNAc-T6 expression was essential for the acquisition of oncogenic features such as hyperproliferation, loss of normal colonic epithelial architecture, and the disruption of cell-cell adhesion. Thus, LS174T GALNT6 knockout cells showed terminal differentiation traits and formed crypt-like structures that resembled the tissue architecture of a healthy colon, features that were reverted upon reintroduction of exogenous GalNAc-T6. Differential transcriptomic analysis confirmed that the expression profile of the GalNAc-T6-expressing LS174T cells resembled that of colon cancer cells, whereas LS174T GALNT6 knockout cells had an expression profile that was more similar to that of normal colon tissue. Furthermore, differential O-glycoproteomic analysis identified unique Gal-NAc-T6 targets, including several important cellular adhesion proteins. These results support the notion that aberrantly expressed GalNAc-T6 plays an important role in colorectal carcinogenesis.

Selective up-regulation of GalNAc-T6 in colon cancer tissue
We first used The Cancer Genome Atlas (TCGA) on colon adenocarcinomas to identify potential cancer-associated changes in the expression of GalNAc-Ts. Expression profiles for all 20 GalNAc-T isoforms were analyzed in 288 colon adenocarcinomas and in 41 healthy colon tissue samples using RNAseq transcriptome data (https://genome-cancer.ucsc.edu/ proj/site/hgHeatmap/) 3 ( Fig. 1a and Fig. S1). Of the 20 Gal-NAc-T isoforms, GalNAc-T6 was the only GalNAc-T that was expressed de novo in colon cancer, i.e. was absent from healthy colon tissue. In contrast, the majority of GalNAc-Ts was either unregulated or down-regulated in colon cancer ( Fig. 1a and Fig.  S1). To confirm the cancer-specific up-regulation of Gal-NAc-T6 at the protein level, we evaluated the expression of GalNAc-T6 in 39 cases of colorectal carcinomas and in healthy colorectal mucosa by immunostaining. The expression pattern of GalNAc-T6 was compared with the expression of its close homolog GalNAc-T3 ( Fig. 1b; Table 1). GalNAc-T6 expression was detected in 34 of 39 carcinomas with antibody labeling restricted to the perinuclear area, suggesting its localization to the Golgi apparatus. GalNAc-T6 expression was not detected in normal colorectal mucosa in four of four cases. High levels of GalNAc-T3 were detected in all 39 colorectal carcinomas, and GalNAc-T3 was almost homogeneously expressed in all layers of the normal-appearing crypts as well as in the tumor tissues. Thus, in contrast to GalNAc-T3, GalNAc-T6 was overexpressed in tumor tissue but not in normal colon tissue. This establishes that GalNAc-T6 is up-regulated at both RNA and protein levels in a cancer-specific manner and suggests that aberrantly expressed GalNAc-T6 has a unique function in colon cancer progression.

GalNAc-T6 disrupts the formation of actin-lined lumens and is associated with the expression of cancer-associated genes in vitro
We next used the well-differentiated human LS174T colon adenocarcinoma cell line as a cell model to evaluate colon cell growth in the presence and absence of GalNAc-T6. LS174T cells exhibit unrestricted growth and grow as separate clusters of cells, supposedly due to inhibited p21WAF1 expression (69). GALNT6 and GALNT3 were knocked out in LS174T cells, individually or combined, using zinc finger nuclease (ZFN)-based genome editing to produce ⌬T6 and ⌬T3 cells. Successful outof-frame mutagenesis was confirmed in individual single-cell clones (Table S1). RNAseq verified that non-sense-mediated RNA decay had removed the targeted transcripts (Fig. 2a), which was also shown by immunofluorescence staining (Fig.  2b). Knockout of GALNT6 was accompanied by an increase in GALNT3 transcripts, and similarly, the knockout of GALNT3 was associated with an increase in GALNT6 transcripts, which suggests that these two enzymes can compensate for each other (Fig. 2a). There were no notable changes in the expression of other GalNAc-Ts after select knockout of either GALNT3 or GALNT6. Knockout of both GalNAc-T3 and -T6, however, resulted in an increase in the expression of GALNT1.
In accordance with previous reports (69), wildtype (WT) LS174T cells formed multilayered colonies, thereby replicating colon cancer growth. Phalloidin staining, to detect F-actin cytoskeletal protein, showed that WT LS174T colon cancer cells, expressing high levels of GalNAc-T6, grew as clusters of cells with dense tubular structures and multiple small, actin-lined lumens, which could resemble the disordered crypts seen in colon cancer tissue (Fig. 3, a, b, and d). Intriguingly, knockout of GALNT6 resulted in cells that grew as colonies with one large actin-lined lumen surrounded by a wall of cells of ucsc.edu/proj/site/hgHeatmap 3 show the expression of GalNAc-Ts in 288 colon adenocarcinomas and 44 healthy colon tissue samples. Red Ͼ0, white ϭ 0, blue Ͻ0, gray ϭ no data. The data are normalized by subtracting the mean of the RNAseq values from each sample value for each of the 20 GalNAc-T and shown in red or blue color. GalNAc-T6 is specifically up-regulated in colon adenocarcinoma, whereas GalNAc-T3 expression is unchanged. b, immunofluorescence staining of GalNAc-T6 (mAb 2F3) and GalNAc-T3 (mAb 2D10) (green) in colorectal adenocarcinoma and healthy colon mucosa (blue, DAPI). GalNAc-T6 is strongly expressed in tumor tissue and absent in normal tissue, whereas GalNAc-T3 is expressed in both types of tissue. Hematoxylin and eosin (H&E) staining shows the morphology of tumor tissue compared with normal tissue in the present sample. Scale bar, 50 m.

Table 1 GalNAc-T6 and GalNAc-T3 expression in colon adenocarcinoma
Tissues were evaluated as positive when more than 25% of the cells were labeled. Labeling intensities were scored from 0 (negative) to 3 (high intensity staining).

Cancer-associated expression of human pp-GalNAc-T6
varying thickness. Staining of healthy colon tissue revealed similarity of these luminal structures with healthy colonic crypts (Fig. 3d).
To confirm that the phenotypic change observed in LS174T⌬T6 cells was the result of GALNT6 knockout rather than a clonal effect, we re-introduced functional, constitutively expressed GALNT6 into ⌬T6 cells to create ⌬T6ϩT6 cells. This was accomplished using a recently published site-specific ZFNmediated knockin strategy ( Fig. 3c and Table S1) (70,71). GALNT6 re-introduction rescued the phenotype, and the ⌬T6ϩT6 cells formed disorganized clusters of cells with multiple small actin-lined lumens (Fig. 3a). Interestingly, no major phenotypic changes were observed after knocking out the close homolog GALNT3 in a WT or ⌬T6 cell background (⌬T3 and ⌬T3⌬T6) (Fig. 3a). When we observed more than 300 colonies of ⌬T6 and ⌬T3⌬T6 cells, we found that 69 and 64% formed crypt-like structures, respectively, compared with 1.5% of WT cells, 1% of ⌬T3 cells, and 13% of ⌬T6ϩT6 cells (Fig. 3e). Taken together, these results indicate that the specific up-regulation of GalNAc-T6 expression during malignant transformation disrupts colon crypt formation.

GalNAc-T6 influences the proliferation and differentiation of colon cancer cells
To investigate the functional role of GalNAc-T6 in colon carcinogenesis, we performed RNAseq analysis of LS174T cells that did or did not express GalNAc-T6 and GalNAc-T3 (Table  S2). We defined 122 genes that were significantly down-regulated in LS174T⌬T6 cells (RPKM(WT) Ն10, log2(⌬T6/WT) Յ Ϫ2) (Table S3). String (http://string-db.org) 3 and functional enrichment analysis (Gene Ontology terms) revealed that half of these genes (62/122) are involved in the cell cycle (Fig. 4a). Investigation of the proliferative potential of LS174T cells showed lower cell population doubling times for cells lacking GalNAc-T6 compared with wildtype cells (Fig. 4b). When Gal-NAc-T6 expression was restored in LS174T cells, the doubling curves phenocopied those of the WT cells. No changes in proliferation were observed in ⌬T3 cells. These findings suggest that GalNAc-T6 promotes cellular proliferation in colon cancer.
In normal colon crypts, cells arise from stem cells located in the crypt base. As the cells differentiate, they lose their ability to proliferate (Fig. 4c). In cancer, however, a lack of differentiation

Cancer-associated expression of human pp-GalNAc-T6
allows sustained proliferation. The formation of organized crypts in LS174T⌬T6 cells suggests a shift toward a more differentiated state and might explain the decrease in proliferative potential. To test this hypothesis, we stained LS174T cells for the leucine-rich repeat-containing G-protein-coupled receptor 5 (LGR5).
LGR5 is considered one of the most selective stem cell markers in the intestine (72), and its expression is restricted to the crypt base of normal colon mucosa. LS174T⌬T6 cells showed positive LGR5 staining only at the crypt base of the colonies (Fig. 4d), mimicking the LGR5 expression pattern in healthy colon tissue. In contrast, LS174T cells that expressed GalNAc-T6 had a high number of LGR5-positive cells in the upper layers of the colonies, with increasing staining intensity at the luminal surface, similar to the expression pattern of LGR5 in human colon cancer (73). This may indicate that Gal-NAc-T6 expression induces a cancer-like LGR5 expression pattern in these colonies (Fig. 4d). The number of LGR5-positive cells was higher in ⌬T6ϩT6 cultures than in ⌬T6 cultures, with increased staining toward the top of the colonies, although reintroduction of GalNAc-T6 expression did not completely revert the ⌬T6 phenotype.
Next, we assessed the expression of markers that define various differentiation stages in the colon in the transcriptome of LS174T cells, and we found a correlation between the loss of GalNAc-T6 and increased transcription of differentiation marker genes. Conversely, the transcriptome of LS174T cells that expressed GalNAc-T6 had an expression signature that was characteristic of non-differentiated colon cells (Fig. 4e, upper panel, and Table S4). We used the colon adenocarcinoma TCGA transcriptome database to assess whether this GalNAc-T6-dependent gene expression pattern resembled the expression pattern in human colon cancer tissue (Fig. 4e, lower panel). Remarkably, the general low expression of differentiation markers in LS174T WT cancer cells mimics the low expression of these markers in human colon adenocarcinomas (Fig. 4e). Conversely, the expression profile of LS174T cells lacking Gal-NAc-T6 resembled the expression profile of healthy colon tissue in that both had decreased expression of stem cell markers and increased expression of genes associated with the differentiated state. Knockout of GALNT3 did not affect the expression of differentiation markers.
The ability to form crypts, keep a low proliferation rate, and retain a highly differentiated state are all properties of healthy colon cells that are lost during carcinogenesis. In LS174T cancer cells, elimination of GalNAc-T6 slows down proliferation and allows the cells to differentiate and form crypt-like structures, which suggests a shift toward a less dysplastic phenotype. To investigate whether the change in expression detected upon knockout of GALNT6 resembles the shift in gene expression from healthy colon tissue to colon adenocarcinoma, we used the dataset from the TCGA transcriptome database. The vast majority of the 122 genes down-regulated in LS174T⌬T6 cells

Cancer-associated expression of human pp-GalNAc-T6
was also expressed at lower levels in healthy colon tissue when compared with colon adenocarcinomas (Fig. 4f). The trend was less clear for the 43 genes that were significantly up-regulated in ⌬T6 cells (RPKM(⌬T6) Ն10, log2(⌬T6/WT) Ն2) (Table S3). In this case, we found only a portion of the genes to be up-regulated in healthy colon tissue compared with colon adenocarcinomas (data not shown). In contrast, the expression of only a few genes was changed in ⌬T3 cells (Tables S2 and S3 and data not shown).

GalNAc-T6 controls cell-cell adhesion but does not induce the EMT
GalNAc-T6 expression has previously been associated with induction of the EMT (68). We therefore investigated whether the LS174T transcriptome expressed markers that are associated with the epithelial phenotype (E-cadherin/CDH1) or mesenchymal markers (N-cadherin/CDH2; vimentin/VIM; fibronectin/FN1) (Fig. 5a). We did not detect any notable change in any

Cancer-associated expression of human pp-GalNAc-T6
of these markers or in inducers of EMT (SNAI1/Snail; SNAI2/ Slug; TWIST1/Twist; TGFB1/transforming growth factor ␤), which implies that GalNAc-T6 expression alone does not induce EMT. Knockdown of GalNAc-T6 was recently shown to induce a switch from P-cadherin to E-cadherin expression in pancreatic cancer cells (67). However, our colon transcriptome data indicated rather an opposite shift from E-cadherin to P-cadherin expression (Fig. 5b). In addition to P-cadherin, several other proteins involved in adhesion, predominantly cell-cell adhesion, were up-regulated upon GALNT knockout, including cadherin-17, CD44, and versican (Fig. 5, b and c). This indicates that expression of GalNAc-T6 disrupts the intercellular adhesive potential. To test this hypothesis, we performed a cellular dissociation assay on confluent sheets of LS147T cells that did or did not express GalNAc-T6. Cell-cell adhesion was significantly stronger in LS174T⌬T6 cells than in LS174T WT cells, where the confluent cell sheet was easily disrupted (Fig. 5, d and e). Reintroduction of GalNAc-T6 expression in LS174T⌬T6 cells rescued the WT phenotype, indicating that GalNAc-T6 decreases intercellular adhesion in LS174T cell cultures (Fig. 5, d and e). However, whether the change in expression of adhesion molecules is responsible for the changes in cell-cell adhesion that we observed in LS174T⌬T6 cells remains to be determined.

Differential O-glycoproteomic analysis identifies substrates glycosylated by GalNAc-T6
To further investigate the role of GalNAc-T6 in colon cancer development, we analyzed GalNAc-T6 target sites using the SimpleCell strategy. The SimpleCell strategy (Fig. 6a) uses homogeneous truncation of O-glycosylation to produce short glycan structures (Tn and STn) that allow enrichment of glycopeptides and determination of glycosites by nanoflow liquid chromatography-tandem mass spectrometry with electron transfer dissociation (ETD). This strategy can be combined with targeted knockout and knockin of individual GALNTs for broad ex vivo discovery of GalNAc-T isoform-specific functions (27,29). SimpleCell versions of LS174T (LS174T SC ) cells that do or do not express GalNAc-T6 and/or GalNAc-T3 were developed by ZFN-mediated knockout of COSMC ( Fig. 6b and Table S1) (29,32). Stable isotype dimethyl labeling (74) of total tryptic peptide digests from isogenic cell pairs with either light (L) or medium (M) reagent allowed quantitative profiling of GalNAc O-glycopeptides using sensitive O-glycoproteomic mass spectrometry (Fig. 6a) (75). The strategy and the light/ medium pairs are shown in Fig. 6c. Elimination of specific Gal-NAc-T isoforms in LS174T SC cells is thus expected to reveal glycosylation sites specific to those isoforms. Specifically, we looked for down-regulation or loss of M-labeled glycopeptides compared with L-labeled glycopeptides by analysis of the complete data sets from all of the differential O-glycoproteomes (Table S5). Based on earlier observations (27,29), we considered 10-fold down-regulated M-labeled glycopeptides (log10(M/ L) Յ Ϫ1) to be GalNAc-T6 -specific and/or GalNAc-T3-specific sites. When comparing GalNAc-T3 and GalNAc-T6 directly (Fig. 6c), specific GalNAc-T3 targets were the result of a loss of L-labeled LS174T SC ⌬T3 glycopeptides, producing log10(M/L) values that were equal to or higher than 1. The analysis was performed on total cell lysates (TCL) as well as on secretomes (SEC) with two or three replicates for each set. 43 potential GalNAc-T6 -specific glycosylation sites were identified in LS174T SC ⌬T6 cells, whereas 67 sites were identified from the LS174T SC ⌬T3 versus LS174T SC ⌬T6 comparison; 12 sites were found using both approaches leaving a total of 98 potential GalNAc-T6 -specific sites. In addition, 102 potential GalNAc-T3 targets were identified (35 sites from LS174T SC versus LS174T SC ⌬T3 comparison and 72 sites from LS174T SC ⌬T3 versus LS174T SC ⌬T6 comparison, and five sites were found using both approaches). Finally, 130 potential Gal-NAc-T3-and/or GalNAc-T6 -specific targets were identified by the comparison of LS174T SC and LS174T SC ⌬T3⌬T6 cells ( Fig. 6d and Table S5). To refine the results, we only considered targets found in at least two datasets. Furthermore, we rejected all glycosites that were targets of both GalNAc-T6 and Gal-NAc-T3. Finally, we performed focused proteome analysis (data not shown) to ensure that the observed loss of glycosite was not due to down-regulation of acceptor substrates. After this rigorous selection, six glycopeptides in six different proteins were found to be selectively glycosylated by GalNAc-T6 ( Fig. 6d and Table 2). Another 82 glycopeptides in 68 proteins were glycosylated by GalNAc-T3 and/or GalNAc-T6 ( Fig. 6e and Table S6). The six GalNAc-T6 -specific targets that we identified were as follows: melanoma inhibitory activity protein Up-regulated GalNAc-T6 levels were previously reported in various types of cancer, including breast (60 -64), gastric (40), renal (56), and pancreatic (49) carcinomas. Here, we report a de novo expression of GalNAc-T6 in colon adenocarcinoma. Approximately 95% of colon cancers are of the adenocarcinoma type. The remaining 5% includes sarcomas and squamous cell and carcinoid tumors. However, the regulation of GalNAc-T6 in these less common carcinomas remains to be evaluated.
To assess the importance of GalNAc-T6 expression in differentiation of colonic epithelia, we used the LS174T tumor cell

Cancer-associated expression of human pp-GalNAc-T6
line, which is derived from a human adenocarcinoma (81). LS174T cells are well-differentiated and exhibit unrestricted growth as separate clusters of cells; this is thought to be due to inhibited p21WAF1 expression (69). We demonstrated that the knockout of GALNT6 in LS174T cells reverted their cancer-like growth characteristics and promoted defined tissue organization with the formation of crypt-like structures. In contrast, knockout of the close homolog GALNT3, which is expressed in both healthy colon tissue and in colon cancer, did not induce any changes in growth.
The increased expression of GalNAc-T6 has previously been suggested to promote morphological changes in several human cancers (66,82). In breast cancer, overexpression of Gal-NAc-T6 was reported to decrease cellular adhesion and disrupt mammary acinar morphogenesis (66,67,82). Furthermore, it has been shown that GalNAc-T6 expression in pancreatic cancer causes a switch from E-cadherin expression to P-cadherin expression affecting cellular adhesion to the underlying matrix (67). The cancer-associated expression of GalNAc-T6 has also been proposed to induce EMT (66,68,83) as evidenced by decreased E-cadherin expression and enhanced expression of mesenchymal markers (68). Whereas our results support the idea that GalNAc-T6 expression negatively regulates adhesion of epithelial cells, we did not observe any changes in E-cadherin expression and hence could not confirm the potential effect of GalNAc-T6 on EMT previously observed in prostate cancer cell lines (68).
Because a large proportion of the proteins that pass through the secretory apparatus are potential substrates for GalNAc-T6 (84,85), it is challenging to determine the specific molecular mechanisms underlying the observed oncogenic effects of Gal-NAc-T6. Using the isogenic LS174T cell model system, we began to characterize the effects associated with the loss of the GALNT6 gene both at the global transcriptomic and global O-glycoproteomic level.
We confirmed that the transcriptomic profile of LS174T cancer cells resembled the expression profile of human colon cancer found in the CancerBrowser UCSC database. Both data sets showed low expression of colon cancer differentiation markers, such as pyruvate dehydrogenase kinase 1 (PDK1), trefoil factor 2 (TFF2), and alanyl aminopeptidase (ANPEP), and high expression of stem cell markers, such as olfactomedin 4 . Red, potential GalNAc-T6 O-glycosylation targets; blue, potential GalNAc-T3 targets; purple, potential GalNAc-T3 and/or GalNAc-T6 targets. d, 98 GalNAc-T6 -specific glycosylation sites were identified. Of 25 reproducible targets, 19 targets were also detected as GalNAc-T3 specific, non-specific, or false positives due to up-regulated protein levels, leaving six specific targets for GalNAc-T6. Of 102 GalNAc-T3 targets, no specific targets were detected. e, 82 GalNAc-T3/GalNAc-6 targets were detected, which included six GalNAc-T6 -specific targets. There were no GalNAc-T3-specific targets, suggesting that GalNAc-T3 and GalNAc-T6 have overlapping functions, with just a few GalNAc-T6 -specific O-glycosylation targets.

Table 2 GalNAc-T6-specific O-glycosylation sites
Six GalNAc-T6-specific O-glycosylation sites were identified as described in Fig. 6. CCDC14 was excluded from the table due to its presumed cytoplasmic localization, which could question the nature of the HexNAc residues. When several O-glycans are depicted, the precise position of the GalNAc-T6 specific O-glycosylation could not be determined.

Cancer-associated expression of human pp-GalNAc-T6
(OLFM4), achaete-scute family of bHLH transcription factor 2 (ASCL2), and SPARC-related modular calcium-binding 2 (SMOC2). In contrast, the expression profile of GALNT6 knockout cells had many features that are found in the transcriptomic profile of healthy colon tissue, e.g. higher expression of differentiation markers and down-regulated expression of stem cell markers. Thus, the transcriptomic data confirmed that loss of GalNAc-T6 causes the LS174T cells to resemble cells found in normal human colon tissue. Furthermore, the transcriptomic analysis revealed a much larger impact by the loss of GalNAc-T6 than by the loss of GalNAc-T3. This implies that inactivation of GALNT3 in LS174T cells has wider consequences than inactivation of GALNT3 supporting the phenotypic characterization of cells with and without GalNAc-T3 and GalNAc-T6. These findings are intriguing, because GalNAc-T3 and GalNAc-T6 have previously been suggested to perform very similar functions.
Our global differential O-glycoproteomic analysis of cells with and without GalNAc-T3 and GalNAc-T6 further confirmed specific functions related to GalNAc-T6 but not Gal-NAc-T3 in LS174T cells. We used our previously developed SimpleCell strategy (26) and performed proteome-wide analysis to identify O-glycoproteins with O-glycan attachment sites directly comparing the O-glycoproteomes of isogenic cell lines with and without GalNAc-T6 and/or GalNAc-T3. We identified 81 shared O-glycosylation targets for GalNAc-T6 and Gal-NAc-T3. Surprisingly, of these targets, only six were GalNAc-T6 -specific, whereas none of the targets were specific for GalNAc-T3. Interestingly, several of the GalNAc-T6 -specific target proteins play important roles in cell-cell and cell-matrix adhesion. For example, MIA3 (melanoma inhibitory activity family, member 3) has an ortholog, TANGO1, that is important for transportation and polarized secretion of collagen 7 in Drosophila and mice (86 -88). Another interesting GalNAc-T6 target is the ephrin B6 receptor (EphB6) (89). Ephrin receptors are a family of transmembrane proteins involved in cell adhesion and migration (90). Interestingly, the GalNAc-T6 glycosylation site in EphB6 is located in close proximity to the proposed binding pocket, which opens for the possibility that glycosylation interferes with ephrin B ligand binding. A few other GalNAc-T6 -specific targets were found, including SLITRK1, associated with Tourette's syndrome (91) and homologous to the Slit proteins. Slit proteins are important for regulating axonal guidance, cell migration, and axonal branching by altering cellular adhesiveness and cytoskeletal organization (92,93). The effect of SLITRK1 on synaptic adhesion is mediated through two extracellular leucine-rich repeats (LRR1 and LRR2) that interact with presynaptic leukocyte common antigen-related receptor protein-tyrosine phosphatases (LAR-RPTPs) (94). Interestingly, the GalNAc-T6 -specific site in SLITRK1 is localized in close proximity to the binding interface with LAR-RPTPs. It is, however, an open question whether the GalNAc-T6-mediated glycosylation influences SLITRK1 functions, such as lateral assembly of LAR-RPTPs-Slitrks complexes (94), and whether the interaction between SLITRK1 and LAR-RPTPs is important for adhesion between colonic cells. Notably, the molecular mechanisms by which these individual site-specific glycosylation events affect the function of proteins are not easy to eluci-date and will require an in-depth analysis of each individual protein in future studies. We and others have demonstrated that site-specific O-glycosylation of proteins affects protein function in several ways, including proprotein processing, modulation of the ligand-binding properties of receptors, and regulation of ectodomain shedding and cell signaling (30 -33). One prominent example of the function of single O-glycan sites in receptor dimerization comes from the demonstration that site-specific O-glycosylation of the granulocyte-CSF receptor regulates receptor homodimerization and signaling. Moreover, a common somatic mutation in a single O-glycosite is a cancer driver in a large percentage of patients with chronic neutrophilic leukemia (95). It is thus very likely that GalNAc-T6 can influence the function of its protein targets in several ways, but further studies are needed to identify the molecular mechanisms underlying GalNAc-T6-mediated regulation of protein function.
A potential role for GalNAc-T6 in modulation of the interaction with the immune system was suggested by the sit-specific glycosylation of HLA class II. The GalNAc-T6 -specific site was localized in the HLA class II histocompatibility antigen ␥ chain, also known as CD74 or invariant chain, which is involved in the stabilization and transport of MHC class II proteins (96,97). Disrupting the interaction between CD74 and MHC class II molecules, which could be the result of GalNAc-T6 -specific O-glycosylation, might decrease the presence of MHC class II molecules on the cell surface, leading to immune evasion of cancer cells. Although validation in future studies is required, it would be intriguing if a site-specific cancer-associated glycosylation event directly interfered with the expression of MHCII on the cancer surface. The cancer-specific expression of GalNAc-T6 might also interact with the immune system indirectly through induction of cancer-associated autoantibodies targeting novel cancer-specific glycopeptide epitopes (98 -102). Such site-specific glycosylation events may also represent potential targets for monoclonal antibody therapy (98,103), and our findings may provide support for the development of immunotherapeutic strategies that target aberrant O-glycoproteins or glycopeptide epitopes created by GalNAc-T6. Furthermore, the cancer-specific expression of GalNAc-T6 in colon cancer also suggests that isoform-specific inhibitors could prove useful in the future treatment of human cancers.
In conclusion, we here present evidence that overexpression of GalNAc-T6, as observed in several types of epithelial cancer, intrinsically promotes an oncogenic phenotype in the LS174T colon cancer cell model. This phenotype is characterized by increased proliferation and dysplasia, compromised adhesion, and the loss of normal differentiation, all of which are characteristics of colon cancer (104). Our study demonstrates that GalNAc-T6 is a potential key regulator of the malignant phenotype in colon cancer and suggests that overexpression of GalNAc-T6 is an early event during cancer development that provides a permissive environment for malignant evolution. In this context it will be of interest to test the effect of overexpression of GalNAc-T6 in healthy colon cancer cells in future studies.

Tissues
Tissue microarrays were purchased from U.S. Biomax, Inc. The tissue microarray (FCO401b) frozen tissue samples were from colo-rectal adenocarcinoma patients. Healthy control samples were evaluated from frozen multiple organ normal tissue array. The sections were fixed in cold 10% buffered neutral formalin for 15 min or in cold acetone for 10 min. Immunohistochemistry was performed as described under "Immunofluorescence" below.

Cell culture
The human colon adenocarcinoma LS174T cell line was chosen as a model system based on its phenotypic characteristics and stable genome (70,81). Cells were grown in 50% DMEM 1965 and 50% Ham's F12 supplemented with 10% FBS and 1% L-glutamine. Prior to staining, cells were trypsinized and seeded on diagnostic slides, dried overnight, and fixed in ice-cold acetone for 5-10 min. Alternatively, cells were cultured directly on coverslips in 24-well plates (5 ϫ 10 4 cells/well) for 4 days and fixed directly in the wells with 4% paraformaldehyde and 1% Triton X-100 for at least 2 h at room temperature. Bound mAbs were detected with FITC-conjugated rabbit antimouse immunoglobulin (1:100; Dako, Denmark), anti-mouse Alexa 488 (1:500; Invitrogen), or swine anti-rabbit FITC (1:100; DAKO). Actin was detected with Alexa 594-conjugated phalloidin (1:500; Invitrogen). Slides were mounted with Prolong Gold antifade reagent with DAPI (Invitrogen). Fluorescence micrographs were obtained on a Leica wide-field fluorescence microscope or a Zeiss LSM710 confocal microscope. Image processing was performed in ImageJ.

ZFN knockout gene targeting
LS174T ⌬T3 and/or ⌬T6 in a WT and SimpleCell (COSMC knockout) background were generated as described previously (26). Briefly, ZFN constructs targeting COSMC, GALNT6, and GALNT3 were custom produced (Sigma), and LS174T cells were transfected with 2 g of GFP-or DsRed2-tagged ZFN plasmids (105) using nucleofection with Amaxa Nucleofector (Lonza). GFP ϩ /DsRed2 ϩ cells were enriched by fluorescenceactivated cell sorting (FACS) and single-cell cloned by limited dilution. Indels at the respective target sites were characterized by Indel Detection by Amplicon Analysis (IDAA) (106), and indels identified in individual cell clones that were selected were confirmed by Sanger sequencing.

Precise GALNT6-targeted integration
Precise targeted integration of GALNT6 into GALNT6 Ϫ/Ϫ cells for stable expression of GALNT6 was performed using the ObLiGaRe knockin strategy (107) targeting the AAVS1 locus (also known as the PPP1R12C locus) on human chromosome 19. Briefly, an ObLiGaRe donor scaffolding vector was constructed encompassing the left and right inverted AAVS1 ZFNbinding sites flanking a CMV-GALNT6-Bgh-UTR expression cassette surrounded by insulator sequences. LS174T⌬T6 cells were transfected with 5 g of ObLiGaRe-GALNT6 vector plasmid and 2.5 g of each AAVS1 ZFN pair (Sigma) using nucleofection (Amaxa Nucleofector; Lonza). Nucleofected cells were single-cell cloned by limiting dilution, and GALNT6 knockin clones were screened for GalNAc-T6 expression by immunohistochemistry using our well-characterized anti-GalNAc-T6 monoclonal antibody (2F3) (108). Precise targeted integration was verified by junction PCR across the target integration sites.

RNA transcriptomic analysis
Total RNA was extracted from exponentially growing cells using the RNeasy kit (Qiagen). RNA integrity and quality were assessed using Bioanalyzer instrumentation (Agilent Technologies). The analysis was performed on total RNA from one clone of each of the following types of cells: LS174T WT, ⌬T6, ⌬T6ϩT6,⌬T3, and ⌬T3⌬T6. Transcriptome analysis of the extracted total RNA samples was performed by the Beijing Genomics Institute (BGI) as described previously (29). Briefly, a library was constructed using the Illumina Truseq RNA Sample Preparation Kit and subjected to PCR amplification and quality control before undergoing next generation sequencing with the Illumina HiSeq 2000 System (Illumina, San Diego).

Bioinformatics analysis
Bioinformatics analysis was performed as described previously (29). Briefly, aligned reads from the RNAseq analysis were analyzed using the DESeq (109) and EdgeR (110) packages for R and Bioconductor to identify differentially expressed transcripts. DESeq and EdgeR analyses were run using default parameters following previously described protocols (111).

Proliferation assay
A total of 5 ϫ 10 4 cells/well were plated in duplicate in 24-well dishes on day 0 for each time point. Cells were trypsinized and counted on days 4, 6, and 8.

Mass spectrometry and data analysis
EASY-nLC 1000 UHPLC (Thermo Fisher Scientific) interfaced via nanoSpray Flex ion source to an LTQ-Orbitrap Velos Pro mass spectrometer (Thermo Fisher Scientific) was used for analysis. The nLC was operated in a single analytical column set up using PicoFrit Emitters (New Objectives, 75 m inner diameter) packed in-house with Reprosil-Pure-AQ C18 phase (Dr. Maisch, 1.9-m particle size, 19 -21 cm column length). Each sample dissolved in 0.1% formic acid was injected onto the column and eluted in a gradient from 2 to 20% B in 95 min, from 20% to 80% B in 10 min, and 80% B for 15 min at 200 nl/min (solvent A, 100% H 2 O; solvent B, 100% acetonitrile; both containing 0.1% (v/v) formic acid).
A precursor MS1 scan (m/z 350 -1,700) of intact peptides was acquired in the Orbitrap at a nominal resolution setting of 60,000, followed by Orbitrap HCD-MS2 and ETD-MS2 (m/z of 120 -2,000) of the five most abundant multiply charged precursors in the MS1 spectrum; a minimum MS1 signal threshold of 50,000 was used for triggering data-dependent fragmentation events; MS2 spectra were acquired at a resolution of 15,000 for HCD MS2 and 30,000 for ETD MS2. Isolation width was 3 mass units, and usually one microscan was collected for each spectrum. Automatic gain control targets were 500,000 ions for Orbitrap MS1 and 100,000 for MS2 scans. Supplemental activation (25%) of the charge-reduced species was used in the ETD analysis to improve fragmentation. Dynamic exclusion for 60 s was used to prevent repeated analysis of the same components. Polysiloxane ions at m/z 445.12003 were used as a lock mass in all runs.
Data processing was performed using Proteome Discoverer 1.4 software (Thermo Fisher Scientific) using Sequest HT node as a search engine. In all cases the precursor mass tolerance was set to 10 ppm and fragment ion mass tolerance to 20 milli-mass units. All spectra were initially searched at the full cleavage specificity, filtered according to the confidence level (medium, low, and unassigned), and further searched with the semi-specific enzymatic cleavage. Up to two missed cleavages were allowed. Carbamidomethylation on cysteine residues was used as a fixed modification. Methionine oxidation and HexNAc and HexHexNAc attachment to serine, threonine, and tyrosine were used as variable modifications for ETD-MS2. All HCD-MS2s were pre-processed as described (75) and searched under the same conditions mentioned above using only methionine oxidation as variable modification. All spectra were searched against a concatenated forward/reverse human-specific database (UniProt, January 2013, containing 20,232 canonical entries. In addition, another 251 common contaminants and 3187 entries of viruses known to infect humans were included in the search) using a target false discovery rate of 1%.

Cell-cell adhesion assay
Cells that were grown to 100% confluence in 24-well plates were washed in HBSS and treated with 2.4 mg/ml dispase in HBSS for 20 min, leaving intact cell layers that were disassociated from the plastic wells. PBS (1 ml) was added carefully, and after the cell solution was pipetted up and down three times, fractions with more than 20 cells were counted.