Global Mapping of O-Glycosylation of Varicella Zoster Virus, Human Cytomegalovirus, and Epstein-Barr Virus*

Herpesviruses are among the most complex and widespread viruses, infection and propagation of which depend on envelope proteins. These proteins serve as mediators of cell entry as well as modulators of the immune response and are attractive vaccine targets. Although envelope proteins are known to carry glycans, little is known about the distribution, nature, and functions of these modifications. This is particularly true for O-glycans; thus we have recently developed a “bottom up” mass spectrometry-based technique for mapping O-glycosylation sites on herpes simplex virus type 1. We found wide distribution of O-glycans on herpes simplex virus type 1 glycoproteins and demonstrated that elongated O-glycans were essential for the propagation of the virus. Here, we applied our proteome-wide discovery platform for mapping O-glycosites on representative and clinically significant members of the herpesvirus family: varicella zoster virus, human cytomegalovirus, and Epstein-Barr virus. We identified a large number of O-glycosites distributed on most envelope proteins in all viruses and further demonstrated conserved patterns of O-glycans on distinct homologous proteins. Because glycosylation is highly dependent on the host cell, we tested varicella zoster virus-infected cell lysates and clinically isolated virus and found evidence of consistent O-glycosites. These results present a comprehensive view of herpesvirus O-glycosylation and point to the widespread occurrence of O-glycans in regions of envelope proteins important for virus entry, formation, and recognition by the host immune system. This knowledge enables dissection of specific functional roles of individual glycosites and, moreover, provides a framework for design of glycoprotein vaccines with representative glycosylation.

Herpesviridae is a family of enveloped viruses that infect a wide range of hosts (1), including humans, causing clinical manifestations of varying severity. Herpesvirus infection is characterized by a primary lytic infection followed by a lifelong latency established in the host (2). The Herpesviridae family is classified into three subfamilies: alphaherpesviruses (HSV-1, HSV-2, and VZV), 2 betaherpesviruses (HCMV, HHV-6, and HHV-7), and gammaherpesviruses (EBV and Kaposi's sarcoma-associated herpesvirus) (3). Although most herpesvirus infections are self-limiting, they can lead to severe complications, particularly in immunocompromised patients (4 -9). Furthermore, gammaherpesvirus infections have been associated with cancer development (10,11).
Herpesviruses have large genomes, encoding from ϳ70 to more than 230 distinct viral proteins depending on the virus type (12). The viral proteins include enzymes involved in viral DNA replication, proteins that form the viral particle, and viral envelope glycoproteins (3). The viral envelope glycoproteins play major roles in the early interactions, attachment, and penetration of the viral particle into the host cell and are involved in modulation of the immune system necessary to maintain the lifelong relationship between a herpesvirus and the infected host (13)(14)(15).
The envelope glycoproteins are processed and modified in the secretory compartment of the host cell where they are decorated with glycans that contribute to their biological properties (16). The most common classes of glycans found on viral envelope proteins are N-linked glycans, attached to asparagine residues of the polypeptide, and GalNAc-type O-linked glycans, attached to serine, threonine, or tyrosine residues (hereafter referred to as O-linked glycans) (17,18). Whereas the functions and structures of N-linked glycans of viral glycoproteins have been elucidated in detail for most enveloped viruses, including herpesviruses, the distribution, structures, and functions of the O-glycans on viral glycoproteins have mostly remained elusive (19 -28). This is largely due to lack of a reliable prediction algorithm and analytical difficulties in characterizing O-glycan sites on a proteome-wide basis. Moreover, the O-glycosylation capacity of cells varies, and thus analysis of O-glycosylation of a virus needs to take the host cell type into account. O-Glycosyl-ation is initiated by up to 20 polypeptide GalNAc transferase isoforms with distinct substrate specificities, and cells express different subsets of these isoforms to regulate the O-glycosylation capacity (18,29,30). Furthermore, a large repertoire of elongating and branching enzymes creates a heterogeneous pool of O-glycan structures within a given cell and even at a given site, making it technically challenging to identify O-glycosylation sites in the right biological context. These characteristics of O-glycosylation have all contributed to the elusive nature of site localization. However, our recent introduction of a mass spectrometry-based proteome-wide discovery platform for mapping viral O-glycosylation sites has changed this view considerably (31,32). Using HSV-1 and HSV-2 as models, we uncovered an unprecedented number of O-glycosites in functionally relevant regions of viral envelope glycoproteins (31,33). In contrast to the general consensus, we also demonstrated that HSV-1 envelope glycoproteins carry more O-glycans than N-glycans (31).
O-Linked glycans may be found as single scattered glycans, or they may be concentrated in dense clusters of glycans in mucin-like domains of viral glycoproteins (31,33). Important biological functions are associated with both types of glycans. For instance, clustered O-glycans of the distinct HSV-1 glycoprotein C have been found to be necessary not only for adjusting viral binding to its initial receptor, heparan sulfate, but also for preventing progeny virus from entrapment on the dying cell in which the virus replicated (21). An example of the functional role of single O-glycans emerged with the demonstration of a few O-glycosites on HSV-1 gB involved in the interaction with entry receptor paired immunoglobulin-like type 2 receptor ␣, possibly important for immune evasion (20,31). Altogether, such data, including our recent demonstration of the importance of O-glycans for HSV-1 and HSV-2 propagation as well as HSV-2 immune sensing (31,33,34), emphasize important functions of O-glycans on herpesvirus envelope glycoproteins.
Based on these findings, we hypothesized that other herpesvirus family members may also be extensively O-glycosylated. To provide the first global map of O-glycosylation of a large virus family, we in the present study characterized the O-glycoproteomes of three clinically relevant Herpesviridae family members: VZV, HCMV, and EBV. Using our proteome-wide discovery platform, we here identified a large number of O-glycosylation sites in multiple glycoproteins from the three viruses. In addition, we for the first time report the O-glycoproteome obtained from a clinical VZV specimen. The presented data sets serve as a resource for exploring biological functions of specific O-glycan sites and structures, as well as a reference for design and testing of vaccines.

Experimental Procedures
Cells and Viruses-Diploid human embryonic lung fibroblasts (35) (HEL, obtained from the cell culture collection at the Sahlgrenska University Hospital, Department of Clinical Microbiology, Gothenburg, Sweden) at a low passage level were cultivated in Eagle's minimum essential medium (Gibco, Life Technologies) with 10% FCS (Sigma), 100 IU/ml penicillin, 100 g/ml streptomycin (Gibco, Life Technologies), and 2 mM L-glutamine. P3HR1 (ATCC HTB-62) B lymphocytes isolated from Burkitt's lymphoma were maintained at a concentration between 4 ϫ 10 5 and 8 ϫ 10 5 cells/ml in RPMI 1640 (Gibco, Life Technologies), supplemented as above. All cells were maintained at 37°C and 5% CO 2 . The HCMV laboratory strain Towne (ATCC-977) was used throughout the study. The virus titers were determined by plaque titration on HEL cells as previously described (36). A VZV patient isolate C821 that was typed by PCR (37) and subsequently passaged in HEL cells was used for the VZV cell culture experiments. The patient VZV isolate DE14 8565 from a zoster blister was obtained with a specimen collection swab (eSwab; Copan Diagnostics, Murrieta, CA) and stored in 1 ml of Amies medium until preparation. Written informed consent was obtained from the patient prior to sampling. Except for age, no clinical information about the patient was registered. Because patient and sample identity was anonymized, ethical approval was not required.
VZV Infection in Cell Culture-Cell-associated VZV (180,000 particles/cell as determined by quantitative PCR) was added to HEL fibroblasts in T175 cell culture flasks (6ϫ). 5 ml of VZVinfected HEL fibroblasts was added to each T175 cell culture flask, generating a final ratio of 0.25 VZV-infected cells per uninfected HEL fibroblast. The virus was allowed to attach to the cells for 3 h at 37°C and 5% CO 2 before the inoculum was removed and fresh growth medium was added. The cells were incubated for 3-6 days until a strong cytopathic effect (ϩϩϩ CPE) was detected. The cells were washed with ice-cold PBS and harvested by scraping with a rubber policeman in ice-cold PBS, followed by centrifugation (500 ϫ g 10 min at 4°C).
HCMV Infection in Cell Culture-HCMV Towne at a multiplicity of infection of 0.1 pfu/cell was added to confluent monolayers of HEL fibroblasts in T175 cell culture flasks (5ϫ). The viral particles were allowed to attach to the cells for 1 h at 37°C and 5% CO 2 before the inoculum was removed, and new growth medium was added. The cells were harvested after 14 days by scraping with a rubber policeman in ice-cold PBS, followed by centrifugation (500 ϫ g 10 min at 4°C).
Stimulation of P3HR1 Cells for EBV Activation-4 ϫ 10 7 to 8 ϫ 10 7 pelleted cells (216 ϫ g for y5 min at room temperature) were resuspended in 100 ml of growth medium supplemented with 20 ng/ml phorbol 12-myristate 13-acetate and 3 mM sodium butyrate (Sigma-Aldrich) in a 165-cm 2 culture flask. The cells were incubated at 37°C and 5% CO 2 for 4 days and then harvested by centrifugation (320 ϫ g for 10 min at 4°C).
O-Glycoproteomic Analysis-O-Glycoproteomic analysis of infected cell lysates was performed as previously described (31) with several modifications. Briefly, cell pellet was resuspended in 0.1% RapiGest (Waters) in 50 mM ammonium bicarbonate and lysed using a sonic probe. Cleared cell lysates were reduced, alkylated, and treated with 5 units of peptide-N-glycosidase F (Roche) overnight at 37°C, followed by digestion with trypsin (Roche) or chymotrypsin (Roche) for 12 h at 37°C. Clinical VZV sample was only digested with trypsin because of a limited amount of material. The peptide-N-glycosidase F treatment was then repeated followed by a 2-h incubation with trypsin/ chymotrypsin. 1:100 -1:200 protease to protein ratio by weight was used with 75% of the protease amount added for the 12-h incubation and 25% added for the 2-h incubation. The samples were then treated with concentrated trifluoroacetic acid (up to 0.5% (v/v), 20 min at 37°C) and cleared by centrifugation (10,000 ϫ g 10 min). The cleared digests were purified on C18 Sep-Pak (Waters) and treated with 0.1 unit/ml of Clostridium perfringens neuraminidase (Sigma) in 50 mM sodium acetate, pH 5.0, at 37°C for 2 h. T (Gal␤1-3GalNAc␣1-O-Ser/Thr) and Tn (GalNAc␣1-O-Ser/Thr) glycopeptides were sequentially enriched using peanut agglutinin and Vicia villosa lectin weak affinity chromatography. C18 Stage-tip desalted lectin weak affinity chromatography fractions were screened by preliminary LC-MS for glycopeptide content, and those most enriched in glycopeptides were pooled together and further fractionated by isoelectric focusing.
nLC-MS2-EASY-nLC 1000 UHPLC (Thermo Scientific) interfaced via nanoSpray Flex ion source to an Orbitrap Fusion MS (Thermo) was used for analysis. The nLC was operated in a single analytical column set up using PicoFrit Emitters (New Objectives; 75-m inner diameter) packed in-house with Reprosil-Pure-AQ C18 phase (Dr. Maisch, 1.9-m particle size, 19 -21-cm column length). Each sample dissolved in 0.1% formic acid was injected onto the column and eluted in a gradient from 2 to 20% B in 95 min, from 20% to 80% B in 10 min, and 80% B for 15 min at 200 nl/min (solvent A, 100% H 2 O; solvent B, 100% acetonitrile; both containing 0.1% (v/v) formic acid). A precursor MS1 scan (m/z 350 -1,700) of intact peptides was acquired in the Orbitrap at a nominal resolution setting of 120,000, followed by Orbitrap HCD-MS2 and ETD-MS2 (m/z of 75-2,000) of the five most abundant multiply charged precursors in the MS1 spectrum; a minimum MS1 signal threshold of 50,000 was used for triggering data-dependent fragmentation events; MS2 spectra were acquired at a resolution of 60,000 for both HCD-MS2 and ETD-MS2. Maximum injection times were 75 and 150 ms for HCD and ETD fragmentation, respectively; isolation width was at 3 with quadrupole, and usually one microscan was collected for each spectrum. Automatic gain control targets were 5 ϫ 10 4 for MS1 and 1 ϫ 10 5 for MS2 scans. Supplemental activation (25%) of the charge-reduced species was used in the ETD analysis to improve fragmentation. Dynamic exclusion for 60 s was used to prevent repeated analysis of the same components. Polysiloxane ions at m/z 445.12003 were used as a lock mass in all runs. For clinical VZV sample, LTQ-Orbitrap Velos Pro spectrometer (Thermo Scientific) was used as previously described (31).
Data Analysis-Data processing was performed using Proteome Discoverer 1.4 software (Thermo Scientific) as previously described with small changes (32). Sequest HT node was used instead of Sequest. All spectra were initially searched at the full cleavage specificity, filtered according to the confidence level (medium, low, and unassigned) and further searched with the semi-specific enzymatic cleavage. Up to two missed cleavages were allowed. For Orbitrap Fusion MS-derived data the precursor mass tolerance was set to 5 ppm, and the fragment ion mass tolerance was set to 20 mmu (milli mass units). For LTQ-Orbitrap Velos Pro MS-derived data, the precursor mass tolerance was set to 7 ppm. Carbamidomethylation on cysteine residues was used as a fixed modification. Methionine oxidation and HexNAc and HexHexNAc attachment to serine, threonine, and tyrosine were used as variable modifications for ETD-MS2. All HCD-MS2 were preprocessed as described (32) and searched under the same conditions mentioned above using only methionine oxidation as variable modification. All spectra were searched against a concatenated forward/reverse human-specific database (UniProt, January 2013, containing 20,232 canonical entries. In addition, another 251 common contaminants and 3187 entries of viruses known to infect humans were included in the search) using a target false discovery rate of 1%. An additional database of HCMV Towne protein entries was used because they were not present in the abovementioned database. The false discovery rate was calculated using target decoy PSM validator node, a part of the Proteome Discoverer workflow. The resulting list was filtered to include only peptides with glycosylation as a modification. This resulted in a final glycoprotein list identified by at least one unique glycopeptide. ETD-MS2 data were used for unambiguous site assignment. HCD-MS2 data were used for unambiguous site assignment only if the number of HexNAc residues was equal to the number of potential sites on the peptide. Data analysis was assisted by manual validation.

Results
Mapping O-Glycosites in Human Herpesviruses-We applied our recently developed mass spectrometry-based approach (31) to map O-glycosites in VZV, HCMV, and EBV. VZV-or HCMV-infected HEL fibroblasts, as well as the EBV-transformed P3HR1 human Burkitt's lymphoma B cell line, were used for O-glycoproteomic analysis. In addition, we had a unique opportunity to analyze VZV-infected clinical material taken directly from a herpes zoster patient and compare it with the glycoproteome derived from a cell lysate. The major O-glycan structures produced in HEL fibroblasts are sialylated core-1 O-glycans (ST; Neu5Ac␣2-3Gal␤1-3GalNAc␣1-O-Ser/Thr), and during HSV-1 infection we have observed an increased amount of truncated O-glycan structure Tn (GalNAc␣1-O-Ser/Thr) (31). Because it is essential to enrich glycopeptides in total protease digests of complex mixtures of proteins, we used our established two-step sequential lectin enrichment strategy by peanut agglutinin and V. villosa lectin to capture desialylated T (Gal␤1-3GalNAc␣1-O-Ser/Thr) and Tn (GalNAc␣1-O-Ser/ Thr) glycopeptides from virus-infected cell digests. To increase protein sequence coverage, we used trypsin and chymotrypsin digestion in parallel. We identified 53, 122, and 41 novel Gal-NAc-type O-glycosites on 6, 28, and 6 glycoproteins in VZV, HCMV, and EBV, respectively (Tables 1-3 and supplemental Data Sets S1-S5). Comparable numbers of T and Tn glycopeptides were identified in VZV-infected samples (Table 1 and supplemental Data Sets S1-S3), whereas markedly higher numbers of Tn glycopeptides were found in HCMV-and EBV-infected samples (Tables 2 and 3 and supplemental Data Sets S1, S4, and S5). Among the identified viral glycopeptides, by far the majority (Ͼ98.5%) belonged to proteins exposed to the lumenal side of the secretory pathway (supplemental Data Set S1). We did identify a few glycopeptides that were mapped to proteins described as nuclear or cytoplasmic proteins and hence not known to enter the secretory pathway (supplemental Data Set S1). These were not included in the analysis and most likely represent random contamination of O-GlcNAc glycopeptides found in the enriched fraction, where some peptides are also found (38). The individual glycoproteomes are discussed in detail in the following sections.
The VZV O-Glycoproteome-We identified 53 O-glycosylation sites on six of nine VZV envelope glycoproteins (Table 1 and supplemental Data Sets S1-S3), combining results obtained from infected cell lysate and clinical VZV sample (39,40). Four envelope glycoproteins are essential for VZV replication in vitro: gB, gH, gL, and gE, of which gB and the gH-gL complex is thought to mediate both virion-cell and cell-cell fusion (41)(42)(43)(44). Two of the fusion complex proteins were found to be O-glycosylated, with seven and eight sites identified in gB and gH, respectively (Table 1). Six gB O-glycosites localized to the N-terminal region of gB, whereas one O-glycosite was situated at the tip of the membrane-proximal domain potentially involved in cell fusion ( Fig. 1A) (45). In gH, seven of the eight glycosites were found within the N-terminal domain I, which interacts with gL. Three of gH glycosites were located at the exposed N-terminal tip of domain I (subdomain IA) important for viral replication in human skin in vivo (42). The remaining four glycosites were found in the poorly structured region of subdomain IB. In addition, a single O-glycosite was situated in domain II, which is critical for gH maturation (Fig. 1A) (42).
VZV gE is the most abundant VZV glycoprotein and is essential for infectivity and cell-to-cell spread in cell culture (46,47). Twenty O-glycosites were identified on gE ( Table 1), 13 of which were dispersed throughout the N-terminal part of the ectodomain. The remaining glycosites were found in the unstructured linker region or juxtamembrane stem region (Fig.  1A). Eleven of the sites were situated within the unique and non-conserved (48) extreme N-terminal domain (amino acids 1-188), five of which were located within the region (amino acids 24 -71) essential for binding the cellular entry receptor insulin-degrading enzyme (49,50). Two O-glycosites, Thr 183 and an ambiguous site spanning amino acids 224 -225 (224 -225 (1x)), mapped to two distinct regions important for gE-gI interaction, which determines gE trafficking and VZV virulence in skin (46,47,50). The binding partner gI, which is indispensable for infectivity of T cells and skin in vivo (51), possessed 10 O-glycosites, all of which were located in the stem region and away from the gE-binding domain (52). A single O-glycosite was identified in the C-terminal loop of the multi-span gM, which is also important for efficient cell-cell spread (Table 1 and Fig. 1A) (39).
Another important protein for VZV virulence in skin in vivo is gC (53). In accordance with the predicted dense glycosylation of the mucin-like tandem repeat domain in gC (54), we found seven O-glycosites in this region (Table 1 and Fig. 1A). Six of the sites were situated on two tryptic peptides, KPDPAVAPT-SAASR and KPDPAVAPTSAATR, found five and two times within the protein sequence, respectively. Clearly it is not possible to discriminate by mass spectrometry how many of the identical repeats are glycosylated within the region. However, presuming that all tandem repeats are occupied with O-glycans, the total number of sites would increase to 22. In VZV blister and the infected fibroblasts, 21 and 50 viral O-glycosites were identified, respectively (Table 4 and supplemental Data Sets S1-S3). The fewer sites obtained in the clinical material could in part be explained by the limited amount of the material, only allowing single digestion with trypsin. Five of the six viral proteins, gB, gC, gE, gH, and gI, were found to be glycosylated in both the clinical sample and the infected cell lysate, with the remaining viral protein (one glycosite on gM) only identified with chymotrypsin digestion of infected fibroblasts. Except for gC, a relatively smaller number of sites were identified in most of the viral proteins found in the clinical specimen (Table 4 and supplemental Data Sets S1 and S2). Despite differences in coverage, the O-glycosites found in the clinical sample correlated well with the ones found in the infected cells (Fig. 2). In both samples, tryptic digest-derived glycosites located to analogous parts of proteins, including the N terminus of gB and gH, the membrane-proximal region of gI, and the tandem repeat region of gC (Fig. 2). In contrast to the   Sites mapping to secreted proteins/protein regions facing the lumenal part of the secretory pathway are listed (see all identified glycopeptides in supplemental Data Set S1). c The same glycopeptide is mapped to both UL7 and UL8.
clinical sample, we only identified one of the two different tandem-repeated gC glycopeptides in the more abundant total cell lysate. For gE, similar clusters of sites were found in both sam-ples, corresponding to the linker region between two structural domains and the juxtamembrane stem region. In contrast, only one O-glycosite was identified at the N-terminal unique region  of gE in the clinical sample, compared with seven in infected cells (Fig. 2).
The HCMV O-Glycoproteome-HCMV has one of the largest genomes between human herpesviruses (more than 160 open reading frames) and encodes at least six well characterized virion-associated envelope glycoproteins with known functions in viral replication: gB, gH, gL, gM, gN, and gO (55,56). However, at least 40 more protein sequences contain a predicted signal peptide, which would allow their potential modification with glycans in the secretory pathway. We identified 122 novel O-glycosylation sites on 28 HCMV proteins, including gB, gH, gL, gN, and gO (Table 2 and supplemental Data Sets S1 and S4). Most of the identified glycoproteins had a relatively small number of sites (1)(2)(3)(4)(5)(6)(7)(8), with the exception of RL12 and UL22A, in which we found 18 and 14 sites, respectively. Five HCMV envelope glycoproteins are indispensable for replication: gB, gH, gL, gM, and gN (57). Glycoprotein B and gH-gL comprise the conserved fusion machinery; however, gH-gL can additionally be complexed with gO or UL128 -131A to promote infectivity (58). In agreement with alphaherpesviruses, we found two O-glycosites at the N terminus of gB (Fig. 1B) (31,33). Four O-glycosites were identified on gH, whereas three were found on gL. It is difficult to compare the site localization to alphaherpesviruses because of low sequence identity for both gH and gL. However, presuming similar protein architecture (59), three of four gH sites were located at the membrane-distal domain I, similarly to alphaherpesviruses (Fig. 1B). In addition, a single glycosite was found in domain II consistent with the findings for VZV gH. The gH-gL-associated protein gO was glycosylated at one position (Fig. 1B), whereas four O-glycosites were seen at the N terminus of gN (Fig. 1B).
Most of the other identified glycoproteins with known functions were involved in counteracting the host defense mechanisms. Eight of the identified glycoproteins, RL11, RL12, UL4, UL5, UL7, UL8, UL10, and UL11, are members of the RL11 multigene family of HCMV proteins (60). Two of them, RL11 and RL12, are known to bind human IgG (61,62). O-Glycosites mainly localized to either the very N terminus or the juxtamembrane stem region of the RL11 family members and not the characteristic RL11D domain (Fig. 1B) (60). UL119, which is an unrelated virion glycoprotein with eight identified O-glycosites, also has the capacity of Fc␥ binding (61). Two of the sites were situated on the Ig-like domain of the protein (Fig. 1B). Two of the identified glycoproteins, US28 and UL78, belong to the GPCR family of seven-transmembrane domain receptors and have known functions in binding host chemokines (63) or chemokine receptors (64). The O-glycosites were located at the extracellular N termini of the proteins (Fig. 1B), which are often found O-glycosylated in human GPCRs (38). For US28, the O-glycosite (2-14 (1x)) was potentially located in the region essential for chemokine binding (amino acids 10 -16) (65). US16, US20, and US21, members of the HCMV US12 family of putative seven-transmembrane domain proteins (66), were O-glycosylated at the C terminus (Fig. 1B), suggesting that their orientation in the membrane could be opposite to that of GPCRs (67). The secreted RANTES-specific chemokine receptor UL22A (also known as UL21.5) was found heavily glycosylated, as previously reported (68,69). One of the HCMV-en-

Site-specific O-Glycosylation in Human Herpesviruses
coded chemokines, vCXCL1, which is thought to attract neutrophils for dissemination, was also found to be O-glycosylated, as recently shown (70,71). Four of HCMV immunoevasins, UL16, US3, US8, and US20, were identified to carry O-glycans (Fig. 1B). UL16 interacts with NKG2D ligands and reduces their expression on the cell surface. Three of the four O-glycosites were found at the membrane-proximal stem region of the molecule, one of which (Tyr 162 ) mapped to a distinct protein-protein contact area with MICB ( Fig. 1B) (72). US3 and US8, which have the capacity of binding MHC class I, were glycosylated at the stem region and the N terminus, respectively ( Fig. 1B) (73).
The EBV O-Glycoproteome-At least nine EBV virion-associated envelope glycoproteins are known (74). Six of those-gB, gN, gp42, gp78, gp150 and gp350 -were found to be O-glycosylated, with a total of 41 glycosites identified (Table 3 and supplemental Data Sets S1 and S5). EBV depends on gB and gH-gL for entry into host cells. Fusion effector gB was O-glycosylated at five positions, three of which were located at the extreme N terminus of the protein, in accordance with VZV and HCMV gB glycosylation (Fig. 1, A-C). In contrast to other herpesviruses, we did not identify any O-glycosites on gH or gL (Fig. 1C). Infection of B cells requires an additional viral protein gp42 (75). gp42 is proteolytically cleaved releasing a secreted form that links gH-gL on the virion to HLA class II on B cells, thereby bringing the two membranes to close proximity (76). A single O-glycosite was identified on gp42 localized in one of the regions essential for high affinity binding to gH-gL and just C-terminally to the proteolytic cleavage site required for release of gp42 from the membrane (Fig. 1C) (77,78). Similar to betaherpesviruses, gM-gN protein complex is particularly important for EBV viral particle formation (79). In agreement with the results obtained for HCMV, four O-glycosites were found at the N terminus of EBV gN (Fig. 1C). Although dispensable for infectivity, gp350 is important for the initial attachment to B cells (80) and has been shown to be highly O-glycosylated (81). gp350 also represents a very potent immunogen (82,83). On gp350, 19 O-glycosites were identified, 18 of which were located in the Pro/Ser/Thr-rich mucin-like stem region, and 1 glycosite was found at the tip of one of the N-terminal domains (Fig. 1C). The remaining glycosites observed in EBV were situated on late proteins gp150 (BDLF3) and gp78 (BILF2), with most O-glycosites found in the stem region or at the N terminus, respectively (Fig. 1C).

Discussion
Until recently, most evidence for viral O-glycosylation has originated from the interrogation of densely glycosylated Pro/ Ser/Thr-rich mucin-like sequences such as those found in HSV-1 gC and Ebola virus glycoprotein (84,85). With our recent introduction of a mass spectrometric strategy for global mapping of viral O-glycosylation sites, we substantially expanded the number of identified O-glycosylation sites in alphaherpesviruses HSV-1 and -2 and demonstrated the importance of elongated viral O-glycans for virus propagation (31), as well as early immune recognition (33). The aim of the present study was to provide knowledge on the global O-glycosylation of three additional clinically important members of the Herpesviridae family, VZV, HCMV, and EBV, representing the three distinct herpesvirus subfamilies. The identified O-glycosites were widely distributed on most of the viral envelope glycoproteins, and importantly we identified conserved glycosylation patterns in distinct regions of homologous viral proteins, suggesting that O-glycosylation at certain regions is of importance for herpesviruses. In addition, we generated an O-glycoproteome from a clinical VZV sample obtained from a zoster blister representing the first O-glycoproteome from a primary source of virus unaffected by artificial propagation in cell culture. Identification of O-glycosylation sites is hampered not only by the lack of reliable prediction algorithms but also by the unique differential biosynthetic regulation of O-glycosylation, underlining the importance of direct experimental analysis (18). The glycoproteomic strategy employed here is highly sensitive and combines enrichment of the most prevalent glycoforms (simple core-1 O-glycans) produced in the host cells used for viral propagation (31). Limitations, however, include the failure to enrich for peptides exclusively expressing core-2 or other more complex glycoforms and lack of stoichiometry for the glycosites identified (86). We have previously estimated that simple core-2 structures constitute ϳ10 -15% of total O-glycans in HEL fibroblasts (31). It is noteworthy to mention that in the present study we have identified a small fraction (ϳ6%) of the virus-derived glycosites potentially carrying both core-2 and core-1 O-glycans on the same sites in the peanut agglutinin-enriched peptides (87). The finding suggests that despite the presence of more complex O-glycans at certain sites, we might still detect them as biosynthetic intermediates. We are not, however, in a position to predict the exact proportion of glycosites missed by our method. Another limitation is that available protease cleavage sites will determine protein coverage. To increase the coverage we therefore utilized both trypsin and chymotrypsin digestion in parallel, which expanded the number of identified sites for certain proteins. Despite these limitations, we were able to capture the majority of glycoforms expressed in the infected cells and achieve high coverage for abundant glycoproteins.
The majority of identified glycopeptides (Ͼ98.5%) were mapped to proteins exposed to the lumenal side of the secretory pathway. A few glycopeptides, however, were mapped to nuclear or cytoplasmic proteins that are not known to enter the secretory pathway. Such glycopeptides most likely represent a minor contamination with cytoplasmic O-GlcNAc glycopeptides. An argument against this interpretation is that several of the identified HexNAc residues were elongated with hexose, suggesting that they represent genuine T-structures. It cannot, however, be completely excluded that an initial GlcNAc residue could be elongated by a galactose residue by the highly efficient galactosyltransferases present in cell lysates alongside released donor substrates (88), and more detailed analyses are required to establish the exact nature of the identified glycans.
The characterization of the O-glycoproteomic landscape of herpesviruses provides a first step in being able to appreciate and probe the biological functions of this prevalent modification of herpesvirus envelope glycoproteins. Information on site-specific O-glycosylation of virus and viral glycoproteins produced in different cellular systems could prove to be important because we predict that O-glycosylation may not only vary with respect to structures, but more importantly also vary considerably with respect to sites of O-glycan attachment. This is because the repertoire of polypeptide GalNAc transferases that controls the O-glycosylation capacity is cell-specific and may also be influenced by the viral infection itself, as evidenced by the induction of GalNAc-T3 by influenza A virus (18,89). We have recently developed a quantitative differential O-glycoproteomic strategy to address non-redundant contributions of individual GalNAc transferase isoforms to the O-glycosylation capacity of a cell (90), and this could be applied to address changes in viral O-glycosylation between clinical isolates or samples propagated in different cell types. Because O-glycans may affect immunity by shielding protein epitopes or introducing glycopeptide epitopes (91)(92)(93), it is important to consider O-glycans in the context of vaccine design. It is also important to consider O-glycans for innate immune targeting ligands to augment immunity (33,94).
The three Herpesviridae subfamilies, alphaherpesviruses, betaherpesviruses, and gammaherpesviruses, diverged from a common ancestor more than 200 million years ago (1). More than 40 genes are conserved between all herpesviruses and are referred to as core genes. Of those, gB, gH, gL, gM, and gN are the conserved envelope glycoproteins (3). In addition, four more envelope glycoproteins, gC, gE, gI, and gK, are conserved between alphaherpesviruses (95). Using the O-glycoproteomes from five different human herpesviruses, we sought to investigate the extent to which sites and patterns of O-glycosylation were conserved on homologous envelope glycoproteins between HSV-1, HSV-2, VZV, HCMV, and EBV ( Fig. 3 and supplemental Figs. S1-S7). One of the most conserved patterns of O-glycosites found in all herpesviruses was localized at the extreme N terminus of the fusogenic protein gB, despite high variability in the protein sequence. In addition, we found several more glycosites in common between VZV and the other two alphaherpesviruses, HSV-1 and -2, that were located on more highly conserved regions, including the O-glycans on the membrane-proximal domain that contains putative fusion loops of gB (96) (Fig. 3 and supplemental Fig. S1). Concordant glycosylation was also found in the N-terminal mucin-like regions of gC, as well as in two clusters of O-glycosites in VZV gE, homologous to sites in HSV-1 and HSV-2 gE (Fig. 3 and supplemental Figs. S1 and S3). It should be mentioned, however, that we also found O-glycosites that were not shared between the different family members. In conclusion, certain regions of homologous Herpesviridae envelope glycoproteins share similar patterns of O-glycosylation that potentially could be linked to specific functions, although virus-specific differences are also observed.
O-Glycosylation occurs in two principally distinct patterns on proteins that may be related to their biosynthesis and function. Isolated O-glycosites are often important for regulated proteolytic processing and exert co-regulatory functions in basic cellular processes (97). Densely clustered sites, on the other hand, are often present in vulnerable protein regions and confer protection from non-regulated proteolysis. Similarly to human proteins, viral proteins accommodate a substantial number of isolated O-glycosites outside the mucin-like regions (31,33,98,99). In mammalian proteins, such sites have been demonstrated to play important regulatory roles in basic cellular processes such as secretion, pro-protein processing, and ectodomain shedding (100 -104). It could thus be speculated that single-site O-glycosylation on viral proteins affects the cleavage of viral proteins with importance for infection. As an example, we identified a single O-glycosite adjacent to a proteolytic cleavage site of EBV gp42 essential for infection of B cells (78,105). Given that O-glycosylation often protects from cleavage, the different extent of glycosylation could potentially be a co-regulatory mechanism for cell tropism, because gp42 is not required for infection of epithelial cells (75,106). The same site might also play a role in interaction with gH-gL, because it mapped to one of the regions required for high affinity binding (107). The immunoevasin UL16 in HCMV represents another example where a glycosylation site is mapped to an interaction surface, which down-regulates NKG2D ligand MICB expression at the cell surface and subsequent detection by NK cells (72). We also identified a number of single or clustered O-glycans at the extracellular termini of five different HCMV seventransmembrane domain receptors similar to what has been observed in human multi-span receptors (38). Viral chemokine receptor UL78 has been demonstrated to heteromerize with human chemokine receptors, modulating their function (64). Based on the relatively short N-terminal regions of these receptors, it is quite unlikely for them to be involved in dimerization, as seen for GPCRs bearing large extracellular domains (108). However, glycosylation could potentially modulate ligand binding or limited proteolysis-associated receptor turnover, as hypothesized for the N terminus of the ␤1-adrenergic receptor (109). Among other specific functions, O-linked glycans may contribute to transport and stable cell surface expression of viral proteins. This has been suggested for HCMV UL11 and for VZV gB (28,110). Interestingly, we identified an ambiguous site FIGURE 3. Conservation of O-linked glycosylation sites on homologous envelope glycoproteins of human herpesviruses. Clustal Omega server was used to align amino acid sequences of gB, gH, gL, and gN between HSV-1 (31), HSV-2 (33), VZV, HCMV, and EBV, as well as gC, gE, and gI between the alphaherpesviruses. Conserved glycoprotein M was not included, because it was only found glycosylated in one of the investigated viruses. Protein backbones are depicted as broken black lines, where spaces represent gaps in the alignment. Individual alignments were drawn to scale (indicated below each graph). Sequence conservation is indicated above the aligned sequences for each set and is represented by a grayscale barcode that maps to the Clustal alignment score, as shown in the legend. In brief, for the Clustal alignment score, an asterisk indicates a position with a fully conserved residue, and a colon indicates conservation of an amino acid with strongly similar properties, whereas a period indicates conservation of an amino acid with weakly similar properties. Predicted signal peptides and transmembrane regions are shaded in pink and blue, respectively. Unambiguous O-glycosylation sites are shown as yellow squares, whereas ambiguous sites are marked as yellow lines within the protein backbone, where the number below indicates the number of glycosites. It should be noted that O-glycosylation sites on VZV are derived both from the clinical sample and the infected total cell lysate. All identical potentially glycosylated VZV tandem repeats are shown occupied. Two ambiguous O-glycosylation sites from our previous publication (31) (HSV-1 gB 109 -123 (HexHexNAc) and gE 135-143 (HexHexNAc)) were omitted from the graph, because we cannot exclude the possibility they could be part of an elongated structure on an adjacent site. Reference strain sequences were used for HSV-2, VZV, and EBV because of incomplete or unavailable annotation of investigated strains.
in VZV gB spanning a region (amino acids 126 -129) containing the potential O-glycosylation site at Thr 129 , investigated for Ala substitution, suggesting that the O-glycan in question may indeed be important for surface expression of gB (28). In addition, we found glycosylation in the EBV gB linker region (T621), which has been suggested to be relevant for gB oligomerization and surface expression (111,112).
In agreement with our previous findings in HSV-1 and HSV-2, we found dense glycosylation at the N-terminal tandem repeat region of gC in VZV. The function of these densely glycosylated areas are not clear, but it is proposed that O-glycans in the distally located mucin-like region in gC has a direct role in modulating the interaction with cell surface proteoglycans (21). In a similar way it can be speculated that the abundant O-glycosylation found on the unique N-terminal domain of VZV gE affects the multiple functions specified by the region, including interactions with cell entry receptor insulin-degrading enzyme and binding partner gI (50). O-Glycosylation was also enriched in multiple members of the RL11 family and at the N-terminal domain of gN in HCMV. N-terminal HCMV gN glycosylation has been suggested to protect from neutralization by antibodies (91). Moreover, dense glycosylation was found in the stem region of several glycoproteins in VZV, HCMV, and EBV, suspected to protect the otherwise vulnerable region from unspecific proteolytic cleavage (113). Another potential function could be to provide structural support for keeping the ectodomain away from the membrane (114).
Viruses such as HIV-1, HCV, and Hendra virus exploit N-glycans to shield themselves from the host immune response (23,25,115,116). In a similar way, O-glycans have also been suggested to shield immunodominant epitopes in herpesviruses (91,92). Nevertheless, there is limited information regarding the impact of individual O-glycans on shielding underlying peptide epitopes or the capacity of host immunity to present and recognize glycosylated antigens (93,94). Here we show that herpesviruses are broadly O-glycosylated including protein regions subject to immune recognition. In this context we found O-glycans (Ser 62 , Ser 70 , and Ser 71 ; Ser 71 and Ser 79 ; Tyr 102 ; Tyr 154 ) localized to four distinct immunodominant human B cell epitopes, previously mapped to the N-terminal region of VZV gE using non-glycosylated synthetic peptides (117). Two of these epitopes represent neutralizing antibody epitopes (Ser 71 and Ser 79 ; Tyr 154 ) (118). In addition, three T cell epitopes in mice were also mapped to VZV gE regions with identified O-glycans (major epitope: Ser 71 and Ser 79 ; minor epitopes: Tyr 102 ; Thr 118 ) (118). Despite immunogens being produced in yeast and thereby lacking mucin-type O-glycosylation (18), mouse immune sera were able to detect and neutralize virus produced in mammalian cells, suggesting either that the epitope recognition is not affected by adjacent glycosylation or that putative O-linked glycans only partly occupy and protect the epitopes. We also found O-glycans located within a neutralizing peptide epitope at the N terminus of HCMV gH (S31) (119) and a discontinuous neutralizing antibody epitope on VZV gH (Ser 177 and Thr 184 ) (120). These findings suggest that O-linked glycans could have a role in the masking of immunodominant epitopes from antibodies and cytotoxic T cells. However, detailed studies are required to investigate the con-tribution of distinct O-glycans to epitope masking or recognition by immune cells. It should be mentioned that we could only confirm a subset of the gE glycosylation sites in the clinical VZV sample. This could be due to the low coverage caused by the scarce material available for analysis, or it could simply signify incomplete protection of these epitopes. Finally, specific O-glycans have not only been shown to protect but also to evoke immune responses (121,122) and hence could serve as potential diagnostic markers, as well as vaccine targets.
As discussed, one of the potential caveats of O-glycoproteomic analysis in vitro is that viral glycosylation is placed in the context of host cell glycosylation capacity. To compare O-glycosylation in artificially infected cells to O-glycosylation in vivo, we analyzed VZV-infected fibroblasts and virus obtained from a primary clinical isolate in parallel. A relatively high degree of glycosylation overlap was identified between the clinical VZV sample and VZV derived from infected cells, although a significant number of O-glycosites were not found in the clinical sample. There could be several explanations for the lower number of glycosylation sites, including substantially scarcer clinical material compared with infected cell lysate, which only allowed us to perform a single digestion with trypsin. Another factor that could influence the identification of sites is genetic differences between the clinical and laboratory VZV isolates. For example, variable numbers of mucin-like tandem repeats can be found in viral glycoproteins, such as HSV-1 gI (123) and VZV gC, derived from different clinical isolates and laboratory strains. Even though VZV is one of the most conserved human herpesviruses (124), it cannot be excluded that minor deviations of the isolate-specific VZV peptide sequences from the sequences available in the search database could prevent the identification of all peptides. The same issues are valid for the investigated EBV P3HR1 and HCMV Towne strains, which are not completely annotated in the available protein databases. Despite the lower overall coverage in the clinical sample, O-glycosites identified in the laboratory VZV strain represent the in vivo glycosylation well, with the only sites exclusively identified in the clinical isolate residing within specific tandem repeats of VZV gC. This would support the use of laboratory strain-derived glycoproteins for addressing relevant biological questions. This represents the first attempt to characterize the O-glycoproteome of a clinical virus specimen. In the future, efforts should be made to evaluate the occupancy of individual glycosites and the individual structures on viruses. Comprehensive glycomic characterization of clinical isolates may lead to identification of sites and structures important for proteinprotein interaction or raising potent immune responses. Using our expanding library of glycoengineered cell lines would enable production of designer viruses presenting defined glycostructures on envelope glycoproteins for antiviral vaccine development (125).
In conclusion, we generated the most comprehensive O-glycoproteomes of VZV, HCMV, and EBV to date and showed that certain regions of conserved proteins are consistently glycosylated in herpesviruses. O-Glycans on viral envelope glycoproteins can play multiple roles from inducing extended molecular conformations and protection from unspecific cleavage to more regulated events, such as protein-protein interaction, modulation of limited proteolysis, and immune recognition. The results should enable more focused studies of O-glycosylation at individual sites, which may confer new knowledge in specific areas of herpesvirus biology. Moreover, the results provide a reference base for design and development of vaccines taking both N-and O-glycosylation into account.