Four N -linked Glycosylation Sites in Human Toll-like Receptor 2 Cooperate to Direct Efficient Biosynthesis and Secretion* □ S

Most higher organisms have a system of innate immune defense that is mediated by a group of evolution-arily related, germ line-encoded receptors, so-called Toll-like receptors. In mammals Toll-like receptors signal in response to pathogen-associated microbial structures. For example, Toll-like receptor 2 appears to mediate responses to bacterial peptidoglycan and acylated lipoproteins and Toll-like receptor 4 to bacterial lipopolysaccharide. However, the structural principles that underlie recognition of these structures are poorly un-derstood. Toll-like receptors have leucine-rich repeats in their extracellular domains and are thus believed to adopt solenoid structures, similar to that found in platelet glycoprotein Ib. Additionally, all Toll-like receptors contain N -linked glycosylation consensus sites, and Toll-like receptor 4 requires glycosylation for function. Toll-like receptor glycosylation is also likely to influence receptor surface representation, trafficking, and pat-tern recognition. Using circular dichroism spectroscopy, we show here that purified human Toll-like receptor 2 and 4 proteins have secondary structure contents similar to glycoprotein Ib. We have also analyzed where consensus glycosylation sites are located in the extracellular domains of other human Toll-like receptors. We found that there are significant differences in the loca-tion

In humans, innate immune responses provide the first line of defense against invading bacterial pathogens. Pathogen-associated molecular patterns such as lipopolysaccharide from Gram-negative bacteria, peptidoglycan, and foreign nucleic acids are sensed by several different immune cell types such as macrophages and dendritic cells. These cells then mediate complex inflammatory and antiviral responses. Stimulation of den-dritic cells by pathogen patterns is also required for T-cell maturation and the development of an adaptive immune response (see Ref. 1 for a review).
In recent years it has become clear that the human Toll-like receptors (TLRs) 1 are required to mediate these responses. These molecules are single pass transmembrane receptors and are related to Drosophila Toll, a protein involved in dorsoventral patterning and antifungal innate immunity in the fly (2)(3)(4). Drosophila Toll and TLRs all have ectodomains with characteristic blocks of leucine-rich repeats and a cytoplasmic signaling domain of about 200 residues called the Toll/interleukin 1 receptor domain. The family of Toll receptors appears to use common components in the postreceptor signaling pathway, resulting in the activation of the transcription factor NFB (5,6).
An important area in TLR research is to understand the way in which pathogen patterns are able to activate these receptors at the biochemical and structural level. In the case of Drosophila Toll, pathogen patterns indirectly activate an endogenous cytokine-like ligand, Spä tzle, and this dimeric protein activates signal transduction by dimerizing the Toll receptor (7). However, the human TLRs appear to have a different mechanism of activation that involves co-receptor proteins and direct sensing of the pattern. For example, signaling by TLR4 probably requires direct binding of lipopolysaccharide to the accessory co-receptor protein MD2 and consequent dimerization of the receptors (8).
To study these molecular recognition events biochemically requires the production of the receptor domains in a pure and functional form. This has proved difficult to achieve, and there are few reports of expression and purification of receptor ectodomains in the literature. The ectodomains consist primarily of leucine-rich repeat sequences (LRRs), 24-amino acid motifs that fold together into a solenoidal structure (see Fig. 1). LRRs are found in a large superfamily of proteins (5) and seem adapted for the rapid evolution of diverse protein binding specificities. Although no structures of TLRs are presently known, the crystal structures of two related LRR extracellular proteins, platelet glycoprotein Ib (9) and the Nogo receptor (10), have recently been solved in complex with specific protein binding partners. These structures show that unlike some other LRR proteins (for example, ribonuclease/angiogenin inhibitor (11)) the convex surface of the LRRs is not ␣-helical but of variable, extended secondary structure (see Fig. 1).
In this study we have described the expression and purification of TLR 2 and 4 ectodomains and have shown that they lack ␣-helical secondary structure. We have also probed the glyco-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. sylation status of TLR2. Using site-directed mutagenesis, we found that all four predicted sites are used and that one of these, the conserved site 4, is heterogeneously glycosylated. Mutation of all the sites severely affects the biosynthesis and secretion of the TLR2 ectodomain.

EXPERIMENTAL PROCEDURES
Sequence Alignments and Bioinformatic Analysis-TLR protein sequences were retrieved from the EMBL data base using SRS (sequence retrieval system). Retrieved protein sequence accession numbers are as follows: human TLR1 Q9UG90, murine TLR1 Q9EPW5; bovine TLR2  Q9GL66, chicken TLR2 variant 1 Q9DD78, chicken TLR2 variant 2  Q9DGB6, equine TLR2 AAR08196_P, hamster Q9R1F8, human TLR2  O15454, macaca TLR2 Q95M53, murine TLR2 Q9DBC4, pig TLR2  AAQ84520, rat TLR2 variant 1 AAN86524_P, rat TLR2 variant  Q9BXR5. For analysis of site conservation, sequences from different species were aligned using ClustalW and color-coded using Joy (12). To analyze where sites were located within the LRR solenoid structure, individual LRRs were manually aligned based on the LRR definitions of Bell et al. (13). Alignments were then colored using Joy as before. Structural representations were created using PyMol (www.pymol.org), based on the crystal structure of the Nogo receptor (10).
Cell Culture-Sf9 and T.ni cells for baculovirus generation and protein expression were grown in suspension culture in serum-free SF900 II medium (Invitrogen) containing 0.1% pluronic acid (Sigma). For the expression of glycosylation mutants HEK293 (human embryonic kidney) cells were used and cultured in HEPES-modified Dulbecco's modified Eagle's medium cell (Sigma), supplemented with L-Gln, 10% fetal bovine serum, and antibiotics (penicillin/streptomycin).
Expression of Recombinant Toll-like Receptors 2 and 4 -Expression constructs for residues 1-587 and 1-631 of the TLR2 and TLR4 extracellular domains, respectively, were generated by PCR on a vector containing the TLR2 open reading frame using a 5Ј primer encoding a BamHI site and Kozac sequence and a 3Ј primer encoding an rTEV protease cleavage site, His 6 tag, STOP codon, and NotI restriction site, respectively. The PCR product was purified and cloned into the plasmid pFastBac (Invitrogen). A recombinant baculovirus was generated using the Bac-to-Bac system (Invitrogen) with Sf9 cells. Large scale protein expression was carried out in T.ni cells by infection of 10-liter cultures at a multiplicity of infection of 0.1 and a cell density of 1.0 ϫ 10 6 ml. The cell culture supernatant was harvested 3 days postinfection, concentrated to 500 ml using a tangential flow filtration unit (Pall Filtron), and the buffer exchanged to 150 mM NaCl, 20 mM Tris-HCl, pH 7.5, 5 mM imidazole. The concentrate was purified using Superflow Ni-NTA acid-agarose (Qiagen) on an Ä KTA FPLC system (Amersham Biosciences) at 4°C. Protein-containing fractions were pooled and further purified by gel filtration on Superdex 200 gel filtration columns (Amersham Biosciences).
CD Spectroscopy and Data Analysis-TLR2 and TLR4 proteins were dialyzed into 20 mM NaCl 20 mM sodium phosphate buffer, pH 7.2, and spectra recorded between 190 and 250 nm on a Aviv Model 215 circular dichroism spectrometer with a protein concentration of ϳ10 M. Raw data were averaged from duplicate runs and a buffer control subtracted. Data were analyzed using Excel software. Circular dichroism data were analyzed using the program Selcon3 (14). Raw CD data were converted to molar ellipticity and transferred into input files as outlined in the CDPro documentation. Secondary structure information for platelet glycoprotein Ib and Ng-R was retrieved from Protein Data Bank structure files (1GWB and 1OZN, respectively) using Procheck (15).
Glycosidase Treatment-TLR2 samples were dialyzed into 100 mM phosphate buffer, pH 7.2. For denatured samples, 1% (v/v) Nonidet P-40, 1% (v/v) ␤-mercaptoethanol, and 0.1% (w/v) SDS were added and the samples boiled for 10 min. Subsequently, 20 units of PNGase F, 100 milliunits of endoglycosidase H, 20 milliunits of neuraminidase, and/or 20 milliunits of O-glycosidase were added to 200 l of protein sample as required; the final volume was adjusted to 260 l. Mixtures were then incubated overnight at 37°C and the samples analyzed by SDS-PAGE.

TLR2 Mutagenesis and Expression of Mutants in HEK293 Cells-
Site-directed mutagenesis was carried out according to the instructions in the Stratagene QuikChange kit. The following primers (and their reverse complements) were used for mutagenesis (mutated codons in bold, changed bases in italics): N114S, atcc tat aat tac tta tct TCT tta tcg tct tcc tgg ttc aag ccc; S114N, atcc tat aat tac tta tct AAT tta tcg tct tcc tgg ttc aag ccc; N199D, aagt ttg aag tca att cag GAC gta agt cat ctg atc cttc; D199N, aagt ttg aag tca att cag AAC gta agt cat ctg atc cttc; T416A, tg ctc act ctg aaa aac ttg GCT cta aca ttg ata tca gta ag; A416T, tg ctc act ctg aaa aac ttg ACT cta aca ttg ata tca gta ag; N442D, ca gaa aag atg aaa tat ttg GAC tta tcc agc aca cga ata cac; D442N, ca gaa aag atg aaa tat ttg AAC tta tcc agc aca cga ata cac. The mutagenized inserts were sequenced and then backcloned into original pcDNA3.1 vector backbone to avoid mutations outside the sequenced insert region. For expression of the mutant constructs, plasmids were transfected into HEK293 cells using LipofectAMINE 2000 according to the manufacturer's instructions. Samples were collected 72 h posttransfection. Supernatant samples (usually ϳ1.5 ml) were collected and dialyzed against 2 ϫ 5 liters of buffer A ϩ 5 mM imidazole. Subsequently, the samples were loaded onto Ni-NTA acid spin columns in several centrifugation runs at 1500 rpm. Columns were then washed twice with buffer A containing 30 mM imidazole and His-tagged proteins eluted with 200 l of 250 mM imidazole-containing buffer. Cells were collected by scraping and then washed twice in ice-cold phosphate-buffered saline. The cell pellet was subsequently lysed in 150 l of radioimmune precipitation assay buffer by incubation on ice for 30 min with pulse vortexing every 10 min. The sample was then centrifuged for 15 min in a chilled centrifuge and the cleared lysates transferred to a fresh tube. Both supernatant samples and cell lysates were analyzed by SDS-PAGE and subsequent immunoblot using TLR2 antibodies (gift from G. Squires) (16). Bands were analyzed with Image J software (rsb.info.nih.gov/ij).

A Single Conserved Glycosylation Site Is Present in TLR2 and Is Located on the Conserved Face of the Leucine-rich Repeat
Solenoid-Analysis of the primary sequence of TLR2 ectodomains reveals the presence of four potential glycosylation sites. All of these lie within consensus leucine-rich repeats, and three are predicted to lie in the constant regions of the LRR (Fig. 1,  A and B). Site 1 is predicted to be solvent-exposed on the convex surface of the LRR solenoid. By contrast, sites 2 and 3 are located on the concave surface but in a position predicted to be solvent-exposed and to make the asparagine residues available for modification. On the other hand, site 4 is located on the terminal residue of the parallel ␤-sheet of LRR16, a structure that forms part of the inner surface of the solenoid; access to this site in the folded state might be sterically restricted. We examined whether any of these four sites is conserved in the 11 TLR2 sequences currently known. As shown in Fig. 1C, only site 4 is conserved in all species; the other three are present in about 6 of 11 species.

The Ectodomains of Toll-like Receptors 2 and 4 Are Secreted in a Monomeric and N-glycosylated Form when Expressed in
Insect Cell Culture-The extracellular domains of TLR2 and TLR4 were secreted from Sf9 insect culture cells infected with baculovirus expression constructs. The proteins were purified from culture supernatants using metal affinity chromatography and gel filtration chromatography ( Fig. 2A). Both TLR2 and TLR4 ectodomains eluted as single peaks at positions that indicate the proteins are monomeric (Fig. 2B). To study the glycosylation status of TLR2, we treated samples with several glycosidases: PNGase F, an amidase that cleaves between the innermost GlcNAc and asparagine residues of high mannose, hybrid, and complex oligosaccharides from N-linked glycoproteins; endoglycosidase H, which cleaves the chitobiose core of high mannose and some hybrid oligosaccharides from N-linked glycoproteins; neuraminidase (also called sialidase), which releases terminal N-acetyl-neuraminic (sialic) acid structures from complex sugars; and O-glycosidase, which releases the disaccharide ␤-galactosidase (1-3)-GalNAc from O-glycans bound to serine or threonine (17). Protein samples were dena-

Toll-like Receptor 2 Glycosylation
tured beforehand or left untreated and then incubated overnight in the presence of glycosidases at 37°C under mildly denaturing or native conditions as indicated. Samples were analyzed by SDS-PAGE (see Fig. 2C). Both protein preparations are insensitive to treatment with neuraminidase and O-glycosidase, which suggests that there is no or very little O-glycosylation. By contrast, both endoglycosidase H and PN-Gase F deglycosylate TLR2, demonstrating the presence of N-linked sugars.
Circular Dichroism Spectra of TLR2 and TLR4 Indicate a Lack of ␣-Helical Structure-To analyze the secondary structure adopted by TLR2 and TLR4 ectodomains, we measured circular dichroism spectra of the purified proteins. As shown in Fig. 3, both samples display a negative band at ϳ217 nm. This indicates the presence of ␤ structure, but the lack of bands at 208 and 222 nm suggests a low content of ␣-helix in the proteins. The secondary structure content of the spectra was estimated using the program Selcon3 (14) and compared with known LRR protein structures (Table I). This result strongly suggests that the LRRs of TLR2 and 4 adopt an overall architecture similar to that of platelet glycoprotein Ib and the Nogo receptor (see Fig. 1) (9, 10) and unlike that of ribonuclease inhibitor, which has a high ␣-helical content (18).
All Four Predicted Glycosylation Sites in TLR2 Are Modified and Are Required for Efficient Secretion-To study the role of TLR2 glycosylation further, the four predicted N-linked sites were sequentially removed by site-directed mutagenesis. To conserve function and folding, sites 1 to 3 were changed to  Fig. S1 and a summary for the other TLRs in Table S1. residues found at those positions in TLR2 from other species (Fig. 1C), and the fully conserved site 4 was mutated to aspartate, a residue frequently found at position 6 in other extracellular LRRs. The four mutant proteins were then expressed in HEK293 cells, followed by immunoblot assay of both supernatant and cell lysate samples (Fig. 4). Each successive mutation reduced the apparent molecular mass of the recombinant protein, demonstrating that the mutations prevented the addition of N-glycans. The wild-type TLR2 protein expressed in the presence of tunicamycin, an inhibitor of N-linked glycosylation, co-migrates with mutant 4, suggesting that mutant 4 is fully unglycosylated (Fig. 4). Taken together, these results demonstrate that each of the four sites is occupied in the wild-type TLR2 ectodomain. It is noticeable that not all protein bands are single species. In the supernatant samples where bands are generally broad and fuzzy, this is hard to distinguish, but in the cell lysate blot doublet bands are discernible. Analysis of these bands using ImageJ software confirms this observation (Fig.  4B). Differences in electrophoretic mobility manifesting themselves as multiplet bands are most likely caused by inefficient core glycosylation. That there are only doublet bands and no triplet or higher order multiplets is an indication that only one of the sites is partially glycosylated. The lower doublet band in lane 11 probably corresponds to unglycosylated protein because it co-migrates with mutant 4. In other samples, upper and lower bands correspond to mono-and diglycosylated (mutant 2), di-or triglycosylated (mutant 1) and so forth. The doublet is present in wild-type and all the mutants up to mutant 3 (lanes 8 -11) but disappears in mutant 4 (lane 12). This suggests that site 4, the last remaining site in mutant 3, is the partially occupied site, removal of which generates a universally unglycosylated and unsecreted product.
No Permutation of Double or Triple Glycosylation Mutants Restores Efficient Secretion of TLR2 Ectodomain-To determine whether any TLR2 ectodomain that retains a single, homogeneous glycosylation site can be efficiently secreted, the three remaining triple mutants were made. As shown in Fig. 5, none of these proteins is secreted by HEK293. However, a substantial amount of protein can be detected intracellularly, and it appears that the single glycosylation sites are used as the bands migrate more slowly than the mutant lacking all the sites. In addition, three further permutations not involving site 4 cannot be secreted (Fig. 6). Interestingly, although the single mutant lacking site 1 is secreted at only a slightly reduced level compared with wild-type, the site 4 single mutant is severely impaired. This points to the heterogeneous site 4 as being particularly critical for secretion of TLR2. DISCUSSION The expression studies and purification of TLR2 and TLR4 presented here show that folded and monodisperse proteins can be produced by baculovirus-infected insect culture cells in amounts sufficient for biochemical and structural analysis. The ability of the TLR2 ectodomain to be secreted in a heterologous expression system is consistent with the behavior of endogenous human TLR2. TLR2 is detected on the surface of mono- FIG. 2. The TLR2 and TLR4 extracellular domains can be expressed and purified from insect cell culture, and the TLR2 extracellular domain is glycosylated. A, expression and purification of TLR2 and TLR4 extracellular domains. Culture supernatants of insect cells infected with TLR2 and TLR4 recombinant baculoviruses were concentrated and purified by Ni-NTA purification (lanes 1 and 3, respectively) and gel filtration (lanes 2 and 4, respectively). Samples were analyzed on 4 -20% SDS-PAGE and stained with Coomassie Blue. B, gel filtration profiles of purified TLR2 and TLR4 ectodomans. Left, purified TLR2 extracellular domain analyzed on a calibrated analytical Superdex 75 gel filtration column.  Table I. cytes, and TLR2 ectodomains are shed from these cells (19). It is proposed that shed TLR2 may have a physiological role in regulating responses to pathogens. The circular dichroism spectroscopy studies presented here indicate that the LRRs of TLR2 and TLR4 are likely to fold into an overall structure comparable with that of platelet glycoprotein Ib and the Nogo receptor. This finding has allowed us to predict the likely arrangement of the polysaccharide chains in TLR2 and TLR4 (see Fig. 1 and Supplemental Fig. S1).
We have shown that human TLR2 is glycosylated when expressed in a human cell line and thus confirmed the charac-ter of TLR2 as a glycoprotein. Although glycosylation of the cell surface TLR2 receptor protein may be different (20), our data suggest that all four glycosylation sites are modified in the mature TLR2 protein. Other studies suggest that this is also the case for TLR4. TLR4 requires glycosylation for receptor function (21); its functional importance may be reflected in the high degree of conservation of TLR4 glycosylation sites, all of which are conserved (see Fig. S1). In TLR1 two of six and in TLR6 five of nine glycosylation sites are conserved across species. In TLR2, which features the least number of glycosylation sites in all TLRs, only site 4 is conserved. An analysis of predicted glycosylation sites in all the TLRs is presented in supplementary data (Table S1). The degree of conservation observed may reflect the requirements of glycosylation in receptor function. Interestingly, the conserved site in TLR2 shows inefficient core glycosylation, whereas all other sites are efficiently substituted. Thus, this site may be of importance for protein structure or secretion rather than for signaling function. In this regard, protein binding to both platelet glycoprotein Ib and the Nogo receptor does not require glycosylation but does involve interaction with LRR residues on the concave surface of the LRR solenoid.
Comparing the TLR2 mutants generated here, the level of protein biosynthesis appears to be similar between the different mutants despite the differences in the nature and number of mutated residues. Our observations illustrate that glycosylation is an important determinant of TLR2 secretion in mammalian cells. All mutants are impaired in their level of secretion in comparison to wild-type protein, and not only entirely unglycosylated but also monoglycosylated mutants secrete poorly. The decrease in secretion between mutants 1 and 4 ( Fig.  4) shows that the overall number of glycosylation sites has a strong influence on the ability of the protein to secrete, potentially by decreasing interactions of the nascent protein chain with the cellular folding machinery. Based on this data, TLR2 appears to belong to the group of glycoproteins that require most, if not all, glycosylation sites for secretion (22,23).
Our data suggest that the conserved site 4 is one of the major determinants for proper TLR2 biosynthesis. All mutants in Figs. 5 and 6 lack site 4 and do not secrete efficiently, although they still carry some glycosylation. On the other hand, mutants 1-3, which still have site 4, are secreted even though at decreasing levels (Fig. 4). Significantly, the single mutant lacking only site 1 retains a reasonable level of secretion, whereas the single mutant of site 4 cannot be detected in supernatants, again suggesting a particular importance for this glycosylation event. The role played by particular glycosylation sites in determining the ability of a protein to secrete has been studied before (20, 24 -26). Our data also suggest that this site is partially occupied, and this is consistent with enzymatic deglycosylation experiments (Fig. 2). It is conceivable that this is because of low accessibility on the inner LRR solenoid surface. Because this region has been implicated as the primary ligand binding region in LRR proteins, it is possible that site 4 is functionally important (27). FIG. 6. Diglycoyslated mutants of TLR2 cannot support secretion. Immunoblot of culture supernatants (A) or cell lysates (B) from HEK293 cells transfected with TLR2 glycosylation mutants 1 and 8 -11 as indicated or transfected with empty vector (x). Samples were collected and processed as before.