Mapping the Hyaluronan-binding Site on the Link Module from Human Tumor Necrosis Factor-stimulated Gene-6 by Site-directed Mutagenesis*

Link modules are hyaluronan-binding domains found in extracellular proteins involved in matrix assembly, development, and immune cell migration. Previously we have expressed the Link module from the inflammation-associated protein tumor necrosis factor-stimulated gene-6 (TSG-6) and determined its tertiary structure in solution. Here we generated 21 Link module mutants, and these were analyzed by nuclear magnetic resonance spectroscopy and a hyaluronan-binding assay. The individual mutation of five amino acids, which form a cluster on one face of the Link module, caused large reduc-tions in functional activity but did not affect the Link module fold. This ligand-binding site in TSG-6 is similar to that determined previously for the hyaluronan receptor, CD44, suggesting that the location of the interaction surfaces may also be conserved in other Link module-containing proteins. Analysis of the sequences of TSG-6 and CD44 indicates that the molecular details of their association with hyaluronan are likely to be significantly different. This comparison identifies key sequence positions that may be important in mediating hyaluronan binding, across the Link module superfamily. The use of multiple sequence alignment and molecular modeling allowed the prediction of functional residues in link protein, and this approach can be extended to all members of the superfamily.

Hyaluronan (HA) 1 is a ubiquitous high molecular weight glycosaminoglycan, composed of repeating disaccharides of Dglucuronic acid and N-acetyl-D-glucosamine, which has diverse biological roles in vertebrates. For instance, this polysaccharide, a vital structural component of extracellular matrix (e.g. cartilage, skin, and brain), is required for successful embryonic development (1), and is involved in cell migration (2,3). The wide range of functional activities derives from the large num-ber of HA-binding proteins, which can be intracellular, secreted, or on the cell surface. Many of the extracellular hyaladherins contain a common domain of ϳ100 amino acids, termed a Link module, which is involved in HA binding (4,51). This domain was first described in link protein (containing an immunoglobulin module and two contiguous Link modules (5)), which together with HA and aggrecan forms huge multimolecular complexes that provide articular cartilage with its load bearing properties. Aggrecan interacts with HA via its N-terminal G1 domain, and this has the same organization of modules as link protein (6); it also has another pair of tandem Link modules within its G2 domain, but these do not bind HA (7)(8)(9). In the aggrecan G1 domain and link protein it has been found that both Link modules participate in HA binding (9,10).
CD44 is the primary receptor for HA and has a range of functions such as anchoring the extracellular matrix to the surface of cells (e.g. in cartilage (11)) and mediating the migration of activated lymphocytes to sites of inflammation (3). CD44 has a single Link module that forms part of its HA-binding domain (12), and functionally important amino acids within this region have been identified (13,14).
The inflammation-associated protein TSG-6 (the secreted product of tumor necrosis factor-stimulated gene-6 (15)) contains a single Link module. TSG-6 has been implicated in the regulation of leukocyte migration (16,17), and its pattern of expression and ligand specificity indicates that it may be involved in extracellular matrix remodeling (18 -20). Previously, we have expressed the Link module from human TSG-6 in Escherichia coli (21,22) and shown that this material (referred to here as Link_TSG6) interacts with HA using a microtiter plate assay (18,19,23). In addition, nuclear magnetic resonance (NMR) spectroscopy on Link_TSG6 has revealed that the Link module is comprised of two ␣-helices and two triplestranded ␤-sheets arranged around a large hydrophobic core (23).
Here, we report the production of 21 Link_TSG6 mutants and their characterization by NMR spectroscopy and a HAbinding assay. Five amino acids, which are clustered on one face of the TSG-6 Link module, were identified as having an important role in binding. Comparison of the HA interaction surfaces in TSG-6 with those determined previously for CD44 has allowed the prediction of functional residues in link protein and other members of the Link module superfamily.
Expression, Purification, and Characterization of Wild-type and Mutant Link_TSG6 -Wild-type and mutant proteins were expressed, refolded, and purified to homogeneity as described previously (21,22). Mutants (at 5-7 pmol/l in 50% (v/v) acetonitrile, 0.2% (v/v) formic acid) were analyzed by electrospray ionization mass spectrometry on a Micromass BioQ II-ZS spectrometer calibrated with horse heart myoglobin (average molecular mass of 16,591.48 Da) and scanned over the mass range 600 -1,600 Da.
NMR Spectroscopy-Lyophilized wild-type and Link_TSG6 mutants were resuspended in 600 l of 10% (v/v) D 2 O, 0.02% (w/v) NaN 3 and adjusted to pH 6.0 with NaOH to give concentrations in the range 0.4 -1.3 mM. One-dimensional NMR spectra (128 scans) were recorded at 25°C on a home-built/GE Omega spectrometer, operating at a frequency of 500 MHz. The NMR data were processed using FELIX 2.3 (Biosym Inc.), applying sine bell and Gauss-Lorentz window functions for resolution enhancement. Proton chemical shifts were referenced to H 2 O at 4.74 ppm.
Protein Concentration-The concentrations of the wild-type and mutant proteins, used in the NMR analysis, were determined by amino acid analysis (24) on an Applied Biosystems 420A derivatizer/analyzer and on-line narrow bore high performance liquid chromatography system (Applied Biosystems). These "stock solutions" were stored at 4°C and used subsequently in the HA-binding assays (see below).
Biotinylation of HA-Rooster comb HA (Sigma) was biotinylated using a modification 2 of the method of Yu and Toole (25). Briefly, 20 l of 250 mM biotin-LC-hydrazide (Pierce and Warriner, Chester, U. K.) in dimethyl sulfoxide was added to 1 ml of 5 mg/ml HA (in 0.1 M MES, pH 5.5) followed by 13 l of 25 mg/ml EDAC in 0.1 M MES, pH 5.5, and the reaction mixture was stirred at room temperature overnight. The sample was dialyzed extensively against water and particulate material removed by centrifugation (12,000 ϫ g for 1 min). The concentrations of HA samples (either biotinylated or unmodified) were determined using the metahydroxybiphenyl reaction (26) relative to standards made from rooster comb HA dried in vacuo over cobalt chloride.
Analysis of HA Binding-The HA-binding activities of wild-type and Link_TSG6 mutants were determined colorimetrically using a microtiter plate assay that measures the binding of biotinylated HA to proteincoated wells (18,19,23). All dilutions, incubations, and washes were performed in 50 mM sodium acetate, 100 mM NaCl, 0.05% (v/v) Tween 20, pH 6.0, unless otherwise stated; it has been shown previously that the interaction between Link_TSG6 and HA is maximal at pH 6.0 (19). Maxisorp F96 plates (Nunc) were coated overnight with 200 l/well protein solution (25 pmol/well; protein concentrations were determined for stock solutions as described above) in 20 mM Na 2 CO 3 , pH 9.6. Control wells were incubated with buffer alone and then treated as for sample wells. The coating solution was removed and the plates washed three times. Nonspecific binding sites were blocked by incubation with 1% (w/v) bovine serum albumin for 90 min at 37°C followed by three more washes. A 200-l solution containing 12.5 ng of biotinylated HA was added to each well, in the absence or presence of 2,500 ng of unmodified rooster comb HA and incubated at room temperature for 4 h. Plates were washed three times, and 200 l of a 1:10,000 dilution of ExtraAvidin alkaline phosphatase (Sigma) was added and incubated for 30 min. After three more washes, wells were incubated for 10 min with 200 l of a 1 mg/ml solution of disodium p-nitrophenyl phosphate (Sigma) in 100 mM Tris-HCl, 100 mM NaCl, 5 mM MgCl 2 , pH 9.3. The absorbance at 405 nm was determined on an MKII Titertek Multiscan Plus plate reader. All absorbance measurements were corrected by subtracting values from uncoated control wells. Each interaction was investigated in quadruplicate in three separate plate assays (i.e. n ϭ 12). 2 S. Banerji, personal communication.
FIG. 1. DNA and translated protein sequences of the VII-6-1mut8-5 plasmid used in site-directed mutagenesis. The amino acids of Link_TSG6 targeted for mutagenesis are indicated in bold; residues are numbered according to the sequence of the expressed Link module (21). The sequences of oligonucleotides, denoted 1-18, used in the mutagenesis reactions are shown in italics, aligned below the wild-type sequence, with the altered nucleotides in bold. Oligonucleotides 1 and 2 are selection primers that change the wild-type NcoI restriction site (CCATGG) to NdeI (CATATG). Oligonucleotide 1 is used in conjunction with the mutagenesis primers 3-18, whereas oligonucleotide 2 is a combined selection and mutagenesis primer. Oligonucleotides 5, 7, and 14 are degenerate primers, each with the potential to produce four different mutant sequences.
Isothermal Titration Calorimetry-The interactions between six Link_TSG6 mutants and an octasaccharide of HA (HA 8 ) were investigated on a MicroCal VP-ITC instrument at 25°C in 5 mM Na-MES, pH 6.0. A 335 M solution of HA 8 , prepared by digestion of human umbilical cord HA with ovine testicular hyaluronidase and purified by gel filtration and ion exchange chromatography, 3 was added in 5-l injections (28 in total) to protein (ranging from 7.3 to 25.7 M) in the 1.4-ml calorimeter cell. Data were fitted to a one-site model by nonlinear least squares regression with the Origin software package, after subtracting the heats resulting from the addition of HA 8 into buffer alone as described previously (27).
Homology Modeling-The three-dimensional structures of the two Link modules from Lp1 and Lp2 were each modeled using the program Modeller4 (38) on the basis of the coordinates of the human TSG-6 Link module (23) and the alignment in Fig. 2. In each case, 100 independent models were generated, and the model with the lowest energy (based on the value of the molecular probability density function) was chosen. XPLOR version 3.8 (39) was used to add hydrogen atoms and disulfide bonds and to carry out energy minimization and molecular dynamic simulations with the CHARMm22 force field (40). Briefly, three rounds of energy minimization were carried out with the backbone fixed. In the first, the electrostatic term was excluded, and a purely repulsive nonbonded force field was used. In the following rounds, the full CHARMm22 force field, which included a Lennnard-Jones potential, was used, but electrostatic interactions were only included in the third round. This was followed by molecular dynamics where all atoms of residues 1-71 and 76 -99 in Lp1, and amino acids 1-11, 14 -39, and 43-96 in Lp2 were fixed so that only regions corresponding to insertions or deletions, when compared with Link_TSG6 (Fig. 2), were free to move. A final energy minimization was performed using the full CHARMm22 force field as in round three above. PROCHECK (41) was used to determine that the number of sterochemical violations in the final models were similar to that of the solution structure of TSG-6 Link module.

RESULTS AND DISCUSSION
Residue Selection and Mutagenesis-15 sequence positions of Link_TSG6 were selected for mutagenesis. Eight of these residues (i.e. Lys-11, Tyr-12, Tyr-59, Lys-72, Asp-77, Tyr-78, Arg-81, and Glu-86), which form a coherent patch on the Link module surface, were chosen because they have been predicted previously to be involved in HA binding (23). Arg-8 was picked because it is adjacent to this patch, and a basic amino acid at this position is involved in HA binding in human CD44 (14). Asp-89, which is completely buried in the hydrophobic core, was chosen because it could be involved in mediating the unusual pH dependence of HA binding to Link_TSG6 (19,42). Asn-67, Phe-70, and Ile-75 are located on the ␤4/␤5 loop and have been demonstrated to be perturbed significantly on binding to HA 8 (27). Glu-6 and Lys-13 were selected because they have been implicated in TGS-6-mediated inhibition of neutrophil migration in an in vivo model of inflammation (16).
In total, 21 Link_TSG6 mutant constructs were generated and verified by DNA sequencing (listed in Table I). All of the mutants were found to express at levels similar to wild-type Link_TSG6 (21). Electrospray ionization mass spectrometry revealed that the mutant proteins had molecular masses that differed by less than 1.5 Da from their theoretical masses (data not shown).
Structural Characterization of Link_TSG6 Mutants-Onedimensional NMR spectroscopy was used to assess the effect of FIG. 2. Alignment of Link_TSG6 with Lp1 and Lp2. Lp1 and Lp2, which correspond to residues 159 -257 and 259 -354, respectively, in human link protein (32), are aligned with residues 1-97 of Link_TSG6 (23). This alignment (23) was used for molecular modeling of Lp1 and Lp2 on the basis of the Link_TSG6 coordinates.

TABLE I
HA-binding activities of Link_TSG6 mutants and the effect of mutagenesis on protein structure Each mutant was analyzed by one-dimensional NMR spectroscopy to determine the effect of mutagenesis on the Link module structure. As shown in Fig. 3, mutants can be classified into three groups (wild-type fold, perturbed fold, and unfolded). The HA binding activities of all mutants were determined and compared with wild-type protein. Only mutants that have wild-type folds (see Fig. 4) provide information on whether a particular amino acid is involved in HA binding. each of the mutations on the Link module fold. Wild-type Link_TSG6 has a characteristic one-dimensional NMR spectrum (Fig. 3), with well dispersed signals (e.g. in the amide region ϳ7.5-9.5 ppm) and the methyl resonances from Val-57 shifted to high field (-0.5 and -1.1 ppm) because of their proximity to Trp-51 and Trp-88 in the hydrophobic core (19,23). 13 of the mutants (Table I) give NMR spectra (data not shown) that are essentially identical to that of the wild-type protein (e.g. Y59F illustrated in Fig. 3). Thus, it can be concluded that these amino acid substitutions have no effect on the Link_TSG6 fold. However, other mutations give rise either to unfolded protein (e.g. E86A; Fig. 3) or a Link module that, while folded, is structurally different from that of wild-type Link_TSG6 (e.g. Y78S; Fig. 3). Therefore, all of the mutants can be classified as having a wild-type fold, a perturbed fold, or being unfolded on the basis of their NMR spectra (Table I).
Clearly, only the mutants that have wild-type folds can be used to provide information on the role of a particular amino acid sequence position in ligand binding.
HA-Binding Experiments-The HA-binding activities of wild-type and mutant Link_TSG6 were analyzed using a microtiter plate assay that we have described previously (18,19,23). For wild-type Link_TSG6, maximum binding of biotinylated HA (12.5 ng) was seen when protein was coated at 25 pmol/well (data not shown). Amino acid analysis of coating solutions, following incubation overnight in the microtiter plate, indicated that greater than 90% of the protein (wild-type and eight mutants tested) was adsorbed onto the well (data not shown). From Fig. 4, which shows the experimental data for the mutants with wild-type folds, it can be seen that the binding of biotinylated HA is highly specific because this is greatly reduced by the presence of unlabeled HA. Some mutants show a degree of nonspecific (i.e. non-competable) binding, but in the worst case (N67L) this is less than 15% of the value for wildtype protein determined in absence of competitor (Fig. 4). Table I shows the HA binding activities of all of the Link_TSG6 mutants (as a percentage of wild-type binding). The mutants that have either a perturbed fold (i.e. N67S, I75A, D77A, Y78S, and D89A) or are unfolded (i.e. R81A, E86A, and E86S) have a greatly reduced HA binding function (with between 2 and 26% of wild-type binding). Because these mutations affect the Link module structure (see above) it is impossible to tell whether the loss of activity results from the residue being involved in binding or from the perturbation of the interaction surface. Therefore, they provide no information on role of a particular amino acid in binding.
The binding data for the 13 mutants with wild-type folds are presented in Fig. 4. From this it can be seen that E6A, R8A, K13A, N67L, and K72A have functional activities similar to that of wild-type protein with 93, 108, 89, 79, and 88% of FIG. 3. One-dimensional 1 H-NMR spectra of wild-type and mutant proteins. The wild-type Link_TSG6 (WT) has a well dispersed spectrum with the methyl resonances of Val-57 (V57), which forms part of the stable hydrophobic core, being shifted to high field. Y59F has a NMR spectrum that is essentially identical to the wild-type and therefore can be classified as having a wild-type fold. E86A has poorly dispersed resonances, with no high field-shifted Val-57 methyls, and the spectrum is characteristic of that of an unfolded protein. The spectrum of Y78S, although having some features of a folded protein (i.e. with high field shifted methyls), is significantly different from that of wild-type. This mutant therefore can be classified as having a perturbed fold.
wild-type binding, respectively (Table I). Therefore, it can be concluded that Glu-6, Arg-8, Lys-13, Asn-67, and Lys-72 are unlikely to be involved in the interaction of Link_TSG6 with HA.
The mutation of Lys-11, Tyr-12, Tyr-59, Phe-70, or Tyr-78 (i.e. mutants K11Q, Y12F, Y12V, Y59F, Y59S, F70V, Y78F, and Y78V) each leads to a large reduction in activity (7-30% of wild-type binding; see Table I). Table II shows that the mutation of these amino acids also leads to a significant reduction in the affinities of HA binding in solution, whereas K72A exhibits wild-type activity. This clearly demonstrates that the results obtained with the microtiter plate assay are reliable and are not an artifact caused by immobilization of the protein on the plate. These data indicate that Lys-11, Tyr-12, Tyr-59, Phe-70, and Tyr-78 are likely to participate directly in HA binding. For example, Lys-11 could be making an ionic interaction with a carboxyl group of HA; basic amino acids have been implicated previously in protein-HA interactions (13,14,(43)(44)(45). Recent calorimetry studies indicate that the interaction of Link_TSG6 with HA 8 involves the formation of one or two salt bridges (46).
The conservative replacement of any of the three tyrosines (i.e. Tyr-12, Tyr-59, and Tyr-78) with phenylalanine leads to a large drop in functional activity (Fig. 4), indicating that the hydroxyl groups in these residues make an important contri- FIG. 4. Comparison of the HA-binding activities of Link module mutants with wild-type Link_TSG6. The binding of biotinylated HA to wild-type (WT) or mutant proteins was determined using a colorimetric assay in the absence or presence of competing unlabeled HA (200-fold molar excess). Values are plotted as the mean absorbance (n ϭ 12) at 405 nm after a 10-min development time Ϯ the S.E. The mutants shown here are those that have wild-type folds (Table I). FIG. 5. Position of the HA-binding site on Link_TSG6. The TSG-6 Link module structure (23) is shown as a spacefilling representation (generated using the program RasMol (50)) in four orientations. Mutated amino acids are colorcoded according to the effect of the amino acid substitution on HA-binding activity or the structural integrity of the Link module fold. Residues in which all of the mutations made lead to a perturbed/unfolded structure are shown in pink; no conclusions can be made about their role in ligand binding. Amino acids that are important for HA binding (i.e. the mutation leads to a large reduction in functional activity) are colored red; those that are not involved are denoted in green. bution to HA binding. Y12F and Y78F have activities that are similar to those of Y12V and Y78V, respectively, showing that in Tyr-12 and Tyr-78 the hydroxyls alone (but not the aromatic rings) are involved in the interaction. The serine mutant of Tyr-59 (Y59S) has a slightly reduced binding capacity compared with Y59F, suggesting that the aromatic ring, in this case, may also take part. Mutation of Phe-70 to Val (F70V) reduces HA-binding activity significantly, indicating that its aromatic ring makes an important contact with HA. The individual mutation of these five amino acids (i.e. Lys-11, Tyr-12, Tyr-59, Phe-70, and Tyr-78) leads to a large reduction in functional activity, indicating that there is an extensive network of interactions between the protein and polysaccharide, and loss of any one of these (such as a hydrogen bond from Tyr-12 to HA) can have a dramatic effect on HA binding.
Localization of the HA-Binding Site on Link_TSG6 -The positions of the 15 amino acids mutated here were mapped onto the structure of the TSG-6 Link module (Fig. 5). The five residues that are implicated in HA binding (colored red) form a cluster on one face of the molecule. Therefore, it is likely that this represents the position of the HA-binding surface on Link_TSG6. This is consistent with recent NMR studies (27) identifying the residues of Link_TSG6 which exhibit significant chemical shift changes (for H N , N H , C ␣ , and C ␤ atoms) on binding to HA, which includes Lys-11, Tyr-59, Phe-70, and Tyr-78. Other amino acids, located on the ␤4/␤5 loop (residues 61-74), which were found here not to be involved in HA binding, also experienced large shift perturbations (i.e. Asn-67 and Lys-72 (27)), indicating that this region of the Link module undergoes a ligand-induced conformational change. This structural alteration may be mediated, in part, by the interaction of the aromatic ring of Phe-70 with HA.
As described above, eight of the amino acids selected for mutagenesis were predicted to be involved in the interaction with HA (23). Of these, Lys-11, Tyr-12, Tyr-59, and Tyr-78 have been found to participate in HA binding, whereas Lys-72 is not involved. Mutation of Asp-77, Arg-81, and Glu-86 compromises the structural integrity of the Link module, such that no conclusion can be made regarding their role in HA binding. However, their involvement cannot be excluded.
Glu-6 and Lys-13, which have been implicated in TGS-6mediated inhibition of neutrophil migration (16), are clearly FIG. 6. Comparison of HA-binding sites on TSG-6 and CD44. In A, the Link modules from TSG-6 and CD44 (modeled on the Link_TSG6 coordinates (4)) are shown in similar orientations on the basis of their secondary structural elements. Residues that are involved in HA binding in Link_TSG6 (determined here) are colored red; amino acids of CD44 which are critical or important for interaction with HA are shown in dark blue or light blue, respectively. The functional residues of CD44 were identified by site-directed mutagenesis as described in Bajorath et al. (14) and are numbered accordingly. All of the HA-binding residues on the CD44 Link module are visible apart from Lys-68, which is on the opposite face of the protein. B, the Link_TSG6 structure (as in A) showing the 11 sequence positions that can contribute to HA binding in TSG-6 and/or CD44 and form a coherent patch on one face of the Link module surface. These are color-coded, as described below, and numbered 1-11. Sequence positions that are involved in HA binding in either TSG-6 or CD44 alone are colored as in A (i.e. red or blue, respectively). Positions 2 and 3, which mediate HA binding in both TSG-6 and CD44, are depicted in purple.

FIG. 7. Alignment of Link module sequences.
Residues of TSG-6 and CD44 which have been demonstrated by mutagenesis to interact with HA are colored as in Fig. 6A; amino acids that are not involved in HA binding are shown in lowercase. Asterisks denote sequence positions that can contribute to HA binding in TSG-6 and/or CD44 (numbered 1-11 as in Fig. 6). These are colored (as in Fig. 6B) to indicate whether the sequence position is functionally TSG-6-specific, CD44-specific, or utilized by both proteins. This color coding is also used to indicate whether an amino acid capable of making an interaction with HA (i.e. salt bridges or hydrogen bonds) is found at these positions in the Link modules from other members of the Link module superfamily. Residues are underlined if they are identical to, or a conservative replacement of, functional amino acids in TSG-6 or CD44. Residues shown in green in Lp2 may also be involved in HA binding (see Fig. 8 and "Results and Discussion"). not involved in HA binding ( Fig. 3 and Table I). In this study we mutated both of these residues to alanine, whereas Wisniewski et al. (16) altered them to lysine and glutamic acid, respectively. It is possible, therefore, that these latter mutants (equivalent to E6K and K13E) may have reduced HA-binding capabilities (e.g. because of having perturbed structures).
HA-Binding Sites in TSG-6 and CD44 Link Modules-The residues of the CD44 Link module which mediate HA binding have been identified by site-directed mutagenesis (14). Four amino acids (shown in dark blue on Fig. 6A) are essential for high affinity binding (i.e. mutation of any one of these greatly reduces functional activity), and five other amino acids (light blue) are involved but not critical. From Fig. 6A it can be seen that the positions of the HA-binding sites on Link_TSG6 and CD44 map to the same face of the Link module. In addition, the essential HA-binding residues Arg-41 and Tyr-42 in CD44 are found at sequence positions identical to those of Lys-11 and Tyr-12, respectively, in Link_TSG6 (Fig. 7). This indicates that the location of the HA-binding surface may be conserved across the Link module superfamily. Consistent with this, the epitope recognized by a monoclonal antibody that inhibits HA binding to link protein (47) also maps to this face of the module.
Further consideration of the HA-binding amino acids in CD44 and TSG-6 suggest that, although the positions of the binding surface are similar, the molecular details of the interactions are likely to be significantly different. Apart from the amino acids described above (i.e. Lys-11 and Tyr-12 in Link_TSG6, and Arg-41 and Tyr-42 in CD44) none of the other HA-binding residues are found at equivalent sequence position in these proteins. As can be seen from Fig. 7, the critically important residues Arg-78 and Tyr-79 in CD44 are replaced in TSG-6 by alanines (Ala-48 and Ala-49), which are unable to make ionic or hydrogen bonds to the sugar. In addition, Lys-38 and Asn-101 in CD44 are both involved in the interaction with HA, whereas the corresponding residues in Link_TSG6 (i.e. Arg-8 and Lys-72, respectively) have been shown here not to participate in binding (Fig. 4).
HA-Binding Consensus-Comparison of the functional residues determined for TSG-6 and CD44 indicates that at least 12 amino acid sequence positions of the Link module can be involved in HA binding. As shown in Fig. 6B (for Link_TSG6) 11 of these form a coherent surface patch on one face of the module. Only positions 2 and 3 are utilized in both TSG-6 and CD44, whereas all of the others are either TSG-6-specific (positions 6, 7, and 11) or CD44-specific (positions 4, 5, 8, 9, and 10). These 11 sequence positions are likely to be of functional importance in other members of the Link module superfamily; it is expected that a particular protein will utilize a certain combination of these "consensus" residues to form its HA-binding surface. Given the conservation of binding residues at positions 2 and 3 in TSG-6 and CD44 it is probable that these represent key determinants in the HA interaction, across the superfamily as a whole. Fig. 7 shows 18 Link modules from 10 different human proteins aligned with TSG-6 and CD44, where the residues (at consensus positions 1-11) which have the potential to mediate carbohydrate binding (i.e. by making ionic or hydrogen bonds (48,49)) are highlighted. For example, Lp1 has potential HA-binding residues at consensus positions 1, 2, 3, 4, 6, 8, 9, 10, and 11 (Fig. 7). This is also illustrated in Fig.  8, which shows the locations of these residues on a homology model of Lp1, generated on the basis of the Link_TSG6 coordinates (see "Experimental Procedures"). It is possible that some of these residues may be more likely to participate in the interaction with HA than others because they correspond to identities or conservative replacements of functional amino acids in TSG-6 or CD44 (i.e. in Lp1 this corresponds to positions 1, 2, 3, 6, 8, 9, and 11, which are underlined on Fig. 7).
It should be noted that the Link module-containing proteins KIA0246, CAB61358, and KIA0527 (Fig. 7) have not yet been shown to have an HA-binding function. As mentioned above, there are likely to be large networks of interactions required to stabilize HA-protein complexes. Given the number of conserved residues (compared with CD44 and TSG-6; underlined in Fig.  7) at the consensus sequence positions (one in KIA0246 at position 3; three in CAB61358 at positions 3, 6, and 11; none in KIA0527) we predict that KIA0246 and KIA0527 are unlikely to be HA-binding proteins, although it is possible that CAB61358 may be functionally active.
All of the HA binding activity of aggrecan is located in its G1 domain, which contains two contiguous Link modules (aggre-can1 and aggrecan2 in Fig. 7) (7-9). Both of these Link modules are required for high affinity HA binding (9). Visual inspection of the nonfunctional Link modules of the G2 domain (i.e. ag-grecan3 and aggrecan4), show that these have sequences similar to those of aggrecan1 and aggrecan2 (Fig. 7), respectively. Therefore, it is not obvious why modules 3 and 4 are inactive. The only significant difference is that aggrecan4 does not have a basic residue at consensus position 2. However, given the importance of this sequence position in TSG-6 and CD44 this may be enough to render this module, and hence the G2 domain, inactive.
The analysis described above does not exclude the possibility that amino acids at other sequence positions are also involved in HA binding. In this regard, visual inspection of the Lp2 model indicates that Arg-66, Lys-85, and Tyr-87 (colored green on Figs. 7 and 8) may contribute to the interaction with HA because they are located in close proximity to the consensus HA-binding residues. From Fig. 8 it can be seen that Tyr-87 is at a location on the Link module surface very similar to that usually occupied by consensus position 3 (which is a leucine in Lp2 rather than a tyrosine). It is also interesting to note that all of the Link modules that have a leucine at position 3 (Lp2, BRAL1-2, brevican2, and neurocan2) have an arginine in an equivalent sequence site to Arg-66 in Lp2. Therefore, careful analysis of the Link module alignment (Fig. 7) in conjunction with individual Link module models (such as those for Lp2 in Fig. 8) allows the identification of amino acids that have a and Lp2 were modeled on the basis of the Link_TSG6 coordinates. The models are shown in the same orientation (on the basis of secondary structure elements) as for Link_TSG6 and CD44 in Fig. 6. Amino acids that could participate in HA binding are colored (as in Fig. 6B and Fig.  7) to indicate whether the sequence position at which they are found is TSG-6-like (red), CD44-like (dark or light blue), or common (purple). In Lp2, additional amino acids can be identified (green), which could contribute to HA binding and are in close proximity to the consensus residues.
reasonable probability of being involved in HA binding. Clearly, not all of the residues identified in this manner will be involved in binding, but as such these provide excellent candidates for programs of site-directed mutagenesis.
Conclusions-Site-directed mutagenesis has identified five amino acids in the Link module of human TSG-6 which contribute to HA binding. Comparison of this ligand-interaction surface with that determined previously for CD44 has led to a prediction of the HA-binding residues in other members of the Link module superfamily by using a combination of sequence alignment and molecular modeling.