Crystal structure of the C-terminal peptidoglycan-binding domain of human peptidoglycan recognition protein Ialpha.

Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors of the innate immune system that bind, and in some cases hydrolyze, peptidoglycans (PGNs) on bacterial cell walls. These molecules, which are highly conserved from insects to mammals, participate in host defense against both Gram-positive and Gram-negative bacteria. We report the crystal structure of the C-terminal PGN-binding domain of human PGRP-Ialpha in two oligomeric states, monomer and dimer, to resolutions of 2.80 and 1.65 A, respectively. In contrast to PGRPs with PGN-lytic amidase activity, no zinc ion is present in the PGN-binding site of human PGRP-Ialpha. The structure reveals that PGRPs exhibit extensive topological variability in a large hydrophobic groove, located opposite the PGN-binding site, which may recognize host effector proteins or microbial ligands other than PGN. We also show that full-length PGRP-Ialpha comprises two tandem PGN-binding domains. These domains differ at most potential PGN-contacting positions, implying different fine specificities. Dimerization of PGRP-Ialpha, which occurs through three-dimensional domain swapping, is mediated by specific binding of sodium ions to a flexible hinge loop, stabilizing the conformation found in the dimer. We further demonstrate sodium-dependent dimerization of PGRP-Ialpha in solution, suggesting a possible mechanism for modulating PGRP activity through the formation of multivalent adducts.

The innate immune system is a host defense mechanism, evolutionarily conserved from insects to humans, that mediates the early recognition and control of invading microorganisms (1,2). The basis for innate immune responses resides in the ability of the host to recognize conserved products of microbial metabolism that are unique to microorganisms and are not produced by the host (pathogen-associated molecular patterns, or PAMPs). 1 Examples of PAMPs recognized by pattern recog-nition receptors of the innate immune system include lipopolysaccharide (LPS) of Gram-negative bacteria, DNA sequences containing unmethylated CpG dinucleotides (CpG DNA), and peptidoglycan (PGN), present in both Gram-positive and Gram-negative bacteria (1,2).
In Drosophila, PGRPs activate two different signaling pathways that induce production of antimicrobial peptides. PGRP-SA interacts with the lysine-containing PGN from Gram-positive bacteria, which activates the Toll receptor pathway (2,12). PGRP-LC and PGRP-LE recognize the diaminopimelic acid-containing PGN from Gram-negative bacteria and activate the Imd/Relish pathway (2,13). The functions of mammalian PGRPs are much less well understood. However, mouse PGRP-S, which is present in neutrophil tertiary granules, inhibits the growth of certain Gram-positive bacteria in culture media and participates in the intracellular killing of bacteria in neutrophils (5). Mice deficient in PGRP-S exhibit increased susceptibility to intraperitoneal infections with low pathogenicity Gram-positive bacteria (15). Bovine PGRP-S (also termed oligosaccharide-binding protein), found in neutrophil and eosinophil granules, is bacteriostatic or bacteriocidal for both Gram-positive and Gram-negative bacteria (16). This PGRP kills, in addition, some microorganisms that completely lack PGN, suggesting that certain PGRPs recognize envelope components other than PGN. Human and mouse PGRP-L, like several Drosophila PGRPs (8,17), hydrolyze the amide bond between the N-acetylmuramic acid and L-alanine moieties of PGNs (9,18). Mouse PGRP-S has been shown to form a cytotoxic complex with the major stress protein Hsp70 that induces apoptotic death in various tumor lines at subnanomolar concentrations (19).
All insect and mammalian PGRPs contain a conserved Cterminal domain of ϳ165 amino acids (designated the PGRP domain) with ϳ30% sequence similarity to bacteriophage T7 lysozyme, a zinc-dependent N-acetylmuromyl-L-alanine amidase (2,11,20). Recently, the crystal structure of Drosophila PGRP-LB was reported (17), showing an active site cleft with a bound zinc ion, in common with T7 lysozyme. To define the structural relationship between insect and mammalian PGRPs, we determined the crystal structure of the C-terminal PGN-binding domain of human PGRP-I␣. Unexpectedly, the protein was found to exist in two oligomeric states, monomer and domain-swapped dimer, in both the crystal and solution. In contrast to Drosophila PGRP-LB, no zinc ion is present in the PGN-binding site of human PGRP-I␣. The structure further revealed that PGRPs are characterized by considerable variability in a large hydrophobic groove, located behind the PGNbinding site, which may serve as a binding site for host proteins or PAMPs besides PGN.

EXPERIMENTAL PROCEDURES
DNA Synthesis, Protein Expression, and Purification-A synthetic gene encoding full-length human PGRP-I␣ (341 residues) was assembled in vitro using recursive PCR techniques. The DNA sequence was designed for optimum expression in Escherichia coli based on codon usage preferences. DNA fragments encoding residues 177-341, or 179 -341, of the C-terminal PGN-binding domain (designated PRGR-I␣C1 and PGRP-I␣C2, respectively), or 19 -176 of the N-terminal PGN-binding domain (PGRP-I␣N), were generated by PCR from the full-length gene and cloned into the bacterial expression vector pT7-7 (Novagen). The proteins were expressed as inclusion bodies in E. coli BL21(DE3) cells (Stratagene). Bacteria were grown at 37°C to an absorbance of 0.7 at 600 nm, and isopropyl-␤-D-thiogalactoside was added to a concentration of 1 mM. After incubation for 3-4 h, the bacteria were harvested by centrifugation and resuspended in 50 mM Tris-HCl (pH 8.0) containing 0.1 M NaCl, 2 mM EDTA and 0.5% (v/v) Triton X-100; cells were disrupted by sonication. The inclusion bodies were washed three times with 50 mM Tris-HCl (pH 8.0), 0.1 M NaCl, 2 mM EDTA, and 0.5% Triton X-100, and another three times with the same buffer without Triton X-100, then solubilized in 50 mM Tris-HCl (pH 8.0), 8 M urea, 2 mM EDTA, and 1 mM dithiothreitol.
For in vitro folding, solubilized PGRP-I␣C1, PGRP-I␣C2, PGRP-I␣N were diluted to a final concentration of 20 -50 g/ml into 1.0 M arginine, 50 mM Tris-HCl (pH 8.5), 2 mM EDTA, 5 mM cysteamine, and 0.5 mM cystamine. After 3 days at 4°C, the folding mixture was concentrated, dialyzed against 50 mM Tris-HCl (pH 8.5), and applied to a Mono Q anion exchange column (Amersham Biosciences) equilibrated in the same buffer; the protein was eluted with a linear NaCl gradient. Further purification was carried out by size exclusion using a Superdex 75 HR column (Amersham Biosciences). Mass spectrometry and N-terminal sequencing established the identities of the recombinant proteins.
For data collection, crystals of PGRP-I␣C1 were cryoprotected by brief soaking in mother liquor containing 15% (volume/volume) glycerol and flash-cooled in liquid propane. X-ray diffraction data to 3.30 Å resolution were recorded in-house at 100 K using an R-axis IV ϩϩ image plate detector. Higher resolution data (2.80 Å) were collected on beamline X25 of the Brookhaven National Synchrotron Laboratory using a Quantum-4 charge-coupled device detector. Crystals of PGRP-I␣C2 were cryoprotected by soaking in mother liquor containing 15% (volume/volume) ethylene glycol before being flash-cooled. Data to 1.65 Å resolution were measured at 100 K on beamline X26C of the Brookhaven National Synchrotron Laboratory. The data were processed and scaled using DENZO/SCALEPACK (21) (Table I).
Structure Determination and Refinement-The structure of PGRP-I␣C1 was determined by molecular replacement with the program AMoRe (22) using the 3.30 Å data set. The crystal structure of Drosophila PGRP-LB (PDB accession code 1OHT) served as the search probe. The translation function search gave a clear solution with a correlation coefficient of 36.8 and R factor of 47.2% at a resolution range of 8.0 -4.0 Å. After rigid-body refinement, the correlation coefficient was 45.8 and the R factor was 44.2%. The solution contains one monomer in the asymmetric unit. The structure of PGRP-I␣C2 was solved by molecular replacement using the refined PGRP-I␣C1 model as the search probe. When checking crystal packing in XtalView (23), clashes were observed involving residues 316 -327 in symmetry-related molecules. Rebuilding these residues according to omit electron density maps revealed swapping of the C-terminal ␣-helical domain (residues 316 -341).
Refinement of the PGRP-I␣C1 model was performed with CNS (24) using the 2.80 Å data set; A -weighted 2F o Ϫ F c and F o Ϫ F c electron density maps were calculated for model adjustment. After several rounds of rebuilding in XtalView (23), including gradual residue replacement to the human PGRP-I␣ sequence, the R cryst was reduced to 34.8% and R free to 35.9%. Group temperature factor (B) refinement was done with further model adjustment. As individual B-factor refinement failed to reduce the R free , it was not applied to the final model, which comprises residues 177-341, two nickel ions, one sulfate, and 11 water molecules. The final R cryst is 21.9% and R free is 25.2% for all data between 30.0 Å and 2.80 Å ( Table I). The PGRP-I␣C2 structure was refined similarly; the final model includes residues 179 -341, one sodium ion, and 137 water molecules. The final R cryst and R free are 19.6 are 21.2%, respectively, for data between 30.0 and 1.65 Å (Table I).
Sedimentation Equilibrium-Sedimentation equilibrium studies were conducted in a Optima XL-I/A analytical ultracentrifuge (Beckman Coulter, Fullerton, CA) at a temperature of 4°C and at several rotor speeds between 20,000 rpm and 45,000 rpm. 180 l of protein was loaded into Epon double-sector centerpieces at concentrations of 0.72 and 0.24 mg/ml in 0.5 M sodium tartrate (pH 7.0) (A), and at 3.5, 1.2 and 0.4 mg/ml in 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl (B), respectively. Equilibrium absorbance profiles were acquired at wavelengths of 250, 280, and 300 nm. In buffer A, the observed weight-average buoyant molar mass values were concentration-dependent, ranging from 3670 to 5230 Da, clearly indicating self-association. Based on the known protein monomer molar mass, the buoyant molar mass at the lowest concentrations led to the effective solvated partial specific volumes of 0.749 (A) and 0.732 ml/g (B), respectively, consistent the values of 0.750 (A) and 0.736 ml/g (B) theoretically predicted for these buffer conditions based on the amino acid composition under fully hydrated conditions (using the software SEDNTERP, kindly provided by Dr. John Philo, Alliance Protein Laboratories). The protein extinction coefficient at 280 nm was calculated as 28,730 A 280 /Mcm based on amino acid composition, and the extinction coefficients at all other wavelengths were calculated as part of the global multi-wavelength analysis using the software SED-PHAT (www.analyticalultracentrifugation.com) including multiple scans of the same solution at different wavelengths (25). Data analysis was performed by global least-squares analysis of data from multiple concentrations, multiple rotor speeds, and multiple data acquisition wavelengths, based on the well known superpositions of the Boltzmann distributions of ideal species in the centrifugal field, using conservation of mass constraints (25). For the model describing reversible dimerization, the monomer and dimer distributions were linked by mass action law (26). The redistribution of a known small non-participating species of ϳ3 kDa with relative abundance of Ͻ10% of the loading concentration was included in the analysis, consideration of which only slightly influenced the quantitative aspects of the results, without affecting the qualitative conclusions. Similarly, a term for non-ideal sedimentation due to hard-sphere repulsion (second virial coefficient B 2 ϭ 3 ml/g, assuming the absence of nonideality from Donnan effects and electrostatic interactions at the buffers of high ionic strength used) (27) was considered in the analysis, leading to negligible changes in the quantitative results.
Protein Data Bank Accession Codes-Coordinates and structure factors for PGRP-I␣C1 and PGRP-I␣C2 have been deposited under accession codes 1SK3 and 1SK4, respectively.

RESULTS
Overview of the Structure-Based on predictions using the domain identification algorithm SMART (28), and on the distribution of cysteine residues, two versions of the putative PGN-binding C-terminal domain of human PGRP-I␣ were expressed in bacterial inclusion bodies and folded in vitro: 1) residues 177-341 (designated PGRP-I␣C1) and 2) residues 179 -341 (PGRP-I␣C2), which differs from PGRP-I␣C1 only in lacking Val 177 and Cys 178 . The structures of PGRP-I␣C1 and PGRP-I␣C2 were determined by molecular replacement to resolutions of 2.80 and 1.65 Å, respectively (Table I). Both structures contain a central ␤-sheet composed of five ␤-strands: four parallel and one (␤5) anti-parallel and three ␣-helices, similar to the zinc-dependent amidases T7 lysozyme (20) and Drosophila PGRP-LB (17) (Fig. 1A). In addition, both PGRP-I␣C1 and PGRP-I␣C2 contain a long N-terminal segment (residues 177-198 and 179 -198, respectively), absent from T7 lysozyme, which is also found in Drosophila PGRP-LB (Fig. 1B); however, the structure of this PGRP-specific segment differs substantially in the human and insect proteins (see below). By analogy to T7 lysozyme (20), the putative PGN-binding site of PGRP-I␣C resides in a cleft flanked by the ␣1-helix and four loops (␤3-␣1, ␣1-␤4, ␤5-␤6, and ␤7-␣3) that project above the ␤-sheet platform.
The PGRP-I␣C1 structure includes a disulfide bond linking Cys178 in the N-terminal PGRP-specific segment to Cys 300 in the ␣2-helix (Fig. 1B). PGRP-I␣C2 does not form this disulfide, as it lacks Cys 178 ; the unpaired Cys 300 is oxidized to cysteine sulfenic acid in the crystal. The Cys 178 -Cys 300 disulfide is not required for PGN binding (see below). However, the most notable difference between PGRP-I␣C1 and PGRP-I␣C2 lies in the position of the ␣3-helix, which is distal to the core of the latter protein (Fig. 1A). This helix mediates formation of a domain-swapped dimer both in the PGRP-I␣C2 crystal and in solution, as discussed later.
Disulfide Bonds in PGRP Domains-The human PGRP-I␣C domain is cross-linked by three disulfide bonds (Cys 178 -Cys 300 , Cys 194 -Cys 238 , and Cys 214 -Cys 220 ), of which only one (corresponding to Cys 214 -Cys 220 ) is present in Drosophila PGRP-LB (Fig. 1A). This buried disulfide, which appears to be present in all known PRGPs except Drosophila PGRP-LE, may be required for maintaining the structural integrity of the PGRP domain. The two additional disulfides in human PGRP-I␣, Cys 178 -Cys 300 and Cys 194 -Cys 238 , tether the N-terminal PGRP-specific segment to the ␣2-helix and loop ␣1-␤4, respectively (Fig. 1). These disulfides, which are conserved in the I and S classes of mammalian PGRPs (Fig. 2), may serve to stabilize these proteins in the harsh environment of the extracellular compartment or leukocyte granules, where they are believed to reside (5,15,16).
In insects, the Cys 178 -Cys 300 disulfide is likely found in several PGRP-S proteins (but in no PGRP-L), including Drosophila PGRP-SA and Anopheles PGRP-S3 (10). The Cys 194 -Cys 238 disulfide, however, is not present in insect PGRPs. In mammals, this disulfide distinguishes non-catalytic from catalytic PGRPs, as it is retained by all non-catalytic PGRPs, but is absent from PGRPs with demonstrated PGN-hydolyzing activity (human and mouse PGRP-L) (9,18). The Cys 194 -Cys 238 disulfide cannot form in catalytic PGRPs (mammalian or insect) due to replacement of one, or both, of the requisite cysteines (Fig. 2).
Structural Variability in PGRP-specific Segments-Although human PGRP-I␣C adopts an overall fold very similar to that of Drosophila PGRP-LB, the human and insect PGRPs differ markedly in the structures of their N-terminal segments (residues 177-198 in PGRP-I␣C), which form a hydrophobic groove with ␣2-helix on the face opposite the PGN-binding site of each protein. Whereas in Drosophila PGRP-LB this region comprises two ␤-strands (␤1 and ␤2) linked by a 3 10 helix, a corresponding 3 10 helix links two stretches of coil in the human PGRP-I␣C structures (Fig. 1B). Superposition of the two PGRP-specific segments showed that they consist of a structurally conserved central region (residues Arg 184 -Arg 190 in PGRP-I␣C), which includes the 3 10 helix, flanked by two variable regions, comprising residues 177-183 and 191-198 (Fig. 3A).
Three residues in the central region of the PGRP-I␣C Nterminal segment, Arg 184 , Trp 187 , and Arg 190 , make extensive interactions with the main body of the PGRP domain: Arg 184 and Arg 190 form bidentate salt bridges with Asp 239 and Asp 233 , respectively, while Trp 187 makes a tight hydrophobic cluster with Val 224 , Tyr 253 , and Ala 189 (Fig. 3B). Significantly, all three residues, and their specific interactions, are preserved in Drosophila PGRP-LB (Fig. 3C). Among all PGRPs, Trp 187 is invari- ant and Arg 184 is highly conserved; residues 189 and 190 exhibit moderate conservation (Fig. 2). By contrast, the regions flanking the central region (residues 177-183 and 191-198) differ at 15 out of 15 positions in human PGRP-I␣C compared with Drosophila PGRP-LB, which also contains an insertion at position 30 in its sequence. Similar sequence variability is apparent in alignments of other PGRP-specific segments (Fig.  2). The result is marked differences in the topology and dimensions of the hydrophobic groove formed by the two PGRPspecific segments and the ␣2-helix (Fig. 4), which was proposed as a potential protein-binding site (17). The different orientations of the N terminus relative to the ␣2-helix make this groove much narrower at this end in the PGRP-I␣C structure (8.6 Å between Val 177 C␣ and Glu 304 C␣) compared with PGRP-LB (18.2 Å between Thr 12 C␣ and Phe 139 C␣) (Fig. 1B). In addition, loop ␤5-␤6 protrudes into the putative proteinbinding groove of PGRP-I␣C, but not that of PGRP-LB. Both grooves contain a prominent hydrophobic pocket that is, however, deeper and wider in PGRP-I␣C than PGRP-LB (Fig. 4).
Based on sequence comparisons (Fig, 2), a similar degree of structural variability probably exists among other PGRP-specific segments as observed between those of human PGRP-I␣C and Drosophila PGRP-LB. Thus, individual PGRPs are distinguished by unique N-terminal regions, which may mediate specific interactions with different effector or signaling pro-teins (29,30), or with microbial ligands besides PGN, such as LPS (13, 31) and 1,3-␤-D-glucan (32).
The Peptidoglycan-binding Site-The putative PGN-binding cleft of PGRP-I␣C is ϳ27 Å long, with a shallow (6 -7 Å) end, flanked by helix ␣1 and loops ␤3-␣1 and ␤6 -␣2, and a deep (12-13 Å) end, flanked by loops ␤5-␤6 and ␤7-␣3. Although no crystal structure of any PGRP (or T7 lysozyme) in complex with PGN has been reported, it is tempting to speculate that the ligand would be oriented with its bulky N-acetylmuramic acid moiety situated at the shallow end of the binding groove, and the short PGN peptide held in an extended conformation at the deep end.
In contrast to Drosophila PGRP-LB, no zinc ion is visible in the PGN-binding site of human PGRP-I␣C (Fig. 5), despite the inclusion of 1 mM Zn 2ϩ in one of the crystallization buffers (see "Experimental Procedures"). This difference is explained by the substitution of zinc-coordinating residues His 42 and Cys 160 in Drosophila PGRP-LB by Ile 207 and Ser 324 in human PGRP-I␣C. Since zinc is required for the PGN-hydrolyzing amidase activity of Drosophila PGRP-LB and other catalytic PGRPs, the absence of this metal ion in human PGRP-I␣C is consistent with its ability to bind, but not hydrolyze, PGNs. Other noncatalytic PGRPs, which constitute the majority of known insect and mammalian PGRPs, also do not preserve the zinc-coordinating residues of Drosophila PGRP-LB (Fig. 2). Topological differences between the human and insect PGRPs are not restricted to this particular region of the binding site cleft, but are distributed along its entire length (not shown). Thus, human PGRP-I␣C differs from Drosophila PGRP-LB at 18 of 28 residues lining the groove, suggesting that the two PGRPs may exhibit distinct PGN binding specificities. Indeed, PGRPs have been shown to discriminate among PGNs from different microbes (3,12), presumably through differential recognition of their variable peptide moieties.
PGRP-I␣ Comprises Tandem PGN-binding Domains-Human PGRP-I␣, and the closely related PGRP-I␤, were predicted to be transmembrane proteins, based on the identification of potential membrane-spanning segments in their N-terminal regions (residues 1-176) (7, 11). However, re-examination of the N-terminal sequence of human PGRP-I␣, in light of the crystal structure of its C-terminal PGN-binding domain, re-vealed that the N-and C-terminal regions share very high sequence similarity (Ͼ40% identity), with strict conservation of four cysteines forming disulfide bonds Cys 194 -Cys 238 and Cys 214 -Cys 220 in the PGRP-I␣C structure (Fig. 2). Sequence alignments of human PGRP-I␤, and of mouse PGRP-I␣ and PGRP-I␤, produced similar results. Therefore, mammalian PGRPs of the I class are most likely soluble intra-or extracellular proteins composed of two tandem PGN-binding domains. To test this prediction, we expressed the entire N-terminal region of human PGRP-I␣ (PGRP-I␣N), excluding the putative 18-residue signal peptide, and examined its ability to recognize PGN. Indeed, the individual PGRP-I␣N and PGRP-I␣C1 (or PGRP-I␣C2) domains appeared to bind insoluble PGN from S. aureus equally well (Fig. 6A), though this assay can probably only discriminate between large affinity differences. The existence of PGRPs having multiple PGN-binding domains is not restricted to mammals. For example, sequence analysis (28) of insect PGRPs indicated that Drosophila PGRP-LF, like human PGRP-I␣, comprises two tandem PGN-binding domains and is unlikely to be transmembrane.
While the functional role of tandem PGN-binding domains remains to be elucidated, two possibilities may be considered. First, incorporation of two or more PGN-binding domains within a single PGRP might serve to augment avidity by permitting multivalent attachment to the bacterial cell wall. Second, tandem domains may confer multiple PGN-binding specificities to individual PGRPs, thereby expanding the ability of a relatively small number of receptors to detect a broad spectrum of microbial pathogens. In this regard, human PGRP-I␣N differs from PGRP-I␣C at 21 of 28 potential PGN-contacting residues in the binding site cleft (Fig. 6B), implying that the two domains could possess different fine specificities. Notably, Tyr 242 , His 264 , Thr 265 , and Asn 269 form a contiguous patch of conserved residues on the floor of the binding groove at its deep end, which may be the site for accommodation of the peptide chain of PGNs (see above). Moreover, with the exception of Thr 265 , these residues are Ͼ80% conserved in other PGRPs. (Table I) to yield essentially identical structures, this was not the case: whereas in PGRP-I␣C1 crystals the PGRP domain exists as a monomer, it forms a domain-swapped dimer in PGRP-I␣C2 crystals, with the two monomers related by a 2-fold rotational symmetry axis (Fig.  7A). That is, two PRGP molecules exchange their C-terminal ␣3 helices, such that each helix makes the same interactions in the dimer as in the monomer, but the interactions are interrather than intramolecular. A secondary interface is introduced between the subunits in the domain-swapped dimer through juxtaposition of loop ␤5-␤6 and helix ␣1 (Fig. 7B). Although dimerization appears to partially occlude the PGNbinding sites of both monomers (Fig. 7B), the PGRP-specific segments are fully exposed for possible interactions with other microbial ligands (13,31,32), or with host proteins (29,30). It should also be noted that full-length PGRP-I␣ includes an N-terminal PGRP domain with PGN binding activity (Fig. 6A). Consequently, dimerization of full-length PGRP-I␣, in the manner observed in PGRP-I␣C2 crystals (Fig. 7A), would create an assembly having two identical PGN-binding sites at opposite poles.

PGRP-I␣ Dimer Formation-Although we had expected the two crystal forms of PGRP-I␣C
To form the domain-swapped dimer, the ␤7-␣3 hinge loop connecting the C-terminal ␣-helix of PGRP-I␣C to the rest of the structure adopts a more extended conformation than in the monomer, where it folds back on itself (Fig. 7, C and D). During refinement of the PGRP-I␣C2 structure, a strong persistent peak of electron density was observed in both 2F o Ϫ F c and F o Ϫ F c maps at the center of the hinge loop, surrounded by a cluster of oxygen atoms. Efforts to model this density using various ions (Zn 2ϩ , Ca 2ϩ , Mg 2ϩ , K ϩ , Na ϩ ), or water, yielded satisfactory results only for Na ϩ , which was present in the crystallization buffer of dimeric, but not monomeric, PGRP-I␣C (see "Experimental Procedures"). The temperature factor of the bound sodium ion (15.5) is comparable to the average B value for the main chain (19.1). The sodium is pentahedrally coordinated by the carbonyl oxygens of Ser 317 , Val 319 , and Ile 322 , and by the hydroxyl oxygen of Ser 324 ; a water molecule provides a fifth sodium ligand (Fig. 7C). All the metal-ligand distances (2.3-2.5 Å) are within the range seen for sodium ions in high resolution protein structures (33). The bound Na ϩ could contribute to stabilizing the particular conformation of the hinge loop found in the dimer, as suggested by the fact that the structure of this loop in monomeric PGRP-I␣C, which crystallized in the absence of sodium, is incompatible with Na ϩ coordination (Fig. 7D).
To address whether PGRP-I␣C dimerizes in solution, as well as in the crystal, and whether Na ϩ promotes dimerization, the protein was characterized by size exclusion chromatography. In the absence of sodium, PGRP-I␣C behaved as a monomer of 16 kDa, with little or no detectable dimer (Fig. 8A). However, in the presence of 300 mM NaCl, at pH 8, a clear dimer peak of 34 kDa, representing ϳ20% of the total protein, was observed (Fig. 8B). These results were corroborated by analytical ultracentrifugation. In the presence of sodium tartrate (pH 7), the salt used for crystallization, sedimentation equilibrium profiles showed 31% of PGRP-I␣C in a dimeric state, with an estimated K D of 130 M (Fig. 9). Similar results were obtained using NaCl. Thus, sodium-mediated dimerization of this pattern recognition molecule occurs in both crystal and solution states. DISCUSSION Except for members of the PGRP-S class, which are clearly secreted in both insects and mammals (3-7, 15, 16, 32), most other PGRPs have not been identified experimentally as transmembrane, intracellular, or secreted molecules. Indeed, considerable caution is warranted in classifying individual PGRP-I and PGRP-L receptors as transmembrane or soluble based solely on structure predictions. Accordingly, we demonstrated the absence of a putative transmembrane segment in the Nterminal portion of human PGRP-I␣ by folding the entire region in aqueous solution into a functional PGN-binding domain. This result, in conjunction with structure-based sequence alignments, indicates that mammalian PGRP-I␣ and PGRP-I␤, and certain insect PGRP-L (e.g. Drosophila PGRP-LF), are soluble intracellular or secreted proteins consisting of tandem PGRP domains. As evidence that mammalian PGRP-L is also soluble, a 64 kDa N-acetylmuramoyl-L-alanine amidase purified from human serum was found to have an N-terminal sequence identical to that predicted for mature PGRP-L (34). Since all mammalian PGRPs contain leader sequences, they should be localized in intracellular vesicles, where they may encounter phagocytosed bacteria, or be secreted upon fusion of these vesicles with the cell membrane. Another pattern recognition molecule, Toll-like receptor 9, was recently shown to require translocation from the endoplasmic reticulum to a CpG DNA-containing lysosomal compartment for ligand binding and signal transduction (35).
Recent evidence indicates that PGNs may not be the sole microbial surface determinants recognized by PGRPs. For example, the unexpected finding that bovine PGRP-S is microbicidal for some organisms that lack PGN (Cryptococcus neofor-mans), or in which PGN is obscured by LPS (Salmonella typhimurium), suggests that envelope components other than PGN may interact, directly or indirectly, with certain PGRPs (16). In support of this view, Drosophila PGRP-LC is required for activation of the Imd/Relish pathway by LPS, as well as PGN (13,31). Similarly, recombinant PGRP-S proteins from the large beetle Holotrichia diomphalia were found to specifically bind both 1,3-␤-D-glucan and PGN (32). In addition to microbial products, PGRPs interact with yet unidentified host effector molecules responsible for downstream signaling, with other pattern recognition receptors such as Gram-negativebinding protein 1 (29,30), and with heat shock proteins (19). It seems unlikely that the relatively conserved PGN-binding site of PGRPs can recognize such an array of structurally diverse microbial and host ligands. Rather, it is more probable that the PGRP-specific segment, which we have shown exhibits considerable topological variability (Fig. 4), mediates binding to non-PGN ligands, with different segments forming functionally distinct binding sites. However, direct binding studies will be required to test this hypothesis.
Three-dimensional domain swapping, the process whereby two or more proteins form a dimer or higher order oligomer by exchanging an identical structural element, has now been observed in over 30 crystal structures (36,37). As the number of domain-swapped proteins continues to rise, the question of biological relevance grows in importance. While some domainswapped dimers are undoubtedly crystallization artifacts (36,37), a variety of proteins have been shown to exist as domainswapped oligomers in vivo, including T7 helicase (38), T4 endonuclease VII (39), coagulation factors IX/X-binding protein (40), and bleomycin resistance protein (41). In addition, domain swapping regulated by ligands has been described for glyoxylase I (42) and the cell cycle regulatory protein suc1 (43,44), where the equilibrium between monomer and dimer is altered by glutathione and phosphopeptide, respectively, suggesting a role for domain swapping in allostery and signal transduction.
We do not know whether the domain swapping observed in PGRP-I␣C (Fig. 7A) has functional significance; however, several possibilities exist. While dimerization appears to block the PGN-binding sites of the PGRP-I␣C monomers (Fig. 7B), the PGRP-specific segments remain completely available for the potential recognition of other PAMPs (13,31,32) or effector proteins (17,29,30). In addition, since full-length PGRP-I␣ includes an N-terminal PGRP domain with PGN binding activity (Fig. 6A), dimerization of full-length PGRP-I␣ would generate a molecule with two identical PGN-binding sites at its opposite ends. Such a dimer would be bivalent not only with respect to the PGN-binding site of the N-terminal PGRP domain, but also with respect to each of the two PGRPspecific segments present in full-length PGRP-I␣ (Fig. 2). Domain swapping may therefore serve to augment the binding avidity of PGRPs, whose monomeric affinities for certain ligands might be weak. Alternatively, dimerization may create multivalent templates for directing the assembly of multiprotein signaling complexes as PGRPs concentrate on the cell walls of invading microbes. Finally, an exchange of ␣3 helices between different PGRPs would generate heterodimers, whose existence has been postulated to explain the simultaneous requirement of two splice forms of Drosophila PGRP-LC for the response to LPS (13,31). In this regard, coagulation factors IX/X-binding protein is a domain-swapped heterodimer formed between homologous C-type lectin domains (40).
Sodium-binding sites have been identified in numerous protein structures (33). However, as in the case of domain swapping, the possible role of bound sodium ions in regulating protein function or stability is generally not well understood FIG. 9. Sedimentation equilibrium analysis of PGRP-I␣C dimerization. a, absorbance distributions for the sedimentation of PGRP-I␣C (0.72 mg/ml) in 0.5 M sodium tartrate (pH 7.0) at 4°C at rotor speeds of 20,000 rpm (circles), 25,000 rpm (triangles), and 30,000 rpm (squares). Only every second data point is shown. Distributions were analyzed as part of a global fit to absorbance data at multiple wavelengths and loading concentrations. Solid lines are the global best-fit distributions using a reversible monomer-dimer model with an equilibrium constant of K D ϭ 130 M. b, residuals of the fit, with a r.m.s. deviation to the data shown of 0.004 A 300 . c, for comparison, the residuals of the best-fit distributions are shown if the absence of dimer was assumed, with an otherwise identical model. (45). A notable exception is the allosteric serine protease thrombin (46,47), whose specificity switches from anticoagulant to procoagulant substrates upon binding Na ϩ , as a result of subtle conformational changes in the primary specificity pocket (48). Here we have shown that sodium ions promote dimerization of PGRP-I␣ by stabilizing a particular conformation of a flexible hinge loop through coordination of specific loop residues, suggesting a novel role for bound Na ϩ in altering protein structure. Since PGRP-I␣ exists in solution in monomer-dimer equilibrium, this finding raises the intriguing possibility that monovalent cations could modulate PGRP function by promoting homo-or heterodimerization of these pattern recognition receptors. Future studies will address this and other functional questions through direct measurements of PGRP binding to its ligands.