Crystal structure of the allergen Equ c 1. A dimeric lipocalin with restricted IgE-reactive epitopes

The three-dimensional structure of the major horse allergen Equ c 1 has been determined at 2.3 Å resolution by x-ray crystallography. Equ c 1 displays the typical fold of lipocalins, a b -barrel flanked by a C-terminal a -helix. The space between the two b -sheets of the barrel defines an internal cavity that could serve, as in other lipocalins, for the binding and transport of small hydrophobic ligands. Equ c 1 crystallizes in a novel dimeric form, which is distinct from that observed in other lipocalin dimers and corresponds to the functional form of the allergen. Binding studies of point mutants of the allergen with specific monoclonal antibodies raised in mouse and IgE serum from horse allergic patients allowed to identify putative B cell antigenic determinants. In addition, total inhibition of IgE serum recognition by a single specific monoclonal antibody revealed the restricted nature of the IgE binding target on the molecular surface of Equ c 1. The incidence of

The incidence of allergic diseases is increasing in developed countries resulting from growing exposure to allergens, altered stimulation of the immune system during development, and probably facilitating an adjuvant effect of the environment. Allergy to animals is distinguishable by its intensity and the possibility of sensitization by a limited contact with danders or hair. The animals responsible for allergy are obviously familiar domestic cats and dogs, but also the horse, which is often incriminated. Five main horse allergens have been isolated and purified (1,2). Among these, the Equ c 1 protein (molecular mass 21.5 kDa, 187 amino acids, pI ϭ 4.5) has been defined as a major allergen, because it induces an IgE-mediated type I allergic reaction in a majority of the patients allergic to horses.
Whether a protein exhibits special structural characteristics that are responsible for its allergenic properties is an issue that remains poorly understood. Although the three-dimensional structures of some allergenic lipocalins are known (mMUP (9), rat urinary ␣2-globulin (10), bovine ␤-lactoglobulin (17,18), and Bos d 2 (19)), little information is available about the nature of the B epitopes recognized by the IgE immunoglobulins. It has been suggested that allergenic proteins may share some common structural features capable of eliciting an IgE response (14) and also that sequence similarities between allergenic lipocalins could indicate putative IgE binding regions (19). To gain further insight into the nature of the IgE B epitopes, we describe here a crystallographic and immunochemical study of recombinant Equ c 1. The three-dimensional structure of the protein has been determined at 2.3 Å resolution, and binding studies of Equ c 1 surface mutants by specific monoclonal antibodies provided valuable information to map the Equ c 1 antigenic determinants and to delineate the regions of the protein surface primarily recognized by IgE serum from allergic patients.

EXPERIMENTAL PROCEDURES
Crystallization and Data Collection-Protein expression, purification, and crystallization of recombinant sublingual gland (rSLG) Equ c 1 have been previously described (3,4). Briefly, rSLG Equ c 1 was expressed as a His-tagged protein in Escherichia coli, purified by metalion affinity chromatography, treated with factor Xa to excise the His tag, and concentrated to 6.0 mg/ml for crystallization. Tetragonal bipyramidal crystals, space group P4 1 2 1 2, were grown at 291 K using the hanging drop vapor diffusion technique (4). A first diffraction data set (with an overall R-merge value of 8.9%) was collected at 2.9 Å resolution from a flash-frozen (110 K) crystal using synchrotron radiation at EMBL/DESY (Hamburg). Subsequently, a second data set was recorded at room temperature to a higher (2.3 Å) resolution using the X31 beamline at EMBL/DESY ( ϭ 1.2 Å) from a capillary-mounted crystal (the statistics of data collection are shown in Table I). All diffraction data were processed with the programs DENZO and SCALEPACK (20). The unit cell parameters measured at room temperature (a ϭ b ϭ 84.04 Å, c ϭ 58.48 Å) are slightly different from those found for the frozen crystal (a ϭ b ϭ 84.36 Å, c ϭ 54.89 Å). In particular, the significant shrinkage (8%) of the c parameter upon flash freezing of the crystal could be related with the loss of diffracting power of the crystal at low temperature.
Structure Determination and Refinement-The structure was solved by molecular replacement methods using the program AMoRe (21). The atomic coordinates of the mouse mMUP protein (PDB code 1MUP) (9), which shares 49% of sequence identity with Equ c 1 (Fig. 1), was used as the search model. A clear solution was obtained using the low temperature data set, with a correlation coefficient of 0.39 and a crystallographic R-factor of 48.7%. Initial rigid body refinement was carried out with the program XPLOR (22), followed by 31 cycles using the slow cool protocol of XPLOR for all reflections with F Ͼ 3(F) between 10 and 2.9 Å resolution. At this stage (R-factor ϭ 27.5%, free R-factor ϭ 37.9%), the L3 loop involving residues 82-86 and the C-terminal end of the protein (beyond the long ␣-helix) were not visible in the electron density map.
The current model was then positioned in the room temperature unit cell by molecular replacement, and further crystallographic refinement was carried out against the 2.3 Å data set obtained at room temperature using the maximum likelihood refinement program REFMAC (23) from the CCP4 package (24). At each stage of the refinement, errors in the model were detected by examination of (F o -F c ) and (2F o -F c ) maps with the program O (25). Toward the end of the refinement the whole polypeptide chain could be traced unambiguously in the electron density map, and 79 water molecules were added to the protein model using the program ARP (26). Refinement converged to a final R-factor of 19.5% (free R-factor ϭ 25.2%) for intensities between 15 and 2.3 Å ( Table I). The average real space correlation coefficient for all main chain atoms of the final model is 0.95 (0.89 for side-chain atoms). The atomic coordinates of Equ c 1 have been deposited with the Protein Data Bank (code 1EW3).
Production and Binding of Monoclonal Antibodies-Eight-week-old Balb/c mice were injected three times with natural Equ c 1. Conventional procedures were used to fuse immune spleen cells to SP2/0 cells with polyethylene glycol, and specific IgG-secreting hybridomas were then selected after screening of the supernatants. For ELISA screenings, purified Equ c 1 molecule, either natural (from horse dander extract) or recombinant (wild-type or point mutants), was coated at 10 g/ml in 0.1 M carbonate/bicarbonate buffer, pH 9.6, in 96-well microtitration plates. After saturation with 0.5% bovine serum albumin, serial concentrations of mouse mAb (0.01 to 100 g/ml), rabbit polyclonal Ab (0.01 to 100 g/ml), or human serum IgE (1/2 to 1/100) were added in duplicate or triplicate to each well and incubated at 37°C during 1 h 30 m. Binding was revealed with peroxidase-labeled goat anti-mouse IgG, goat anti-rabbit Ig, or biotinylated goat anti-human IgE ϩ peroxidasecoupled streptavidin, respectively, followed by o-phenylenediamine according to the manufacturer's recommendations.
Competitive experiments were performed by co-incubating serial dilutions of the competitor Ab (0.1 to 100 g/ml) with a predetermined concentration of the anti-Equ c 1 Ab biotinylated (for the experiments reported in Fig. 4) or not (for the experiments reported in Fig. 5) in Equ c 1-coated microtitration plates. Residual binding was revealed with peroxidase-labeled streptavidin or appropriate peroxidase-labeled secondary Ab, respectively.

RESULTS
The Overall Structure of Equ c 1-The final model of the Equ c 1 monomer (Fig. 2a) includes 159 amino acid residues (from Ala 23 to Gly 181 ). The final parameters of refinement and model stereochemistry are summarized in Table I. Only the solventexposed side chain of Arg 131 in a surface loop was not visible in the electron density map and was modeled as alanine, although other exposed amino acid residues (Arg 26 , Asp 67 , Glu 82 , Glu 94 , Glu 148 , and Glu 165 ) also display very high temperature factors for their side-chain atoms. All but one nonglycine and nonproline residues (99.3%) have main-chain dihedral angles which fall within allowed regions of the Ramachandran diagram, as defined by PROCHECK (27). The outlier Glu 115 ( ϭ 70.3°, ϭ Ϫ35.6°), is located in the exposed L6 loop, a structurally conserved region in lipocalins (28), which often displays unfavorable main-chain conformations (18).
The overall structure of Equ c 1 is similar to that of other lipocalins (6 -9, 17-19, 29 -32), an 8-stranded ␤-barrel flanked by an ␣-helix. In Equ c 1, there is a short additional ␤-strand and one helical turn at the C-terminal end of the polypeptide chain. A conserved disulfide bridge links Cys 83 in loop L3 with Cys 176 at the C-terminal region of the chain. As in most lipocalins, the ␤-barrel defines an internal cavity, which can accommodate a small hydrophobic ligand. Loops L2 (connecting ␤B-␤C), L4 (␤DϪ␤E), and L6 (␤FϪ␤G) define the "closed" end of the ␤-barrel, whereas loops L1 (partly helical), L3, and L5 form the entry site for ligand binding (Fig. 2). The protein backbone of the horse allergen is structurally similar to that of the mouse allergen mMUP (9), with an overall root mean square deviation between main-chain atoms of 1.124 Å (Figs. 1 and 2b). In particular, the core of the structure (the two ␤-sheets and the ␣-helix) and the long L1 loop, which partially closes the ligand entry site, are well conserved. The most important differences between the two structures involve the Nand C-terminal ends of the polypeptide chain. Minor, but significant, differences are also observed for loops L2 (which includes a single amino acid insertion in Equ c 1, see Fig. 1), L3, L6, and L7 (Fig. 2b).
The Equ c 1 Dimer-Equ c 1 is found to form a dimer in the crystal, in agreement with gel filtration experiments in solution at neutral pH (4). The crystallographic dimer is formed by the side-to-side packing of the ␤-barrels through the interaction of strands ␤F, ␤G, and ␤H from each monomer, with the two ␣-helices on the same side of the dimer and running antiparallel to each other (Fig. 2c). The entry sites to the putative binding sites of Equ c 1 (loops L1, L3, and L5) are located on opposite sides of the dimer and are accessible for ligand binding. The tight association buries 1070 Å 2 of exposed molecular surface from each monomer (as calculated with the program Although monomeric forms of lipocalins have been characterized (such as the allergen Bos d 2 (19), the retinol-binding protein (8), or the human neutrophil gelatinase-associated protein (31)), lipocalins often exist as dimers or higher oligomers. However, no consistent pattern of dimerization is obvious from the analysis of known three-dimensional structures. Indeed, the mode of dimerization of Equ c 1 has not been previously observed in other lipocalin dimers studied by x-ray diffraction (Fig. 3). In mMUP, the dimer interface includes the ␤-sheet composed of strands B, C, and D, and two Cd 2ϩ ions (9), whereas bovine ␤-lactoglobulin dimerizes by joining the first ␤ strand and the L1 loop of each monomer (17), and dimerization of the odorant-binding protein involves the swapping of the ␣-helix between the two monomers (7,8).
The Putative Ligand Binding Site-The biological role of Equ c 1 is unknown. As in most proteins belonging to the lipocalin family, its physiological role might be concerned with the binding and transport of small hydrophobic ligands. The internal space between the two ␤-sheets of Equ c 1 defines a hydrophobic pocket similar to that observed for mMUP. Indeed, amino acid residues that are in contact with the pheromone ligand in mMUP are also well conserved in Equ c1 (Fig. 1). The mMUP residues Leu 44 , Val 58 , Phe 60 , Phe 94 , Ala 107 , Leu 109 , Leu 120 , and Tyr 124 , are replaced in Equ c 1 by Val 58 , Ala 73 , Tyr 75 , Phe 109 , Leu 122 , Leu 124 , Leu 135 , and Tyr 139 , respectively. The presence of the additional hydroxyl group of Tyr 75 in Equ c 1 (Phe 60 in mMUP) is compensated by the smaller volume occupied by the neighboring residues Ala 73 and Val 58 (Val 58 and Leu 44 in mMUP), therefore suggesting that Equ c 1 could bind small hydrophobic molecules similar to those recognized by mMUP.
The partially helical L1 loop connecting ␤-strands A and B folds over the entrance to the cavity, forming a lid that apparently precludes ligand access to the putative binding site (Fig.  2). The temperature factor values of L1 residues are similar to those of the protein core, and the observed conformation of the loop is stabilized by several hydrogen bonding interactions (Table II). Because no residues from this loop are involved in crystal contacts, the observed closed conformation appears to be an intrinsic property of Equ c 1, at least under the current crystallization conditions (high ionic strength, basic pH). Indeed, no substitution was observed in crystals of Equ c 1 soaked with different lipocalin ligands (histamine, retinol, retinoic acid, S-␤ citronellol, methone, ␣Ϫionone, benzyl benzoate, or 3-pyridine-propanol) lending further support to the hypothesis of a blocked binding site. It is conceivable that more physiolog-  Fig. 2c); B, bovine ␤-lactoglobulin (PDB entry code 1BEB (18)); C, bovine odorant-binding protein (PDB entry code 1OBP (8)); D, mMUP (PDB entry code 1MUP (9)). For each protein, one monomer (in light gray) is shown in a similar orientation, except for mMUP, which has been slightly rotated for clarity. This figure was produced with the program MOL-SCRIPT (42).
ical conditions (i.e. lower ionic strength and/or more acidic pH) could favor a modification of the L1 loop toward an "open" conformation allowing the access of ligands to the binding pocket.
Restricted IgE Epitopes-To gain further insight into putative regions of the Equ c 1 molecule that might define B cell epitopes, mAbs were raised in Balb/c mice immunized with natural Equ c 1 purified from horse hair dander extract (3). Four hybridomas were selected for the secretion of specific antibodies (IgG 1 , ) reacting with the natural form of Equ c 1 (Table III). Dissociation constants as measured by ELISA (34) are in the range of values reported for conventional specific Abs (Table III). Treatment of Equ c 1 with the reducing compound dithiothreitol did not affect the binding of mAbs 118, 220, and 65 but abolished the reactivity of mAb 197, suggesting that the epitope recognized by the latter depends on the integrity of the disulfide bridge. Competition experiments in ELISA (Fig. 4) revealed that mAbs 118 and 197 mutually inhibited each other for binding to natural Equ c 1, indicating that both mAbs recognize overlapping epitopes. However, mAb 220 binds to a different target on the molecular surface, because its reactivity with natural Equ c 1 was not affected by the presence of mAbs 118 or 197 (and conversely).
Analysis of the overall structure of the Equ c 1 monomer revealed four protruding, solvent-accessible regions that could determine putative targets for antibody binding (35). Charged amino acid residues belonging to these regions were substituted by site-directed mutagenesis: R26S, E82A, K129S, and E148A (see Fig. 6). The four point mutants of rSLG Equ c 1 exhibited the same molecular weight as evaluated from SDSpolyacrylamide gel electrophoresis (molecular mass 19.5 kDa) and isoelectric point (pI 4.5) than the unmutated form (data not shown). Their immunoreactivity, as compared with natural and recombinant wild-type Equ c 1, was tested in direct binding ELISA with the four available anti-Equ c 1 mAbs and human IgE from various horse allergic patients (Table IV). Mouse mAb 65 (which binds to natural glycosylated Equ c 1 but not to recombinant wild-type Equ c 1) failed to recognize all four Equ c 1 mutants, thus confirming that protein-bound carbohydrate should be part of the mAb 65 epitope. The other three mAbs (118, 197, and 220) and the IgE serum were able to recognize all four point mutants as evaluated from direct ELISA, with only one exception; mAb 197 failed to react with E82A. This lack of binding further supported the hypothesis that the L3 loop is part of the mAb 197 epitope and is also consistent with the absence of reactivity of this mAb to reduced natural Equ c 1 (Table III), the substitution at position 82 being close to one of the cysteine residues involved in the intramolecular disulfide bridge (Cys 83 -Cys 176 ) of Equ c 1.
To further delineate putative epitopes recognized by human serum IgE from horse allergic patients, competitive experiments were performed with the four anti-Equ c 1 mAbs. Serial dilutions of competitor antibody, either rabbit polyclonal (used as control inhibitor) or mouse monoclonal, were incubated with natural or recombinant-coated Equ c 1 together with individual human serum IgE. As expected, the binding of IgE serum from a representative patient (similar results were reproduced with IgE serum from various horse allergic patients taken individually) was totally inhibited by the anti-Equ c 1 rabbit polyclonal Ab, which should contain a complete spectrum of antibodies directed against various regions of Equ c 1 (Fig. 5). No inhibition of IgE binding was noticed with mAbs 118 or 197, whereas mAb 65 could partially inhibit recognition of natural Equ c 1 (Fig. 5A), suggesting that some IgE epitopes overlap with the glycosylated epitope recognized by mAb 65. More interestingly, total inhibition of serum IgE binding to either natural or recombinant Equ c 1 was observed with mAb 220, indicating that the IgE-reactive epitopic regions of Equ c 1 are highly restricted. Although conformational changes of the al-   4. Competitive ELISA between three mouse anti-Equ c 1 mAbs. Natural Equ c 1 (10 g/ml) was coated on the microtitration plate, and biotinylated mAbs (1 g/ml) (*) as indicated were co-incubated with competitor mAb 118 (open bar), mAb 197 (hatched bar), or mAb 220 (closed bar) at 10 g/ml. Results are expressed as the percentage of residual biotinylated mAb binding revealed by peroxidase-labeled streptavidin.

TABLE IV
Immunoreactivity of natural and recombinant Equ c 1 and point mutants of rSLG Equ c 1, as determined by direct ELISA Results are expressed qualitatively with respect to Natural Equ c 1 binding. Ϫ, Ͻ10%; ϩ, between 10% and 50%; ϩϩ, between 50% and 75%; ϩϩϩ, Ͼ75%. mAbs were tested at 10 g/ml. lergen induced by mAb 220 could also explain the observed inhibition, such a possibility is unlikely because binding of mAb 220 to Equ c 1 did not affect the reactivities of mAbs 197 and 118 (Fig. 4). DISCUSSION The allergic response is determined by a number of factors such as the genetic background and immunological history of the individual and the dose, mode of entry, and biochemical nature of the allergen. Structural knowledge of allergens may therefore have considerable impact on the generation of tools for the immunotherapy of allergic diseases, by providing insight into the mechanisms of immune recognition (14,19) and the physiological role (36) of allergens.
Equ c 1 is the main major allergen in patients sensitive to horses. It is a member of the lipocalin family of proteins that includes several well studied allergens, such as Mus m 1 (mMUP), Bos d 2, Bos d 5 (␤-lactoglobulin), Bla g 4, Can f 1, Can f 2, and Rat n 1. Although the physiological role of Equ c 1 is not understood, it presumably relies on the ability of most lipocalins to bind and transport small hydrophobic molecules. The nature and size of the amino acid residues defining the internal pocket and the overall structural similarity of Equ c 1 with Mus m 1 suggest that both allergens could bind similar types of ligands. In addition, preliminary experiments using affinity chromatography with histamine-bound agarose gel suggested that Equ c 1 may bind histamine, 2 a mediator released by basophils and mast cells, which plays a crucial role in allergy. Such a ligand has been described for other lipocalins like those found in the saliva of the blood-feeding insect Rhodnius prolixus (37) and Rhipicephalus appendiculatus ticks (38). However, the specificity of the horse allergen for histamine needs to be confirmed, because the hydrophilic nature of the ligand has no counterpart within the hydrophobic binding site of Equ c 1, and soaking trials of Equ c 1 crystals with various lipocalin ligands (including histamine) were inconclusive.
Equ c 1 crystallizes as a tight dimer with an extended contact surface between monomers. Because a dimer has also been observed in solution at neutral pH (4), it should represent the relevant form of Equ c 1 in vivo, when it interacts with the immune system of the human host. Two putative N-glycosyla-tion sites, Asn 53 and Asn 68 , can be predicted from the amino acid sequence of Equ c 1, although analysis of the saccharide composition of natural Equ c 1 indicated that only one of these sites is occupied by carbohydrate (3). The three-dimensional structure of rSLG Equ c 1 showed that the side chains of the two Asn residues are exposed to the solvent (Fig. 6), suggesting that in each case the covalent attachment of carbohydrate would not interfere with either the monomer structure or dimer formation.
Binding studies of mouse mAbs specific for Equ c 1 allowed to outline putative B cell antigenic determinants of the allergen. The epitope recognized by mAb 65 was shown to involve the region close to the single glycosylation site of Equ c 1 and might coincide in part with some IgE epitopes according to the partial inhibition of IgE binding observed in competition experiments (Fig. 5A). On the other hand, mAbs 118 and 197 bind to a mutually overlapping, though not identical, region of the molecule, because both mAbs compete with each other for binding ( Fig. 4) but react differently with reduced Equ c 1 (Table III). Their corresponding epitopes probably involve residues from the L3 loop, as indicated by the lack of binding of mAb 197 to both the E82A mutant (close to the disulfide bridge Cys 83 -Cys 176 ) and dithiothreitol-reduced Equ c 1.
Unexpectedly, binding of IgE serum from allergic patients was totally inhibited by a single monoclonal antibody, mAb 220 (Fig. 5). This observation can be explained either by the overlap of the corresponding IgE and mAb 220 epitopes or alternatively by conformational changes on the allergen upon binding of mAb 220 that could destroy nonoverlapping IgE epitopes somewhere else on the allergen surface. However, the second hypothesis is unlikely, because mAb 220 binding did not affect the reactivity of the other mouse monoclonal antibodies in competitive assays (Fig. 4). Therefore, it follows that the dominant IgE epitopes of Equ c 1 are circumscribed to a restricted region of the allergen surface, which overlaps with the mAb 220 epitope. These results differ from those found by Olson and Klapper (39), who used a similar approach (inhibition studies with murine mAbs) to identify putative IgE binding sites on antigen E, an allergen isolated from short ragweed pollen. These authors identified two major IgE antigenic sites in inhibition and double bind solid-phase ELISA, but in their case the binding of IgE in pooled human serum from allergic individuals was only partially (50%) blocked by murine mAbs specific for each epitope.
A date regions for the putative binding targets of human IgE and mouse mAb 220 in Equ c 1: (i) the epitopes are unlikely to include a protein-bound carbohydrate because mAb 220 and IgE serum display a similar reactivity for natural and recombinant Equ c 1 (Table IV and Fig. 5), (ii) the region around the L3 loop of the allergen can also be excluded from the epitope because mAb 197 (which binds to this region) does not compete with IgE or mAb 220 binding (Figs. 4 and 5), and (iii) the four single point mutations of Equ c 1 (involving in each case the substitution of a charged residue by Ala or Ser) did not modify the reactivity of the molecule with mouse mAb 220 or horse allergic patient IgE (Table IV), suggesting that these amino acid residues are not part of the corresponding epitopes. As a consequence, the highly accessible region comprising the L2 loop and part of the adjacent ␤-sheet (strands A-B-C-D) on the allergen surface (Fig. 6) appears as the most promising candidate for a critical IgE epitope.
A second putative target, on the opposite face of the dimer, is provided by the first part of the C-terminal loop of the molecule (residues 158 -172 following the ␣-helix). However, the similar reactivity of mAb 220 with nonreduced and reduced Equ c 1 (Table III) seems to argue against this possibility, because the disruption of the intramolecular disulfide bridge Cys 83 -Cys 176 is expected to modify the conformation of the 158 -172 segment. Moreover, an epitope within this region would be confined to the space delimited by the two Glu 148 residues from each monomer (see Fig. 6), because mAb 220 recognizes the single mutant E148A. Therefore, given the molecular size of the antibody combining site, the actual epitope should extend through the dimer interface and could not represent a common determinant of allergenic lipocalins with different modes of dimerization.
It is possible that the IgE immune response to exogenous (nonself) lipocalin allergens could be directed to specific binding targets on the molecular surface (14), in agreement with the restricted nature of B and T cell responses observed against exogenous lipocalins (40,41). However, very limited information is available about B-cell epitopes on allergenic lipocalins. Rouvinen et al. (19) compared the amino acid sequences of several lipocalin allergens and detected a resemblance between the molecules in certain areas of the surface, which could indicate common determinants of allergenicity. These regions include the N-and C-terminal ends of ␤-strand A, the L3 loop, and two highly conserved charged residues (Glu 148 and Glu 151 in Equ c 1) in the exposed face of the ␣-helix. The binding studies of Equ c 1 reported here provide evidence against the involvement of the L3 loop and the conserved glutamates in a common IgE epitope. In contrast, both conserved ends of ␤-strand A are exposed to the solvent close to the putative IgE binding targets identified in our study; the N-terminal end is adjacent to the L2 loop and the C-terminal end is in contact with residues from the loop 158 -172 following the ␣-helix. These results lend support to the possible existence of a common critical determinant of allergenicity in lipocalins and suggest putative IgE binding targets of the major horse allergen for further mutational studies.