Probing the molecular basis of allergy. three-dimensional structure of the bovine lipocalin allergen Bos d 2.

The three-dimensional structure of the major bovine allergen Bos d 2 has been determined by using x-ray diffraction at 1.8-A resolution. Structurally Bos d 2 is a member of the lipocalin family comprising proteins with transport functions. There is a flat small cavity inside the Bos d 2 protein core suitable for ligand binding, and it is possible that Glu115 and Asn37 inside the core are able to make hydrogen bonds with the ligand. Many allergens from different animals belong to the lipocalin family. The amino acid residue similarities between these lipocalins indicate putative regions for IgE binding. Comparison with the available allergen structures from other sources suggests that these allergens are roughly the same size and that their shape is more spherical than elliptical.

The three-dimensional structure of the major bovine allergen Bos d 2 has been determined by using x-ray diffraction at 1.8-Å resolution. Structurally Bos d 2 is a member of the lipocalin family comprising proteins with transport functions. There is a flat small cavity inside the Bos d 2 protein core suitable for ligand binding, and it is possible that Glu 115 and Asn 37 inside the core are able to make hydrogen bonds with the ligand. Many allergens from different animals belong to the lipocalin family. The amino acid residue similarities between these lipocalins indicate putative regions for IgE binding. Comparison with the available allergen structures from other sources suggests that these allergens are roughly the same size and that their shape is more spherical than elliptical.
Allergy is a result of an inappropriate immune response in persons with a genetic predisposition or atopy. At least 20% of the population in the industrialized countries suffer from different forms of allergies (1). The development of allergy is a multistep process, and the mechanisms leading to sensitization, production of IgE antibodies, and allergic diseases are complex and not fully understood. However, allergies arise in response to certain otherwise innocuous proteins, allergens. The list of denominated allergens contains more than 170 proteins from different species (2). These proteins are structurally and functionally a heterogeneous group, including hydrolytic enzymes, enzyme inhibitors, and transport molecules, but there are also a number of proteins without a known function (3). One group of proteins, which has been found recently to include several animal allergens, is the lipocalin family (4).
Recognition of a protein by IgE antibodies and by receptors on the T helper type 2 lymphocytes is a critical requirement for allergenicity. The receptor of T helper type 2 cells recognizes allergens as processed linear peptides in association with major histocompatibility class II molecules on antigen-presenting cells (5). On the other hand, IgE antibodies bind on the surface of the allergen molecules and therefore the binding sites or B-cell epitopes are often conformational and thus dependent on the three-dimensional structure. The allergic reaction is triggered by the binding of an allergen to IgE molecules on mast cells. The cross-linking of mast cell-bound IgE molecules by an allergen postulates that the allergenic protein contains more than one B-cell epitope.
During the past few years, the primary structure of a large number of different allergens has been resolved, and the threedimensional structures of some allergens have also been determined by using x-ray diffraction or NMR. To date, the coordinates of seven different allergenic proteins are available at the Brookhaven Data Bank (Table I). Four of the proteins are from plants: Birch pollen Bet v 1 (6) and Bet v 2 (7), mouse ear cress Ara t (8), and timothy grass pollen Phl p 2 (9). Three proteins are from animals: house dust mite Der f 2 (10), mouse urine Mus m 1 (11), and bovine milk Bos d 5 (␤-lactoglobulin) (12).
The three-dimensional structures of allergens are important in understanding the molecular basis of allergenicity. Structural information in relation to the information on B-cell and T-cell epitopes is also necessary for a rational strategy to design effective, safe recombinant derivatives of allergenic proteins for specific immunotherapy. In this study, we have determined the three-dimensional structure of the major bovine dander allergen Bos d 2 1 at 1.8-Å resolution. The structure of Bos d 2 is compared with other allergenic proteins within the lipocalin protein family. We also make a general comparison between currently released allergen structures in an attempt to find features shared by these molecules.

EXPERIMENTAL PROCEDURES
X-ray Crystallography-The crystals of recombinant Bos d 2 were grown at 20°C by the hanging drop method as described (13). Bos d 2 crystallized in a space group P2 1 2 1 2 1 with cell dimensions a ϭ 37.0, b ϭ 55.6, c ϭ 77.2 Å containing one molecule in the asymmetric unit. The native data set was collected at room temperature with an R-AXIS IIc imaging plate using CuK␣ radiation and processed with the DENZO and SCALEPACK programs (14). The native data set consisted of 14,230 unique reflections (redundancy 2.9) up to 1.8-Å resolution with an R-merge of 6.6%.
Bos d 2 is homologous with two lipocalins in which three-dimensional structures have been determined. Its identity to the odorant-binding protein (PDB code 1OBP) is 26% and to the major mouse urinary protein (1MUP) 20%. Because of the low identity, some heavy atom derivatives were sought. One data set obtained from a crystal soaked overnight in 10 mM Hg(OAc) 2 showed a single heavy atom site in the difference Patterson map. Using one mercury site, the single isomorphous replacement map was calculated with both hands using the program XtalView (15). Both maps showed positions of protein molecules in the unit cell but no other details.
The molecular replacement calculations were performed with the program AMoRe (16). Because the odorant-binding protein is a dimer and part of its structure has swapped, it was not used. Therefore, the * This work was supported by the Academy of Finland, Maud Kuistila Foundation, and Kuopio University Hospital (Grant 5035). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The atomic coordinates and structure factor (code 1BJ7) have been deposited in the Protein Data Bank, Brookhaven Laboratory, Upton, NY.
§ To whom correspondence should be addressed. Tel.: 358-13-2513318; Fax: 358-13-2513390; E-mail: juha.rouvinen@joensuu.fi. 1 The abbreviations used are: Bos d 2, bovine lipocalin allergen Bos d 2; rBos d 2, recombinant Bos d 2; 1MUP, major mouse urinary protein; ELISA, enzyme-linked immunosorbent assay; mAb, monoclonal antibody. major murine urinary protein 1MUP without the ligand was used as a search model. The rotation function was calculated between the data at 8 and 4 Å, and a 22-Å integration radius was used. This resulted in eight peaks, which were 3.9 -4.0 above the mean. The first 20 peaks from the rotation function were used in the translation search at 4 -10-Å resolution. The fifth and sixth solution from the rotation function produced peaks 18.1 and 15.1 above the mean, respectively. The rigid body refinement resulted in an R factor of 52.1% for both solutions. The solutions are partially overlapping so that one part of the ␤-sheet of the search model is common to the solutions and the rest are not. When comparing the solutions with the both hand single isomorphous replacement maps we found that the first solution overlapped with one of the single isomorphous replacement maps. In consequence, this model was used as a starting model in the structure determination.
The building of the model was performed with the program O and the refinement with the program X-PLOR (17). The initial refinement with the mouse urinary protein against the Bos d 2 data resulted in the crystallographic R factor of 39.3% and free R factor of 54.2%. The residues of the model were then gradually replaced by the residues of Bos d 2, and after several cycles of manual rebuilding, simulated annealing refinement and addition of water molecules the R factor dropped to 18.4% and the free R factor to 23.2%. The statistics of the final model are given in the Table II.
Immunologic Assays-The capacity of nBos d 2 and rBos d 2 to inhibit the binding of patient IgE or the binding of monoclonal antibodies, mAb1 and mAb3 (18), to the solid-phase rBos d 2 or nBos d 2 was examined by the ELISA technique as described previously (19). First, six 10-fold dilutions of each protein preparation (0.001-100 g/ml) were allowed to react with a predetermined dilution of an antibody in a test tube for 30 min at 37°C. The inhibited antibody as well as the uninhibited control antibody were then transferred into the wells of a microwell plate. In IgE inhibitions, the ELISA procedure was completed using rabbit anti-human IgE (Dako A/S, Glostrup, Denmark; A0094), biotinylated goat anti-rabbit IgG (Dako; E0432) and Vectastain ABC Elite Kit (Vector Laboratories). In the inhibitions of mAbs, the bound antibody was detected with horseradish peroxidase-conjugated rabbit anti-mouse immunoglobulins (Dako; P161). The color reaction was developed, and optical density values were measured at 450 nm. The values are the means of two replicate wells calculated after subtracting the reagent blank value.
Skin prick tests were performed according to Nordic recommendations (20) on the forearm of cow-asthmatic patients in duplicate with nBos d 2 and rBos d 2. Five 10-fold dilutions (0.02-200 g/ml) of each preparation were used with appropriate controls. After 15 min, the wheals were marked and the diameters were measured.

RESULTS AND DISCUSSION
Immunological Properties of Recombinant Bos d 2-Recombinant Bos d 2 (rBos d 2) was produced in Pichia pastoris and crystallized as described earlier (13). To ascertain that the observations made on the relationship between the allergenicity and the structure of the rBos d 2 were valid, we characterized the immunoreactivity of rBos d 2 by the ELISA inhibition technique and skin prick tests. In the ELISA inhibition, the capacity of rBos d 2 or natural Bos d 2 (nBos d 2) to inhibit the binding of serum IgE of four allergic patients to the allergens was tested as described under "Experimental Procedures." All sera gave identical results, showing that rBos d 2 was as effective an inhibitor per weight unit as nBos d 2 (Fig. 1). The complete inhibition of the binding of the two monoclonal antibodies revealed that the two different epitopes recognized by these antibodies were intact.
The relevance of rBos d 2 as an allergen was further confirmed with in vivo skin prick tests in ten patients. Again, the results showed (Fig. 2) that rBos d 2 was as effective as nBos d 2, indicating that rBos d 2 was recognized by IgE antibodies on the effector cells leading to cross-linking, release of mediators, and to positive skin prick tests. In preliminary experiments with two Bos d 2-specific T-cell clones (19), rBos d 2 also induced stimulation identical to that caused by nBos d 2 (results not shown).
The results described above showed that structural characteristics responsible for the clinically relevant, conformational B-cell epitopes were present in the rBos d 2. One of the aims when modified allergens are designed for immunotherapy is to reduce allergic side effects by minimizing the IgE binding capacity. Once the structural sites in the molecule critical for B-cell epitopes have been localized, this can be achieved by recombinant DNA technology.
The Structure of Bos d 2-The three-dimensional structure of rBos d 2 was determined by using x-ray diffraction at 1.8-Å resolution and then refined to a crystallographic R factor of  18.4%. The structure was solved with a molecular replacement technique by using the mouse urinary protein 1MUP as a search model. The refined model contains 150 residues (7-156) and 76 water molecules. A short N-terminal region of the protein was disordered and was not included in the model. The structure is monomeric with no glycosylation or other observable post-translational modifications. Previous molecular weight determinations by ion trap mass spectrometry suggested that the N-terminal residue appears either as glutamine or as pyrrolidone carboxylic acid. Because the first six N-terminal residues were not visible, the existence of pyrrolidone carboxylic acid could not be verified by the crystal structure analysis. The final electron density was of good quality except in the two loops B-C and C-D, which indicated disorder in the crystal and resulted in high B factors. The side chains for two residues, Arg 17 and Leu 117 , were built using two conformations. Bos d 2 is clearly a member of the lipocalin protein family sharing all the structural elements characteristic of other lipocalins (4). The central ␤-barrel is composed of eight antiparallel ␤-strands (A-H) with (ϩ1) 8 topology forming a continuous hydrogen-bonded flattened barrel. The structure also contains a short ␤-strand (I), a 3 10 -helix between strands A and B, an ␣-helix between strands H and I, and a C-terminal 3 10 -helix (Fig. 3A).
The electron density unambiguously showed the existence of two disulfide bridges. Cys 44 -Cys 48 is located in the loop between strands B and C. Cys 63 -Cys 154 is located between the N terminus of the ␤-strand D and C terminus of the protein, a structure shared by other proteins in the lipocalin protein family, for example, 1MUP.
Of lipocalins with known three-dimensional structures, 1MUP (11) and odorant-binding protein (1OBP) (21) have the highest sequence identities with Bos d 2, 20 and 26%, respectively. The structures of the three proteins were compared with the program O (22). We obtained an root mean square deviation of 1.645 Å for 138 C␣ atoms of 1MUP and 1.623 Å for 129 C␣ atoms of 1OBP. These values are close to those reported between the insect biliverdin-binding protein and the human retinol-binding protein of the lipocalin protein family, namely 1.65 Å for 97 C␣ atoms (23). For comparison, the root mean square deviation of 1.557 Å for 144 C␣ atoms is obtained when the crystal structure of Bos d 2 and the model based on comparative modeling (24) are compared.
The biological function of Bos d 2 is currently unknown. There is only indirect evidence of the biological function of Bos d 2 (25). However, most proteins in the lipocalin family are ligand-binding proteins capable of binding and transporting small hydrophobic molecules. The VOIDOO program was used to calculate the molecular surface of the protein and detect possible cavities (26). A flat cavity was found inside the protein, and its size would approximately correspond two small fused rings, as in indan. Because a recombinant protein was used in the study, the ligand binding cavity was only occupied by two water molecules. The ligand binding pocket is usually formed by a number of hydrophobic residues. However, the striking feature of Bos d 2 is the existence of hydrophilic residues in the pocket (Fig. 4). An important residue is Glu115, in which the first carboxylate oxygen makes hydrogen bonds with Tyr 39 and Thr 100 . The second carboxylate oxygen is hydrogen-bonded with two water molecules. The second important hydrophilic residue in the pocket is Asn 37 with the side chain amino group hydrogen-bonded to the main chain carbonyl of Pro 34 and the side chain oxygen hydrogen-bonded to the same water molecule as Glu 115 . These two hydrogen bonds from Glu 115 and Asn 37 to the water molecule are well oriented geometrically. Because both residues are hydrogen bond acceptors, it is plausible to suggest that the endogenous ligand of Bos d 2 contains a heteroatom(s), which can donate two hydrogen bonds. One such group could be the charged nitrogen with two hydrogen atoms.  dimensional structures are only available for Mus m 1 and Bos d 5, but their allergenic properties have not been discussed in terms of structure. The best known allergens of the lipocalin family exist as monomers. Mus m 1 was reported to be a dimer in the crystal but when observing the packing of the molecules it is difficult to believe that it is a true dimer. On the other hand, the dimeric structure of ␤-lactoglobulin is well characterized. In this protein, the longer ␤-strand I forms an important part of the dimer interface containing a number of hydrogen bonds. The protein exists as a dimer at neutral pH but as a monomer at acidic pH. Four of the allergens, Bla g 4, Can f 1, Can f 2, and Equ c 1, have putative N-glycosylation sites (Fig.  5). These sites are located on the protein surface between ␤-strands A and B, and between ␤-strands B and C, and in ␤-strand D.
The main unifying features of allergenic lipocalins are the three-dimensional structure and the transport function. Therefore, the relationship between the structure and B-cell epitopes could reveal valuable information on the determinants of allergenicity of lipocalins. Three different methods have been used to map B-cell epitopes: the use of polypeptide fragments of allergenic proteins, mutagenesis of allergenic proteins, and investigation of surfaces of protein models, e.g. by mapping homologous residues. Generally, much less is known about B-cell (conformational) epitopes than T-cell (sequential) epitopes. We have reported earlier that the C-terminal part of Bos d 2 is important for IgE binding (19). Among the other lipocalin allergens, B-cell epitopes have been characterized only for ␤-lactoglobulin (Bos d 5) by using peptide fragments. Ball et al. (28) determined that polypeptide 97-108 inhibited the antibody binding of native ␤-lactoglobulin. This fragment is composed mainly of ␤-strand G and the preceding loop. Many residues of this fragment would be inside the core in the native protein structure, and it is questionable whether this kind of short fragment would have a native extended ␤-strand conformation. A small red stick model shows the position of the putative ligand binding pocket inside the protein core. The picture has been created with the program Setor (34). B, the molecular surface of Bos d 2. The surface is mapped according to the "cost" from the sequence alignment of allergenic lipocalins (Table II) Because the knowledge of B-cell epitopes of lipocalin allergens is very limited, we have used another method to probe the properties of these molecules. First, a multiple amino acid sequence alignment was carried out for seven proteins with the program GeneWorks (Fig. 5). Despite the probable structural similarity, the identity between the sequences was very low, showing only the two identical residues, Gly 14 and Trp 16 (Bos d 2 numbering), of the main conserved segment of the lipocalin family. The computer program used also estimates a penalty or a "cost," a number from 0 to 9 for alignment; 0 represents strongest similarity and 9 weakest. These values are shown in the Fig. 5 for each residue and are also mapped on the surface of Bos d 2 (Figs. 3, B and C). A resemblance between the molecules can be observed in several areas of the surface. These areas include the ␣-helix, N-and C-terminal ends of ␤-strand A, and the loop between ␤-strands C and D. Some well conserved charged residues can be found in these areas, especially on the surface of the ␣-helix (Glu 126 and Glu 129 ). It would be interesting to continue the analysis to other, nonallergenic lipocalins, but problems would arise, because we do not know if these molecules are potential allergens or not. However, we can conclude that surprisingly many characterized allergens belong to the lipocalin family. These allergens are predominantly monomeric molecules. Some, but not all, are glycosylated, which would indicate that glycosylation is not a common critical determinant of allergenicity in lipocalins. We have identified surface residues of high similarity between lipocalin allergens. These regions would be suitable targets for mutational studies in the development of new preparations for allergen immunotherapy.
Comparison with Allergens of Other Families-In the past few years, the three-dimensional structures of allergens of other families have also been determined. The coordinates for four plant and one mite allergen are available at the Brookhaven Data Bank (Table I).
The major birch pollen allergen Bet v 1 contains a sevenstranded anti-parallel ␤-sheet, which wraps around a long ␣-helix. The structure also contains two shorter ␣-helices. The biological function of Bet v 1 is unknown (6). The second birch pollen allergen, Bet v 2, is a profilin and partly resembles Bet v 1, since it also has a seven-stranded antiparallel ␤-sheet, but it has a slightly different strand order and helix positions (7). Mouse ear cress profilin Ara t has a similar structure to birch pollen profilin Bet v 2 (8). Major mite allergen Der f 2 is a single domain ␤-protein with an immunoglobulin fold, and it has been suggested that it is related to the antibacterial defense system (10). Only preliminary structural data have been published for timothy grass pollen Phl p 2 (9). It is composed of an eightstranded anti-parallel ␤-barrel, thus resembling lipocalins, but the connection between the ␤-strands is clearly different.
Knowledge of the B-cell epitopes of these allergens is also very limited. Peptides have been used to study the IgE binding of Bet v 2. In particular, it has been suggested that N-and C-terminal fragments are important for binding (7). Nishiyama et al. (32) have studied Der f 2 by using site-directed mutagenesis. They have found that two small areas on the opposite faces of the protein are important for IgE binding. One area is composed of polar or charged residues and the other of more hydrophobic residues.
When comparing the available allergen structures, it is easy to note that they are composed mainly of ␤-structures. Der f 2 and Phl p 2 contain only ␤-structures, whereas lipocalins, Bet v 1, and profilins also have some helices. Because the experimental knowledge of B-cell epitopes is very limited, it is not FIG. 5. The sequence alignment of allergenic lipocalins. Putative N-glycosylation sites are shown in boldface. The first row shows the secondary structure elements of Bos d 2: ϳ, electron density weak or missing; 3, ␤-strand; ϭ, 3 10 or ␣-helix. Cost is the estimation of similarity according to the GeneWorks program (0 means absolute identity). possible to suggest any structural motif or sequence pattern common to all allergenic proteins. Therefore we have compared dimensions and shapes of available allergenic molecules. Since many molecules seemed to be a bit flat, we calculated ellipticity for molecules (Fig. 6, Table I). The shortest dimension for these molecules is on average 34 Å, the middle dimension 37 Å, and the longest dimension 44 Å. When looking at the picture, it is evident that these allergens appear to have similar dimensions. Islam et al. (33) defined the ellipticity to be the ratio between the longest ellipsoidal axis and the shortest. These allergens are only slighly elliptical (ratio 1.3). Islam et al. (33) have calculated the ellipticity for most single domain proteins to range from 1.25 to 2.25, which would suggest that these allergens should be considered more spherical than elliptical.