Structures of Two Major Allergens, Bla g 4 and Per a 4, from Cockroaches and Their IgE Binding Epitopes*

Inhalant allergens from cockroaches are an important cause of asthma to millions of individuals worldwide. Here we report for the first time the structures of two major cockroach allergens, Bla g 4 and Per a 4, that adopt a typical lipocalin fold but with distinct structural features as compared with other known lipocalin allergens. Both Bla g 4 and Per a 4 contain two long-range disulfide bonds linking the N and C termini to a β-barrel. The C-terminal helix of Bla g 4 is bent and greatly extended toward the N terminus. Bla g 4 is found to be a monomer, whereas Per a 4 exists as a dimer in solution with a novel dimeric interface involving residues from loops at the top and bottom of the β-barrel. Putative ligand binding sites of both allergens are determined by docking of the juvenile hormone III inside the β-barrel and found to interact with the ligand using non-conserved residues. Bla g 4 and Per a 4 are found to be cross-reactive in sera IgE binding, at least in the Singaporean Chinese population tested. A major IgE binding epitope unique to Per a 4 is found on the loops at the bottom of the β-barrel that may aid the development of hypoallergens for immunotherapy.

Inhalant allergens from cockroaches are an important cause of asthma to millions of individuals worldwide. Here we report for the first time the structures of two major cockroach allergens, Bla g 4 and Per a 4, that adopt a typical lipocalin fold but with distinct structural features as compared with other known lipocalin allergens. Both Bla g 4 and Per a 4 contain two longrange disulfide bonds linking the N and C termini to a ␤-barrel. The C-terminal helix of Bla g 4 is bent and greatly extended toward the N terminus. Bla g 4 is found to be a monomer, whereas Per a 4 exists as a dimer in solution with a novel dimeric interface involving residues from loops at the top and bottom of the ␤-barrel. Putative ligand binding sites of both allergens are determined by docking of the juvenile hormone III inside the ␤-barrel and found to interact with the ligand using non-conserved residues. Bla g 4 and Per a 4 are found to be cross-reactive in sera IgE binding, at least in the Singaporean Chinese population tested. A major IgE binding epitope unique to Per a 4 is found on the loops at the bottom of the ␤-barrel that may aid the development of hypoallergens for immunotherapy.
Rising incidence of asthma and other allergy-related diseases has been one of the major healthcare concerns worldwide. The great importance of these diseases has lead to the isolation and identification of many allergenic proteins involved in eliciting inflammation responses. Allergens bind to IgE antibodies and cross-link their receptors on mast cells, subsequently causing the release of histamine and other pro-inflammatory cytokines.
Despite the identification of many allergens from several different sources, the characterization and epitope mapping studies on all these proteins yet remain limited. Consequently, the basis of the IgE binding of many of these allergens also remains unknown.
Inhalant allergens from cockroaches have been recognized for the last 30 years as an important cause of asthma. Sensitization to cockroach allergens is a major risk factor, especially due to the high levels of these proteins found in the urban environment (1). It has been reported that 44% of the 755 allergy clinic patients living in New York responded positive in skin tests to cockroach allergens (2). In Singapore, sensitization rates among patients with asthma and/or allergic rhinitis to cockroaches are 56.4% (Blattella germanica) and 59.5% (Periplaneta americana), second only to the dust mite (3). Both species of domestic cockroaches produce several potent allergens, including Bla g 1 and Per a 1 (food intake and digestion); Bla g 2 and Per a 2 (inactive aspartic protease); Per a 3 (arylphorin-like storage protein); Bla g 4 and Per a 4 (male pheromone transport lipocalin); Bla g 5 and Per a 5 (glutathione S-transferase); Bla g 6 and Per a 6 (troponin C); and Bla g 7 and Per a 7 (tropomyosin) (4).
The prevalence of serum IgE antibody to recombinant Bla g 4 in 73 cockroach allergic patients with asthma ranged from 40 to 60%, indicating that it is a major allergen (5). Bla g 4 belongs to a family of protein known as lipocalin, a group of small extracellular ligand-binding proteins (6). Typically, members of the lipocalin protein family exhibit greatly diverse protein sequence similarities but all of them contain a highly conserved overall protein structure (7). Generally, structures of lipocalins are made up of eight anti-parallel ␤-strands forming a ␤-barrel, which contains a ligand binding cavity (7). Several allergenic lipocalin proteins have been characterized so far, such as Bos d 2 (8), Bos d 5 or bovine ␤-lactoglobulin (9), Equ c 1 (10), Mus m 1 (11), and rat urinary ␣2 globulin (12). Although several crystal structures of these lipocalins have been determined, exact IgE epitope residues of these allergens are not well understood. Consequently, the basis of allergenicity of lipocalin allergens remains unknown.
Previously, we have isolated the cDNA of variants of Bla g and Per a 4, from the expressed sequence tag (EST) library of B. germanica and P. americana, respectively. 6 Here we report the crystal structures of variants of two allergenic lipocalins, Bla g 4 and Per a 4, from cockroaches B. germanica and P. americana, which showed typical lipocalin folds but with distinct structural features. The putative ligand binding site has been identified by docking with the physiological ligand. The cross-reactivity and key IgE binding residues of these two allergens will also be discussed.

EXPERIMENTAL PROCEDURES
Expression and Purification of Recombinant Bla g 4 and Per a 4-The gene of a Bla g 4 variant (GenBank accession code 1146297, residues 1-176) or a Per a 4 variant (GenBank accession code 1146802, residues 1-167), both without their intact signal peptides, was subcloned into a modified pET-32a vector (Novagen; with thioredoxin and S-tag removed). BL21 Origami B cells (Novagen) were used for recombinant protein expression. Overnight cultures of the bacterial cells were grown in 2 liters of Luria broth containing 100 g/ml ampicillin, 15 g/ml kanamycin, and 12.5 g/ml tetracycline. For expression in BL21(DE3) cells, Luria broth containing 100 g/ml ampicillin was used. Cultures were induced with 0.3 mM isopropyl ␤-Dthiogalactoside for overnight expression at 20°C. Selenomethionine (Se-Met) 7 -labeled Bla g 4 and Per a 4 were expressed similarly in M9 minimal media supplemented with Se-Met. Both Bla g 4 and Per a 4 were expressed as soluble His-tagged proteins and purified using nickel-nitrilotriacetic acid resins (Qiagen). The N-terminal His tag was cleaved using thrombin digestion (3 units/mg, Sigma) and the protein was further purified on a HiLoad 16/60 Superdex 75 (GE Healthcare) equilibrated with 50 mM Tris-Cl, pH 7.9, and 0.2 M NaCl. Proteins were dialyzed overnight against 20 mM Tris-Cl, pH 7.5, prior to crystallization experiments.
Crystallization and Data Collection-Both the native and Se-Met-labeled Bla g 4 were concentrated to 12 mg/ml prior to crystallization, whereas native and Se-Met-labeled Per a 4 were concentrated to 27 mg/ml. Initial crystallization conditions were identified using Hampton Research screens for both proteins. A final crystallization condition was optimized for Bla g 4 to contain 0.1 M HEPES, pH 7.5, 1.5 M sodium citrate, and 3% acetone. Diffraction quality crystals grew after 7 days at room temperature. Per a 4 crystals grew in 10% PEG 4000, 2 M NaCl, and 3% acetone at room temperature after 3 days. Synchrotron datasets for both protein crystals were processed and scaled using the program HKL2000 (13).
Structure Determination and Refinement-Four selenium sites in the asymmetric unit of Bla g 4 crystals were located by the program Solve (14) and the initial phases were further improved by RESOLVE (15). RESOLVE built ϳ75% of the molecule. The remaining parts of the model were built manually using the program Coot (16). Further cycles of model building alternate with refinement using CNS (17) resulted in a final model, with reflections I Ͼ I used in the refinement. The model was later refined against the native data of 1.9 Å in resolution. The final model comprised 338 residues in the asymmetric unit (Asn 7 -Lys 135 and Met 141 -His 176 of each mono-mer) and 179 water molecules were added to the model. PROCHECK (18) analysis showed Ͼ84.5% residues in the most favorable region with 0.7% residues in the disallowed region of the Ramachandran plot. Per a 4 was refined by similar procedures as those for Bla g 4 but with non-crystallographic symmetry restraints used throughout the refinement. The final model comprised of 314 residues (Asp 3 -Tyr 87 and Ser 94 -Tyr 165 of each monomer) with no water molecules added due to the limited resolution. The density from residues Gly 88 to Trp 93 were not observed and thus not modeled. A well defined density belonging to the PEG molecule was observed in the model. The Ramachandran plot showed 76.6% of residues in the most favorable region with 0.7% residues in the disallowed region.
Docking of JH III Ligand on Bla g 4 and Per a 4-The crystal structures of Bla g 4 and Per a 4 were used for docking with JH III using the program Autodock 4.0 (19). The initial three-dimensional structure of JH III was generated using the program Discovery Studio 2.0 from Accelrys. Coordinate files of the macromolecule as well as the ligand used for autogrid and autodock calculations were prepared using the program AutoDockTools (19). A grid with spacing of 0.375 Å and 74 ϫ 68 ϫ 66 points or spacing of 0.375 Å and 74 ϫ 68 ϫ 80 points were used for Bla g 4 and Per a 4, respectively. The grid center was selected to include all residues in the putative ligand binding cavity. The GA-LS algorithm was adopted using default settings except with the maximum number of generation changed to 500,000 and the maximum number of energy evaluations increased to 25,000,000. For each docking job, 100 hybrid GA-LS runs were carried out. A total of 100 possible binding conformations were generated and grouped into clusters based on a 2.0-Å cluster tolerance.
Titration of Bla g 4 and Per a 4 with JH III-JH III (Sigma) was dissolved in dimethyl sulfoxide to a final concentration of 50 mM. Titration studies were carried out using Bla g 4 or Per a 4 at the protein concentration of 0.5 mM and with a final molar ratio of protein:ligand equal to 1:1. HSQC experiments were performed at 308 K and in a buffer containing 20 mM acetate, pH 4.5, for Bla g 4 or 20 mM phosphate, pH 6.0, for Per a 4. Dimethyl sulfoxide alone was titrated into the protein sample as a blank control.
IgE binding by Immuno Dot-blot-Sera from 700 consecutive individuals (80% Singaporean Chinese, remaining 14% Malaysians and 6% Indians) with asthma or allergic rhinitis were screened by dot blotting for Bla g 4-and Per a 4-specific IgE responses. For mapping of IgE epitopes, 24 and 32 sera from individuals with positive responses to Bla g 4 and Per a 4 were tested against Bla g 4 or Per a 4 proteins (wild type or mutant). The immuno dot-blot was performed as previously described (20).
IgE Binding by ELISA-The specific binding of Bla g 4 and Per a 4 (wild type or mutant) to IgE from the serum of a patient was determined to map the IgE binding epitopes. ELISA was performed as previously described (20).
Cross-inhibition of Bla g 4 and Per a 4 by Competitive ELISA-Microtiter plates (Nunc) were coated with Bla g 4 or Per a 4. Before use, pooled sera from 73 reactive individuals were preabsorbed with serial dilutions of recombinant proteins, Bla g 4 or Per a 4. Competitive ELISA was performed as described previously (21).
Coordinates-Coordinates for the crystal structures of Bla g 4 and Per a 4 have been deposited at the Protein Data bank with accession codes of 3EBK and 3EBW, respectively.

Structures of Bla g 4 and Per a 4-
The models of Bla g 4 and Per a 4 had been refined and both showed good stereochemical parameters (Table 1 and Fig. 1A). Both Bla g 4 and Per a 4 crystallized with two molecules in the asymmetric unit. Bla g 4 and Per a 4 adopted a typical lipocalin fold with an eight-stranded anti-parallel ␤-barrel (␤A to ␤H) followed by a single C-terminal ␣-helix that packed against the ␤-barrel (Fig. 1B). The 9th ␤-strand (␤I) can be found in many lipocalins but was absent in Bla g 4 and Per a 4. In Bla g 4, a short helical segment preceding the ␤A strand crossed and closed the bottom of the ␤-barrel, whereas the CD loop and three 3 10 helices from the AB loop, EF loop, and GH loop surrounded the entrance of the cavity on top of the ␤-barrel (Fig. 1, B and C). Per a 4 does not have any secondary structural element at loops or N-and C-terminal regions. The bottom of the ␤-barrel is closed by the loop region preceding the ␤A and the entrance of the cavity is surrounded by the AB loop, CD loop, and GH loop. The AB loop is much longer, reaching the center of the entrance and, thus, conceivably, acting as a lid for the cavity. The ␤E strand is especially shortened in Per a 4, making the EF loop to "peel" away from the ␤-barrel making the entrance to the cavity much more open and accessible as compared with Bla g 4 ( Fig. 1, B and C). All lipocalins contain a C-terminal helix that pack on one face of the ␤-barrel. The most distinct feature of Bla g 4 is that this ␣-helix is especially longer and bent, whereas the equivalent helix in Per a 4 and other lipocalins are much shorter and straight ( Fig. 1B and Fig.  2). This bent ␣-helix of Bla g 4 has a unique hydrogen bonding pattern. The carbonyl oxygen (CЈO) of residues Val 150 and Asn 151 are both hydrogen-bonded to NH of residue Lys 154 . The CЈO of residue Gln 152 does not form a hydrogen bond with NH of Ala 156 . The NH of residue Lys 155 and Ala 156 , as well as COЈ of Gln 152 , are left free without forming any hydrogen bonds. Other residues in this ␣-helix adopt regular hydrogen bonding patterns (COЈ of residue i to NH of residue iϩ4).
Another characteristic structural feature of lipocalins is the long-range disulfide bond that either connects the N and C terminus or just the C terminus of the protein to the ␤-barrel (Figs. 1C and 2). This is the first report to reveal the simultaneous presence of both disulfide bonds in allergenic lipocalins. Both Bla g 4 and Per a 4 contain two long-range disulfide bonds similar to those found in many non-allergenic lipocalins. One long-range disulfide bond connects the N terminus to the end of ␤G, whereas the other one connects the start of ␤B to the C terminus. A similar pattern of long-range disulfide is also observed in many non-allergenic lipocalins like insecticyanin from the tobacco hornworm (22), bilin-binding protein from the cabbage butterfly (23), and human apolipoprotein D (24). All previously identified lipocalin allergens like Mus m 1 (12), rat ␣ 2u -globulin (25), Equ c 1 (10), Fel d 4 (26), Bos d 2 (8), and Bos d 5 (9) contain single long-range disulfide bonds that connect the start of ␤D to the C terminus.
Bla g 4 and Per a 4 share a low sequence identity among them, and with other lipocalins (Fig. 2). Despite the very low sequence  identity, there exist some conserved sequence motifs among lipocalins that allow them to be classified (7,27). Three conserved sequence motifs that are found in the three "structurally conserved regions" (SCR1, SCR2, and SCR3) are GXW, TDYXXY, and Arg, respectively (7) (Fig. 2). Bla g 4 belongs to a kernel lipocalin as it shares all three conserved sequence motifs, whereas Per a 4 is an outlier lipocalin that shares only the GXW and TDYXXY motifs (Fig. 2). Fig. 2 shows the structure-based sequence alignment of selected allergenic and nonallergenic lipocalins using the protein structure comparison service SSM at European Bioinformatics Institute (28). Overall, structures of all lipocalins are similar adopting the basic lipocalin fold. However, the C-terminal ␣-helix of Bla g 4 is extended far toward the N terminus and this region cannot be aligned with other lipocalins (Fig. 2). For comparison of structures, we have overlaid structures of Bla g 4 and Per a 4, and also, with their structurally most homologous nonallergenic and allergenic lipocalins selected based on the DALI search (29) (Fig. 3). Structure superposition was performed using the Top-Match (30) web service. Bla g 4 superimposed with Per a 4 with an   , 1BEB), and, endogenous lipocalins (human tear lipocalin, Protein Data Bank code, 1XKI; and bilin-binding protein, Protein Data Bank code, 1BBP) using the protein structure comparison service SSM at European Bioinformatics Institute. Note that residues 1-6 and 136 -140 of Bla g 4, and residues 1-2, 88 -93, and 166 -167 of Per a 4, are absent in the structure alignment due to the lack of electron densities at these residues in the electron density map. ␤-Strands, ␣-helices, and 3 10 helices are highlighted in yellow, cyan, and magenta, respectively. Boundaries and types of secondary structures are determined using the program MOLMOL (49). Three SCRs are boxed with solid black lines and highly conserved residues are in bold face. The pair of cysteine residues involved in the same disulfide bond are boxed and marked in a same color (red, blue, or green). The sequence identity between Bla g 4 and Per a 4 is 21%. Sequence identities between Bla g 4 and Mus m 1, Bos d 2, and Equ c 1 are 17, 10, and 17%, respectively. Sequence identities between Per a 4 and Mus m 1, Bos d 2, and Equ c 1 are 15, 4, and 14%, respectively. r.m.s. deviation of 1.6 Å for C␣ atoms of 119 structurally equivalent residues. Main differences are found at the N-terminal part of the bent ␣-helix of Bla g 4 and residues at the CD, EF, and GH loops located at the entrance of the ligand binding cavity and the C terminus located at the top of the ␤-barrel (Fig. 3A). A similar pattern has been observed in the superposition of Bla g 4 with the bacterial lipocalin Blc with an r.m.s. deviation of 2.4 Å for C␣ atoms of 121 structurally equivalent residues (Fig. 3B). For superposition with Mus m 1, additional regions from ␤G and ␤H adjacent to the GH loop cannot be aligned with Bla g 4. The overall number of structurally equivalent residues was reduced to 112 with an r.m.s. deviation of 2.6 Å for C␣ atoms (Fig. 3C). Per a 4 superimposed well with the bilin-binding protein with a r.m.s. deviation of 2.6 Å for C␣ atoms of 138 structurally equivalent residues, with main exceptions residing at the EF and GH loops at the entrance to the cavity, and also at the C terminus at the top of the molecule (Fig. 3D). Per a 4 superimposed poorly with Bos d 5, with a r.m.s. deviation of 3.0 Å for C␣ atoms of 118 structurally equivalent residues and with main discrepancies at the N and C termini as well as the AB and EF loops surrounding the entrance of the cavity (Fig. 3E). Overall, both Bla g 4 and Per a 4 are structurally more similar to nonallergenic than allergenic lipocalins. Moreover, due to its much longer C-terminal ␣-helix, Bla g 4 aligned less well with other lipocalins than Per a 4.
The Per a 4 Dimer-Both Bla g 4 and Per a 4 were crystallized with 2 molecules in the asymmetric unit. However, only Per a 4 was found to exist as a dimer in solution. When expressed using Origami cells, gel filtration studies have showed that Per a 4 eluted as a dimer, whereas Bla g 4 eluted as a monomeric protein. Formation of the disulfide bonds in Per a 4 may be essential for its dimer formation as both Bla g 4 and Per a 4 eluted as a monomeric protein when expressed in BL21(DE3) cells (Fig. 4C). The two monomers of Per a 4 are packed side by side with each other leaving the entrance to the ligand binding cavity open and pointing into opposite directions. The 2 ␣-helices are on opposite sides of the dimer running in an anti-parallel fashion (Fig. 4, A  and B). Residues from the loops (AB and GH) and the C terminus at the top of one monomer are in close contact with residues from the loops (BC and DE) and the N terminus at the bottom of another monomer (Fig. 4, A and B).
There is an extensive network of hydrogen bonds in the dimeric interface. A total of 12 hydrogen bonds (7 backbone to side chain; 1 backbone to backbone and 4 side chain to side chain) are observed in the dimeric interface. There is no significant hydrophobic interaction involved in the dimeric interface, a case that is very different for other lipocalins. For instance, Equ c 1 has an extended hydrophobic patch at the center of the dimeric interface (10). The dimeric interface of Per a 4 buried an area of 858 Å 2 , which is ϳ10% of the total solvent accessible surface. Presence of a tight dimeric interface suggests that this is a physiologically relevant Per a 4 dimer.
The Putative Ligand Binding Sites of Bla g 4 and Per a 4-Lipocalins carry out their function by binding small hydrophobic ligands such as pheromones, steroids, and retinoids (31). Many lipocalins have been suggested to be involved in regulation of cell homeostasis (32). Recent studies have found that the lipocalin Bla g 4 is expressed only in the adult male reproductive system of cockroach, specifically in the utricles and the conglobate gland, and is transferred to the female within the spermatophore during copulation (33). Topical application of juvenile hormone III (10-eposy-methylfarnesoate, JH III) (Fig. 5A) stimulated Bla g 4 production by both the conglobate gland and utricles (33). We have performed titration of JH III into NMR samples of Bla g 4 and Per a 4 and found significant perturbation of chemical shifts in the 1 H-15 N HSQC spectra (supplemental Fig. S1), indicating the interaction of JH III with both Bla g 4 and Per a 4. During crystallization of Per a 4, polyethylene glycol (PEG) from the buffer co-crystallized with Per a 4 in a conformation almost perpendicular to the axis of the ␤-barrel and with one end of the molecule curved upward toward the entrance of the ␤-barrel (Fig. 5B). We have thus docked JH III on Bla g 4 and Per a 4 using Autodock 4.0 with rigid protein side chains (19) to determine whether JHIII will adopt a conformation similar to that of PEG.
Docking of JH III on Bla g 4 showed that JH III molecule lies almost perpendicular to the axis of the ␤-barrel and located closely to the entrance of the ␤-barrel (Fig. 5C). The docked ligand in Bla g 4 is not exposed to the solvent and the entrance to the ligand binding cavity seems to be closed by side chains from residues Tyr 42 of the AB loop, Tyr 59 at the C-terminal end of ␤C, and Phe 92 of the EF loop of Bla g 4. Residues lying within a close contact with the docked JH III molecule are mainly hydrophobic. The hydrophobic center of the JH III molecule is surrounded in a pocket formed by residues Leu 39 (AB loop), Trp 45 (␤B), Phe 92 (EF loop), and Tyr 122 (␤H). The terminal COOCH 3 group and epoxy group of JH III may also form hydrogen bonds with polar residues in the vicinity (Fig. 5E).
When docked onto Per a 4, JH III adopted an orientation opposite to that of JH III in Bla g 4 (Fig. 5D). In addition, the ligand molecule is exposed and not blocked by residues at the entrance of the protein molecule, likely due to the much wider opening of the ␤-barrel in Per a 4. The hydrophobic center of the JH III molecule lies on a hydrophobic pocket with residues Tyr 46 (␤B), Val 56 (␤C), and Phe 83 (␤E) at the bottom, and residues Tyr 97 (␤F) and Val 110 (␤G) at the side. The terminal COOCH 3 and epoxy group of JH III may also form hydrogen bonds with surrounding polar residues (Fig. 5F). Among residues that are in close contact with JH III in Per a 4, residues Val 56 , Phe 83 , Tyr 97 , and Val 110 are also in close contact with the co-crystallized PEG molecule in Per a 4.
IgE Binding Epitopes of Bla g 4 and Per a 4-Previously, various IgE binding epitopes of lipocalin allergens have been reported by several other groups. Using ELISA and various synthetic peptides, the IgE epitope of ␤-lactoglobulin was defined by monoclonal antibody binding and was located at residues 94 -103 (VLDTDYKKYL). It contains the conserved DTDY motif that is also found in human retinol-binding protein (34). Based on these findings, we have selected charged residues in the corresponding regions on Bla g 4 (residues 98 -107, VLAT-DYENYA) and Per a 4 (residues 99 -108, MVGTDYQTYS) for site-directed mutagenesis study. Equ c 1 is another lipocalin allergen whose IgE binding epitopes have been mapped. The L2 loop (LDNSS) and part of the adjacent ␤-sheet including the start of ␤A (RNFDISKISG) and the end of ␤D (DKTEEDG) were identified to be the most promising IgE binding epitopes (10). Based on this information, we have targeted charged residues within the corresponding regions on Bla g 4 including the BC loop (residues 51-54, YDDA), the start of ␤A (residues 18 -27, PNLDYERFRG), and the end of ␤D (residues 75-81, KFEGNKF) for site-directed mutagenesis to determine IgE binding epitopes. Similarly, we have also carried out site-directed mutagenesis studies on Per a 4, with targets including the BC loop (residues 48 -53, LDENGV), the start of ␤A (residues 13-21, TGLDMTKYV), and the end of ␤D (residues 77-84, SWTEDTFD).
IgE binding activities of different mutants of Bla g 4 and Per a 4 were determined by immuno dot-blot screening with up to 32 different sera from patients with allergic diseases. Screening results suggest that selected residues may not be essential IgE binding epitopes for Bla g 4. Only 2 out 24 sera (8%) showed a significant drop in IgE binding to below 80% in mutants R24A, R26A, and K75A as compared with the wild type protein (Fig.  6A). All these residues are positively charged and located at the bottom of the ␤-barrel. Arg 24 and Arg 26 are located at the short helical region preceding the ␤A, and Lys 75 is located on ␤D (Fig.  6E). ELISA results confirmed that mutations at residues Arg 24 , Arg 26 , or Lys 75 did not cause significant reduction in IgE binding for serum from a selected patient, indicating that these res-

Structures of Bla g 4 and Per a 4
idues may not be major epitopes of Bla g 4 (Fig. 6C). In contrast, results obtained for mutants of Per a 4 are very different. Immuno dot-blot screening showed that 50% of the sera tested had significant drops in IgE binding (Ͻ80% compared with wild type) for mutations at each of residues Asp 16 , Asp 49 , Glu 80 , or Gln 105 . Results are consistent among these atopic individuals. All 4 mutants showed significant drop in IgE binding, indicating that all 4 residues are important for IgE binding. Residue Asp 84 is less essential as only around 30% of the sera tested showed significant drops in IgE binding (Fig. 6B). ELISA results also confirmed these findings and suggested that Asp 16 , Asp 49 , Glu 80 , and Gln 105 are four essential residues for IgE binding in Per a 4 (Fig. 6D). The mutation at Asp 84 did not cause any significant drop in IgE binding for this particular serum, consistent with results from some individuals in immuno dot-blot screening (Fig.  6B). These four residues are negatively charged residues and located on loop regions (Asp 16 on the loop region preceding ␤A; Asp 49 on the BC loop; Glu 80 on the DE loop; and Gln 105 on the FG loop) at the bottom of the ␤-barrel (Fig. 6F).
Cross-reactivity of Bla g 4 and Per a 4-To determine the IgE cross-reactivity of Bla g 4 and Per a 4, we have performed an extensive immuno dot-blot assay using 700 sera from atopic individuals. Fig. 7A shows that there is a very high correlation in IgE binding activities of patient sera to Bla g 4 and Per a 4. This result can arise from either (a) the patient being co-sensitized to both Bla g 4 and Per a 4, and that there are two different types of IgE binding to specific epitopes of these two proteins or, (b) the same IgE can bind to similar epitopes that are found in both Bla g 4 and Per a 4. To distinguish these situations, the cross-inhibition of IgE binding by Bla g 4 and Per a 4 on each other was investigated. Competitive ELISA assays indeed showed that Bla g 4 and Per a 4 shared similar epitopes and that Per a 4 could inhibit IgE binding to Bla g 4 as strongly as the self-antigen (Fig. 7B). In contrast, Bla g 4 can only inhibit up to 60% of IgE binding in Per a 4, even at the highest protein concentration tested (Fig. 7C). We propose that there is a unique IgE epitope in Per a 4, whereas most IgE epitopes found in Bla g 4 are also present in Per a 4, at least for the population in Singapore where the American cockroach (P. americana) predominates over the German cockroach (B. ger- Surfaces are colored in magenta and shown with a transparency of 50%. Docked JH III and bound PEG are shown as ball and stick models and CPK colored. E and F, ribbon representations of close-ups of ligand binding cavities of Bla g 4 and Per a 4, respectively. Side chains of residues that are in the vicinity of the docked ligand are labeled, CPK colored, and shown as stick models. Docked JH III molecules are CPK colored and shown as ball and stick models. Graphic images were produced using the UCSF Chimera package (50). manica). This finding agrees with our results that the unique IgE epitope mapped in Per a 4 is not the major epitope for Bla g 4. The IgE binding epitopes common to both Bla g 4 and Per a 4 remain to be determined.

DISCUSSION
Here, we have described the crystal structures of two major cockroach allergens, Bla g 4 and Per a 4, which assumed a basic lipocalin fold with an eight-stranded anti-parallel ␤-barrel and a C-terminal ␣-helix packed against the barrel. The most distinct structural feature is the two long-range disulfide bonds, in contrast to a single disulfide bond found in other known lipoca-lin allergens. The presence of disulfide bonds could be essential for IgE binding by maintaining stability of the allergen, especially for conformational epitopes. The allergenicity of a lipocalin allergen, Bos d 2, was shown to be diminished after substituting or deleting the four most C-terminal amino acids including the cysteine that forms a disulfide bond (35). Superposition of different lipocalin structures showed that they are different mainly at the loop regions surrounding the entrance to the ligand binding cavity. This may be related to highly varied ligand specificities in different lipocalins (36).
Comparison with other known dimeric lipocalins showed that Per a 4 could be yet another novel dimeric lipocalin. So far, there is no consistent pattern of dimerization among different lipocalins. The Equ c 1 dimer is formed by side-to-side packing of the ␤-barrels through the interaction of strands ␤F, ␤G, and ␤H from each monomer, with the two ␣-helices on the same side of the dimer and running anti-parallel to each other (10). Sides of ␤-barrels of MUP1 pack against each other through ␤B, ␤C, ␤D, and two Cd 2ϩ ions. Entrances to the ligand binding cavity open up to the same side of the dimer. ␣-Helices run antiparallel to each other and are located on two ends of the dimer (12). On the other hand, dimerization of the bovine odorant-binding protein involves swapping of ␣-helices between the 2 monomers (37,38). How the dimerization is related to the physiological function of lipocalin remains unknown to date. However, the dimerization of the lipocalin allergen, ␤-lactoglobulin, has been proposed to increase the efficiency of cross-linking of two IgE-Fc⑀RI receptor complexes and thus the potency of the protein to operate as an allergen (39). Moreover, a recent study on an aspartic protease-like cockroach allergen Bla g 2 also suggested a functional role of dimerization in allergenicity, as mutant proteins that did not dimerize induced less ␤-hexosaminidase release from mast cells (40).
The physiological function of an allergen may not be directly related to its allergenicity. Bla g 2 is an inactive aspartic protease but it is the most potent cockroach allergen (41). The ligand binding cavity of lipocalin is buried at the center of the protein FIGURE 6. Mapping of IgE binding epitopes on Bla g 4 and Per a 4. Immuno dot-blot screening of A, Bla g 4; and B, Per a 4 mutants with sera from atopic patients. Data from the same serum are linked by straight lines. Numbers of sera used for screening Bla g 4 and Per a 4 were 24 and 32, respectively. C and D, IgE binding ELISA experiments using mutants from Bla g 4 and Per a 4, respectively, and serum from a single patient. E and F, locations of identified IgE binding epitope residues on Bla g 4 and Per a 4, respectively, based on immuno dot-blot screening. Side chains of the epitope residues are shown in ball and stick model and CPK colored. Graphic images were produced using the UCSF Chimera package (50). and the chance of correlation between ligand binding and allergenicity of the protein is unlikely. JH III is a putative ligand for both Bla g 4 and Per a 4 but in a docking experiment it adopted different bound conformations in these two proteins. The locations and identities of JH III binding residues of these two proteins are very different from each other. Among all the identified ligand binding residues, only Tyr 96 of Bla g 4 is conserved with Tyr 97 of Per a 4. The locations of ligand binding residues have also been demonstrated in other lipocalin proteins such as ␤-lactoglobulin or Bos d 5 (42), human apolipoprotein D or apolipoprotein D (36), retinol-binding protein (43), human complement protein C8␥ (44), and lipocalin-type prostaglandin D synthase or lipocalin-type prostaglandin D synthase (45). However, highly conserved ligand binding residues are lacking among various lipocalins based on structure alignment and this agrees with diverse ligand-binding activities of lipocalins (46). The absolutely conserved Trp residue of the GXW sequence on strand ␤A is reported to be involved in ligand binding in four of the five proteins listed above, except the retinol-binding protein. This absolutely conserved Trp residue (Trp 29 in Bla g 4 and Trp 24 in Per a 4), however, are not involved in direct contact with the ligand JH III in the model complexes with both Bla g 4 and Per a 4.
The group 1 cockroach allergens, Bla g 1 and Per a 1, are found to be cross-reactive with each other (47). This is not surprising as the two allergens are highly similar to each other and with a sequence identity of 70%. Surprisingly, Bla g 4 is found to be cross-reactive to Per a 4 even though the sequence identify between the two allergens is only 21%. The cross-reactivity is likely due to similar structural features rather than specific amino acid sequences. We have identified a major epitope on Per a 4 at the loops on the bottom of the ␤-barrel. However, the common epitope shared by Bla g 4 and Per a 4 remains to be elucidated. Recent complex structure between a recombinant IgE Fab fragment and ␤-lactoglobulin showed that the IgE epitopes are located on the flat ␤-sheet region of ␤-lactoglobulin and is formed mainly by the first four ␤-strands (39). Sitedirected mutagenesis on corresponding regions of Bla g 4 and Per a 4 may help to determine the shared IgE epitopes of these two allergens.